This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-01-10 generated by the compressor, my personal LLM-based project.
Ronglai Zuo,Fangyun Wei,Zenggui Chen,Brian Mak,Jiaolong Yang,Xin Tong
http://arxiv.org/abs/2401.04730v1
Compressor summary: The paper proposes a Spoken2Sign system that translates spoken languages into 3D sign languages using a simple baseline of three steps.
Xiyi Chen,Marko Mihajlovic,Shaofei Wang,Sergey Prokudin,Siyu Tang
http://arxiv.org/abs/2401.04728v1
Compressor summary: Key points: - The paper presents a new diffusion model for generating photorealistic human avatars from a single image or text prompt. - The model integrates a 3D morphable model to enable controllable facial expressions and body poses. - The model outperforms existing models on novel view and novel expression synthesis tasks. Summary: The paper introduces a new diffusion model that can create realistic human avatars with expressive faces and poseable bodies from one image or text, using a 3D morphable model for guidance.
Zeyu Wang,Xianhang Li,Hongru Zhu,Cihang Xie
http://arxiv.org/abs/2401.04727v1
Compressor summary: Key points: - The paper proposes AdvXL, an efficient and effective framework for adversarial training with giant models and web-scale data. - AdvXL achieves new state-of-the-art robust accuracy records under AutoAttack on ImageNet-1K. - The code is available at https://github.com/UCSC-VLAA/AdvXL. Summary: The paper introduces AdvXL, a novel framework for training robust visual models with large-scale data and models, which sets new benchmarks for robust accuracy on ImageNet-1K.
Benedikt Roth,Valentin Koch,Sophia J. Wagner,Julia A. Schnabel,Carsten Marr,Tingying Peng
http://arxiv.org/abs/2401.04720v1
Compressor summary: The study shows that foundation models in computer vision can be finetuned with minimal resources to match or outperform existing feature extractors for computational pathology tasks.
Xiaojuan Wang,Taesung Park,Yang Zhou,Eli Shechtman,Richard Zhang
http://arxiv.org/abs/2401.04718v1
Compressor summary: The paper presents a method to smooth jump cuts in talking head videos by using keypoints and landmarks from surrounding frames, interpolating them, and translating pixels with cross-modal attention.
Yunhua Zhang,Hazel Doughty,Cees G. M. Snoek
http://arxiv.org/abs/2401.04716v1
Compressor summary: This paper explores low-resource image tasks in computer vision, where foundation models struggle to transfer well due to data scarcity, fine-grained differences, and distribution shift.
Jia-Chen Gu,Hao-Xiang Xu,Jun-Yu Ma,Pan Lu,Zhen-Hua Ling,Kai-Wei Chang,Nanyun Peng
http://arxiv.org/abs/2401.04700v1
Compressor summary: The paper discusses the trade-off between improving factuality and preserving general abilities in large language models through model editing methods.
Gal Yona,Roee Aharoni,Mor Geva
http://arxiv.org/abs/2401.04695v1
Compressor summary: GRANOLA QA is a novel evaluation setting that considers multi-granularity answers to measure accuracy and informativeness of question answering models, which can improve their performance, especially for rare entities.
Joaquim Estopinan,Maximilien Servajean,Pierre Bonnet,Alexis Joly,François Munoz
http://arxiv.org/abs/2401.04691v1
Compressor summary: This paper uses deep learning to map the global conservation status of orchids and finds high threat levels in Madagascar and Sumatra, highlighting the need for protection.
Kylliann De Santiago,Marie Szafranski,Christophe Ambroise
http://arxiv.org/abs/2401.04682v1
Compressor summary: Key points: - A new method for aggregating multiple clustering from different sources - Uses a mixture of multilayer SBM to group co-membership matrices and partition observations - Provides identifiability, optimal number of clusters and components, and comparison with other methods - Applied to global food trading networks Summary: The paper presents a novel method that combines multiple clustering sources using multilayer SBM, and shows its advantages and applications in food trading networks.
Sunny Howard,Peter Norreys,Andreas Döpp
http://arxiv.org/abs/2401.04680v1
Compressor summary: CoordGate is a new module for convolutional neural networks that enhances spatially-varying convolutions and improves image deblurring.
Mahdi Nikdan,Soroush Tabesh,Dan Alistarh
http://arxiv.org/abs/2401.04679v1
Compressor summary: RoSA is a new method for fine-tuning large language models that uses low-rank and sparse components to improve accuracy while saving computational resources.
Thomas Randall,Jaehoon Koo,Brice Videau,Michael Kruse,Xingfu Wu,Paul Hovland,Mary Hall,Rong Ge,Prasanna Balaprakash
http://arxiv.org/abs/2401.04669v1
Compressor summary: Generative transfer learning using Gaussian copula improves autotuning for high-performance computing systems by efficiently sampling high-performing configurations and estimating few-shot budgets.
Galib Muhammad Shahriar Himel,Md. Masudul Islam
http://arxiv.org/abs/2401.04666v1
Compressor summary: This research compares pre-trained models for image classification on a cat and dog dataset, achieving 99.65% accuracy with NASNet Large on different computer architectures.
Zhen Qin,Weigao Sun,Dong Li,Xuyang Shen,Weixuan Sun,Yiran Zhong
http://arxiv.org/abs/2401.04658v1
Compressor summary: Lightning Attention-2 is a new linear attention algorithm that enables constant training speed for sequences of unlimited length by using tiling and GPU hardware optimization.
Abu Bakar Siddiqur Rahman,Hoang-Thang Ta,Lotfollah Najjar,Azad Azadmanesh,Ali Saffet Gönül
http://arxiv.org/abs/2401.04655v1
Compressor summary: The paper introduces DepressionEmo, a novel dataset to detect emotions associated with depression from Reddit user posts, and evaluates various text classification methods using it.
Jiaxing Huang,Kai Jiang,Jingyi Zhang,Han Qiu,Lewei Lu,Shijian Lu,Eric Xing
http://arxiv.org/abs/2401.04651v1
Compressor summary: SSPrompt is a method to learn effective semantic and spatial prompts for improving segmentation masks of anything using Promptable Segmentation models like SEEM and SAM.
Vijay Kag,Birupaksha Pal
http://arxiv.org/abs/2401.04648v1
Compressor summary: The paper proposes a new method to improve hidden physics models, allowing them to generalize to different input changes and system configurations.
Tanmay Garg,Deepika Vemuri,Vineeth N Balasubramanian
http://arxiv.org/abs/2401.04647v1
Compressor summary: The paper proposes an explanation module that generates visual concepts from classifier networks, which improves interpretability and performance in visual classification tasks using adversarial training.
Samuel Yanes Luis,Dmitriy Shutin,Juan Marchal Gómez,Daniel Gutiérrez Reina,Sergio Toral Marín
http://arxiv.org/abs/2401.04631v1
Compressor summary: Key points: - The paper proposes a multi-agent system of autonomous vehicles to monitor water quality using Local Gaussian Processes and Deep Reinforcement Learning. - The approach improves the estimation accuracy of water quality variables and algae blooms compared to existing methods. Summary: The paper presents a novel multi-agent system that uses Local Gaussian Processes and Deep Reinforcement Learning to monitor water quality more accurately than traditional approaches.
Shimin Li,Tianxiang Sun,Xipeng Qiu
http://arxiv.org/abs/2401.04620v1
Compressor summary: The paper proposes EvolutionaryAgent, a framework that aligns AI agents with social norms by evolving and selecting agents based on their fitness in adapting to changing norms.
Selva Kumar S,Afifah Khan Mohammed Ajmal Khan,Chirag Manjeshwar,Imadh Ajaz Banday
http://arxiv.org/abs/2401.04619v1
Compressor summary: The paper presents a dataset and methods to identify and classify languages in text messages that use the English alphabet to write native languages (transliteration), enhancing digital communication and overcoming language barriers.
Ziyue Huang,Mingming Zhang,Yuan Gong,Qingjie Liu,Yunhong Wang
http://arxiv.org/abs/2401.04614v1
Compressor summary: GeRSP is a novel pre-training framework for remote sensing images that combines self-supervised and supervised learning from both remote sensing and natural images to improve understanding tasks.
Victor Dheur,Tanguy Bosser,Rafael Izbicki,Souhaib Ben Taieb
http://arxiv.org/abs/2401.04612v1
Compressor summary: The paper proposes a framework to improve uncertainty quantification in neural Temporal Point Process models by using conformal prediction methods that generate more reliable and sharper joint prediction regions for event arrival times and marks.
Jingyuan Yang,Jiawei Feng,Hui Huang
http://arxiv.org/abs/2401.04608v1
Compressor summary: The paper introduces Emotional Image Content Generation (EICG), a new task to generate images that convey specific emotions using CLIP and novel losses for semantic clarity and emotion fidelity.
Mihael Arcan,Paul-David Niland,Fionn Delahunty
http://arxiv.org/abs/2401.04592v1
Compressor summary: The study evaluates the performance of different language models in understanding mental health expressions and finds that transformer-based models like BERT and XLNet perform better than others.
Xuewen Liu,Zhikai Li,Junrui Xiao,Qingyi Gu
http://arxiv.org/abs/2401.04585v1
Compressor summary: EDA-DM is a method to improve the performance of post-training quantization for diffusion models by addressing distribution mismatch issues in both calibration sample and reconstruction output levels.
Amro Abbas,Evgenia Rusak,Kushal Tirumala,Wieland Brendel,Kamalika Chaudhuri,Ari S. Morcos
http://arxiv.org/abs/2401.04578v1
Compressor summary: The authors prune large-scale multimodal datasets for training CLIP-style models to improve efficiency and performance, achieving better results with less data and compute.
Yatong Bai,Utsav Garg,Apaar Shanker,Haoming Zhang,Samyak Parajuli,Erhan Bas,Isidora Filipovic,Amelia N. Chu,Eugenia D Fomitcheva,Elliot Branson,Aerin Kim,Somayeh Sojoudi,Kyunghyun Cho
http://arxiv.org/abs/2401.04575v1
Compressor summary: The Let's Go Shopping (LGS) dataset provides high-quality, large-scale image-caption pairs from e-commerce websites for vision-language tasks, overcoming the limitations of existing general-domain datasets.
Pierluigi Vito Amadori,Timothy Bradley,Ryan Spick,Guy Moss
http://arxiv.org/abs/2401.04572v1
Compressor summary: EVOLUTE is a new architecture for automated game testing that combines behavioural cloning with energy based models, improving efficiency and generalisation in shooting-and-driving games.
Gyutae Hwang,Sang Jun Lee
http://arxiv.org/abs/2401.04560v1
Compressor summary: Key points: - The text proposes a vision-based method to estimate heart rate and blood pressure using deep learning - The method consists of two stages: one for rPPG signal detection and another for blood pressure estimation - The method achieves high accuracy on two datasets and reduces the MAE by 34.31% for heart rate estimation Summary: The text presents a deep learning framework that uses vision to estimate heart rate and blood pressure with high accuracy and low error compared to existing methods.
Shengli Zhang,Zhiyong Tao,Sen Lin
http://arxiv.org/abs/2401.04550v1
Compressor summary: The paper proposes a new Transformer-based wavelet network to improve foggy image recovery in complex real-world conditions by preserving texture details, avoiding color distortion, and enhancing feature extraction.
Tim R. Davidson,Veniamin Veselovsky,Martin Josifoski,Maxime Peyrard,Antoine Bosselut,Michal Kosinski,Robert West
http://arxiv.org/abs/2401.04536v1
Compressor summary: The text discusses evaluating language models in negotiation games, which better reflect real-world scenarios and reveal their decision-making processes.
Alena Fenogenova,Artem Chervyakov,Nikita Martynov,Anastasia Kozlova,Maria Tikhonova,Albina Akhmetgareeva,Anton Emelyanov,Denis Shevelev,Pavel Lebedev,Leonid Sinev,Ulyana Isaeva,Katerina Kolomeytseva,Daniil Moskovskiy,Elizaveta Goncharova,Nikita Savushkin,Polina Mikhailova,Denis Dimitrov,Alexander Panchenko,Sergei Markov
http://arxiv.org/abs/2401.04531v1
Compressor summary: The paper introduces MERA, a benchmark for evaluating Russian foundation models in various tasks and domains, aiming to better understand their capabilities, limitations, and risks.
Marat Saidov,Aleksandra Bakalova,Ekaterina Taktasheva,Vladislav Mikhailov,Ekaterina Artemova
http://arxiv.org/abs/2401.04522v1
Compressor summary: LUNA is a tool that simplifies the evaluation of Natural Language Generation (NLG) models by providing a unified interface for 20 NLG metrics, making it easy to add new ones.
Shichao Sun,Junlong Li,Weizhe Yuan,Ruifeng Yuan,Wenjie Li,Pengfei Liu
http://arxiv.org/abs/2401.04518v1
Compressor summary: This paper introduces MetaCritique, a framework that evaluates critiques of large language models using precision, recall, and rationale, improving the quality of generative AI.
Mikhail Tikhomirov,Natalia Loukachevitch
http://arxiv.org/abs/2401.04515v1
Compressor summary: The article explores using language models to predict hypernymy relationships and improve them with co-hyponym information and iterative methods.
Jiaqi Wang,Yuying Chang,Zhong Li,Ning An,Qi Ma,Lei Hei,Haibo Luo,Yifei Lu,Feiliang Ren
http://arxiv.org/abs/2401.04507v1
Compressor summary: TechGPT-2.0 is a large language model for knowledge graph construction tasks with improved capabilities in various domains and lengthy texts, trained on Huawei's Ascend server.
Tim Huisman,Jacobus G. M. van der Linden,Emir Demirović
http://arxiv.org/abs/2401.04489v1
Compressor summary: Survival trees use dynamic programming and recursion to model complex nonlinear relations in survival analysis, providing an optimal and scalable method that performs well in experiments.
Yufei Guo,Yuanpei Chen
http://arxiv.org/abs/2401.04486v1
Compressor summary: The paper proposes a shortcut back-propagation method for training spiking neural networks and an evolutionary training framework to balance accuracy and ease of training, achieving better performance than existing methods.
Christian Huber,Alexander Waibel
http://arxiv.org/abs/2401.04482v1
Compressor summary: Key points: - ASR systems struggle with recognizing special words in lectures - A self-supervised continual learning approach is proposed - The approach uses a memory-enhanced ASR model and adaptation datasets from slides - The approach improves performance on new words and preserves general performance Summary: The paper proposes a self-supervised continual learning method for ASR systems to learn special words in lectures using slides and memory enhancement.
Shrey Satapara,Parth Mehta,Debasis Ganguly,Sandip Modha
http://arxiv.org/abs/2401.04481v1
Compressor summary: The paper proposes using large language models to generate summaries that contain specific types of misinformation, creating a ground-truth dataset for detecting misinformation in news articles.
Xue Zhang,Xiangyu Shi,Xinyue Lou,Rui Qi,Yufeng Chen,Jinan Xu,Wenjuan Han
http://arxiv.org/abs/2401.04471v1
Compressor summary: The authors introduce TransportationGames, a benchmark to evaluate large language models' performance in the transportation domain across various tasks based on Bloom's Taxonomy levels.
Weimin Wang,Jiawei Liu,Zhijie Lin,Jiangqiao Yan,Shuo Chen,Chetwin Low,Tuyen Hoang,Jie Wu,Jun Hao Liew,Hanshu Yan,Daquan Zhou,Jiashi Feng
http://arxiv.org/abs/2401.04468v1
Compressor summary: MagicVideo-V2 is a text-to-video system that creates high-quality videos from text descriptions using various modules and achieves superior performance compared to other Text-to-Video systems.
Casper Fibaek,Luke Camilleri,Andreas Luyts,Nikolaos Dionelis,Bertrand Le Saux
http://arxiv.org/abs/2401.04464v1
Compressor summary: The PhilEO Bench is a new evaluation framework for testing Earth Observation Foundation Models on a large Sentinel-2 dataset with three downstream tasks.
Justin Tebbe,Jawad Tayyub
http://arxiv.org/abs/2401.04463v1
Compressor summary: The text proposes a new framework that improves diffusion models for anomaly detection by enhancing reconstruction and localization of variously sized anomalies.
Magali Richard,Yuna Blum,Justin Guinney,Gustavo Stolovitzky,Adrien Pavão
http://arxiv.org/abs/2401.04452v1
Compressor summary: The chapter covers various aspects of organizing AI competitions, including motivating participants, engaging the community, managing logistics, and sharing results.
Eleonora Breci,Luca Guarnera,Sebastiano Battiato
http://arxiv.org/abs/2401.04448v1
Compressor summary: The text discusses a forensic handwriting analysis method that uses advanced software and machine learning to compare digitized documents, and presents a new dataset with both traditional and digital handwritten samples.
Yishuang Tian,Ning Wang,Liang Zhang
http://arxiv.org/abs/2401.04441v1
Compressor summary: The paper proposes a multi-level hierarchical deep learning algorithm that uses human cognition model and existing knowledge information to train deep neural networks, improving their interpretability and classification performance.
Dongeon Kim,YeongHyeon Park
http://arxiv.org/abs/2401.04437v1
Compressor summary: The paper proposes a feature selection method for hyperspectral imaging to detect foreign matters in products faster and with better explainability than existing methods.
Kuo Yang,Duo Li,Menghan Hu,Guangtao Zhai,Xiaokang Yang,Xiao-Ping Zhang
http://arxiv.org/abs/2401.04435v1
Compressor summary: The paper proposes a method to improve semi-supervised learning with imbalanced classes by using uncertainty-aware dynamic thresholds for pseudo-label selection, achieving better performance on long-tailed datasets.
Haoyang Chen,Peiyan Sun,Qiyuan Song,Wanyuan Wang,Weiwei Wu,Wencan Zhang,Guanyu Gao,Yan Lyu
http://arxiv.org/abs/2401.04429v1
Compressor summary: i-Rebalance is a personalized vehicle reposition technique using deep reinforcement learning to balance demand and supply in ride-hailing platforms while considering drivers' preferences.
Yuyang Sun,Panagiotis Kosmas
http://arxiv.org/abs/2401.04425v1
Compressor summary: Meta-forests is a novel domain generalization algorithm that improves classifier performance by using meta-learning and maximum mean discrepancy to reduce correlation among trees and increase their strength.
Tim vor der Brück,Marc Pouly
http://arxiv.org/abs/2401.04422v1
Compressor summary: Semantic Concept Embeddings improve Word2Vec word embeddings by capturing human thought processes and handling ambiguity in highly context-dependent words.
Sander Riisøen Jyhne,Morten Goodwin,Per Arne Andersen,Ivar Oveland,Alexander Salveson Nossum,Karianne Ormseth,Mathilde Ørstavik,Andrew C. Flatman
http://arxiv.org/abs/2401.04406v1
Compressor summary: MapAI is a 2022 building segmentation competition using aerial images and LiDAR data, evaluated by IoU and Boundary IoU metrics.
Long Xu,Shanghong Li,Yongquan Chen,Jun Luo
http://arxiv.org/abs/2401.04403v1
Compressor summary: The paper proposes a new interactive segmentation algorithm that uses multi-scale tokens and contrastive loss to improve accuracy and efficiency, outperforming existing methods.
Ghadeer O. Ghosheh,Jin Li,Tingting Zhu
http://arxiv.org/abs/2401.04402v1
Compressor summary: IGNITE is a deep-learning model that learns patient dynamics from sparse and missing EHRs to generate personalized realistic values for data-driven models in personalized medicine.
Zilong Wang,Hao Zhang,Chun-Liang Li,Julian Martin Eisenschlos,Vincent Perot,Zifeng Wang,Lesly Miculicich,Yasuhisa Fujii,Jingbo Shang,Chen-Yu Lee,Tomas Pfister
http://arxiv.org/abs/2401.04398v1
Compressor summary: Chain-of-Table is a framework that uses tabular data to guide large language models in generating a chain of operations for table understanding tasks, improving accuracy and reliability.
Oskar Keurulainen,Gokhan Alcan,Ville Kyrki
http://arxiv.org/abs/2401.04397v1
Compressor summary: The text discusses a new paradigm for active learning with human feedback, considering how humans' higher levels of agency affect rational communication and providing an example and a computational study.
Heewon Kim,Hyun Sung Chang,Kiho Cho,Jaeyun Lee,Bohyung Han
http://arxiv.org/abs/2401.04390v1
Compressor summary: The paper proposes an EM-based method to learn with noisy labels in computer vision, where two networks collaborate to distinguish clean labels and refurbish corrupted ones.
Zhiwei Zuo,Zhuo Tang,Kenli Li,Anwitaman Datta
http://arxiv.org/abs/2401.04385v1
Compressor summary: Key points: - Machine unlearning helps with user privacy but is costly - Proposed fine-grained strategies to address privacy and computational efficiency - Introduced new metrics and SPD-GAN to evaluate effectiveness and degree of unlearning Summary: The authors propose efficient machine unlearning techniques, novel metrics, and a data perturbation method (SPD-GAN) to balance user privacy protection and model performance.
Haoyi Xiong,Xuhong L,Xiaofei Zhang,Jiamin Chen,Xinhao Sun,Yuchen Li,Zeyi Sun,Mengnan Du
http://arxiv.org/abs/2401.04374v1
Compressor summary: This work reviews data-centric approaches to make deep neural networks more interpretable by analyzing how data collection, processing, and analysis affect model behavior and knowledge discovery.
Mulomba Mukendi Christian,Hyebong Choi
http://arxiv.org/abs/2401.04369v1
Compressor summary: This study proposes a machine learning approach that uses two months of data to accurately predict air quality in 197 capital cities, with Random Forest algorithm achieving high performance and interpretability.
Gabriel D. M. Manalu,Mulomba Mukendi Christian,Songhee You,Hyebong Choi
http://arxiv.org/abs/2401.04368v1
Compressor summary: The study proposes a novel multimodal approach using patient prescription data and drug embeddings to improve AKI prediction in the critical care setting, showing significant improvement over baseline models.
Curtis Murray,Lewis Mitchell,Jonathan Tuke,Mark Mackay
http://arxiv.org/abs/2401.04367v1
Compressor summary: The study presents a new method to model emotions from online patient stories and develops a recommender system that predicts emotions and sentiments using topic analysis, outperforming existing methods.
Binh M. Le,Jiwon Kim,Shahroz Tariq,Kristen Moore,Alsharif Abuadbba,Simon S. Woo
http://arxiv.org/abs/2401.04364v1
Compressor summary: The paper reviews state-of-the-art deepfake detectors and categorizes them into groups based on critical criteria, providing insights into their effectiveness across different attack scenarios.
Kwan Yun,Youngseo Kim,Kwanggyoon Seo,Chang Wook Seo,Junyong Noh
http://arxiv.org/abs/2401.04362v1
Compressor summary: DiffSketch is a method that creates stylized sketches from images using deep features and can be trained with one manual drawing, outperforming other methods.
Jiaan Wang,Jianfeng Qu,Kexin Wang,Zhixu Li,Wen Hua,Ximing Li,An Liu
http://arxiv.org/abs/2401.04361v1
Compressor summary: The paper proposes a contrastive learning framework to improve knowledge-grounded dialogue's robustness against real-world noises like misspellings, abbreviations, incomplete, erroneous, and outdated facts in external knowledge graphs.
Yifan Xie,Boyu Wang,Shiqi Li,Jihua Zhu
http://arxiv.org/abs/2401.04357v1
Compressor summary: The paper proposes a novel Iterative Feedback Network (IFNet) that improves unsupervised point cloud registration by efficiently enriching low-level features with high-level ones and using a geometry-awareness descriptor for more precise results.
Xuzheng Yu,Chen Jiang,Wei Zhang,Tian Gan,Linlin Chao,Jianan Zhao,Yuan Cheng,Qingpei Guo,Wei Chu
http://arxiv.org/abs/2401.04354v1
Compressor summary: The paper proposes a novel two-stream framework for video scene recognition that uses temporal and non-temporal perspectives, self-distillation, and knowledge-enhanced feature fusion to classify scenes in videos effectively.
Anushiya Arunan,Yan Qin,Xiaoli Li,Chau Yuen
http://arxiv.org/abs/2401.04351v1
Compressor summary: The paper proposes a model that detects changes in device health using temporal correlation features and improves remaining useful life estimation accuracy by considering heterogeneous change points.
Sibo Wang,Jie Zhang,Zheng Yuan,Shiguang Shan
http://arxiv.org/abs/2401.04350v1
Compressor summary: PMG-AFT is a method that enhances CLIP's zero-shot adversarial robustness by preserving the generalization features of the pre-trained model using an auxiliary branch and minimizing the distance between feature distributions.
Khoi M. Le,Trinh Pham,Tho Quan,Anh Tuan Luu
http://arxiv.org/abs/2401.04348v1
Compressor summary: The paper introduces LAMPAT, an unsupervised multilingual paraphrasing model that uses low-rank adaptation and adversarial training to generate diverse and human-like sentences from monolingual data without parallel corpora.
Hualie Jiang,Rui Xu,Minglang Tan,Wenjie Jiang
http://arxiv.org/abs/2401.04345v1
Compressor summary: The paper proposes a recurrent omnidirectional stereo matching algorithm that improves the performance over previous methods and introduces two techniques to enhance it further.
Xinyu Tang,Ashwinee Panda,Milad Nasr,Saeed Mahloujifar,Prateek Mittal
http://arxiv.org/abs/2401.04343v1
Compressor summary: DP-ZO is a method for privately fine-tuning large language models by randomizing gradient directions and privatizing step sizes with Laplace or Gaussian noise, achieving a strong trade-off between privacy and performance.
Hyogon Ryu,Seohyun Lim,Hyunjung Shim
http://arxiv.org/abs/2401.04339v1
Compressor summary: This paper explores fine-tuning quantized diffusion models for generative AI and proposes two strategies to improve personalization and prompt fidelity without compromising image quality.
Youshao Xiao,Shangchun Zhao,Zhenglei Zhou,Zhaoxin Huan,Lin Ju,Xiaolu Zhang,Lin Wang,Jun Zhou
http://arxiv.org/abs/2401.04338v1
Compressor summary: G-Meta is a high-performance framework for large-scale meta learning based recommendation models that improves efficiency and statistical performance in distributed training on GPU clusters.
Jiaxing He,Bingzhe Hou,Tieru Wu,Yue Xin
http://arxiv.org/abs/2401.04332v1
Compressor summary: The paper proposes three multifiltrations (multi-GENEO, multi-DGENEO, mix-GENEO) for Topological Data Analysis and shows their stability and effectiveness on MNIST dataset.
Yonghui Tan,Xiaolong Li,Yishu Chen,Jinquan Ai
http://arxiv.org/abs/2401.04330v1
Compressor summary: The BD-MSA model uses deep learning to detect changes in remote sensing images by extracting boundary information and separating the main body from the boundary, outperforming other models on public datasets.
Han Li,Yukai Ma,Yaqing Gu,Kewei Hu,Yong Liu,Xingxing Zuo
http://arxiv.org/abs/2401.04325v1
Compressor summary: The paper proposes a method to fuse Radar and image data for accurate dense depth estimation using four stages of processing.
Junjie Wang,Dan Yang,Binbin Hu,Yue Shen,Ziqi Liu,Wen Zhang,Jinjie Gu,Zhiqiang Zhang
http://arxiv.org/abs/2401.04319v1
Compressor summary: The paper proposes ARALLM, a method to use large language models for transforming natural language demands into structured logical languages for user targeting, by enhancing their reasoning ability with analogical prompts and distilling multi-task models.
Uri Stemmer
http://arxiv.org/abs/2401.04311v1
Compressor summary: PEP is a private learning model that predicts labels from unlabeled examples, with improvements in robustness, privacy-parameter independence, and reduced sample complexity.
Andreas Kirsch
http://arxiv.org/abs/2401.04305v1
Compressor summary: The thesis explores data subset selection techniques, such as active learning and active sampling, in deep learning models using information-theoretic principles to improve label and training efficiency.
Gbètondji J-S Dovonon,Michael M. Bronstein,Matt J. Kusner
http://arxiv.org/abs/2401.04301v1
Compressor summary: This paper analyzes the causes of oversmoothing in Transformers and proposes a way to control their spectrum, leading to improved generalization.