This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-01-12 generated by the compressor, my personal LLM-based project.
Yue Zhao,Long Zhao,Xingyi Zhou,Jialin Wu,Chun-Te Chu,Hui Miao,Florian Schroff,Hartwig Adam,Ting Liu,Boqing Gong,Philipp Krähenbühl,Liangzhe Yuan
http://arxiv.org/abs/2401.06129v1
Compressor summary: The authors create a video-language model using synthesized data and fine-tuning, which improves performance on various benchmarks compared to existing methods.
Yifan Gong,Zheng Zhan,Qing Jin,Yanyu Li,Yerlan Idelbayev,Xian Liu,Andrey Zharkov,Kfir Aberman,Sergey Tulyakov,Yanzhi Wang,Jian Ren
http://arxiv.org/abs/2401.06127v1
Compressor summary: The text describes a novel approach to improve the efficiency of distilling GANs from diffusion models for real-time image editing on mobile devices using innovative techniques like generalized features, Low-Rank Adaptation, and minimal data fine-tuning.
Jack Saunders,Vinay Namboodiri
http://arxiv.org/abs/2401.06126v1
Compressor summary: The authors propose a method for high-quality visual dubbing using data-efficient neural rendering priors and actor-specific adaptation, which generalizes to limited data and outperforms existing approaches in terms of visual quality and recognizability.
Dilyara Bareeva,Marina M. -C. Höhne,Alexander Warnecke,Lukas Pirch,Klaus-Robert Müller,Konrad Rieck,Kirill Bykov
http://arxiv.org/abs/2401.06122v1
Compressor summary: The paper explores how adversarial manipulations can falsify neural network explanations and proposes a method to protect them.
Pratyush Maini,Zhili Feng,Avi Schwarzschild,Zachary C. Lipton,J. Zico Kolter
http://arxiv.org/abs/2401.06121v1
Compressor summary: TOFU is a benchmark for evaluating the effectiveness of unlearning methods in large language models, using synthetic author profiles and diverse metrics.
Vage Egiazarian,Andrei Panferov,Denis Kuznedelev,Elias Frantar,Artem Babenko,Dan Alistarh
http://arxiv.org/abs/2401.06118v1
Compressor summary: The paper proposes a new algorithm called Additive Quantization for Language Models (AQLM) that significantly improves the compression and accuracy of large language models, enabling them to run on end-user devices with very low bit counts.
Luis Bolanos,Shih-Yang Su,Helge Rhodin
http://arxiv.org/abs/2401.06116v1
Compressor summary: The proposed Gaussian shadow model enables realistic shadows and shading in neural character models by using a simple analytic formula instead of costly sampling, improving reconstructions and poses in various scenes.
Hiroaki Yamagiwa,Yusuke Takase,Hidetoshi Shimodaira
http://arxiv.org/abs/2401.06112v1
Compressor summary: Axis Tour is a novel method that optimizes the order of interpretable semantic axes in word embeddings using Independent Component Analysis (ICA) for improved clarity and performance on downstream tasks.
Moab Arar,Andrey Voynov,Amir Hertz,Omri Avrahami,Shlomi Fruchter,Yael Pritch,Daniel Cohen-Or,Ariel Shamir
http://arxiv.org/abs/2401.06105v1
Compressor summary: Prompt-aligned personalization is a new method for generating personalized images that align well with complex textual prompts, while preserving subject fidelity and user requirements.
Matanel Oren,Michael Hassid,Yossi Adi,Roy Schwartz
http://arxiv.org/abs/2401.06104v1
Compressor summary: This paper shows that transformers can be seen as infinite multi-state RNNs and introduces a new conversion policy, TOVA, which improves performance on long range tasks with less cache memory usage.
Asma Ghandeharioun,Avi Caciularu,Adam Pearce,Lucas Dixon,Mor Geva
http://arxiv.org/abs/2401.06102v1
Compressor summary: Patchscopes is a framework that explains the hidden representations of large language models in natural language, addressing limitations of prior methods and enabling new applications.
Matthew B. A. McDermott,Lasse Hyldig Hansen,Haoran Zhang,Giovanni Angelotti,Jack Gallifant
http://arxiv.org/abs/2401.06091v1
Compressor summary: The paper challenges the notion that AUPRC is superior to AUROC for binary classification tasks with class imbalance, and shows that AUPRC can be biased and harmful in such cases.
Piyush Sao,Andrey Prokopenko,Damien Lebrun-Grandié
http://arxiv.org/abs/2401.06089v1
Compressor summary: \pandora is a parallel algorithm that efficiently constructs dendrograms for hierarchical clustering, using a recursive tree contraction method, achieving significant speed-ups on CPUs and GPUs.
K M Sajjadul Islam,Ayesha Siddika Nipu,Praveen Madiraju,Priya Deshpande
http://arxiv.org/abs/2401.06088v1
Compressor summary: Key points: - The text is about a study that develops an autocompletion tool for documenting Chief Complaints (CC) in medical records using machine learning models and natural language generation techniques. - The study compares three variants of BioGPT and a Long Short-Term Memory (LSTM) model, as well as a GPT-4 prompt. - The study evaluates the models' performance based on perplexity, modified BERTScore, and cosine similarity score. - The results show that BioGPT-Large performs best among the models and leads to an effective autocompletion tool. Summary: The study develops and evaluates an autocompletion tool for generating Chief Complaints in medical records using various machine learning models and natural language generation techniques, finding that BioGPT-Large outperforms other models and produces accurate and well-formatted CC phrases or sentences.
Chawin Terawong,Dave Cliff
http://arxiv.org/abs/2401.06086v1
Compressor summary: XGBoost is a machine learning method that learns profitable betting strategies from synthetic data generated by an agent-based model of a sports-betting exchange, and can generalize to outperform the original strategies.
Zhipeng Chen,Kun Zhou,Wayne Xin Zhao,Junchen Wan,Fuzheng Zhang,Di Zhang,Ji-Rong Wen
http://arxiv.org/abs/2401.06081v1
Compressor summary: RLMEC is a new RL method that uses a generative model to provide token-level rewards for training large language models, improving their performance on complex reasoning tasks and reducing harmful outputs.
Binghai Wang,Rui Zheng,Lu Chen,Yan Liu,Shihan Dou,Caishuang Huang,Wei Shen,Senjie Jin,Enyu Zhou,Chenyu Shi,Songyang Gao,Nuo Xu,Yuhao Zhou,Xiaoran Fan,Zhiheng Xi,Jun Zhao,Xiao Wang,Tao Ji,Hang Yan,Lixing Shen,Zhan Chen,Tao Gui,Qi Zhang,Xipeng Qiu,Xuanjing Huang,Zuxuan Wu,Yu-Gang Jiang
http://arxiv.org/abs/2401.06080v1
Compressor summary: The paper proposes methods to improve reward models for reinforcement learning from human feedback by addressing challenges related to preference data quality and generalization.
Ruilin Luo,Tianle Gu,Haoling Li,Junzhe Li,Zicheng Lin,Jiayi Li,Yujiu Yang
http://arxiv.org/abs/2401.06072v1
Compressor summary: The paper presents a novel approach to predict missing event links in future timestamps using LLMs fine-tuned with historical data and structural information, achieving state-of-the-art results.
Zhaowei Li,Qi Xu,Dong Zhang,Hang Song,Yiqing Cai,Qi Qi,Ran Zhou,Junting Pan,Zefeng Li,Van Tu Vu,Zhida Huang,Tao Wang
http://arxiv.org/abs/2401.06071v1
Compressor summary: The paper introduces LEGO, a multi-modal model that captures both global and local information across different modalities, enhancing its performance in various tasks requiring fine-grained understanding of input data.
Damai Dai,Chengqi Deng,Chenggang Zhao,R. X. Xu,Huazuo Gao,Deli Chen,Jiashi Li,Wangding Zeng,Xingkai Yu,Y. Wu,Zhenda Xie,Y. K. Li,Panpan Huang,Fuli Luo,Chong Ruan,Zhifang Sui,Wenfeng Liang
http://arxiv.org/abs/2401.06066v1
Compressor summary: DeepSeekMoE is a new language model architecture that improves expert specialization and reduces computational costs compared to conventional MoE architectures like GShard.
Minhao Jiang,Ken Ziyu Liu,Ming Zhong,Rylan Schaeffer,Siru Ouyang,Jiawei Han,Sanmi Koyejo
http://arxiv.org/abs/2401.06059v1
Compressor summary: The paper explores how pre-training language models with evaluation data affects their performance on downstream tasks and highlights limitations of current contamination definitions.
Giuseppe Vecchio,Valentin Deschaintre
http://arxiv.org/abs/2401.06056v1
Compressor summary: MatSynth is a large and diverse dataset of high-quality, public domain materials for use in virtual environments.
Guanjun Wu,Taoran Yi,Jiemin Fang,Wenyu Liu,Xinggang Wang
http://arxiv.org/abs/2401.06052v1
Compressor summary: The paper proposes HDR-HexPlane, a dynamic HDR NeRF framework that can learn 3D scenes from dynamic 2D images with various exposures and render high-quality novel-view images at any time point with desired exposure.
Qipeng Qian,Tanwi Mallick
http://arxiv.org/abs/2401.06040v1
Compressor summary: WavGCRN is a novel method that combines wavelet transformation, graph convolutional recurrent networks, and road network information to improve spatiotemporal traffic forecasting by modeling multiscale structure in traffic data.
Partha Ghosh,Soubhik Sanyal,Cordelia Schmid,Bernhard Schölkopf
http://arxiv.org/abs/2401.06035v1
Compressor summary: The authors propose a new generative model for videos that can handle long-term dependencies, reduce computational complexity, and synthesize high-quality video clips efficiently.
Muhammad Farid Adilazuarda,Samuel Cahyawijaya,Alham Fikri Aji,Genta Indra Winata,Ayu Purwarianti
http://arxiv.org/abs/2401.06034v1
Compressor summary: LinguAlchemy improves PLMs' performance on unseen languages by regularizing them with linguistic constraints, making them more inclusive and accessible.
Zhiyu Zhu,Huaming Chen,Xinyi Wang,Jiayu Zhang,Zhibo Jin,Kim-Kwang Raymond Choo
http://arxiv.org/abs/2401.06031v1
Compressor summary: GE-AdvGAN improves the efficiency and transferability of adversarial attacks by optimizing generator training with a novel gradient editing mechanism in GANs.
Pablo Alonso,Jon Ander Iñiguez de Gordoa,Juan Diego Ortega,Sara García,Francisco Javier Iriarte,Marcos Nieto
http://arxiv.org/abs/2401.06019v1
Compressor summary: Key points: - Runway and taxiway pavements deteriorate over time and need regular inspection - UAV-based vision system using DL can identify pavement defects automatically - Synthetic dataset generation helps overcome data scarcity for training Summary: The paper proposes a vision-based DL method using UAVs to detect pavement defects, and a synthetic dataset generation technique to train the model with limited real data.
Cui Beilei,Islam Mobarakol,Bai Long,Ren Hongliang
http://arxiv.org/abs/2401.06013v1
Compressor summary: The paper introduces Surgical-DINO, a low-rank adaptation of DINOv2 for depth estimation in robotic surgery, which significantly outperforms existing models on the SCARED dataset.
Rocío del Amor,Julio Silva-Rodríguez,Adrián Colomer,Valery Naranjo
http://arxiv.org/abs/2401.06010v1
Compressor summary: Key points: - Computer vision solutions for gigapixel images in digital pathology face computational limitations due to large image size. - Knowledge distillation uses soft labels and features from high-resolution images to train a model that works with lower-resolution images. - Incorporating attention maps, such as grad-CAMs, helps transfer discriminative information across resolutions and improve performance. Summary: The authors propose using attention maps, like grad-CAMs, to guide knowledge distillation for computer vision in digital pathology, improving model performance at different image resolutions.
Martin S J Rogers,Maria Fox,Andrew Fleming,Louisa van Zeeland,Jeremy Wilkinson,J. Scott Hosking
http://arxiv.org/abs/2401.06009v1
Compressor summary: ViSual_IceD is a CNN that fuses multispectral and SAR imagery for accurate sea ice detection in polar regions, outperforming other models and complementing passive microwave data.
Linus Franke,Darius Rückert,Laura Fink,Marc Stamminger
http://arxiv.org/abs/2401.06003v1
Compressor summary: TRIPS is a novel technique that combines ideas from Gaussian Splatting and ADOP to render high-quality images of highly detailed scenes at real-time speeds with a differentiable pipeline.
Qian Gong,Jieyang Chen,Ben Whitney,Xin Liang,Viktor Reshniak,Tania Banerjee,Jaemoon Lee,Anand Rangarajan,Lipeng Wan,Nicolas Vidal,Qing Liu,Ana Gainaru,Norbert Podhorszki,Richard Archibald,Sanjay Ranka,Scott Klasky
http://arxiv.org/abs/2401.05994v1
Compressor summary: MGARD is a software tool for compressing and managing large scientific data on grids across various computing architectures.
Rouwan Wu,Xiaoya Cheng,Juelin Zhu,Xuxiang Liu,Maojun Zhang,Shen Yan
http://arxiv.org/abs/2401.05971v1
Compressor summary: The paper introduces a large-scale dataset for UAV localization and presents a two-stage pipeline that combines synthetic data generation and visual localization, as well as a hierarchical system for 3D ground target tracking.
Niklas Strauß,Matthias Schubert
http://arxiv.org/abs/2401.05969v1
Compressor summary: Spatial-aware deep reinforcement learning (SATOP) method improves the traveling officer problem by exploiting spatial relationships and learning future inter-action correlations, resulting in 22% more fines in Melbourne.
Yashwardhan Chaudhuri,Ankit Kumar,Orchid Chetia Phukan,Arun Balaji Buduru
http://arxiv.org/abs/2401.05968v1
Compressor summary: The paper introduces two lightweight crowd-counting models that use MobileNet and MobileViT backbones, feature fusion, and are computationally efficient compared to previous methods.
Yihua Zhu,Hidetoshi Shimodaira
http://arxiv.org/abs/2401.05967v1
Compressor summary: OrthogonalE is a novel Knowledge Graph embedding model that uses matrices for entities and block-diagonal orthogonal matrices with Riemannian optimization for relations, improving generality and flexibility over existing methods.
Hongjun Zhang
http://arxiv.org/abs/2401.05964v1
Compressor summary: The authors propose a method to generate new bridge types using generative artificial intelligence, which combines different structural components and can potentially lead to artificial general intelligence.
Xijun Li,Fangzhou Zhu,Hui-Ling Zhen,Weilin Luo,Meng Lu,Yimin Huang,Zhenan Fan,Zirui Zhou,Yufei Kuang,Zhihai Wang,Zijie Geng,Yang Li,Haoyang Liu,Zhiwu An,Muming Yang,Jianshu Li,Jie Wang,Junchi Yan,Defeng Sun,Tao Zhong,Yong Zhang,Jia Zeng,Mingxuan Yuan,Jianye Hao,Jun Yao,Kun Mao
http://arxiv.org/abs/2401.05960v1
Compressor summary: The paper presents a study on enhancing Huawei Cloud's OptVerse AI Solver with machine learning techniques to improve efficiency and performance in mathematical programming tasks.
Chujie Gao,Dongping Chen,Qihui Zhang,Yue Huang,Yao Wan,Lichao Sun
http://arxiv.org/abs/2401.05952v1
Compressor summary: The text introduces mixcase, a hybrid text form of machine-generated and human-generated content, and MixSet, the first dataset to study mixed modification scenarios in large language models.
Shuai Zhao,Meihuizi Jia,Luu Anh Tuan,Jinming Wen
http://arxiv.org/abs/2401.05949v1
Compressor summary: In-context learning is an effective NLP paradigm but has security risks as it can be exploited by ICLAttack, a new backdoor attack method that manipulates model behavior without fine-tuning.
Antoine Dedieu,Wolfgang Lehrach,Guangyao Zhou,Dileep George,Miguel Lázaro-Gredilla
http://arxiv.org/abs/2401.05946v1
Compressor summary: The paper proposes a transformer model with discrete bottlenecks that can learn compressed representations of observations and actions, enabling it to extract interpretable cognitive maps for path planning in partially observed environments.
Jushi Kai,Tianhang Zhang,Hai Hu,Zhouhan Lin
http://arxiv.org/abs/2401.05930v1
Compressor summary: The paper proposes a method to help language models generate text more truthfully by highlighting and hesitating on less probable but factual tokens.
Jiashuo Wang,Chunpu Xu,Chak Tou Leong,Wenjie Li,Jing Li
http://arxiv.org/abs/2401.05928v1
Compressor summary: Muffin is a framework that uses contrastive learning to reduce unhelpful responses in emotional support systems by considering multiple factors such as empathy, support strategies, and coherence.
Bin Dou,Tianyu Zhang,Yongjia Ma,Zhaohui Wang,Zejian Yuan
http://arxiv.org/abs/2401.05925v1
Compressor summary: Our method improves 3D scene segmentation speed and quality by optimizing Gaussian points, fusing spatial and semantic features, and using a shallow decoding network.
Sabina Elkins,Ekaterina Kochmar,Jackie C. K. Cheung,Iulian Serban
http://arxiv.org/abs/2401.05914v1
Compressor summary: The paper shows how a language model-based question generation system can be used by teachers to create quizzes with learning goals from Bloom's taxonomy, and demonstrates its advantages over handwritten quizzes in terms of quality and metrics.
Wesley Ramos dos Santos,Ivandre Paraboni
http://arxiv.org/abs/2401.05912v1
Compressor summary: The article describes how to use GPT 3.5 prompts and a simple text classifier to screen for mental health issues in social media posts, achieving similar results to a more complex BERT method but with less computation.
Xuyang Zhao,Qibin Zhao,Toshihisa Tanaka
http://arxiv.org/abs/2401.05908v1
Compressor summary: EpilepsyLLM is a customized language model that provides more accurate and relevant medical information about epilepsy in Japanese by fine-tuning a pre-trained LLM with domain-specific datasets.
Kang Chen,Yuanjie Liu
http://arxiv.org/abs/2401.05907v1
Compressor summary: Swintormer is a sliding window model for defocus deblurring that uses diffusion, Transformer blocks, and optimized Macs to achieve high performance with low memory usage and improved SNR.
Hyunjin Kim,Minhyuk Sung
http://arxiv.org/abs/2401.05906v1
Compressor summary: PartSTAD adapts 2D segmentation models for 3D segmentation tasks using finetuning, merging weights, and a foreground segmentation model, achieving significant improvements on PartNet-Mobility dataset.
David Rivas-Villar,Álvaro S. Hervella,José Rouco,Jorge Novo
http://arxiv.org/abs/2401.05901v1
Compressor summary: ConKeD is a novel deep learning approach that learns descriptors for retinal image registration using a multi-positive multi-negative contrastive learning strategy, achieving comparable results to state-of-the-art methods with fewer training samples and detected keypoints.
Yuanzhao Zhai,Yiying Li,Zijian Gao,Xudong Gong,Kele Xu,Dawei Feng,Ding Bo,Huaimin Wang
http://arxiv.org/abs/2401.05899v1
Compressor summary: ORPO is an offline RL framework that uses optimistic rollouts to improve policy optimization and generalization with synthetic model rollouts.
Tianxiu Xie,Keke Gai,Jing Yu,Liehuang Zhu
http://arxiv.org/abs/2401.05895v1
Compressor summary: The paper proposes a novel ownership protection model for distributed machine learning using binary linear tree commitment, which ensures computational integrity with efficient proof aggregation and watermarking.
Ion Ciobotari,Adriana Príncipe,Maria Alexandra Oliveira,João Nuno Silva
http://arxiv.org/abs/2401.05891v1
Compressor summary: The text describes a low-cost terrestrial laser scanner (TLS) for ecological data acquisition and its application in two case studies, showing its effectiveness in measuring vegetation structure.
Xianming Li,Jing Li
http://arxiv.org/abs/2401.05883v1
Compressor summary: Generative duplication is a novel approach that removes duplicate text from noisy social media data, mitigating model bias, improving performance, and saving training time.
Yu Jing,Tan Yujuan,Ren Ao,Liu Duo
http://arxiv.org/abs/2401.05879v1
Compressor summary: The YOIO framework improves optical flow prediction accuracy in occluded regions using spatiotemporal information from two frames and achieves state-of-the-art results with high efficiency.
Dominik Baumann,Thomas B. Schön
http://arxiv.org/abs/2401.05876v1
Compressor summary: The paper proposes a safe learning method for robotic systems that can handle discrete environmental changes without directly measuring them, using multi-class classification and experiments to estimate the context.
Yahui Fu,Haiyue Song,Tianyu Zhao,Tatsuya Kawahara
http://arxiv.org/abs/2401.05871v1
Compressor summary: The text describes a new method for improving personality recognition in robots using data augmentation and a specialized network architecture, leading to better human-robot interactions.
Hanzhang Wang,Haoran Wang,Jinze Yang,Zhongrui Yu,Zeke Xie,Lei Tian,Xinyan Xiao,Junjun Jiang,Xianming Liu,Mingming Sun
http://arxiv.org/abs/2401.05870v1
Compressor summary: HiCAST is a novel approach for arbitrary style transfer that allows flexible and customized stylization by using a Latent Diffusion Model and a Style Adapter, and can also apply to video AST with improved temporal consistency.
Pengzhi Gao,Zhongjun He,Hua Wu,Haifeng Wang
http://arxiv.org/abs/2401.05861v1
Compressor summary: The paper proposes a new method, XConST, to improve multilingual zero-shot translation by using prompt strategies and cross-lingual consistency regularization during instruction finetuning on pretrained LLMs.
Litian Li,Jord Molhoek,Jing Zhou
http://arxiv.org/abs/2401.05849v1
Compressor summary: The text studies whether AI can recognize intentions to speak from accelerometer data in real-life settings, but finds that it is not reliable enough and more data sources are needed.
Georgios Vardakas,John Pavlopoulos,Aristidis Likas
http://arxiv.org/abs/2401.05831v1
Compressor summary: The paper proposes a new way to evaluate data clustering quality, called macro-averaging, which is more robust to cluster imbalance and background noise than the standard micro-averaging method.
Jinge Wu,Yunsoo Kim,Honghan Wu
http://arxiv.org/abs/2401.05827v1
Compressor summary: The text discusses the potential of visual assistants for healthcare using large language and vision models, but warns that these models may hallucinate when faced with unfamiliar medical images.
Michael Free,Andrew Langworthy,Mary Dimitropoulaki,Simon Thompson
http://arxiv.org/abs/2401.05822v1
Compressor summary: The text describes a system that trains a chatbot with reinforcement learning to solve evolving problems by conversing with a simulated user about a virtual game.
Quentin Delfosse,Sebastian Sztwiertnia,Wolfgang Stammer,Mark Rothermel,Kristian Kersting
http://arxiv.org/abs/2401.05821v1
Compressor summary: SCoBots are transparent agents that use concept bottleneck layers to enable domain experts to understand and regularize their behavior, leading to better human-aligned RL.
Yannick Emonds,Kai Xi,Holger Fröning
http://arxiv.org/abs/2401.05820v1
Compressor summary: The text explores how image classification with neural networks can tolerate noise in resistive memory operations and proposes methods to improve resilience.
Zhuoyuan Mao,Yen Yu
http://arxiv.org/abs/2401.05811v1
Compressor summary: The article proposes AlignInstruct, a method that improves machine translation on large language models for unseen and low-resource languages by using cross-lingual supervision.
Alejandro Cobo,Roberto Valle,José M. Buenaposada,Luis Baumela
http://arxiv.org/abs/2401.05807v1
Compressor summary: Key points: - The paper analyzes different methods and metrics for head pose estimation (HPE) in face processing tasks - It shows that Euler angles are good for short-range HPE but not for extreme rotations - It proposes a new cross-data set evaluation methodology and a generalization of the geodesic angular distance metric - It introduces a wide range HPE benchmark based on the CMU Panoptic data set Summary: The paper discusses HPE methods and metrics for different rotation ranges, improves cross-data set evaluation, and introduces a new benchmark.
Xiaoyan Yu,Neng Dong,Liehuang Zhu,Hao Peng,Dapeng Tao
http://arxiv.org/abs/2401.05806v1
Compressor summary: The paper proposes a new method, CLIP-Driven Semantic Discovery Network (CSDN), that uses high-level semantics to bridge the modality gap between visible and infrared images for person re-identification.
Yu Zheng,Huan Yee Koh,Ming Jin,Lianhua Chi,Haishuai Wang,Khoa T. Phan,Yi-Ping Phoebe Chen,Shirui Pan,Wei Xiang
http://arxiv.org/abs/2401.05800v1
Compressor summary: The text introduces GST-Pro, a framework for detecting anomalies in irregularly-sampled multivariate time series using neural controlled differential equations and a distribution-based scoring mechanism.
Frank Xing
http://arxiv.org/abs/2401.05799v1
Compressor summary: This paper explores using large language models without fine-tuning for financial sentiment analysis, proposing a framework that leverages their generative power and domain knowledge to improve accuracy and discusses its implications on business and management.
Jesse Geneson,Linus Tang
http://arxiv.org/abs/2401.05794v1
Compressor summary: Key improvements in online learning bounds for various scenarios are presented, including sharpened upper and lower bounds and solved problems.
Zhihui Xie,Handong Zhao,Tong Yu,Shuai Li
http://arxiv.org/abs/2401.05792v1
Compressor summary: The paper proposes a method to remove language-specific factors from multilingual embeddings using singular value decomposition, which improves semantic tasks like cross-lingual sentence retrieval without finetuning.
Md Rizwan Parvez
http://arxiv.org/abs/2401.05787v1
Compressor summary: E2G is a novel framework that leverages evidence from context to improve LLMs' reasoning and generation performance across various tasks.
Jing Wu,Trung Le,Munawar Hayat,Mehrtash Harandi
http://arxiv.org/abs/2401.05779v1
Compressor summary: The paper introduces an algorithm for diffusion models to remove data while preserving their utility and effectiveness on other data.
Tianyu Cui,Yanling Wang,Chuanpu Fu,Yong Xiao,Sijia Li,Xinhao Deng,Yunpeng Liu,Qinglin Zhang,Ziyi Qiu,Peiyang Li,Zhixing Tan,Junwu Xiong,Xinyu Kong,Zujie Wen,Ke Xu,Qi Li
http://arxiv.org/abs/2401.05778v1
Compressor summary: This paper proposes a taxonomy for large language models (LLMs) to analyze and mitigate risks in their four essential modules, and reviews benchmarks for risk assessment.
Jinxin Liu,Shulin Cao,Jiaxin Shi,Tingjian Zhang,Lei Hou,Juanzi Li
http://arxiv.org/abs/2401.05777v1
Compressor summary: This paper evaluates how well large language models understand and generate structured logical forms for question answering using different formal languages and suggests generating more training data rather than directly relying on LLMs for answering questions.
Wujie Sun,Defang Chen,Jiawei Chen,Yan Feng,Chun Chen,Can Wang
http://arxiv.org/abs/2401.05772v1
Compressor summary: The paper proposes a novel framework called Knowledge Translation that uses neural networks to compress deep learning models without re-training or architectural constraints, inspired by language translation.
Kunpeng Qiu,Zhiying Zhou,Yongxin Guo
http://arxiv.org/abs/2401.05771v1
Compressor summary: The paper proposes a new method for classifying lesions in Wireless Capsule Endoscopy images using Decoupled Supervised Contrastive Learning and Saliency Augmentor, achieving state-of-the-art results.
Adrian Gheorghiu,Iulian-Marius Tăiatu,Dumitru-Clementin Cercel,Iuliana Marin,Florin Pop
http://arxiv.org/abs/2401.05768v1
Compressor summary: The paper presents a deep learning-based method for classifying Robusta coffee leaf diseases using the RoCoLe dataset, augmented with synthetic data generated by CycleGAN, improving the performance of the model.
Na Wang,Lei Qi,Jintao Guo,Yinghuan Shi,Yang Gao
http://arxiv.org/abs/2401.05752v1
Compressor summary: This paper proposes two modules to improve domain generalization by disentangling spurious correlations and enhancing potential correlations using sample and feature perspectives, achieving better results for CNNs or MLPs.
Peng Dai,Feitong Tan,Xin Yu,Yinda Zhang,Xiaojuan Qi
http://arxiv.org/abs/2401.05750v1
Compressor summary: The paper introduces GO-NeRF, a method for creating 3D objects within an existing Neural Radiance Field (NeRF) scene using scene context and producing high-quality, harmonious results with minimal artifacts.
Brian Thompson,Mehak Preet Dhaliwal,Peter Frisch,Tobias Domhan,Marcello Federico
http://arxiv.org/abs/2401.05749v1
Compressor summary: The text suggests that machine translation is widely used for creating low-quality translations of web content in multiple languages, especially in lower resource languages.
Barry Shichen Hu,Siyun Liang,Johannes Paetzold,Huy H. Nguyen,Isao Echizen,Jiapeng Tang
http://arxiv.org/abs/2401.05745v1
Compressor summary: Key points: - Use Transformer to predict normals from noisy point clouds - Previous methods use PointNet variants and surface fitting methods with limitations - Proposed method is simpler, faster, more robust, and achieves state-of-the-art performance on two datasets Summary: The paper proposes a simple and effective Transformer-based model for predicting normals from noisy point clouds, outperforming previous methods that use PointNet variants and surface fitting methods with limitations.
Lorenzo Marconi,Riccardo Rosati
http://arxiv.org/abs/2401.05743v1
Compressor summary: The paper investigates how to answer queries over knowledge bases with existential rules, finding some cases where it is easy or possible to fix inconsistencies using new techniques.
Chenghao Li,Boheng Zeng,Yi Lu,Pengbo Shi,Qingzi Chen,Jirui Liu,Lingyun Zhu
http://arxiv.org/abs/2401.05738v1
Compressor summary: The paper introduces Large Kernel Convolutional Attention (LKCA), a new spatial attention method for visual transformers that simplifies the attention operation and combines the advantages of convolutional neural networks and visual transformers, achieving competitive performance in various visual tasks.
Antonio Manjavacas,Alejandro Campoy-Nieves,Javier Jiménez-Raboso,Miguel Molina-Solana,Juan Gómez-Romero
http://arxiv.org/abs/2401.05737v1
Compressor summary: This paper evaluates state-of-the-art Deep Reinforcement Learning algorithms for HVAC control, showing their potential in comfort and energy efficiency while highlighting challenges in generalization and incremental learning.
Paul Lerner,Olivier Ferret,Camille Guinaudeau
http://arxiv.org/abs/2401.05736v1
Compressor summary: The text discusses how cross-modal retrieval can help recognize named entities in visual question answering tasks by bridging the semantic gap between entities and their depictions.
Kumara Kahatapitiya,Adil Karjauv,Davide Abati,Fatih Porikli,Yuki M. Asano,Amirhossein Habibian
http://arxiv.org/abs/2401.05735v1
Compressor summary: The paper analyzes the inefficiencies of diffusion-based video editing, introduces Object-Centric Diffusion (OCD) to reduce computation costs by focusing on foreground regions, and proposes two novel techniques that can significantly speed up video editing without retraining.
Jaeill Kim,Duhun Hwang,Eunjung Lee,Jangwon Suh,Jimyeong Kim,Wonjong Rhee
http://arxiv.org/abs/2401.05730v1
Compressor summary: The paper proposes a multi-view strategy called ECPP that improves the speed and performance of contrastive and non-contrastive visual representation learning methods.
Sahil Chopra
http://arxiv.org/abs/2401.05727v1
Compressor summary: The text discusses using a hidden Markov model to predict part-of-speech tags in low-resource languages by transferring data from source languages, and finds that this method is effective for zero-resource languages.
Xu Cai,Jonathan Scarlett
http://arxiv.org/abs/2401.05716v1
Compressor summary: The paper investigates how the difficulty of estimating a normalizing constant using black-box function queries depends on the problem parameter $\lambda$, ranging from Bayesian quadrature to Bayesian optimization and considering noisy function evaluations.
Jiyu Jiao,Xiaojun Wang,Chenpei Han,Yuhua Huang,Yizhuo Zhang
http://arxiv.org/abs/2401.05711v1
Compressor summary: A meta-learning algorithm that uses historical localization tasks to improve adaptability and efficiency in indoor environments reduces data acquisition costs and increases accuracy.
Xi Chen,Zhihui Zhu,Andrew Perrault
http://arxiv.org/abs/2401.05710v1
Compressor summary: The paper proposes a new method for reinforcement learning with unknown reward perturbations that can recover the true rewards in various settings and outperforms existing methods.
Zhen Tao,Dinghao Xi,Zhiyu Li,Liumin Tang,Wei Xu
http://arxiv.org/abs/2401.05707v1
Compressor summary: The proposed Chinese Article-style Transfer framework (CAT-LLM) leverages Large Language Models to analyze and transfer text features from Chinese articles while preserving the original content's integrity.
Jiaxin Guo,Zhanglin Wu,Zongyao Li,Hengchao Shang,Daimeng Wei,Xiaoyu Chen,Zhiqiang Rao,Shaojun Li,Hao Yang
http://arxiv.org/abs/2401.05700v1
Compressor summary: Regularized Batched Inputs is a novel method for low-latency simultaneous speech translation that improves input diversity and reduces output errors with suitable regularization techniques.
Chengfeng Dou,Zhi Jin,Wenpin Jiao,Haiyan Zhao,Yongqiang Zhao,Zhenwei Tao
http://arxiv.org/abs/2401.05695v1
Compressor summary: PLPF is a method to improve medical dialogue generation by integrating diagnostic logic into large language models using rule modeling, preference data generation, and preference alignment.
Jiaxin Guo,Minghan Wang,Xiaosong Qiao,Daimeng Wei,Hengchao Shang,Zongyao Li,Zhengzhe Yu,Yinglu Li,Chang Su,Min Zhang,Shimin Tao,Hao Yang
http://arxiv.org/abs/2401.05689v1
Compressor summary: UCorrect is an unsupervised method for correcting errors in automatic speech recognition (ASR) output without relying on specific training data or fine-tuning, achieving significant word error rate reduction and outperforming popular correction models.
Blaise Appolinary,Alex Deaconu,Sophia Yang
http://arxiv.org/abs/2401.05686v1
Compressor summary: The paper proposes a method to dynamically expand CNNs during training using an expansion score, which improves performance, reduces resource use, and is eco-friendly.
Weibo Jiang,Weihong Ren,Jiandong Tian,Liangqiong Qu,Zhiyong Wang,Honghai Liu
http://arxiv.org/abs/2401.05676v1
Compressor summary: The paper proposes a new method for human-object interaction detection that considers both self-triplet and cross-triplet dependencies, as well as leveraging the CLIP model to obtain interaction-aware features.
Seung Hyun Lee,Yinxiao Li,Junjie Ke,Innfarn Yoo,Han Zhang,Jiahui Yu,Qifei Wang,Fei Deng,Glenn Entis,Junfeng He,Gang Li,Sangpil Kim,Irfan Essa,Feng Yang
http://arxiv.org/abs/2401.05675v1
Compressor summary: Parrot is a new framework for text-to-image generation that uses reinforcement learning to automatically balance multiple quality rewards and improve the generated images.
Xintao Wang,Zhouhong Gu,Jiaqing Liang,Dakuan Lu,Yanghua Xiao,Wei Wang
http://arxiv.org/abs/2401.05669v1
Compressor summary: ConcEPT is a novel pre-training method for language models that infuses conceptual knowledge by predicting the concepts of entities in the context, improving performance on tasks like entity typing.
Weijieying Ren,Vasant G Honavar
http://arxiv.org/abs/2401.05667v1
Compressor summary: EsaCL is a method for efficient continual learning of sparse models that prunes redundant parameters without retraining and uses intelligent data selection to improve data efficiency.
Jian Ma
http://arxiv.org/abs/2401.05664v1
Compressor summary: The paper proposes using transfer entropy to diagnose the root causes of low energy efficiency in industrial systems, and tests it on a real compressing air system data set.
Kaixun Yang,Mladen Raković,Yuyang Li,Quanlong Guan,Dragan Gašević,Guanliang Chen
http://arxiv.org/abs/2401.05655v1
Compressor summary: The study investigates the relationship between AES model accuracy, fairness, and generalizability, finding that prompt-specific models perform better in accuracy but may have more bias towards students of different economic statuses compared to cross-prompt models, while traditional machine learning models with engineered features achieve higher accuracy and fairness than complex neural networks.
Tao Tu,Anil Palepu,Mike Schaekermann,Khaled Saab,Jan Freyberg,Ryutaro Tanno,Amy Wang,Brenna Li,Mohamed Amin,Nenad Tomasev,Shekoofeh Azizi,Karan Singhal,Yong Cheng,Le Hou,Albert Webson,Kavita Kulkarni,S Sara Mahdavi,Christopher Semturs,Juraj Gottweis,Joelle Barral,Katherine Chou,Greg S Corrado,Yossi Matias,Alan Karthikesalingam,Vivek Natarajan
http://arxiv.org/abs/2401.05654v1
Compressor summary: AMIE is an AI system that can have diagnostic dialogues with patients and performs better than primary care physicians in some aspects, but it is not yet ready for real-world use.
Sean Tang,Sriya Musunuru,Baoshi Zong,Brooks Thornton
http://arxiv.org/abs/2401.05653v1
Compressor summary: The paper shows how Shapley Value Regression can help measure individual partner's impact on marketing performance in financial services, without the need for complex and costly cooperative game theory testing.
Israa Jaradat,Haiqi Zhang,Chengkai Li
http://arxiv.org/abs/2401.05650v1
Compressor summary: Cherry is a novel approach that uses language models and multiple news sources to automatically detect cherry-picked statements in news articles by identifying missing important statements.
Chunlei Peng,Boyu Wang,Decheng Liu,Nannan Wang,Ruimin Hu,Xinbo Gao
http://arxiv.org/abs/2401.05646v1
Compressor summary: The paper proposes a method called MADE that uses attribute descriptions to enhance person re-identification across clothing changes, by masking and embedding them into the Transformer blocks of an image model.
Changtai Li,Xu Han,Chao Yao,Xiaojuan Ban
http://arxiv.org/abs/2401.05638v1
Compressor summary: MatSAM is a general and efficient solution for microstructure extraction in microscopic images based on SAM, using point-based prompts generation to adapt to different materials and microscopy types.
Gang Wu,Junjun Jiang,Junpeng Jiang,Xianming Liu
http://arxiv.org/abs/2401.05633v1
Compressor summary: The ConvFormer-based Super-Resolution network (CFSR) is a lightweight method for image super-resolution that replaces the self-attention module with large kernel convolution and uses an edge-preserving feed-forward network to preserve high-frequency information.
Aditya Joshi,Raj Dabre,Diptesh Kanojia,Zhuang Li,Haolan Zhan,Gholamreza Haffari,Doris Dippold
http://arxiv.org/abs/2401.05632v1
Compressor summary: The text surveys past research on natural language processing (NLP) for dialects, covering various tasks, languages, and methods, with a focus on improving the equity of language technologies.
Shaoru Chen,Mahyar Fazlyab
http://arxiv.org/abs/2401.05629v1
Compressor summary: The paper proposes a self-supervised learning framework for finding control barrier functions that maximize safety and accommodate complex constraints in nonlinear control systems.
Juni Kim,Zhikang Dong,Pawel Polak
http://arxiv.org/abs/2401.05625v1
Compressor summary: The authors present a new method that uses geometry, smoothing, and spectral analysis to measure facial muscle activity from videos, which could have various applications in security, medicine, and emotion recognition.
Matthew Renze,Erhan Guven
http://arxiv.org/abs/2401.05618v1
Compressor summary: The paper introduces Concise Chain-of-Thought (CCoT) prompting, which reduces response length and per-token cost for GPT-3.5 and GPT-4 on MCQA tasks, but may impair math problem-solving for GPT-3.5.
Victoria M. Dax,Jiachen Li,Kevin Leahy,Mykel J. Kochenderfer
http://arxiv.org/abs/2401.05610v1
Compressor summary: The paper shows how Graph Neural Networks (GNNs) can be used to optimize discrete solutions in Combinatorial Optimization problems by learning policies through Q-Learning.
Damjan Kalajdzievski
http://arxiv.org/abs/2401.05605v1
Compressor summary: Our study shows that LoRA, a PEFT strategy, causes significant catastrophic forgetting in LLMs and provides scaling laws for its relationship with performance and number of parameters.
Andrew Gritsevskiy,Arjun Panickssery,Aaron Kirtland,Derik Kauffman,Hans Gundlach,Irina Gritsevskaya,Joe Cavanagh,Jonathan Chiang,Lydia La Roux,Michelle Hung
http://arxiv.org/abs/2401.05604v1
Compressor summary: The paper introduces a rebus puzzle benchmark to evaluate multimodal language models' performance, which requires various skills and shows current models' weaknesses in reasoning and explanation.
Lucas W. Remedios,Shunxing Bao,Samuel W. Remedios,Ho Hin Lee,Leon Y. Cai,Thomas Li,Ruining Deng,Can Cui,Jia Li,Qi Liu,Ken S. Lau,Joseph T. Roland,Mary K. Washington,Lori A. Coburn,Keith T. Wilson,Yuankai Huo,Bennett A. Landman
http://arxiv.org/abs/2401.05602v1
Compressor summary: The paper proposes using inter-modality learning to classify more cell types on virtual H&E stains by synthesizing them from multiplexed immunofluorescence images and transferring labels from the latter.
Shilong Pan,Zhiliang Tian,Liang Ding,Zhen Huang,Zhihua Wen,Dongsheng Li
http://arxiv.org/abs/2401.05596v1
Compressor summary: The paper proposes a novel method called POMP that uses a dynamic graph of multiple auxiliary languages to improve unsupervised neural machine translation for low-resource languages by mitigating linguistic noise in large language models.