This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-01-16 generated by the compressor, my personal LLM-based project.
Michelle Wastl,Jannis Vamvas,Rico Sennrich
http://arxiv.org/abs/2401.06769v1
Compressor summary: The authors propose an unsupervised method to detect translation direction of parallel text using the simplification effect in machine translation and achieve high accuracies for both NMT and human translations.
Anton Voronov,Lena Wolf,Max Ryabinin
http://arxiv.org/abs/2401.06766v1
Compressor summary: The prompt template format significantly affects large language models' in-context learning performance, and using Template Ensembles can improve their accuracy by aggregating predictions across different templates.
Caleb Robinson,Isaac Corley,Anthony Ortiz,Rahul Dodhia,Juan M. Lavista Ferres,Peyman Najafirad
http://arxiv.org/abs/2401.06762v1
Compressor summary: The paper introduces a new benchmark dataset (Chesapeake Roads Spatial Context) for testing how well geospatial machine learning models understand spatial context over long distances, and shows that current models often fail at this task.
Mingdao Liu,Aohan Zeng,Bowen Wang,Peng Zhang,Jie Tang,Yuxiao Dong
http://arxiv.org/abs/2401.06761v1
Compressor summary: The paper introduces a parallel auto-regressive generation method that speeds up text generation by LLMs and reduces resource consumption.
Tom Kocmi,Vilém Zouhar,Christian Federmann,Matt Post
http://arxiv.org/abs/2401.06760v1
Compressor summary: The paper explores how different metrics measure human-noticeable differences in machine translation quality using a new dataset and finding more stable results than p-values.
Muhammad Naveed Riaz,Maciej Wielgosz,Abel Garcia Romera,Antonio M. Lopez
http://arxiv.org/abs/2401.06757v1
Compressor summary: ARCANE generates diverse synthetic datasets for pedestrian intention prediction, complementing real-world data, and PedGNN is a fast and memory-efficient model for this task.
Muhammad Tayyab Zamir,Muhammad Asif Ayub,Asma Gul,Nasir Ahmad,Kashif Ahmad
http://arxiv.org/abs/2401.06752v1
Compressor summary: The paper presents a new framework that uses style analysis to detect authorship and changes in multi-authored documents, improving on existing methods with special characters and weight optimization.
Peter Hase,Mohit Bansal,Peter Clark,Sarah Wiegreffe
http://arxiv.org/abs/2401.06751v1
Compressor summary: This paper shows that current language models can generalize well from easy to hard data using simple training methods, and suggests collecting easy data instead of hard data for better performance.
Alexandra DeLucia,Mengjie Zhao,Yoshinori Maeda,Makoto Yoda,Keiichi Yamada,Hiromi Wakaki
http://arxiv.org/abs/2401.06742v1
Compressor summary: The authors propose a natural language inference method for adapting a persona extraction model to new settings without needing new data or human annotation.
Kaitlyn Zhou,Jena D. Hwang,Xiang Ren,Maarten Sap
http://arxiv.org/abs/2401.06730v1
Compressor summary: The text discusses the challenges of AI language models expressing uncertainties in their responses, which can lead to overconfidence and safety risks for users who rely on them.
Bozhen Hu,Zelin Zang,Jun Xia,Lirong Wu,Cheng Tan,Stan Z. Li
http://arxiv.org/abs/2401.06727v1
Compressor summary: The paper introduces a new method for embedding graph data in a low-dimensional space that preserves topological structure and improves stability and quality, outperforming existing approaches on various tasks.
Xinrui Zou,Ming Zhang,Nathaniel Weir,Benjamin Van Durme,Nils Holzenberger
http://arxiv.org/abs/2401.06715v1
Compressor summary: The paper proposes re-framing statutory reasoning as an analogy task to increase dataset size and interpretability, and improves upon existing methods by combining retrieval and analogy models.
Rafael Rivera Soto,Kailin Koch,Aleem Khan,Barry Chen,Marcus Bishop,Nicholas Andrews
http://arxiv.org/abs/2401.06712v1
Compressor summary: The authors propose a method to distinguish human from machine-written text without relying on training data from potentially abusive language models, using style features that work across different models.
Garud Iyengar,Raghav Singal
http://arxiv.org/abs/2401.06710v1
Compressor summary: Key points: - The paper studies how to optimize personalized interventions for different consumer states in marketing campaigns. - It proposes a novel algorithm that combines attribution-based decision-making and approximate Bayesian learning. - It shows high accuracy, interpretability, scalability, and asymptotic optimality of the algorithm on a real-world email marketing dataset. Summary: The paper introduces an optimal algorithm for personalized interventions in marketing campaigns that uses attribution-based decision-making and approximate Bayesian learning to learn from consumer interactions.
Muskan Garg,MSVPJ Sathvik,Amrit Chadha,Shaina Raza,Sunghwan Sohn
http://arxiv.org/abs/2401.06709v1
Compressor summary: The paper presents a new dataset for analyzing low self-esteem in Reddit posts and suggests that current models should focus more on textual cues related to self-esteem rather than triggers or consequences of mental disturbances.
Sen Yang,Shujian Huang,Xinyu Dai,Jiajun Chen
http://arxiv.org/abs/2401.06706v1
Compressor summary: This paper improves speculative decoding by sampling multiple candidates and verifying them in batches, leading to higher acceptance rates for large language models.
Damien Robert,Hugo Raguet,Loic Landrieu
http://arxiv.org/abs/2401.06704v1
Compressor summary: SuperCluster is an efficient and scalable method for panoptic segmentation of large 3D point clouds using graph clustering and superpoint adaptation, achieving state-of-the-art results on multiple datasets while being much smaller and faster than previous methods.
Slavisa Tomic,João Pedro Matos-Carvalho,Marko Beko
http://arxiv.org/abs/2401.06699v1
Compressor summary: The paper proposes a weight optimization method for neural networks using least squares that is faster and easier to implement than existing methods.
Gantavya Bhatt,Yifang Chen,Arnav M. Das,Jifan Zhang,Sang T. Truong,Stephen Mussmann,Yinglun Zhu,Jeffrey Bilmes,Simon S. Du,Kevin Jamieson,Jordan T. Ash,Robert D. Nowak
http://arxiv.org/abs/2401.06692v1
Compressor summary: The paper proposes using experimental design techniques to reduce the annotation cost and increase the label efficiency for supervised finetuning on instruction datasets for large language models.
M. Erkin Yücel,Serkan Topaloğlu,Cem Ünsalan
http://arxiv.org/abs/2401.06690v1
Compressor summary: The study proposes an embedded system using computer vision and deep learning techniques for planogram compliance control in retail, achieving high F1 scores and working stand-alone for up to two years with energy harvesting options.
Giorgos Vernikos,Andrei Popescu-Belis
http://arxiv.org/abs/2401.06688v1
Compressor summary: QE-fusion is a method that uses quality estimation metrics to improve neural machine translation, achieving better results than existing techniques and scaling linearly with candidate diversity.
Jacob M. Chen,Rohit Bhattacharya,Katherine A. Keith
http://arxiv.org/abs/2401.06687v1
Compressor summary: The paper proposes a new causal inference method using text data and zero-shot models to handle unobserved confounding variables, which is novel and has low bias.
Yi Zeng,Hongpeng Lin,Jingwen Zhang,Diyi Yang,Ruoxi Jia,Weiyan Shi
http://arxiv.org/abs/2401.06373v1
Compressor summary: The paper proposes using social science research to generate persuasive prompts that can jailbreak large language models, highlighting the need for better defense mechanisms.
Wenbin Wang,Liang Ding,Li Shen,Yong Luo,Han Hu,Dacheng Tao
http://arxiv.org/abs/2401.06659v1
Compressor summary: WisdoM is a framework that uses large vision-language models to analyze images and text for sentiment analysis, incorporating contextual world knowledge and improving performance by +1.89 F1 score on average.
Stefan Blücher,Johanna Vielhaben,Nils Strodthoff
http://arxiv.org/abs/2401.06654v1
Compressor summary: The study proposes the R-OMS score to compare occlusion strategies in XAI and the SRG measure to resolve disagreements between MIF and LIF measures.
Le Thi Khanh Hien,Valentin Leplat,Nicolas Gillis
http://arxiv.org/abs/2401.06646v1
Compressor summary: BMMe is a new method to solve multi-convex optimization problems faster by updating parameters adaptively and using extrapolation, which is shown to be efficient for nonnegative matrix factorization with different $\beta$-divergences.
Ali Saeizadeh,Douglas Schonholtz,Daniel Uvaydov,Raffaele Guida,Emrecan Demirors,Pedram Johari,Jorge M. Jimenez,Joseph S. Neimat,Tommaso Melodia
http://arxiv.org/abs/2401.06644v1
Compressor summary: SeizNet is a closed-loop system that uses deep learning and implantable sensors to predict drug-resistant epileptic seizures with high accuracy and specificity, providing a better alternative to traditional treatments.
Jan Cegin,Branislav Pecher,Jakub Simko,Ivan Srba,Maria Bielikova,Peter Brusilovsky
http://arxiv.org/abs/2401.06643v1
Compressor summary: The study explores how using taboo words, hints from previous solutions, and chaining on previous solutions can increase text diversity in LLM-generated data and improve downstream models' performance.
Kanishka Misra,Allyson Ettinger,Kyle Mahowald
http://arxiv.org/abs/2401.06640v1
Compressor summary: Experimental contexts can improve language models' performance in predicting semantic properties of novel concepts, but this ability is inconsistent and relies on controlling the input examples and instructions.
Shuai Wang,Liang Ding,Li Shen,Yong Luo,Bo Du,Dacheng Tao
http://arxiv.org/abs/2401.06628v1
Compressor summary: The text introduces an OOP-focused code generation benchmark and evaluation metric, pass@o, which reveals the limitations of current LLMs in OOP.
Yihong Liu,Chunlan Ma,Haotian Ye,Hinrich Schütze
http://arxiv.org/abs/2401.06620v1
Compressor summary: TransliCo is a framework that fine-tunes pretrained language models to improve crosslingual transfer by contrasting sentences with their transliterations in a unified script.
Jan Schneider,Pierre Schumacher,Simon Guist,Le Chen,Daniel Häufle,Bernhard Schölkopf,Dieter Büchler
http://arxiv.org/abs/2401.06604v1
Compressor summary: This paper evaluates how exploiting low-dimensional gradient subspaces can improve the training efficiency of deep policy gradient methods in reinforcement learning.
Shangding Gu
http://arxiv.org/abs/2401.06603v1
Compressor summary: The study proposes a teacher-student learning framework where a large language model (LLM) acts as a teacher and a reinforcement learning (RL) model serves as a student, enabling both agents to improve each other through feedback and cooperation in challenging tasks.
Pengfei Zhu,Qian Wang,Yu Wang,Jialu Li,Qinghua Hu
http://arxiv.org/abs/2401.06595v1
Compressor summary: The paper proposes a graph clustering method that adapts the weights of self-supervised learning tasks for different nodes and fuses their embeddings, achieving state-of-the-art results.
Seongyun Lee,Seungone Kim,Sue Hyun Park,Geewook Kim,Minjoon Seo
http://arxiv.org/abs/2401.06591v1
Compressor summary: The authors propose Prometheus-Vision, an evaluator model that uses a new feedback dataset to assess long-form responses generated by Vision-Language Models based on user-defined criteria.
Tsegaye Misikir Tashu,Eduard-Raul Kontos,Matthia Sabatelli,Matias Valdenegro-Toro
http://arxiv.org/abs/2401.06583v1
Compressor summary: The study evaluates four multilingual transformer models in creating cross-lingual document representations to overcome limitations in recommending documents in different languages.
Qian Wang,Weiqi Li,Chong Mou,Xinhua Cheng,Jian Zhang
http://arxiv.org/abs/2401.06578v1
Compressor summary: The paper introduces a method to generate 360-degree panoramic videos from text prompts using a modified text-to-video diffusion model and a new panorama dataset called WEB360.
Xu Huang,Zhirui Zhang,Xiang Geng,Yichao Du,Jiajun Chen,Shujian Huang
http://arxiv.org/abs/2401.06568v1
Compressor summary: The study examines how large language models use source and reference information to evaluate translations, finding that reference information improves accuracy while source information sometimes hinders it, indicating a need for better cross-lingual capability in LLMs.
Ali Safa,Wout Mommen,Lars Keuninckx
http://arxiv.org/abs/2401.06563v1
Compressor summary: The paper presents a low-cost hand gesture recognition system using a thermal sensor and a Spiking Neural Network, achieving high accuracy and outperforming deep learning approaches.
Yuqi Zhang,Liang Ding,Lefei Zhang,Dacheng Tao
http://arxiv.org/abs/2401.06561v1
Compressor summary: The study proposes a defense strategy called Intention Analysis Prompting (IAPrompt) to reduce the harmfulness of large language models without compromising their helpfulness, by triggering their self-correct and improve ability.
Yusen Zhang
http://arxiv.org/abs/2401.06559v1
Compressor summary: This paper argues that a unified benchmark framework is needed to accurately evaluate dynamic graph learning models and improve their performance in various applications.
Ziqiang Cui,Xing Tang,Yang Qiao,Bowei He,Liang Chen,Xiuqiang He,Chen Ma
http://arxiv.org/abs/2401.06557v1
Compressor summary: TAHyper is a novel method that uses hyperbolic space and treatment-aware relationship identification to estimate hidden confounders in social networks for individual treatment effect estimation.
Chuanji Shi,Yingying Zhang,Jiaotuan Wang,Qiqi Zhu
http://arxiv.org/abs/2401.06550v1
Compressor summary: The paper proposes an algorithm using multimodal deep learning to detect urban area-of-interest fence polygons from remote sensing images and multi-semantics data, improving accuracy for mobile Internet businesses.
Chenyang Wang,Junjun Jiang,Xingyu Hu,Xianming Liu,Xiangyang Ji
http://arxiv.org/abs/2401.06548v1
Compressor summary: Key points: - Deep learning systems suffer from catastrophic forgetting when learning new tasks without access to old data - Data-free data replay methods invert samples from the classification model to reuse old data - Existing methods ignore the inconsistency of inverted and real data, which affects performance - The proposed CCIL method measures and reduces the data consistency problem using a novel loss function and class weight regularization Summary: CCIL is a new method that improves deep learning systems' continuity by measuring and reducing the inconsistency of inverting samples from the model to replay old data, which mitigates catastrophic forgetting.
Vandad Imani,Elaheh Moradi,Carlos Sevilla-Salcedo,Vittorio Fortino,Jussi Tohka
http://arxiv.org/abs/2401.06546v1
Compressor summary: The paper proposes NMFS-GA, a genetic algorithm to choose accurate and interpretable features for binary classification with noisy labels.
Ziying Song,Lin Liu,Feiyang Jia,Yadan Luo,Guoxin Zhang,Lei Yang,Li Wang,Caiyan Jia
http://arxiv.org/abs/2401.06542v1
Compressor summary: The text discusses the importance of evaluating 3D object detection methods for autonomous driving in terms of accuracy, latency, and robustness against environmental variations and weather changes.
Kaishuai Xu,Wenjun Hou,Yi Cheng,Jian Wang,Wenjie Li
http://arxiv.org/abs/2401.06541v1
Compressor summary: The IADDx framework generates medical dialogues that include a comprehensive differential diagnosis using retrieval-based intuitive association and graph-enhanced analytic reasoning, helping both clinicians and patients understand the diagnostic process.
Yutao Zhu,Peitian Zhang,Chenghao Zhang,Yifei Chen,Binyu Xie,Zhicheng Dou,Zheng Liu,Ji-Rong Wen
http://arxiv.org/abs/2401.06532v1
Compressor summary: This paper introduces INTERS, a new dataset for instruction tuning to improve large language models' performance in information retrieval tasks.
Elias Arbash,Margret Fuchs,Behnood Rasti,Sandra Lorenz,Pedram Ghamisi,Richard Gloaguen
http://arxiv.org/abs/2401.06528v1
Compressor summary: The paper presents PCB-Vision, a dataset of RGB and hyperspectral images for analyzing electronic waste composition to improve recycling efficiency and align with the UN Sustainable Development Goals.
Paloma Piot,Patricia Martín-Rodilla,Javier Parapar
http://arxiv.org/abs/2401.06526v1
Compressor summary: The paper presents MetaHate, a meta-collection of datasets for studying hate speech, to help develop better models for combating it online.
Subina Khanal,Seshu Tirupathi,Giulio Zizzo,Ambrish Rawat,Torben Bach Pedersen
http://arxiv.org/abs/2401.06524v1
Compressor summary: The paper proposes a one-step fine-tuning method to improve Transformers' performance in time series prediction for domains with limited data by incorporating source domain data gradually and using gradual unfreezing.
Yu Wang,Junxian Mu,Pengfei Zhu,Qinghua Hu
http://arxiv.org/abs/2401.06521v1
Compressor summary: MEDAF is a discriminative model that learns diverse representations to improve open set recognition and outperforms generative models with less computational cost.
Dmitry Ivanov,Omer Ben-Porat
http://arxiv.org/abs/2401.06514v1
Compressor summary: The paper proposes r-MDPs, a framework for balancing personalization and regulatory constraints in high-stakes fields like healthcare using deep reinforcement learning algorithms inspired by K-means clustering.
Yuanzhi Liang,Linchao Zhu,Yi Yang
http://arxiv.org/abs/2401.06509v1
Compressor summary: The authors propose a virtual setting using tabletop role-playing games to foster complex, context-rich interactions among agents and introduce AntEval, a framework to evaluate the informativeness and expressiveness of these interactions using novel metrics.
Chandler Timm Doloriel,Ngai-Man Cheung
http://arxiv.org/abs/2401.06506v1
Compressor summary: The paper proposes a novel deepfake detector using frequency masking in the self-supervised pre-training phase, which outperforms existing methods.
Chandler Timm C. Doloriel,Rhandley D. Cajote
http://arxiv.org/abs/2401.06503v1
Compressor summary: The paper proposes an Attention-Points Network that improves the detection of small oriented objects in aerial images using two losses: Guided-Attention Loss and Box-Points Loss.
Thibaud Leteno,Antoine Gourru,Charlotte Laclau,Christophe Gravier
http://arxiv.org/abs/2401.06495v1
Compressor summary: The paper investigates gender bias in BERT and DistilBERT, finding that bias is uniformly encoded by every attention head except a few in underrepresented classes, and that distillation may increase bias.
Tianyu Zheng,Shuyue Guo,Xingwei Qu,Jiawei Guo,Weixu Zhang,Xinrun Du,Chenghua Lin,Wenhao Huang,Wenhu Chen,Jie Fu,Ge Zhang
http://arxiv.org/abs/2401.06477v1
Compressor summary: Key points: - Kun is a novel approach to create instruction-tuning datasets for LLMs without manual annotations - It uses self-training, back-translation, and answer polishing with diverse unlabelled data sources - It improves data retention, clarity, and reduces manual annotation costs - It shows robustness and scalability on Yi model across various benchmarks - It has implications for LLM applications in diverse fields Summary: Kun is a new method that generates instruction-tuning datasets for large language models using self-training, back-translation, and answer polishing, without manual annotations. It enhances data quality, reduces costs, and improves LLM performance across different tasks.
Eytan Kats,Jochen G. Hirsch,Mattias P. Heinrich
http://arxiv.org/abs/2401.06473v1
Compressor summary: The paper presents a self-supervised framework for learning voxel-wise coarse-to-fine representations that balances global and local features, improves downstream tasks with limited annotations, and outperforms baselines.
Yuwei Wang,Yi Zeng
http://arxiv.org/abs/2401.06471v1
Compressor summary: The study develops a computational model for concept learning based on spiking neural networks that mimic human brain mechanisms and achieves human-like concept representations.
Kaiyi Zhang,Ang Lv,Yuhan Chen,Hansen Ha,Tao Xu,Rui Yan
http://arxiv.org/abs/2401.06469v1
Compressor summary: Batch-ICL is an efficient and order-agnostic inference algorithm for in-context learning that consistently outperforms most example sequences and reduces computational resources.
Minghao Wu,Thuy-Trang Vu,Lizhen Qu,George Foster,Gholamreza Haffari
http://arxiv.org/abs/2401.06468v1
Compressor summary: This paper investigates how to adapt large language models for document-level machine translation, finding that some specialized models outperform GPT-4 while others struggle with off-target translations.
Pedram Rostami,Ali Salemi,Mohammad Javad Dousti
http://arxiv.org/abs/2401.06466v1
Compressor summary: PersianMind is an open-source bilingual large language model that performs similarly to GPT-3.5-turbo in Persian by expanding LLaMa2's vocabulary and training it on a large Persian dataset.
Anna Hedström,Leander Weber,Sebastian Lapuschkin,Marina MC Höhne
http://arxiv.org/abs/2401.06465v1
Compressor summary: Smooth MPRT and Efficient MPRT are new adaptations of the Model Parameter Randomisation Test that address methodological caveats and improve reliability in eXplainable Artificial Intelligence evaluations.
Xiwei Xuan,Jorge Piazentin Ono,Liang Gou,Kwan-Liu Ma,Liu Ren
http://arxiv.org/abs/2401.06462v1
Compressor summary: AttributionScanner is a Visual Analytics system that helps evaluate machine learning models on images by finding interpretable subgroups with explainable features and enabling users to fix model issues using neural network regularization.
Jack D. Saunders,Alex A. Freitas
http://arxiv.org/abs/2401.06452v1
Compressor summary: The text introduces two new Automated Machine Learning systems for Positive-Unlabelled learning and evaluates them along with a previous system and other methods on various datasets.
Minjun Kim,Seungwoo Song,Youhan Lee,Haneol Jang,Kyungtae Lim
http://arxiv.org/abs/2401.06443v1
Compressor summary: The paper introduces a new bilingual dataset for visual question answering and proposes a framework to inject external knowledge into the system using graph embeddings.
Minxing Luo,Wentao Cheng,Jian Yang
http://arxiv.org/abs/2401.06442v1
Compressor summary: This paper introduces RotationDrag, a novel point-based image editing method that improves in-plane rotation accuracy by using feature maps of rotated images, and presents RotateBench, the first benchmark to evaluate this task on real and generated images.
Seitaro Ono,Yuka Ogino,Takahiro Toizumi,Atsushi Ito,Masato Tsukada
http://arxiv.org/abs/2401.06438v1
Compressor summary: The study presents a method to improve image recognition in low-light conditions by applying an adaptive image processing module and forecasting optimal parameters for it.
Thi Linh Hoang,Tuan Dung Pham,Viet Cuong Ta
http://arxiv.org/abs/2401.06436v1
Compressor summary: The paper presents a transformer-based GCN model with an improved encoder for node embedding and attention mechanism, which outperforms GCN in predicting social network ratings.
Changrong Xiao,Wenxing Ma,Sean Xin Xu,Kunpeng Zhang,Yufang Wang,Qi Fu
http://arxiv.org/abs/2401.06431v1
Compressor summary: The study shows that large language models like GPT-4 and fine-tuned GPT-3.5 can significantly improve automated essay scoring for second-language learners by providing accurate, consistent, generalizable, and interpretable feedback, as well as assisting human graders to perform better.
Huiyuan Fu,Kuilong Cui,Chuanming Wang,Mengshi Qi,Huadong Ma
http://arxiv.org/abs/2401.06430v1
Compressor summary: The paper proposes MDPR, a novel approach for person re-identification that extracts features from multiple perspectives using mutual distillation and fusion, achieving state-of-the-art results on widely used datasets.
Ji Liu,Dehua Tang,Yuanxian Huang,Li Zhang,Xiaocheng Zeng,Dong Li,Mingjie Lu,Jinzhang Peng,Yu Wang,Fan Jiang,Lu Tian,Ashish Sirasao
http://arxiv.org/abs/2401.06426v1
Compressor summary: Key points: - Traditional channel-wise pruning methods struggle to prune efficient CNN models with depth-wise convolutions and inverted residual blocks. - The paper proposes a novel depth pruning method that uses a block pruning strategy and progressive training for the subnet, and works on vision transformer models as well. - The method outperforms existing depth pruning methods and achieves state-of-the-art results on efficiency and performance. Summary: The paper presents a new depth pruning method that improves the efficiency and performance of efficient CNN models with depth-wise convolutions and inverted residual blocks, as well as vision transformer models.
Geethen Singh,Glenn Moncrieff,Zander Venter,Kerry Cawse-Nicholson,Jasper Slingsby,Tamara B Robinson
http://arxiv.org/abs/2401.06421v1
Compressor summary: Conformal prediction is a model-agnostic framework for uncertainty quantification that can improve the reliability of AI systems for Earth Observation applications without requiring access to the underlying model or training dataset.
Julie Kallini,Isabel Papadimitriou,Richard Futrell,Kyle Mahowald,Christopher Potts
http://arxiv.org/abs/2401.06416v1
Compressor summary: The study tests GPT-2's ability to learn synthetic impossible languages and finds that it struggles, challenging the claim that LLMs can learn any possible language.
Junuk Cha,Hansol Lee,Jaewon Kim,Nhat Nguyen Bao Truong,Jae Shin Yoon,Seungryul Baek
http://arxiv.org/abs/2401.06415v1
Compressor summary: The paper presents a novel pipeline to reconstruct 3D geometry of interacting people in clothing from a single image using priors for complete geometry and surface contacts, overcoming occlusion challenges.
Li Lucy,Suchin Gururangan,Luca Soldaini,Emma Strubell,David Bamman,Lauren Klein,Jesse Dodge
http://arxiv.org/abs/2401.06408v1
Compressor summary: The authors examine how different quality and language identification filters affect web text pretraining data, revealing implicit biases in data curation based on social and geographic contexts.
Lingchao Mao,Hairong Wang,Leland S. Hu,Nhan L Tran,Peter D Canoll,Kristin R Swanson,Jing Li
http://arxiv.org/abs/2401.06406v1
Compressor summary: Key points: - Machine learning can analyze multi-omics profiles and medical imaging for cancer diagnosis and prognosis - Machine learning models face challenges such as limited labeled samples, high-dimensionality data types, heterogeneity, and interpretability - Knowledge-informed machine learning integrates biomedical knowledge into data-driven models to improve accuracy, robustness, and interpretability - The paper reviews different forms of knowledge representation and integration strategies for four primary data types - The paper discusses future directions to advance cancer research through knowledge-informed machine learning Summary: The paper reviews how knowledge-informed machine learning, which integrates biomedical knowledge into data-driven models, can overcome challenges in cancer diagnosis and prognosis using multi-omics profiles and medical imaging.
Taehee Kim,Yeongjae Cho,Heejun Shin,Yohan Jo,Dongmyung Shin
http://arxiv.org/abs/2401.06400v1
Compressor summary: CoQAH is a new method that uses a sequence of QA interactions between a language model and a VQA model to answer human-written questions for images, achieving state-of-the-art accuracy without finetuning.
Sudhansu Bala Das,Leo Raphael Rodrigues,Tapas Kumar Mishra,Bidyut Kr. Patra
http://arxiv.org/abs/2401.06398v1
Compressor summary: The paper proposes an algorithm to remove mistranslations from a parallel dataset for Indian languages and evaluates its impact on neural machine translation quality.
Bowen Shi,Peisen Zhao,Zichen Wang,Yuhang Zhang,Yaoming Wang,Jin Li,Wenrui Dai,Junni Zou,Hongkai Xiong,Qi Tian,Xiaopeng Zhang
http://arxiv.org/abs/2401.06397v1
Compressor summary: UMG-CLIP is a new model that improves vision-language understanding by aligning local image regions with text tokens at different levels of detail, achieving state-of-the-art performance on various tasks.
Xinyu Wang,Bohan Zhuang,Qi Wu
http://arxiv.org/abs/2401.06395v1
Compressor summary: ModaVerse is a new multi-modal language model that can understand and transform images, videos, and audio using natural language without complex latent feature alignments, making it more efficient and cost-effective.
Wenyuan Zhang,Xinghua Zhang,Shiyao Cui,Kun Huang,Xuebin Wang,Tingwen Liu
http://arxiv.org/abs/2401.06394v1
Compressor summary: The paper proposes an Adaptive Data Augmentation framework to address data imbalance issues in aspect-based sentiment analysis by enhancing tail quad patterns and aspect categories.
Zhenlong Yuan,Jiakai Cao,Zhaoxin Li,Hao Jiang,Zhaoqi Wang
http://arxiv.org/abs/2401.06385v1
Compressor summary: SD-MVS is a new method that uses semantic segmentation, pixel deformation, and refinement techniques to reconstruct 3D models of textureless areas with high quality and efficiency.
Matthew L. Daggitt,Wen Kokke,Robert Atkey,Natalia Slusarz,Luca Arnaboldi,Ekaterina Komendantskaya
http://arxiv.org/abs/2401.06379v1
Compressor summary: Vehicle is a tool that helps verify neural-symbolic programs by linking problem-space properties to embedding-space properties, enabling formal verification of a simple self-driving car example.
Gordon Banks,Gates Bierhuizen,Katherine McCrum,Ellen Wengert
http://arxiv.org/abs/2401.06375v1
Compressor summary: ProcessGPT is an AI model that helps design better business processes considering human cognitive limitations, benefiting both people with and without disabilities.
Haoxuan Ding,Junyu Gao,Yuan Yuan,Qi Wang
http://arxiv.org/abs/2401.06374v1
Compressor summary: The paper introduces SamLP, a license plate detector based on a vision foundation model (SAM) that leverages few-shot and zero-shot learning for diverse LP styles and appearances.
Xiaoyu Liu,Yueyi Zhang,Zhiwei Xiong,Wei Huang,Bo Hu,Xiaoyan Sun,Feng Wu
http://arxiv.org/abs/2401.06370v1
Compressor summary: The text proposes a graph relation distillation method for efficient biomedical instance segmentation that transfers knowledge from heavy to lightweight networks using instance and pixel relation graphs, achieving high performance with significantly reduced parameters and inference time.
Md Arafat Sultan,Aashka Trivedi,Parul Awasthy,Avirup Sil
http://arxiv.org/abs/2401.06356v1
Compressor summary: This study examines how different configuration parameters in knowledge distillation affect student performance, identifying an optimal configuration for various natural language processing tasks and student sizes.
Chang Yu,Junran Peng,Xiangyu Zhu,Zhaoxiang Zhang,Qi Tian,Zhen Lei
http://arxiv.org/abs/2401.06345v1
Compressor summary: The paper proposes a method to improve diffusion models' image generation by learning proper textual descriptions using quality and semantic guidance from pre-trained models.
Weizheng Wang,Le Mao,Baijian Yang,Guohua Chen,Byung-Cheol Min
http://arxiv.org/abs/2401.06344v1
Compressor summary: Hyper-STTN is a hypergraph-based method for predicting crowd trajectories that captures both pair-wise and group-wise interactions using spectral convolution and multimodal transformers.
Shengyi Qian,Weifeng Chen,Min Bai,Xiong Zhou,Zhuowen Tu,Li Erran Li
http://arxiv.org/abs/2401.06341v1
Compressor summary: The paper proposes a model that leverages large-scale vision language models to improve affordance grounding tasks, achieving better performance on in-the-wild object affordance grounding and handling unseen objects and actions.
Banafshe Felfeliyan,Yuyue Zhou,Shrimanti Ghosh,Jessica Kupper,Shaobo Liu,Abhilash Hareendranathan,Jacob L. Jaremko
http://arxiv.org/abs/2401.06331v1
Compressor summary: The study explores using Vision Language Processing models to predict osteoarthritis severity from X-ray images and reports, potentially improving diagnosis and paving the way for specialized AI in medicine.
Jiaxin Wang,Lingling Zhang,Jun Liu,Tianlin Guo,Wenjun Wu
http://arxiv.org/abs/2401.06327v1
Compressor summary: The paper introduces a new task called GRD that involves finding novel relations or clustering instances in existing relations using semi-factual examples and proposes a framework (SFGRD) that outperforms current models.
Wonjune Kang,Yun Wang,Shun Zhang,Arthur Hinsvark,Qing He
http://arxiv.org/abs/2401.06321v1
Compressor summary: The paper proposes a multi-task learning model for text-to-speech tasks that uses shared representations and pre-trained language embeddings to improve text normalization, part-of-speech tagging, and homograph disambiguation performance.
Yaowei Hu,Jacob Lear,Lu Zhang
http://arxiv.org/abs/2401.06318v1
Compressor summary: The paper proposes an algorithmic framework to integrate fairness and reinforcement learning in dynamic systems with sequential decisions.
Xingyu Zhou,Leheng Zhang,Xiaorui Zhao,Keze Wang,Leida Li,Shuhang Gu
http://arxiv.org/abs/2401.06312v1
Compressor summary: The paper proposes MIA-VSR, a feature-level masked processing framework for video super-resolution that reduces redundant computations and improves memory and computation efficiency.
Akshita Jha,Vinodkumar Prabhakaran,Remi Denton,Sarah Laszlo,Shachi Dave,Rida Qadri,Chandan K. Reddy,Sunipa Dev
http://arxiv.org/abs/2401.06310v1
Compressor summary: This study evaluates visual stereotypes in Text-to-Image models for 135 nationality-based identity groups and finds that they are often present, offensive, and similar across different attributes.
Shangqing Xu,Chao Zhang
http://arxiv.org/abs/2401.06301v1
Compressor summary: In-Context Reflection (ICR) is a method that selects and refines demonstrations for large language models to improve their learning efficiency and generalization across tasks.