This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-02-27 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2402.16508v1
Compressor summary: The paper presents a single encoder-decoder model for cross-lingual question answering (CLQA) that uses self-supervision from Wikipedia's link structure to perform retrieval and answer generation without auxiliary resources like machine translation.
http://arxiv.org/abs/2402.16506v1
Compressor summary: The paper proposes SCDM, a robust conditional diffusion model for semantic image synthesis with noisy labels, which enhances robustness by stochastically perturbing the semantic label maps through Label Diffusion and using a class-wise noise schedule.
http://arxiv.org/abs/2402.16505v1
Compressor summary: The Tulving Test measures memory tasks to evaluate a model of human recall, and explores if this model applies to LLMs' remembering abilities.
http://arxiv.org/abs/2402.16499v1
Compressor summary: LLMArena is a framework for testing large language models' abilities in multi-agent dynamics using seven gaming environments and Trueskill scoring.
http://arxiv.org/abs/2402.16486v1
Compressor summary: The text describes a novel AI method that can accurately identify both known and unknown military and civilian aircraft from low-resolution images, overcoming the limitations of traditional methods.
http://arxiv.org/abs/2402.16482v1
Compressor summary: The Lang2Sim framework uses functionalized language models to transform textual descriptions of material simulations into executable code, enabling interactive navigation and efficient programming.
http://arxiv.org/abs/2402.16479v1
Compressor summary: This paper proposes a binary edge feature branch for deep convolutional neural networks to improve their robustness against adversarial examples by incorporating shape-like features with texture features.
http://arxiv.org/abs/2402.16473v1
Compressor summary: DCVSMNet is a fast stereo matching network that uses two small cost volumes to produce accurate results with some trade-offs in inference time.
http://arxiv.org/abs/2402.16472v1
Compressor summary: mEdIT is a multi-lingual text editing model that takes user instructions and uses pre-trained language models to perform tasks like error correction, simplification, and paraphrasing across diverse languages.
http://arxiv.org/abs/2402.16470v1
Compressor summary: The paper proposes HackAttend, a perturbation technique that attacks PLMs by altering attention scores in self-attention mechanisms, and introduces S-Attend, a smoothing technique that makes SA robust to such attacks.
http://arxiv.org/abs/2402.16463v1
Compressor summary: DOL-RM is an online task scheduling algorithm that optimizes performance by estimating rewards, costs, and arrival distributions under uncertainty.
http://arxiv.org/abs/2402.16459v1
Compressor summary: The paper proposes a backtranslation method to reveal the hidden intent of original prompts and defend language models against jailbreaking attacks.
http://arxiv.org/abs/2402.16458v1
Compressor summary: ID-XCB is a new technique that reduces biases in cyberbullying detection by ignoring swear words without harming performance, and it works well on different datasets.
http://arxiv.org/abs/2402.16457v1
Compressor summary: This paper introduces RetrievalQA, a benchmark to evaluate adaptive retrieval-augmented generation methods, and proposes Time-Aware Adaptive Retrieval (TA-ARE), which improves the efficiency and relevance of sourced information.
http://arxiv.org/abs/2402.16444v1
Compressor summary: ShieldLM is a safety detector for Large Language Models that aligns with human safety standards, supports customization, and provides explanations.
http://arxiv.org/abs/2402.16442v1
Compressor summary: The paper proposes a distributed bounding algorithm for subset selection problems with provable approximation guarantees and shows its effectiveness on large-scale datasets.
http://arxiv.org/abs/2402.16438v1
Compressor summary: The paper proposes a method to identify language-specific regions in Transformer architectures of large language models (LLMs) and demonstrates how to control their output language by activating or deactivating these regions.
http://arxiv.org/abs/2402.16435v1
Compressor summary: The authors propose a discriminator-free method for training one-dimensional and multivariate generative implicit models that learns complex data distributions and avoids mode-dropping issues.
http://arxiv.org/abs/2402.16431v1
Compressor summary: The paper proposes using code-style instructions to improve LLMs' robustness to adversarial samples and introduces a novel method for few-shot learning with both clean and adversarial contexts.
http://arxiv.org/abs/2402.16424v1
Compressor summary: COMAE is a method for improving zero-shot hashing by exploring locality relationships and utilizing continuous-value attributes, achieving better performance on large-scale retrieval scenarios.
http://arxiv.org/abs/2402.16421v1
Compressor summary: Key points: - The paper proposes a method to generate new instance segmentation images by filling out masked areas with desired object classes using diffusion-based inpainting and object outline guidance. - The method preserves mask annotations, shape characteristics, and introduces diversity within the augmented area. - The method can be combined with text guidance and other image augmentation techniques. Summary: The paper presents a technique to create diverse instance segmentation datasets by inpainting object classes into masked regions using object outlines as guidance, while keeping mask annotations and shape features intact.
http://arxiv.org/abs/2402.16420v1
Compressor summary: The paper proposes a method to generate training data using PaLM 2 and train smaller models to predict SDGs for university courses, achieving an F1-score of 0.786.
http://arxiv.org/abs/2402.16412v1
Compressor summary: The paper introduces TOTEM, a method to represent time series data with discrete vectors, enabling generalist training across tasks and domains without tuning.
http://arxiv.org/abs/2402.16407v1
Compressor summary: The paper proposes a simple depth-aware consistency method for NeRF that uses layered representations and constrained rendering to improve novel view synthesis when few input views are available.
http://arxiv.org/abs/2402.16406v1
Compressor summary: The text evaluates the use of large language models (LLMs) to generate clinical trial protocols and shows that combining them with retrieval-augmented generation can improve their quality significantly.
http://arxiv.org/abs/2402.16402v1
Compressor summary: The paper introduces Distributional Edge Layouts (DELs), a pre-processing method for Graph Neural Networks (GNNs) that samples edge layouts using Langevin dynamics and Boltzmann distribution, improving GNN performance on various tasks.
http://arxiv.org/abs/2402.16399v1
Compressor summary: The paper investigates how eye movement biometrics using machine learning are affected by input data variations in terms of temporal persistence, reliability, and efficacy.
http://arxiv.org/abs/2402.16392v1
Compressor summary: The paper introduces Placing Objects in Context (POC), a pipeline that can realistically add any object into any image, and shows how it can improve anomaly segmentation and learning new classes for semantic segmentation models.
http://arxiv.org/abs/2402.16389v1
Compressor summary: The paper introduces MoZIP, a new benchmark for evaluating large language models in the intellectual property domain, and presents MoZi, a multilingual IP-oriented model that outperforms other LLMs on the benchmark.
http://arxiv.org/abs/2402.16387v1
Compressor summary: This paper explores the generalization ability of different Temporal Graph Learning (TGL) algorithms and proposes Simplified-Temporal-Graph-Network with improved performance and lower complexity.
http://arxiv.org/abs/2402.16383v1
Compressor summary: The paper proposes a novel end-to-end deep learning framework for multi-view clustering that learns fused data representations and cluster assignments simultaneously.
http://arxiv.org/abs/2402.16382v1
Compressor summary: The paper proposes "Immunization conditions" as a framework for defending against harmful fine-tuning attacks on large language models (LLMs) by bad actors.
http://arxiv.org/abs/2402.16379v1
Compressor summary: The TER system helps improve machine translations by using human feedback to correct errors in large language models across different languages.
http://arxiv.org/abs/2402.16374v1
Compressor summary: The text discusses graph learning methods that handle distribution shifts in real-world data, categorizes them into different scenarios, and provides a survey of existing approaches and future directions.
http://arxiv.org/abs/2402.16370v1
Compressor summary: Step-by-step training improves object detection by initializing the backbone with a classic detector, then training the decoder from scratch, achieving real-time performance and accuracy without extra data.
http://arxiv.org/abs/2402.16369v1
Compressor summary: This paper surveys generative AI diffusion models, their techniques, applications, and challenges across various domains, highlighting their potential for creative tasks and data augmentation.
http://arxiv.org/abs/2402.16367v1
Compressor summary: The study analyzes how large language models process multiple languages using a Mixture of Experts architecture, discovering both non-language-specific and language-specific neurons that can be used to improve performance and guide model training.
http://arxiv.org/abs/2402.16366v1
Compressor summary: The paper introduces SPC-NeRF, a new compression technique for Neural Radiance Fields using spatial predictive coding, which achieves better efficiency and quality than existing methods.
http://arxiv.org/abs/2402.16364v1
Compressor summary: The paper introduces the Rendezvous (RVS) task and dataset for studying geospatial instructions with map knowledge that involve more complex spatial relations than previous navigation benchmarks.
http://arxiv.org/abs/2402.16363v1
Compressor summary: This paper surveys the current state of research on efficient Large Language Model inference, introducing a roofline model-based framework to analyze and compare various techniques, and provides valuable insights for practical implementation.
http://arxiv.org/abs/2402.16361v1
Compressor summary: LR-Drop is a new technique for Transformer-based language models that uses consistency training to regularize dropout at the output layer, leading to improved performance on various natural language understanding and generation tasks.
http://arxiv.org/abs/2402.16359v1
Compressor summary: The paper proposes a new RL method to efficiently explore and optimize complex data distributions using diffusion models.
http://arxiv.org/abs/2402.16358v1
Compressor summary: The paper introduces a data processing framework to improve foundation models' performance by refining their pretraining data quality using operators at different levels, probing, and evaluation tools.
http://arxiv.org/abs/2402.16356v1
Compressor summary: The study examines how text design on book covers affects our understanding of book genres using semantic information and visual design.
http://arxiv.org/abs/2402.16354v1
Compressor summary: The algorithm uses LLMs to segment trajectories and then merges them to discover reusable skills for agents, improving their performance on new tasks.
http://arxiv.org/abs/2402.16352v1
Compressor summary: MathGenie is a new method to create diverse and reliable math problems from seed data, using augmentation, back-translation, and rationale-based verification, achieving state-of-the-art performance in mathematical reasoning.
http://arxiv.org/abs/2402.16350v1
Compressor summary: Impression-CLIP is a machine-learning model that uses CLIP to co-embed font images and their impressions for cross-modal retrieval, achieving better accuracy than existing methods and being robust to noise and missing tags.
http://arxiv.org/abs/2402.16349v1
Compressor summary: Controlled-GAIL (C-GAIL) is a new algorithm that uses control theory to improve the stability and efficiency of imitation learning by GANs.
http://arxiv.org/abs/2402.16347v1
Compressor summary: The paper introduces CodeS, an open-source series of pre-trained language models designed for the text-to-SQL task, which outperforms existing closed-source models in accuracy and robustness while having smaller parameter sizes.
http://arxiv.org/abs/2402.16346v1
Compressor summary: The paper proposes a novel method to improve graph neural networks by injecting global topological invariance into pooling layers using persistent homology.
http://arxiv.org/abs/2402.16342v1
Compressor summary: The paper proposes a bi-level MDP framework for efficient and flexible rover mission contingency planning, addressing computational challenges in stochastic scenarios.
http://arxiv.org/abs/2402.16338v1
Compressor summary: BLO-SAM is a model that improves semantic segmentation by automatically identifying objects without manual prompts and reducing overfitting risk through bi-level optimization.
http://arxiv.org/abs/2402.16324v1
Compressor summary: The paper proposes a new framework for analyzing constrained Markov decision processes (CMDPs), deriving optimal regret bounds and improving sample complexity by operating in the primal space and resolving the primal LP online with adaptive resource capacities.
http://arxiv.org/abs/2402.16319v1
Compressor summary: The paper proposes a novel data-free method for compressing large language models' parameters by using rank-k approximation, achieving significant parameter reduction while preserving performance.
http://arxiv.org/abs/2402.16318v1
Compressor summary: The paper proposes a method to improve multimodal learning with incomplete data by reducing modality dominance and decoupling modalities, achieving better performance on three benchmarks.
http://arxiv.org/abs/2402.16315v1
Compressor summary: Instruction-tuned Large Vision-Language Models struggle with fine-grained visual categorization and explanation due to modality gap, while Finer benchmark aims to improve evaluation of their abilities.
http://arxiv.org/abs/2402.16313v1
Compressor summary: The Chain-of-Discussion framework uses multiple open-source LLMs to improve the correctness and comprehensiveness of answers for open-ended question answering tasks.
http://arxiv.org/abs/2402.16311v1
Compressor summary: The paper introduces a new SPS parsing method that uses large language models and self-training to adapt to different domains, improving performance over rule-based approaches.
http://arxiv.org/abs/2402.16310v1
Compressor summary: REPLAY is a new RNN architecture for location prediction that incorporates timestamp embeddings with adaptive bandwidths to capture time-varying temporal regularities in human mobility data.
http://arxiv.org/abs/2402.16305v1
Compressor summary: The paper proposes a training-free method to improve text-image alignment in Diffusion Probabilistic Models by using discriminative Vision-Language Models and Score Distillation Sampling, achieving near state-of-the-art performance on T2I-Compbench.
http://arxiv.org/abs/2402.16302v1
Compressor summary: Graph Diffusion Policy Optimization (GDPO) is a new method that uses reinforcement learning to optimize graph diffusion models for arbitrary objectives, achieving state-of-the-art performance in various graph generation tasks.
http://arxiv.org/abs/2402.16300v1
Compressor summary: The paper proposes a new selective regression method called conformalized selective regression that uses model-specific biases to measure uncertainty and evaluate its performance.
http://arxiv.org/abs/2402.16298v1
Compressor summary: The paper proposes a new transformer-based multi-view network for breast cancer classification that leverages inter-view correlations using a novel dynamic attention block.
http://arxiv.org/abs/2402.16297v1
Compressor summary: The paper proposes a new Bayesian model for count time series that can adapt to changing dynamics over time, and shows that it performs better than existing models in predicting future values.
http://arxiv.org/abs/2402.16291v1
Compressor summary: The proposed multi-scale Attention Pyramid module (mAPm) enhances object detection by integrating dilated convolutions, global self-attention, and refined up-sampling into the Feature Pyramid Network (FPN), achieving significant improvements in scale-variant tasks like Rice Leaf Disease detection.
http://arxiv.org/abs/2402.16288v1
Compressor summary: PerLTQA is a dataset for question answering that incorporates personalized memories, focusing on social interactions and events, to enhance dialogues using a novel framework for memory classification, retrieval, and synthesis.
http://arxiv.org/abs/2402.16280v1
Compressor summary: The paper proposes a meta-learning based framework for nucleus instance segmentation using few-shot learning and structural guidance, which achieves high performance with minimal annotations.
http://arxiv.org/abs/2402.16278v1
Compressor summary: The paper proposes a self-matching training method to improve concept subsumption prediction in ontologies using InME and CoME embeddings, which capture global and local information in annotation axioms.
http://arxiv.org/abs/2402.16269v1
Compressor summary: The paper proposes DOCP, an AI tool that uses LLMs to help users create and solve optimization models for business problems in natural language.
http://arxiv.org/abs/2402.16268v1
Compressor summary: The authors propose Foundation Model Transparency Reports based on 6 design principles and 100 indicators to ensure responsible development and deployment of AI models.
http://arxiv.org/abs/2402.16267v1
Compressor summary: The paper proposes using natural language to express the objective of infrared-visible image fusion, improving performance by encoding texts into a multi-modal embedding space and constructing a language-driven fusion model with a supervised loss function.
http://arxiv.org/abs/2402.16261v1
Compressor summary: The text proposes a multi-task framework with a dual-encoder architecture to act as a universal retriever for persona, knowledge, and response selection in conversational retrieval systems, improving efficiency and performance.
http://arxiv.org/abs/2402.16249v1
Compressor summary: SeqTrack3D is a novel Sequence-to-Sequence tracker that combines point clouds and bounding boxes to improve 3D single object tracking performance, especially in scenes with sparse points.
http://arxiv.org/abs/2402.16248v1
Compressor summary: The paper proposes a novel copy mechanism model for generating diverse and coherent paragraphs from given topics, and introduces an improved prefix tuning method and a new Chinese dataset for this task.
http://arxiv.org/abs/2402.16247v1
Compressor summary: The text introduces a new AI challenge called CLAP, where a 'joiner' agent learns communication strategies by imitating or translating existing interactions in a target community.
http://arxiv.org/abs/2402.16246v1
Compressor summary: The paper presents a system that uses drones and deep learning to collect and analyze urban traffic data in real-time on mobile devices.
http://arxiv.org/abs/2402.16242v1
Compressor summary: The text proposes a new method (HSONet) to improve change detection models by addressing imbalance and missingness issues in learning hard cases using equilibrium optimization and scene context.
http://arxiv.org/abs/2402.16237v1
Compressor summary: Our novel algorithm finds level sets in continuous search spaces without discretization, using a confidence-based acquisition function and achieving theoretical convergence and superior performance on various datasets.
http://arxiv.org/abs/2402.16230v1
Compressor summary: GARNNs are interpretable graph attentive recurrent neural networks that predict future blood glucose levels in diabetics using sensor and self-reported data, while explaining variable importance and generating feature maps, outperforming existing methods in accuracy and interpretability.