arxiv compressed, 2024-08-05

This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-08-05 generated by the compressor, my personal LLM-based project.

Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting

http://arxiv.org/abs/2408.01423v1

Compressor summary: The text proposes a new Prompt Recursive Search (PRS) framework that improves large language models' NLP performance by adjusting prompts based on problem complexity and structure.

Mission Impossible: A Statistical Perspective on Jailbreaking LLMs

http://arxiv.org/abs/2408.01420v1

Compressor summary: The paper studies how large language models can be influenced to generate harmful or undesired behaviours, such as leaking information or spreading fake news, through a process called jailbreaking, and proposes a new alignment strategy called E-RLHF that aims to increase the likelihood of safe responses.

DebateQA: Evaluating Question Answering on Debatable Knowledge

http://arxiv.org/abs/2408.01419v1

Compressor summary: DebateQA is a new dataset for evaluating LLM chatbots' ability to answer debatable questions with multiple perspectives, using two metrics: Perspective Diversity and Dispute Awareness.

Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs

http://arxiv.org/abs/2408.01417v1

Compressor summary: The paper introduces ICCA, a framework to evaluate if large language models adapt their communication efficiency during interactions like humans do, and finds that current models don't show this ability without prompting.

The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability

http://arxiv.org/abs/2408.01416v1

Compressor summary: The paper proposes a unified perspective on interpretability research using causal mediation analysis and suggests focusing on discovering new mediators with better trade-offs between human-interpretability and compute-efficiency.

Conditional LoRA Parameter Generation

http://arxiv.org/abs/2408.01415v1

Compressor summary: COND P-DIFF is a method that generates high-performance neural network parameters based on task conditions using an autoencoder and a conditional latent diffusion model.

Derivation of Back-propagation for Graph Convolutional Networks using Matrix Calculus and its Application to Explainable Artificial Intelligence

http://arxiv.org/abs/2408.01408v1

Compressor summary: The paper derives the backpropagation algorithm for graph convolutional networks using matrix calculus, shows its effectiveness in node classification and link prediction tasks, and discusses its potential for explainability and sensitivity analysis.

Pre-trained Language Models Improve the Few-shot Prompt Ability of Decision Transformer

http://arxiv.org/abs/2408.01402v1

Compressor summary: LPDT is a new method for offline reinforcement learning that uses pre-trained language models and low-rank adaptation to improve performance and task distinction with prompts.

Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features

http://arxiv.org/abs/2408.01394v1

Compressor summary: The paper proposes a method to improve multilingual neural machine translation by exploiting semantic and linguistic features from multiple languages using disentangling learning and linguistic encoder tasks.

NOLO: Navigate Only Look Once

http://arxiv.org/abs/2408.01384v1

Compressor summary: NOLO learns a video navigation policy from context videos without finetuning or re-training, using optical flow and offline reinforcement learning.

Explaining a probabilistic prediction on the simplex with Shapley compositions

http://arxiv.org/abs/2408.01382v1

Compressor summary: Shapley compositions use the Aitchison geometry to explain multiclass probabilistic predictions by quantifying the contribution of each feature's value, satisfying axiomatic properties like linearity and efficiency.

Coalitions of Large Language Models Increase the Robustness of AI Agents

http://arxiv.org/abs/2408.01380v1

Compressor summary: The text discusses using a group of specialized large language models to perform tasks more efficiently than one single model, reducing the need for fine-tuning and improving robustness.

Adaptive Recruitment Resource Allocation to Improve Cohort Representativeness in Participatory Biomedical Datasets

http://arxiv.org/abs/2408.01375v1

Compressor summary: The text discusses how participatory biomedical studies can improve representation in datasets by using a computational approach to allocate recruitment resources and shows its effectiveness in simulated studies.

Hybrid Coordinate Descent for Efficient Neural Network Learning Using Line Search and Gradient Descent

http://arxiv.org/abs/2408.01374v1

Compressor summary: The paper introduces a new algorithm that combines line search and gradient methods for faster and more efficient parameter updates in neural networks.

Spatial-Spectral Morphological Mamba for Hyperspectral Image Classification

http://arxiv.org/abs/2408.01372v1

Compressor summary: The MorpMamba model combines spatial-spectral tokens, morphology blocks, and self-attention to achieve efficient and effective hyperspectral image classification, surpassing CNN and Transformer models.

EVIT: Event-based Visual-Inertial Tracking in Semi-Dense Maps Using Windowed Nonlinear Optimization

http://arxiv.org/abs/2408.01370v1

Compressor summary: Event cameras track objects using maps from other sensors, and adding inertial signals improves performance in changing lighting and dynamics.

Transformers are Universal In-context Learners

http://arxiv.org/abs/2408.01367v1

Compressor summary: This paper shows that deep transformers can handle an unlimited number of context tokens and approximate continuous in-context mappings with high precision using a fixed embedding dimension, number of heads, and MLP layers.

Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs

http://arxiv.org/abs/2408.01355v1

Compressor summary: The paper introduces Hallu-PI, a benchmark to evaluate hallucination in large language models on perturbed images with various types of hallucinations and tasks.

HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate Prediction

http://arxiv.org/abs/2408.01332v1

Compressor summary: HMDN is a model that can handle diverse distributions by efficiently capturing hierarchical relationships among different types of distributions, improving recommendation performance.

UnifiedNN: Efficient Neural Network Training on the Cloud

http://arxiv.org/abs/2408.01331v1

Compressor summary: UnifiedNN is a framework for efficiently training multiple neural network models simultaneously on the cloud, reducing memory and training time costs without sacrificing accuracy.

FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only

http://arxiv.org/abs/2408.01323v1

Compressor summary: FANNO is a framework that uses a large language model to create diverse and high-quality instruction datasets without the need for manual annotations or costly API calls.

A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks

http://arxiv.org/abs/2408.01319v1

Compressor summary: The paper explores the applications, advantages, limitations, and future directions of multimodal large language models (MLLMs) that integrate diverse data types in AI systems.

TopoNAS: Boosting Search Efficiency of Gradient-based NAS via Topological Simplification

http://arxiv.org/abs/2408.01311v1

Compressor summary: TopoNAS is a model-agnostic method that simplifies the search space for one-shot Neural Architecture Search, reducing time and memory usage while maintaining accuracy.

Reconsidering Token Embeddings with the Definitions for Pre-trained Language Models

http://arxiv.org/abs/2408.01308v1

Compressor summary: DefinitionEMB uses definitions from Wiktionary to create isotropic and meaningful token embeddings for PLMs that maintain robustness during fine-tuning and improve performance on various tasks.

Decentralized Smoothing ADMM for Quantile Regression with Non-Convex Sparse Penalties

http://arxiv.org/abs/2408.01307v1

Compressor summary: This paper proposes a new method for analyzing IoT data called DSAD, which improves identification of significant predictors and ensures uniform model performance across distributed nodes.

Optimal Mixed Integer Linear Optimization Trained Multivariate Classification Trees

http://arxiv.org/abs/2408.01297v1

Compressor summary: Key points: - Multivariate decision trees are machine learning tools for classification and regression - Optimal binary trees are obtained by solving a biobjective optimization problem - The paper proposes two cut-based MILO formulations for designing optimal binary classification trees - The models use minimal infeasible subsystems to derive cutting planes - The models have theoretical and empirical advantages over existing methods Summary: The paper presents novel cut-based MILO formulations for designing optimal binary classification trees, which are machine learning tools that balance classification accuracy and tree complexity using minimal infeasible subsystems as cutting planes.

Feature Clock: High-Dimensional Effects in Two-Dimensional Plots

http://arxiv.org/abs/2408.01294v1

Compressor summary: Feature Clock is a new tool that simplifies the explanation of high-dimensional data in two-dimensional plots, making it easier to understand complex relationships.

Underwater Object Detection Enhancement via Channel Stabilization

http://arxiv.org/abs/2408.01293v1

Compressor summary: The paper presents a novel method for underwater object detection using Detectron2, image enhancement, and channel stabilization techniques to improve accuracy in detecting marine trash.

TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling

http://arxiv.org/abs/2408.01291v1

Compressor summary: TexGen is a novel framework for generating high-quality 3D textures from textual descriptions using multi-view sampling, attention guidance, and noise resampling.

Contribution-based Low-Rank Adaptation with Pre-training Model for Real Image Restoration

http://arxiv.org/abs/2408.01099v1

Compressor summary: Key points: - The paper proposes a new efficient parameter tuning method (CoLoRA) and pre-training strategy (PROD) for low-level computer vision tasks such as image restoration. - CoLoRA fine-tunes only a small amount of parameters for each task using low-rank adaptation and contribution-based method. - PROD uses random order degradations to extend the capability of pre-trained models and improve performance and robustness. Summary: The paper introduces CoLoRA, a novel efficient parameter tuning method that adapts to different image restoration tasks with low-rank adaptation and contribution-based method, and PROD, a pre-training strategy that uses random order degradations to enhance the performance and robustness of pre-trained models.

An Encoding--Searching Separation Perspective on Bi-Encoder Neural Search

http://arxiv.org/abs/2408.01094v1

Compressor summary: The paper proposes a new perspective on the bi-encoder architecture for neural search by separating the encoding and searching operations to address its limitations and improve performance.

Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions

http://arxiv.org/abs/2408.01091v1

Compressor summary: The text introduces a new benchmark for testing large multimodal models' ability to handle self-contradictory instructions and proposes a method to improve their performance in recognizing conflicting commands.

General-purpose Dataflow Model with Neuromorphic Primitives

http://arxiv.org/abs/2408.01090v1

Compressor summary: Neuromorphic dataflow is a tailored dataflow model for neuromorphic hardware that allows general-purpose programs to run efficiently and flexibly on spiking neural networks.

Prototypical Partial Optimal Transport for Universal Domain Adaptation

http://arxiv.org/abs/2408.01089v1

Compressor summary: The paper proposes a novel approach called mini-batch Prototypical Partial Optimal Transport (m-PPOT) for universal domain adaptation, which partially aligns two distributions and distinguishes "known" and "unknown" samples using reweighted losses.

Bridging Information Gaps in Dialogues With Grounded Exchanges Using Knowledge Graphs

http://arxiv.org/abs/2408.01088v1

Compressor summary: The text studies how large language models can help dialogue systems bridge information gaps by understanding natural language expressions and connecting them to internal knowledge, using a new corpus called BridgeKG.

Adaptive Contrastive Decoding in Retrieval-Augmented Generation for Handling Noisy Contexts

http://arxiv.org/abs/2408.01084v1

Compressor summary: Adaptive contrastive decoding improves open-domain question answering by handling noisy contexts better than previous methods.

FCDFusion: a Fast, Low Color Deviation Method for Fusing Visible and Infrared Image Pairs

http://arxiv.org/abs/2408.01080v1

Compressor summary: The paper introduces FCDFusion, a fast and accurate image fusion method that preserves color information without color space transformations and with low computational cost.

PhysMamba: Leveraging Dual-Stream Cross-Attention SSD for Remote Physiological Measurement

http://arxiv.org/abs/2408.01077v1

Compressor summary: PhysMamba is a dual-stream time-frequency interactive model that uses Cross-Attention State Space Duality to improve information exchange and feature complementarity for robust remote heart rate monitoring in noisy real-world environments.

Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning

http://arxiv.org/abs/2408.01076v1

Compressor summary: Our method uses text embeddings to capture semantic similarity and integrate semantic guidance within and across tasks for continual learning with DNNs, improving performance on general and fine-grained datasets.

A Survey on Self-play Methods in Reinforcement Learning

http://arxiv.org/abs/2408.01072v1

Compressor summary: This paper explains self-play in reinforcement learning, its preliminaries, classifications, applications, challenges, and future directions.

Leveraging Large Language Models for Mobile App Review Feature Extraction

http://arxiv.org/abs/2408.01063v1

Compressor summary: The study proposes a new method to improve feature extraction from mobile app reviews using encoder-only Transformer models with extended pre-training and instance selection techniques.

From Stem to Stern: Contestability Along AI Value Chains

http://arxiv.org/abs/2408.01051v1

Compressor summary: The workshop will create a community of researchers, synthesize a roadmap for contestable AI, and facilitate interdisciplinary dialogue on opportunities and challenges in AI value chains.

QUDSELECT: Selective Decoding for Questions Under Discussion Parsing

http://arxiv.org/abs/2408.01046v1

Compressor summary: QUDSELECT is a new approach to QUD parsing that jointly trains models to predict anchor sentences and generate questions while considering theoretical criteria using selective decoding and instruction-tuning.

UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents

http://arxiv.org/abs/2408.01038v1

Compressor summary: UNER is a query-aware entity extraction head that collaborates with multi-modal document transformers to improve named entity recognition (NER) in visually-rich documents, addressing complex layouts, reading orders, and discontinuous entities.

POA: Pre-training Once for Models of All Sizes

http://arxiv.org/abs/2408.01031v1

Compressor summary: The POA framework trains one large model that can adapt to different sizes and tasks by randomly sampling sub-networks for self-distillation, achieving state-of-the-art performance on various vision tasks.

Semantic Skill Grounding for Embodied Instruction-Following in Cross-Domain Environments

http://arxiv.org/abs/2408.01024v1

Compressor summary: The SemGro framework uses pretrained language models to plan tasks by grounding semantic skills in different domains, using iterative skill decomposition and reasoning capabilities.

GNN-MolKAN: Harnessing the Power of KAN to Advance Molecular Representation Learning with GNNs

http://arxiv.org/abs/2408.01018v1

Compressor summary: The paper proposes GNN-MolKAN and GNN-MolKAN+, new GNNs that use KAN architecture to improve molecular representations for property prediction and drug design, with benefits in performance, efficiency, and few-shot learning.

IBB Traffic Graph Data: Benchmarking and Road Traffic Prediction Model

http://arxiv.org/abs/2408.01016v1

Compressor summary: The paper introduces a new traffic dataset and a model that improves traffic prediction by considering temporal and spatial relationships in the data.

EIUP: A Training-Free Approach to Erase Non-Compliant Concepts Conditioned on Implicit Unsafe Prompts

http://arxiv.org/abs/2408.01014v1

Compressor summary: The authors propose a method to purify text inputs for text-to-image models, which uses attention mechanisms and erasure prompts to suppress non-compliant image features and generate safer images.

Tensor Train Low-rank Approximation (TT-LoRA): Democratizing AI with Accelerated LLMs

http://arxiv.org/abs/2408.01008v1

Compressor summary: TT-LoRA is a new method for compressing large language models without losing performance, making them suitable for low-resource devices.

Enhancing Financial Market Predictions: Causality-Driven Feature Selection

http://arxiv.org/abs/2408.01005v1

Compressor summary: The FinSen dataset combines news articles and stock market data from 197 countries to improve financial forecasting accuracy and reliability using sentiment analysis and calibration techniques.

Piculet: Specialized Models-Guided Hallucination Decrease for MultiModal Large Language Models

http://arxiv.org/abs/2408.01003v1

Compressor summary: Piculet is a training-free method that uses multiple specialized models to provide better input representations for multimodal language models, reducing hallucinations and improving performance.

Adaptive Two-Stage Cloud Resource Scaling via Hierarchical Multi-Indicator Forecasting and Bayesian Decision-Making

http://arxiv.org/abs/2408.01000v1

Compressor summary: HARMONY is a system that uses hierarchical attention and Bayesian decision-making to efficiently allocate resources in cloud computing, saving costs and improving performance.

FBSDiff: Plug-and-Play Frequency Band Substitution of Diffusion Features for Highly Controllable Text-Driven Image Translation

http://arxiv.org/abs/2408.00998v1

Compressor summary: This paper presents a novel method to adapt a text-to-image diffusion model for image-to-image translation using reference images, enabling high-quality and versatile content creation with minimal effort.

A Safe Exploration Strategy for Model-free Task Adaptation in Safety-constrained Grid Environments

http://arxiv.org/abs/2408.00997v1

Compressor summary: The paper introduces a framework for model-free reinforcement learning agents that helps them explore grid environments safely by learning to identify and avoid potentially unsafe states using a binary classifier.

IncidentNet: Traffic Incident Detection, Localization and Severity Estimation with Sparse Sensing

http://arxiv.org/abs/2408.00996v1

Compressor summary: IncidentNet is a deep learning model that accurately detects, locates, and estimates the severity of traffic incidents using sparse sensor data from urban intersections.

Fairness in Large Language Models in Three Hour

http://arxiv.org/abs/2408.00992v1

Compressor summary: This tutorial covers recent advances in fairness considerations for large language models, including case studies, bias analysis, evaluation strategies, and resources.

On the Resilience of Multi-Agent Systems with Malicious Agents

http://arxiv.org/abs/2408.00989v1

Compressor summary: This paper studies how malicious agents affect multi-agent systems and proposes methods to improve their resilience, finding that a hierarchical structure and additional reviewers or challengers enhance system performance.

A SAT-based approach to rigorous verification of Bayesian networks

http://arxiv.org/abs/2408.00986v1

Compressor summary: The paper introduces a verification framework for Bayesian networks that uses Boolean logic literals to check their properties and improve safety in machine learning applications.

Reconstructing Richtmyer-Meshkov instabilities from noisy radiographs using low dimensional features and attention-based neural networks

http://arxiv.org/abs/2408.00985v1

Compressor summary: The summary sentence would be: A trained transformer network can accurately recover Richtmyer-Meshkoff instability from noisy radiographic images using self-attention layers.

Cross-domain Named Entity Recognition via Graph Matching

http://arxiv.org/abs/2408.00981v1

Compressor summary: The paper proposes a graph matching approach to improve cross-domain named entity recognition (NER) by fusing label graphs into BERT embeddings, enhancing the model's ability to adapt from general to specific domains.

Automatic Extraction of Relationships among Motivations, Emotions and Actions from Natural Language Texts

http://arxiv.org/abs/2408.00966v1

Compressor summary: The paper presents a graph-based method to analyze relationships among motivations, emotions, and actions in natural language texts, using a large food review dataset and nurture beliefs.

Integrating ESG and AI: A Comprehensive Responsible AI Assessment Framework

http://arxiv.org/abs/2408.00965v1

Compressor summary: The ESG-AI framework is a novel approach that helps investors assess the environmental and social impacts of AI applications and evaluate a company's commitment to responsible AI.

MIS-ME: A Multi-modal Framework for Soil Moisture Estimation

http://arxiv.org/abs/2408.00963v1

Compressor summary: The paper introduces MIS-ME, a software tool that predicts soil moisture using smartphone images and weather forecasts, outperforming traditional methods.

PERSOMA: PERsonalized SOft ProMpt Adapter Architecture for Personalized Language Prompting

http://arxiv.org/abs/2408.00960v1

Compressor summary: PERSOMA is a new method that uses soft prompt embeddings to capture and adapt to users' interaction history for personalized natural language systems.