This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-08-19 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2408.08872v1
Compressor summary: xGen-MM is a framework for developing large multimodal models with various applications, evaluation metrics, and safety features.
http://arxiv.org/abs/2408.08869v1
Compressor summary: PEDAL is a hybrid self-ensembling approach that uses diverse exemplar prompts and LLM aggregation to improve text generation accuracy while reducing inference cost.
http://arxiv.org/abs/2408.08855v1
Compressor summary: DPA is an unsupervised domain adaptation method for vision-language models that uses dual prototypes, robust self-training, and textual-visual alignment to improve performance on downstream tasks.
http://arxiv.org/abs/2408.08852v1
Compressor summary: GeoTransformer is a new model that uses geospatial attention to incorporate urban information and improve predictions of GDP and ride-share demand.
http://arxiv.org/abs/2408.08848v1
Compressor summary: The paper introduces PsychoLex, a set of resources to improve LLMs' performance in psychological tasks, and presents the PsychoLexLLaMA model that outperforms general models in this domain.
http://arxiv.org/abs/2408.08841v1
Compressor summary: The paper proposes FLEXTAF-Single and FLEXTAF-Vote, two methods that use flexible tabular formats to improve table reasoning performance using Large Language Models (LLMs).
http://arxiv.org/abs/2408.08837v1
Compressor summary: Shuffle coding is a method for compressing unordered sequences of objects using bits-back coding, which works well for various data structures like graphs and molecules.
http://arxiv.org/abs/2408.08827v1
Compressor summary: The paper proposes AINet, a novel network for robust RGBT tracking that efficiently interacts features of all modalities and layers using two fusion mambas with dynamic order adjustment.
http://arxiv.org/abs/2408.08824v1
Compressor summary: The LEVIS framework helps identify verifiable input spaces for neural networks and assess their robustness in safety-critical applications using novel techniques.
http://arxiv.org/abs/2408.08823v1
Compressor summary: Key points: - The paper explores how group symmetries affect binary classification tasks using a novel framework based on Neyman-Pearson optimality - Smaller or more appropriate symmetry groups can improve generalisation and sample efficiency, contrary to common intuition - The paper develops a theoretical foundation for designing group equivariant neural networks that align with the data probability distributions - The paper shows that optimal performance is associated with subgroups of the likelihood ratio, not the largest possible groups Summary: The paper proposes a new framework to design group equivariant neural networks that match the symmetry groups and subgroups of the data probability distributions for optimal binary classification tasks.
http://arxiv.org/abs/2408.08822v1
Compressor summary: PFDiff is a training-free method that improves the efficiency of existing fast ODE solvers for image generation by skipping timesteps and using gradient replacement and foresight updates.
http://arxiv.org/abs/2408.08815v1
Compressor summary: The paper examines how well balancing strategies work for counterfactual estimation with time series data, and suggests they may not always be effective.
http://arxiv.org/abs/2408.08812v1
Compressor summary: CAT framework balances reward and caution to improve safety in transfer RL.
http://arxiv.org/abs/2408.08808v1
Compressor summary: The paper introduces a novel data pipeline to create diverse, domain-specific benchmarks for evaluating large language models in various applications, improving their usefulness and alignment with human preferences.
http://arxiv.org/abs/2408.08805v1
Compressor summary: CIKMar is a small but effective dialogue system for education that uses Gemma language model and dual-encoder ranking to provide relevant answers, although it tends to favor theoretical explanations.
http://arxiv.org/abs/2408.08803v1
Compressor summary: FR-KAN is a better alternative to MLPs for text classification, as it improves accuracy, training speed, and reduces parameters while using transformer-based encoders.
http://arxiv.org/abs/2408.08799v1
Compressor summary: The paper introduces a new representation learning framework for geometric trees that captures their hierarchical structure and spatial constraints using a unique message passing neural network and self-supervised training targets.
http://arxiv.org/abs/2408.08793v1
Compressor summary: The paper proposes an Orthogonal Compatible Aligned (OCA) approach to update visual retrieval systems without re-indexing or backfilling, preserving compatibility with old models and achieving state-of-the-art accuracy.
http://arxiv.org/abs/2408.08788v1
Compressor summary: NO-GAT is a new GNN model that leverages structural information and overlaid neighbors to improve node representations and attention coefficients.
http://arxiv.org/abs/2408.08781v1
Compressor summary: The paper investigates how much influence prompting LLMs as judges has on their alignment with human judgments, comparing different levels of instruction and a prompt-free method using perplexity.
http://arxiv.org/abs/2408.08780v1
Compressor summary: The proposed ensemble prompt framework improves in-context learning performance by describing the selection criteria of multiple examples, but the improvement mainly comes from the ensemble format rather than the descriptive content.
http://arxiv.org/abs/2408.08779v1
Compressor summary: Decomposed Automation Correction (DAC) is a new approach that improves text-to-SQL performance by decomposing the task into entity linking and skeleton parsing, and then correcting SQL based on the differences between the generated results and the initial query.
http://arxiv.org/abs/2408.08776v1
Compressor summary: NEAR is a zero-cost proxy for Neural Architecture Search that uses the effective rank of pre- and post-activation matrices to identify optimal networks without training.
http://arxiv.org/abs/2408.08774v1
Compressor summary: The study compares six different speckle noise reduction techniques for Synthetic Aperture Radar images and recommends Lee or Kuan Filters depending on the application's needs.
http://arxiv.org/abs/2408.08770v1
Compressor summary: The pessimistic iterative planning (PIP) framework finds robust memory-based policies for robust partially observable Markov decision processes by alternating between selecting an adversarial POMDP and computing a finite-state controller for it, using the rFSCNet algorithm which trains recurrent neural networks with supervision policies.
http://arxiv.org/abs/2408.08769v1
Compressor summary: LOL (LOwer Layer Matters) is a novel contrastive decoding framework that mitigates hallucination in large language models by fusing lower and final layers and using contextual guidance to enhance factual encoding.
http://arxiv.org/abs/2408.08766v1
Compressor summary: The paper proposes a new method, VF-NeRF, that improves surface reconstruction of indoor scenes by using Vector Fields as an implicit representation, which can better model planar surfaces and sharp corners.
http://arxiv.org/abs/2408.08761v1
Compressor summary: SYMPOL is a novel method for learning interpretable, tree-based policies in reinforcement learning using a policy gradient approach.
http://arxiv.org/abs/2408.08754v1
Compressor summary: The paper introduces a new framework for signed graph neural networks that improves explainability and prediction accuracy using positional encoding, transformer architecture, and nearest neighbor-based decision process.
http://arxiv.org/abs/2408.08753v1
Compressor summary: The paper proposes PCP-MAE, a method for point cloud self-supervised learning that predicts centers of masked patches and uses them to reconstruct points, improving efficiency and performance over Point-MAE.
http://arxiv.org/abs/2408.08736v1
Compressor summary: The paper proposes a Task-Aware Dynamic Transformer (TADT) for efficient image super-resolution at arbitrary scales by adapting the feature extraction based on input images and upsampling scales.
http://arxiv.org/abs/2408.08732v1
Compressor summary: The paper proposes two algorithms to learn probabilities in probabilistic logic programs using symbolic equations, and shows they perform better than existing methods.
http://arxiv.org/abs/2408.08724v1
Compressor summary: ChatZero is a zero-shot dialogue generation model that uses cross-lingual code-switching and unsupervised contrastive learning to generate responses in low-resource languages.
http://arxiv.org/abs/2408.08713v1
Compressor summary: KarSein is a novel network that optimizes CTR prediction by adaptively modeling high-order feature interactions efficiently and accurately.
http://arxiv.org/abs/2408.08707v1
Compressor summary: LLMs improve beam prediction in mmWave communication by converting time series data into text and using Prompt-as-Prefix for contextual enrichment, outperforming traditional LSTM models.
http://arxiv.org/abs/2408.08706v1
Compressor summary: The paper proposes a tailored behavior policy to improve the efficiency and unbiasedness of evaluating multiple target policies in reinforcement learning.
http://arxiv.org/abs/2408.08704v1
Compressor summary: The text introduces RadVUQA, a benchmark to assess Large Vision-Language Models' abilities in radiological tasks, revealing weaknesses in multimodal comprehension and quantitative reasoning.
http://arxiv.org/abs/2408.08703v1
Compressor summary: The paper proposes a novel framework (TsCA) that uses conditional transport theory and homology to visual-semantics interaction in CZSL to address calibration bias, generalization, and consistency issues in recognizing novel compositions.
http://arxiv.org/abs/2408.08700v1
Compressor summary: The paper introduces HyCoT, a transformer-based autoencoder for pixelwise hyperspectral image compression that outperforms existing models with less computational complexity and faster training.
http://arxiv.org/abs/2408.08698v1
Compressor summary: The NFDI4DS project seeks to improve data accessibility and interoperability in AI by connecting digital artifacts to FAIR principles using a new ontology and knowledge graph.
http://arxiv.org/abs/2408.08696v1
Compressor summary: Token Recycling is a train-free technique that accelerates inference in large language models by reusing candidate tokens and achieving significant speedup with minimal storage.
http://arxiv.org/abs/2408.08694v1
Compressor summary: The research study develops a machine learning workflow using BERT LLM and sentiment analysis to measure the effectiveness of student extracurricular activities based on emotional responses.
http://arxiv.org/abs/2408.08693v1
Compressor summary: The paper introduces a new evaluation paradigm for large language models in medical scenarios, showing their limitations and potential for improvement.
http://arxiv.org/abs/2408.08690v1
Compressor summary: The paper proposes a decentralized algorithm for two-sided matching markets without prior preference knowledge or structural assumptions, and analyzes its performance.
http://arxiv.org/abs/2408.08688v1
Compressor summary: The paper evaluates multi-agent workflows for PO dataset generation using different prompting strategies and LLM configurations, finding that the LLM Feedback Loop with Llama as generator and Gemma as reviewer achieves high win rates.
http://arxiv.org/abs/2408.08685v1
Compressor summary: The paper proposes LLM4RGNN, a framework that uses large language models to improve the robustness of graph neural networks against topology attacks by identifying and adding malicious and missing edges.
http://arxiv.org/abs/2408.08684v1
Compressor summary: The article discusses challenges and solutions for deploying efficient and personalized Vision Transformer and Large Language Model AI on mobile devices using model pruning techniques.
http://arxiv.org/abs/2408.08682v1
Compressor summary: Large language models can effectively compress point cloud data without text description or alignment operations, outperforming existing methods.
http://arxiv.org/abs/2408.08681v1
Compressor summary: The paper proposes a mean field theory to explain how weights from small neural networks can be transferred to large ones, and tests it on simple MLPs and LLMs like GPT-3 and Llama-3.1.
http://arxiv.org/abs/2408.08677v1
Compressor summary: Neural Reward Machines (NRM) are a neurosymbolic framework that enables agents to reason and learn in non-markovian RL tasks using semisupervised symbol grounding, outperforming Deep RL methods.
http://arxiv.org/abs/2408.08676v1
Compressor summary: This study shows how Large Language Models can be fine-tuned to control spacecraft in the Kerbal Space Program using language inputs and outputs, potentially expanding their applications for space operations.
http://arxiv.org/abs/2408.08670v1
Compressor summary: ALaST is a method that adapts the importance of layers during ViT fine-tuning and reduces computational cost, memory load, and training time by adjusting their compute budgets.
http://arxiv.org/abs/2408.08668v1
Compressor summary: The paper proposes a risk-aware path planning algorithm that uses CVaR minimization to create more robust and less conservative paths for SSP problems in high-risk industries.
http://arxiv.org/abs/2408.08665v1
Compressor summary: QMambaBSR is a novel network for burst super-resolution that uses inter-frame querying, intra-frame scanning, adaptive upsampling, and sub-pixel extraction to reconstruct high-quality images with rich details.
http://arxiv.org/abs/2408.08661v1
Compressor summary: MIA-Tuner is a novel method that uses instructions to guide LLMs to detect their own pre-training data, improving detection performance and privacy safeguards.
http://arxiv.org/abs/2408.08656v1
Compressor summary: The paper evaluates and reduces format bias in large language models, which affects their performance across different formats of tasks like multiple-choice questions, lists, and mappings.
http://arxiv.org/abs/2408.08652v1
Compressor summary: TextCAVs is a method that creates concept activation vectors using text descriptions instead of image examples, enabling interactive and cost-effective explanations for deep learning models.
http://arxiv.org/abs/2408.08651v1
Compressor summary: The text discusses biases in language models, their impact on answer choices, and proposes two methods to reduce bias and improve accuracy.
http://arxiv.org/abs/2408.08650v1
Compressor summary: The paper proposes an end-to-end model for photo-sharing multi-modal dialogue generation that integrates a large language model with an image perceptron and generator, enabling better text and image alignment and gradient propagation.
http://arxiv.org/abs/2408.08648v1
Compressor summary: The paper proposes using classical and default logic to represent arguments identified by argument mining in an argument map, enabling automated reasoning on the argumentation.
http://arxiv.org/abs/2408.08640v1
Compressor summary: Math-PUMA is a method to improve multimodal large language models' math problem-solving by aligning textual and visual information using KL divergence and instruction tuning.
http://arxiv.org/abs/2408.08637v1
Compressor summary: AthenIA is a solution that optimizes magazine supply for 20,000 points of sale in France using a four-step pipeline and novel quantile regression method.
http://arxiv.org/abs/2408.08633v1
Compressor summary: The paper uses computer vision to study historical printed ornaments, introducing three tasks and a dataset for evaluation of state-of-the-art models.
http://arxiv.org/abs/2408.08632v1
Compressor summary: Key points: - The paper reviews 180 benchmarks and evaluation methods for MLLMs - It covers perception, understanding, cognition, reasoning, domains, capabilities, and other modalities - It argues that evaluation is crucial to support MLLM development Summary: The paper presents a comprehensive survey of 180 benchmarks and evaluation methods for multimodal large language models (MLLMs), covering various aspects and applications, and emphasizing the importance of evaluation for their development.
http://arxiv.org/abs/2408.08631v1
Compressor summary: Jekyll & Hyde framework ensembles role-playing and neutral prompts to improve LLM's reasoning ability by selecting better solutions from both prompts and using a robust evaluator that reduces position bias.
http://arxiv.org/abs/2408.08629v1
Compressor summary: This paper reviews uncertainty-aware methods for machine learning in structural dynamics, emphasizing Bayesian neural networks and their applications.
http://arxiv.org/abs/2408.08624v1
Compressor summary: Key points: - Clinical question answering systems can help clinicians but need realistic data - RealMedQA is a new dataset of clinical questions generated by humans and LLM - The authors compare QA models on BioASQ and RealMedQA and show the LLM is more efficient - They release their code and data to encourage further research Summary: The authors introduce RealMedQA, a realistic dataset of clinical questions for question answering systems, and show that an LLM can generate better QA pairs than existing methods.
http://arxiv.org/abs/2408.08622v1
Compressor summary: DeepDFA is a new method to find Deterministic Finite Automata from traces using a differentiable and discrete model that is interpretable, efficient, and robust.
http://arxiv.org/abs/2408.08610v1
Compressor summary: The paper proposes a novel high-speed and high-quality generative dataset distillation method based on Stable Diffusion, achieving significant improvement in image generation per class and placing third in the ECCV 2024 challenge.
http://arxiv.org/abs/2408.08604v1
Compressor summary: The paper presents a new bi-directional deep contextual video compression scheme (DCVC-B) for B-frames that significantly improves their compression performance compared to traditional codecs and recent advancements.
http://arxiv.org/abs/2408.08601v1
Compressor summary: The paper proposes VPIP, a framework that uses visual task prompts to handle multiple low-level vision tasks with different input-target domains, improving image reconstruction quality and performance over existing methods.
http://arxiv.org/abs/2408.08593v1
Compressor summary: Key points: - Radio map (RM) technology can reduce communication costs for 6G network by using location information - Existing NN-based methods have suboptimal performance due to misalignment between generative and discrimination modeling - RadioDiff is a new method that uses denoised diffusion, attention U-Net, and decoupled diffusion to improve RM construction quality - RadioDiff achieves state-of-the-art performance in accuracy, structural similarity, and peak signal-to-noise ratio Summary: RadioDiff is a novel method for constructing radio maps that uses advanced neural networks to exploit location information and achieve high accuracy and quality in 6G network applications.
http://arxiv.org/abs/2408.08590v1
Compressor summary: This paper investigates how auto-regressive Language Models perform syllogistic reasoning and discovers a transferable circuit involving middle-term suppression for deriving valid conclusions, but finds that these mechanisms are influenced by world knowledge and not general logical principles.
http://arxiv.org/abs/2408.08583v1
Compressor summary: GrassNet is a novel graph neural network that uses structured state space models to design and learn arbitrary graph spectral filters, overcoming limitations of traditional polynomial methods in spectral graph learning.
http://arxiv.org/abs/2408.08570v1
Compressor summary: EraW-Net is a novel method for estimating driver attention in scenes across two fields of view using a W-shaped architecture, dynamic adaptive filtering, and global context sharing.
http://arxiv.org/abs/2408.08568v1
Compressor summary: The paper presents a new learning-based framework for matching non-rigid point clouds using semantic features from large vision models, which improves generalization, robustness, and performance on various challenging datasets.
http://arxiv.org/abs/2408.08567v1
Compressor summary: S$^3$Attention is a novel attention structure that balances information preservation and computation reduction by using a smoothing block and a matrix sketching method to handle long sequences in linear complexity.
http://arxiv.org/abs/2408.08566v1
Compressor summary: The paper describes the second edition of a shared task on summarizing biomedical research articles, which attracted more participants and saw increased use of large language models.
http://arxiv.org/abs/2408.08561v1
Compressor summary: The study proposes a method that combines two techniques to generate Chinese landscape paintings with high quality and fidelity, outperforming other models.
http://arxiv.org/abs/2408.08560v1
Compressor summary: The text discusses a machine learning method to improve breast cancer screening using both DBT and FFDM images, which could reduce reliance on FFDM and increase accuracy in detecting lesions.
http://arxiv.org/abs/2408.08554v1
Compressor summary: ABQ-LLM is a novel quantization algorithm and inference framework that enables efficient arbitrary-precision quantized inference on GPUs, addressing the challenges of low-bit quantization and limited integer computing units in large language models.
http://arxiv.org/abs/2408.08551v1
Compressor summary: The Multi-view Mixture-of-Experts Model (MvP) is a new method for detecting personality traits by analyzing user posts from multiple perspectives, improving performance and addressing previous limitations.
http://arxiv.org/abs/2408.08550v1
Compressor summary: The paper introduces a hierarchical framework of optimal transports using string diagrams and proposes a new algorithm to solve safety problems on them using cost matrix compositions.
http://arxiv.org/abs/2408.08545v1
Compressor summary: SelectLLM is a novel algorithm that efficiently selects a subset of large language models to overcome their individual limitations and achieve competitive performance on complex tasks with reduced latency.
http://arxiv.org/abs/2408.08544v1
Compressor summary: Sign language is a vital communication tool for the deaf-mute community that involves hand gestures and body movements, and various tasks are being studied to help hearing people understand it better.
http://arxiv.org/abs/2408.08541v1
Compressor summary: The paper explores non-canonical tokenizations in large language models and shows that aggregating their probabilities can improve model performance.
http://arxiv.org/abs/2408.08535v1
Compressor summary: CommunityKG-RAG is a new framework that uses community structures within Knowledge Graphs to improve the accuracy and relevance of information retrieval for fact-checking, without needing extra training.
http://arxiv.org/abs/2408.08531v1
Compressor summary: The paper evaluates using data from cybersecurity exercises to predict students' performance and suggests automated tools to aid instructors in helping struggling students.
http://arxiv.org/abs/2408.08529v1
Compressor summary: The proposed encryption method enhances vision transformer performance when working with encrypted images.
http://arxiv.org/abs/2408.08527v1
Compressor summary: The text introduces a new framework called Focus on Focus (FoF) that improves glioma grading by enhancing molecular-pathology representation and aligning histopathology features with molecular biomarkers using paired training and pathology-only inference.
http://arxiv.org/abs/2408.08526v1
Compressor summary: The conditional cascaded diffusion model (cCDM) is a machine learning approach for inverse design that leverages diffusion models' strengths to predict higher-resolution solutions from lower ones, but its performance depends on the amount of high-resolution training data available.
http://arxiv.org/abs/2408.08524v1
Compressor summary: GS-ID is a framework that decomposes light sources in images using priors, environmental and direct components, learnable environment maps, and Spherical Gaussians for realistic relighting on Gaussian Splatting.
http://arxiv.org/abs/2408.08518v1
Compressor summary: The paper proposes VCPro, a framework to protect key concepts in images with less visible perturbations using adversarial techniques.
http://arxiv.org/abs/2408.08508v1
Compressor summary: The paper proposes Degree Debiased Signed Graph Neural Network (DD-SGNN) to address fairness issues related to signed graphs and evaluates its effectiveness on four real-world datasets.
http://arxiv.org/abs/2408.08506v1
Compressor summary: The paper proposes Ex3, a method to generate high-quality long novels using structure information extraction and fine-tuning of a large language model, improving over existing approaches.
http://arxiv.org/abs/2408.08502v1
Compressor summary: The paper proposes an Image-to-Image diffusion classifier that uses image translation to predict distinguishable labels for adversarial robustness, reducing computational costs and improving performance compared to existing methods.
http://arxiv.org/abs/2408.08499v1
Compressor summary: Stochastic optimization for performative shifts requires regularization to achieve optimal models.
http://arxiv.org/abs/2408.08495v1
Compressor summary: FunEditor is a diffusion model that learns atomic editing functions to perform complex edits simultaneously, faster, and more accurately than existing methods.
http://arxiv.org/abs/2408.08493v1
Compressor summary: The paper proposes a novel unlearning framework that enables parallel unlearning among models with inheritance, using a new graph and algorithm that leverages Fisher Information Matrix to reduce computational overhead and achieve effective unlearning.
http://arxiv.org/abs/2408.08488v1
Compressor summary: The paper proposes a novel method, PITN, that combines physics-informed neural networks with adversarial contrastive learning to estimate cuffless blood pressure accurately with limited data.
http://arxiv.org/abs/2408.08484v1
Compressor summary: The paper proposes an unsupervised learning framework with heuristics for solving the Maximum Minimal Cut Problem, a challenging combinatorial optimization problem, by using graph neural networks and tree transformations.
http://arxiv.org/abs/2408.08470v1
Compressor summary: The paper proposes a contextual bandit approach to select an assistant model for large language models, improving performance under resource constraints.
http://arxiv.org/abs/2408.08463v1
Compressor summary: The paper proposes a composability framework for analyzing understanding in various subjects, including AIs, and explores the role of learning ability and catalysts in enhancing output quality.