This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-09-17 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2409.10516v1
Compressor summary: RetrievalAttention speeds up attention computation by using approximate nearest neighbor search and reducing data access to exploit sparsity, achieving sub-linear time complexity and lower GPU memory requirements.
http://arxiv.org/abs/2409.10504v1
Compressor summary: The DIctionary Label Attention module disentangles dense embeddings into sparse ones, making medical code predictions more accurate and interpretable by uncovering thousands of learned medical concepts.
http://arxiv.org/abs/2409.10502v1
Compressor summary: The paper shows that Transformers can learn to solve Sudoku and Zebra puzzles by training on logical steps and have a hidden reasoning engine within their weights.
http://arxiv.org/abs/2409.10499v1
Compressor summary: The paper proposes a partial distribution matching method using a Wasserstein adversarial network and shows its effectiveness in point set registration and domain adaptation tasks.
http://arxiv.org/abs/2409.10489v1
Compressor summary: The paper presents an efficient PyTorch implementation of the Spectral Transform Unit (STU) that beats the Transformer and other state space models in different sequence prediction tasks like language, robotics, and simulated systems.
http://arxiv.org/abs/2409.10488v1
Compressor summary: The paper examines if vision-language models can recognize physical states of objects over time and suggests improvements for better performance.
http://arxiv.org/abs/2409.10482v1
Compressor summary:
http://arxiv.org/abs/2409.10476v1
Compressor summary: The text proposes improving DDIM inversion for image editing by disentangling the guidance scale and using a better scale (0.5) derived theoretically, leading to better performance and efficiency.
http://arxiv.org/abs/2409.10473v1
Compressor summary: Masked Conditional Diffusion (MacDiff) uses diffusion models and random masking to learn effective representations for human skeleton understanding, achieving state-of-the-art performance on benchmarks and improving fine-tuning in scarce labeled data scenarios.
http://arxiv.org/abs/2409.10463v1
Compressor summary: This paper compares MLPs and KANs for modeling complex relationships with a focus on low-data regimes, introducing an effective technique to design MLPs with individualized activation functions that achieve higher predictive accuracy.
http://arxiv.org/abs/2409.10452v1
Compressor summary: SGAAE is an explainable graph generative model that extracts node representations from signed networks based on polarization and archetypes, and shows high performance in signed link prediction tasks.
http://arxiv.org/abs/2409.10445v1
Compressor summary: Key points: - DeWi is a novel learning assistance for insect pest classification that uses a one-stage and alternating training strategy. - It improves several Convolutional Neural Networks in discrimination and generalization by optimizing a triplet margin loss and data augmentation. - It achieves the highest performances on two insect pest classification benchmarks. Summary: DeWi is a new method for classifying insect pests that uses a combined training strategy to improve both discrimination and generalization of Convolutional Neural Networks, resulting in the best performance on two datasets.
http://arxiv.org/abs/2409.10432v1
Compressor summary: The paper proposes a machine learning method to infer energy-preserving reduced-order models from PDEs using data without requiring fully discrete operators or intrusive knowledge of the PDEs.
http://arxiv.org/abs/2409.10403v1
Compressor summary: The paper presents a method to improve disease diagnosis by using structured knowledge from external sources and enhancing language models' performance and interpretability on three datasets.
http://arxiv.org/abs/2409.10388v1
Compressor summary: MI-RNN is a modified RNN that predicts over time intervals and can solve unsteady PDEs more accurately without numerical derivatives.
http://arxiv.org/abs/2409.10385v1
Compressor summary: Mamba-ST is an efficient State-Space Model that performs style transfer by simulating cross-attention layers without extra modules, improving quality and reducing computational burden compared to transformers and diffusion models.
http://arxiv.org/abs/2409.10372v1
Compressor summary: The paper presents a new method that uses large language models and reinforcement learning to create more cooperative team behaviors in simulations.
http://arxiv.org/abs/2409.10370v1
Compressor summary: This study develops a novel approach using graph convolutional networks (GCNs) and molecular descriptors to predict the toxicity of per- and polyfluoroalkyl substances (PFAS), which are persistent environmental pollutants with known health concerns.
http://arxiv.org/abs/2409.10365v1
Compressor summary: Counterfactual contrastive learning creates positive pairs for medical imaging that capture relevant domain variations, improving generalisation and performance on both in-distribution and out-of-distribution data.
http://arxiv.org/abs/2409.10357v1
Compressor summary: This paper explores how using 2D or 3D joint coordinates as training data affects the quality of speech-to-gesture deep generative models and compares the results with human gestures.
http://arxiv.org/abs/2409.10353v1
Compressor summary: This paper reviews diffusion models' applications in image restoration tasks, discussing their techniques, challenges, and future directions.
http://arxiv.org/abs/2409.10341v1
Compressor summary: The authors propose a method to detect sexism and misogyny in German online comments using text embeddings, achieving competitive results in a challenge and showing potential for scalability.
http://arxiv.org/abs/2409.10340v1
Compressor summary: The text introduces hypergraphs, which extend graphs by allowing multiple nodes to be connected by a single hyperedge, and proposes a novel algorithm (DOSAGE) for finding densest overlapping subgraphs in hypergraphs that improves node classification performance.
http://arxiv.org/abs/2409.10338v1
Compressor summary: The paper proposes a method to distinguish between large language models using a small number of binary questions, which could be useful for detecting model leaks.
http://arxiv.org/abs/2409.10329v1
Compressor summary: InfoDisent is a hybrid model that combines post-hoc and intrinsic methods to better understand and interpret the decisions made by pre-trained image classification networks.
http://arxiv.org/abs/2409.10327v1
Compressor summary: The paper proposes a method to perform real-time relighting using a CNN renderer for direct illumination and a hash grid-based renderer for indirect illumination, trained with distillation from a pre-trained teacher model.
http://arxiv.org/abs/2409.10141v1
Compressor summary: PSHuman uses cross-scale diffusion and parametric models to reconstruct detailed and photorealistic 3D human meshes from monocular RGB images, addressing challenges like self-occlusions and clothing topology.
http://arxiv.org/abs/2409.10132v1
Compressor summary: StruEdit is a method to update large language models' answers with current knowledge by editing structured reasoning triplets, improving accuracy and speed.
http://arxiv.org/abs/2409.10111v1
Compressor summary: This study evaluates whether instance incremental or batch incremental learning is better for real-world fraud detection problems with delayed labels, finding that batch incremental models perform similarly or better in terms of predictive performance and interpretability.
http://arxiv.org/abs/2409.10104v1
Compressor summary: The study explores how Transfer Learning can help train AI models in small data contexts for quality control of CFRP tape laying in aerospace manufacturing using optical sensors.
http://arxiv.org/abs/2409.10096v1
Compressor summary: The paper proposes a framework for robust risk-aware reinforcement learning using dynamic distortion risk measures, neural networks, and actor-critic algorithms to handle uncertainty and environmental dynamics.
http://arxiv.org/abs/2409.10094v1
Compressor summary: The paper proposes a diffusion-based OoD detection framework that uses a novel similarity metric in feature and probability spaces to measure distribution disparities between original and generated images, achieving better performance than existing methods.
http://arxiv.org/abs/2409.10090v1
Compressor summary: MotionCom is a novel image composition method that uses a large vision language model and video diffusion prior for automatic integration of objects into new scenes with realistic motion and interaction.
http://arxiv.org/abs/2409.10085v1
Compressor summary: This paper proposes learning a latent ground metric for optimal transport distances in machine learning and signal processing applications using Riemannian geometry.
http://arxiv.org/abs/2409.10080v1
Compressor summary: The paper proposes a new framework called DAE-Fuse that uses a two-phase autoencoder to generate sharp and natural fused images from different imaging modalities.
http://arxiv.org/abs/2409.10077v1
Compressor summary: The paper proposes LLM-DER, a framework that uses large language models to enrich entity information and evaluate plausibility for complex domain-specific entity recognition in Chinese.
http://arxiv.org/abs/2409.10075v1
Compressor summary: Steinmetz Neural Networks use real-valued subnetworks with coupled outputs to process complex-valued data and Analytic Neural Networks enforce analytic signal representations for better generalization error bounds and performance.
http://arxiv.org/abs/2409.10070v1
Compressor summary: The paper proposes using semantic information from goal-oriented human-human dialogues to improve summarization and introduces a new dataset version for research on this topic.
http://arxiv.org/abs/2409.10069v1
Compressor summary: The paper introduces a domain-agnostic method for unsupervised anomaly detection using conditional perturbators and a discriminator that generates diverse and hard-to-distinguish synthetic anomalies.
http://arxiv.org/abs/2409.10068v1
Compressor summary: The STVNN model uses joint spatiotemporal convolutions to process multivariate time series, addressing instabilities in traditional methods like PCA and improving stability for dynamic data settings.
http://arxiv.org/abs/2409.10064v1
Compressor summary: MindGuard is a mobile mental healthcare system using an LLM to provide personalized screening and intervention conversations, addressing the low treatment rate due to stigma and improving accessibility in mental healthcare.
http://arxiv.org/abs/2409.10053v1
Compressor summary: The paper proposes a new editing method for large language models that preserves activation magnitudes and improves safety benchmarks by rotating activations instead of adding steering vectors.
http://arxiv.org/abs/2409.10046v1
Compressor summary: This study develops machine learning models to characterize and predict lightning-ignited wildfires globally, showing that climate change increases their risk and highlighting the importance of tailored models for different types of fires.
http://arxiv.org/abs/2409.10045v1
Compressor summary: The paper introduces a new machine learning technique to model and predict wireless channel dynamics using compressed representations of channel state information.
http://arxiv.org/abs/2409.10044v1
Compressor summary: The paper introduces a benchmark dataset to evaluate different types of uncertainty in LLMs, highlighting the need for improved metrics that better guide prompt optimization.
http://arxiv.org/abs/2409.10041v1
Compressor summary: DENSER uses wavelets to improve 3D Gaussian splatting for dynamic urban scene reconstruction, outperforming existing methods on the KITTI dataset.
http://arxiv.org/abs/2409.10038v1
Compressor summary: Diagram of Thought (DoT) is a framework for modeling iterative reasoning in large language models using directed acyclic graphs, enhancing logical consistency and soundness while improving reasoning capabilities.
http://arxiv.org/abs/2409.10028v1
Compressor summary: AttnMod modifies cross attention in diffusion models to create new art styles that are not achievable with standard prompts.
http://arxiv.org/abs/2409.10021v1
Compressor summary: Key points: - Hotspot detection techniques for VLSI fabrication need to generalize well to real-world scenarios - The proposed framework integrates a lithography simulator and an object detection network with cross-attention blocks - The framework outperforms previous methods on real-world data Summary: The paper presents a novel hotspot detection framework for VLSI fabrication that combines a lithography simulator and an object detection network, enabling better generalization to real-world scenarios.
http://arxiv.org/abs/2409.10016v1
Compressor summary: The paper introduces AceParse, a comprehensive dataset for parsing diverse structured texts in academic literature, and presents AceParser, a multimodal model that outperforms the previous state-of-the-art in this task.
http://arxiv.org/abs/2409.10011v1
Compressor summary: HALO is a framework that detects and mitigates hallucinations in medical question-answering systems by using multiple queries, retrieving context from external sources, and scoring relevance to improve the accuracy of large language models.
http://arxiv.org/abs/2409.10007v1
Compressor summary: SelECT-SQL is a new in-context learning approach that combines chain-of-thought prompting, self-correction, and ensemble methods to improve Text-to-SQL conversion accuracy using large language models like GPT-3.5-Turbo, achieving state-of-the-art results on challenging benchmarks.
http://arxiv.org/abs/2409.09990v1
Compressor summary: SHIRE is a framework that uses human intuition encoded in PGMs to improve sample efficiency and explainability of Deep RL policies in robotic tasks.
http://arxiv.org/abs/2409.09989v1
Compressor summary: This paper reviews the evolution of sentiment analysis in NLP, its challenges, and future trends.
http://arxiv.org/abs/2409.09984v1
Compressor summary: The paper analyzes the effect of increasing batch sizes or decaying learning rates on the GSAM algorithm's ability to find flat local minima in deep neural networks.
http://arxiv.org/abs/2409.09980v1
Compressor summary: The study shows that using machine learning, especially Random Forests, can help predict household nutrition in countries facing hunger crises by analyzing various factors, but better data is needed for more accurate results.
http://arxiv.org/abs/2409.09969v1
Compressor summary: The paper proposes a novel omni-directional image synthesis method that uses a pre-trained VQGAN model and reduces training time by using two stages: global coarse image creation and local refinement.
http://arxiv.org/abs/2409.09958v1
Compressor summary: The paper proposes a method for multi-objective reinforcement learning that adapts to preferences and safety constraints from demonstrations without explicit input.
http://arxiv.org/abs/2409.09957v1
Compressor summary: This text reviews deep learning methods for graph anomaly detection (GAD), discussing challenges, methodologies, and datasets, and provides a taxonomy of 13 fine-grained categories.
http://arxiv.org/abs/2409.09953v1
Compressor summary: UAAN is a novel network that detects out-of-distribution actions in videos using both appearance and motion features, outperforming existing methods.
http://arxiv.org/abs/2409.09951v1
Compressor summary: Optimal ablation is a new method for measuring the importance of model components in machine learning models, which has advantages over existing methods and can improve interpretability tasks.
http://arxiv.org/abs/2409.09947v1
Compressor summary: The authors propose a new way to evaluate machine-generated legal analysis by identifying gaps between human and machine outputs and create a detector with an annotated dataset to measure these gaps.
http://arxiv.org/abs/2409.09945v1
Compressor summary: This study analyzes how synthetic opioids and heroin spread in the U.S. from 2013 to 2020 using a graph convolutional neural network model that accounts for spatial connections between counties.
http://arxiv.org/abs/2409.09944v1
Compressor summary: Key points: - Paper presents machine learning model for induction motor fault detection and classification using three phase voltages and currents as inputs - Aims to protect vital electrical components and prevent abnormal event progression through early detection and diagnosis - Uses fast forward artificial neural network model to detect common electrical faults - Interfaces the model with a real motor to test its performance Summary: The paper proposes a machine learning model that uses voltages and currents of induction motors to detect and classify common electrical faults, with the goal of protecting vital electrical components.
http://arxiv.org/abs/2409.09931v1
Compressor summary: This study shows that a machine learning model can accurately predict properties of solid materials, even when trained only on some aspects and tested on unseen configurations.
http://arxiv.org/abs/2409.09930v1
Compressor summary: MissNet is a method to accurately impute missing values in multivariate time series data by exploiting temporal dependency and inter-correlation using adaptive networks.
http://arxiv.org/abs/2409.09927v1
Compressor summary: The authors evaluate five data contamination detection methods on four state-of-the-art LLMs using eight challenging datasets, finding significant limitations and inconsistencies in current approaches.
http://arxiv.org/abs/2409.09920v1
Compressor summary: This paper proposes a deep learning-based surrogate model that uses multiple forward transitions in latent space to improve long-term predictions for two-phase reservoir simulations.
http://arxiv.org/abs/2409.09916v1
Compressor summary: SFR-RAG is a small language model that uses external context to generate accurate and relevant answers, outperforming larger models like GPT-4o with fewer parameters.
http://arxiv.org/abs/2409.09915v1
Compressor summary: The paper presents a method to recognize hand gestures using forearm ultrasound and deep neural networks on low-resource devices like Raspberry Pi with high accuracy and low latency.
http://arxiv.org/abs/2409.09905v1
Compressor summary: The authors propose a novel method that uses large language models to reveal latent personality dimensions without relying on explicit questionnaires and show that their approach can predict Big Five traits more accurately than previous methods.