This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-09-05 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2409.02919v1
Compressor summary: HiPrompt is a new method for generating high-resolution images using diffusion models, which improves the quality by providing both global and local guidance with hierarchical prompts and conditioning noise components on different prompt levels.
http://arxiv.org/abs/2409.02917v1
Compressor summary: The paper proposes a new technique called uncertainty-aware conditional NeRF to improve visualization of surgical scenes by addressing challenges such as sparse views and photometric inconsistencies.
http://arxiv.org/abs/2409.02908v1
Compressor summary: This paper reveals a theoretical issue with masked diffusion models (MDMs) and proposes a faster sampler that addresses it, showing that MDMs are not superior to auto-regressive models (ARMs).
http://arxiv.org/abs/2409.02901v1
Compressor summary: TML uses algebraic topology techniques to analyze complex data structures and reveal insights hidden from traditional machine learning methods.
http://arxiv.org/abs/2409.02897v1
Compressor summary: The authors propose a method to improve the trustworthiness of large language models by generating accurate answers with sentence-level citations in LQAC tasks.
http://arxiv.org/abs/2409.02889v1
Compressor summary: The paper introduces LongLLaVA, a hybrid MLLM that balances efficiency and effectiveness for multi-modal tasks, with improved performance and reduced costs compared to existing models.
http://arxiv.org/abs/2409.02883v1
Compressor summary: The authors developed a multi-stream deep learning framework to improve the reliability and accuracy of detecting mild cognitive impairment using the Rey Complex Figure Test, which could be useful in clinical settings.
http://arxiv.org/abs/2409.02882v1
Compressor summary: The paper introduces FewSTAB, a benchmarking system to assess and improve the robustness of few-shot image classifiers against spurious bias using pre-trained vision-language models and existing test data.
http://arxiv.org/abs/2409.02877v1
Compressor summary: This paper proposes configurable foundation models using functional modules called bricks, which can improve efficiency and scalability of large language models by allowing dynamic configuration based on instructions.
http://arxiv.org/abs/2409.02869v1
Compressor summary: The paper introduces LITE, a lightweight deep learning architecture for Time Series Classification that is faster, consumes less resources, and performs well on multivariate time series data.
http://arxiv.org/abs/2409.02867v1
Compressor summary: The paper explores how using balanced authentic and synthetic data affects face recognition accuracy and fairness, finding that diffusion-based models improve accuracy while pre-trained generative methods have little impact on fairness.
http://arxiv.org/abs/2409.02866v1
Compressor summary: The Hybrid-Segmentor model uses self-attention to detect and segment different types of cracks in infrastructure with high accuracy and generalization capabilities.
http://arxiv.org/abs/2409.02864v1
Compressor summary: BRAD is a prototype digital assistant that can handle various bioinformatics tasks, such as question-answering, software pipeline execution, and automation of workflows.
http://arxiv.org/abs/2409.02851v1
Compressor summary: Human-VDM is a novel method for generating high-quality 3D humans from a single RGB image using Video Diffusion Models and Gaussian Splatting, addressing inconsistent view issues in existing methods.
http://arxiv.org/abs/2409.02850v1
Compressor summary: The paper shows that using sampling with replacement in few-shot learning leads to misleading confidence intervals and proposes methods to improve them.
http://arxiv.org/abs/2409.02846v1
Compressor summary: The paper proposes MaDis-Stereo, a Transformer-based stereo matching model that uses Masked Image Modeling and knowledge distillation to improve performance on data-scarce tasks like ETH3D and KITTI 2015.
http://arxiv.org/abs/2409.02841v1
Compressor summary: The report presents a machine learning-based normalization system for historic German texts using Transformer models, achieving state-of-the-art accuracy with limited data.
http://arxiv.org/abs/2409.02840v1
Compressor summary: The R2GQA system helps students understand legal regulations by retrieving, reading, and generating answers from documents using advanced techniques and a new Vietnamese dataset.
http://arxiv.org/abs/2409.02836v1
Compressor summary: The study analyzes different types of statements and emotions in cryptocurrency discussions using advanced natural language processing techniques and finds distinct patterns and interplay between predictive, hope, and regret sentiments across five cryptocurrencies.
http://arxiv.org/abs/2409.02834v1
Compressor summary: The paper introduces a Chinese multimodal math dataset to evaluate and enhance large multimodal models' mathematical reasoning skills, and proposes a new model (Math-LMM) that improves performance on various problem types.
http://arxiv.org/abs/2409.02828v1
Compressor summary: The paper proposes a novel method called ExpLLM that uses large language models to generate a chain of thought for accurate facial expression recognition, outperforming current methods and even GPT-4o in recognizing micro-expressions.
http://arxiv.org/abs/2409.02825v1
Compressor summary: The paper compares different feature extraction and matching methods for digital surface models generation from satellite images, showing that traditional methods can be competitive with deep learning ones.
http://arxiv.org/abs/2409.02813v1
Compressor summary: The paper presents MMMU-Pro, a challenging benchmark for multimodal models that tests their ability to integrate visual and textual information by embedding questions within images, and shows that current models perform significantly worse on it.
http://arxiv.org/abs/2409.02802v1
Compressor summary: Key points: - Adversarial robustness is a challenge for time series classification (TSC) - Randomized Smoothing provides provable lower bound on robustness radius but struggles with poor robustness datasets - Self-ensemble method improves robustness by reducing variance of classification margins and addressing computational overhead issue Summary: The paper proposes a self-ensemble method to enhance adversarial robustness for time series classification, improving on Randomized Smoothing's performance and lower bound.
http://arxiv.org/abs/2409.02795v1
Compressor summary: The text discusses a survey of preference alignment strategies for large language models, breaking them down into four components and providing examples to understand their strengths and challenges.
http://arxiv.org/abs/2409.02792v1
Compressor summary: ULE is a new approach that improves robustness of deep neural networks by training two models in parallel: a student model that learns spurious correlations and a teacher model that unlearns the student's mistakes.
http://arxiv.org/abs/2409.02772v1
Compressor summary: The paper shows that many causal representation learning methods align representations to data symmetries, not necessarily causal ones, and proposes a unified method that can use different assumptions based on relevant invariances for applications like treatment effect estimation.
http://arxiv.org/abs/2409.02760v1
Compressor summary: The paper proposes an active learning approach for learning non-monotonic preferences in MCS problems using max-margin optimization and information amount measurement.
http://arxiv.org/abs/2409.02751v1
Compressor summary: Pre-training and fine-tuning is the best semi-supervised learning approach for sentiment analysis and natural language inference tasks, while self-training has no extra advantages.
http://arxiv.org/abs/2409.02747v1
Compressor summary: The paper proposes two new techniques for offline RL in non-Markovian environments, improving on previous algorithms by using a formal language pseudometric and Count-Min-Sketch.
http://arxiv.org/abs/2409.02728v1
Compressor summary: The paper introduces a method to create small, task-focused subgraphs from large graphs using GNNs and GIB principle, which reduces communication costs while preserving essential information.
http://arxiv.org/abs/2409.02727v1
Compressor summary: This study explores optimal design and pooling strategies for Large Language Models (LLMs) based embedding models by conducting large-scale experiments and proposes a new pooling method, Multi-Layers Trainable Pooling, which improves performance in text similarity and retrieval tasks.
http://arxiv.org/abs/2409.02725v1
Compressor summary: The study investigates how using specific quality metrics for scientific papers to refine a pre-training dataset affects BERT's performance on biomedical language understanding tasks.
http://arxiv.org/abs/2409.02712v1
Compressor summary: This study proposes a data filtering approach using cross-lingual sentence representations to improve machine translation quality for low-resource English-Marathi language pairs.
http://arxiv.org/abs/2409.02711v1
Compressor summary: PostNL developed an AI system called SuperTracy to improve parcel tracking communication using generative AI technologies like Retrieval-Augmented Generation.
http://arxiv.org/abs/2409.02708v1
Compressor summary: Meta-SP is a new algorithm for multi-task linear models that learns an invariant low-rank subspace shared by different tasks and outperforms other methods.
http://arxiv.org/abs/2409.02697v1
Compressor summary: The paper proposes a method to improve a deep reinforcement learning agent for job shop scheduling by training it on search trajectories and achieves state-of-the-art results with machine learning-enhanced search.
http://arxiv.org/abs/2409.02686v1
Compressor summary: The paper explores why large language models struggle with reasoning tasks, proposes a causal framework to understand their limitations, and introduces Deconfounded Causal Adaptation (DCA), a novel fine-tuning method that improves their performance with minimal parameters.
http://arxiv.org/abs/2409.02683v1
Compressor summary: The paper introduces three new metrics for evaluating handwriting generation models that consider style, content, and diversity, and shows they are better than existing metrics like FID.
http://arxiv.org/abs/2409.02681v1
Compressor summary: The study uses a mixed RNN model with LSTM and GRU architectures to predict monthly fire spot counts in the Amazon using satellite data, showing improved accuracy and capturing seasonal patterns.
http://arxiv.org/abs/2409.02672v1
Compressor summary: The paper proposes a novel method for disentangled representation learning that combines mutual information and independence constraints within a GAN framework, improving the quality of controllable generation and explainability.
http://arxiv.org/abs/2409.02667v1
Compressor summary: The article presents a semi-automatic method to create domain-specific translation memories from Turkish cardiology journals for various language applications.
http://arxiv.org/abs/2409.02664v1
Compressor summary: The paper proposes a novel method to improve deepfake detection by repurposing vision-language models and manipulating their input without tuning the model parameters.
http://arxiv.org/abs/2409.02657v1
Compressor summary: PoseTalk is a system that generates lip-synchronized talking head videos with free head poses using both audio and text inputs, addressing loss-imbalance issues and achieving better pose diversity and realness.
http://arxiv.org/abs/2409.02653v1
Compressor summary: The paper proposes Skip-and-Play, a depth-based pose control method for text-to-image generation that reduces shape dependency on depth maps while preserving the pose.
http://arxiv.org/abs/2409.02647v1
Compressor summary: The text proposes a learning-based monitoring approach to detect and counter rendering errors in digital displays of the automotive industry using telltales that separate good and corrupted content.
http://arxiv.org/abs/2409.02638v1
Compressor summary: MADiff is a method that predicts future hand waypoints in egocentric videos using diffusion models with motion-aware denoising and scene understanding, achieving real-time performance and reasonable results.
http://arxiv.org/abs/2409.02634v1
Compressor summary: The paper introduces Loopy, an end-to-end audio-only conditioned video diffusion model that generates natural and high-quality human video without spatial motion templates.
http://arxiv.org/abs/2409.02632v1
Compressor summary: The paper presents an exploratory agent that evaluates and optimizes procedurally generated game levels for player exploration based on motivations and a fitness function.
http://arxiv.org/abs/2409.02628v1
Compressor summary: Epistemic uncertainty collapse occurs in deep learning models as complexity increases due to implicit ensembling, challenging the assumption of better uncertainty quantification with larger models.
http://arxiv.org/abs/2409.02617v1
Compressor summary: The paper introduces a new dataset to evaluate how well large language models understand various types of visual data, using text prompts and questions related to images.
http://arxiv.org/abs/2409.02611v1
Compressor summary: The paper proposes a novel Graph-of-Thought guided compositional reasoning model called GoT-CQA for answering questions based on chart content, which performs well in complex reasoning tasks.
http://arxiv.org/abs/2409.02608v1
Compressor summary:
http://arxiv.org/abs/2409.02604v1
Compressor summary:
http://arxiv.org/abs/2409.02598v1
Compressor summary:
http://arxiv.org/abs/2409.02596v1
Compressor summary:
http://arxiv.org/abs/2409.02588v1
Compressor summary:
http://arxiv.org/abs/2409.02584v1
Compressor summary:
http://arxiv.org/abs/2409.02574v1
Compressor summary:
http://arxiv.org/abs/2409.02569v1
Compressor summary:
http://arxiv.org/abs/2409.02566v1
Compressor summary:
http://arxiv.org/abs/2409.02562v1
Compressor summary:
http://arxiv.org/abs/2409.02561v1
Compressor summary:
http://arxiv.org/abs/2409.02555v1
Compressor summary:
http://arxiv.org/abs/2409.02549v1
Compressor summary:
http://arxiv.org/abs/2409.02546v1
Compressor summary:
http://arxiv.org/abs/2409.02545v1
Compressor summary:
http://arxiv.org/abs/2409.02543v1
Compressor summary:
http://arxiv.org/abs/2409.02530v1
Compressor summary:
http://arxiv.org/abs/2409.02529v1
Compressor summary:
http://arxiv.org/abs/2409.02522v1
Compressor summary:
http://arxiv.org/abs/2409.02519v1
Compressor summary:
http://arxiv.org/abs/2409.02512v1
Compressor summary:
http://arxiv.org/abs/2409.02494v1
Compressor summary:
http://arxiv.org/abs/2409.02492v1
Compressor summary:
http://arxiv.org/abs/2409.02490v1
Compressor summary:
http://arxiv.org/abs/2409.02486v1
Compressor summary:
http://arxiv.org/abs/2409.02482v1
Compressor summary:
http://arxiv.org/abs/2409.02481v1
Compressor summary:
http://arxiv.org/abs/2409.02465v1
Compressor summary:
http://arxiv.org/abs/2409.02449v1
Compressor summary:
http://arxiv.org/abs/2409.02448v1
Compressor summary:
http://arxiv.org/abs/2409.02446v1
Compressor summary:
http://arxiv.org/abs/2409.02438v1
Compressor summary:
http://arxiv.org/abs/2409.02431v1
Compressor summary:
http://arxiv.org/abs/2409.02429v1
Compressor summary:
http://arxiv.org/abs/2409.02428v1
Compressor summary:
http://arxiv.org/abs/2409.02426v1
Compressor summary:
http://arxiv.org/abs/2409.02416v1
Compressor summary:
http://arxiv.org/abs/2409.02413v1
Compressor summary:
http://arxiv.org/abs/2409.02410v1
Compressor summary:
http://arxiv.org/abs/2409.02404v1
Compressor summary:
http://arxiv.org/abs/2409.02393v1
Compressor summary:
http://arxiv.org/abs/2409.02392v1
Compressor summary:
http://arxiv.org/abs/2409.02389v1
Compressor summary:
http://arxiv.org/abs/2409.02387v1
Compressor summary:
http://arxiv.org/abs/2409.02385v1
Compressor summary:
http://arxiv.org/abs/2409.02384v1
Compressor summary:
http://arxiv.org/abs/2409.02376v1
Compressor summary:
http://arxiv.org/abs/2409.02375v1
Compressor summary:
http://arxiv.org/abs/2409.02374v1
Compressor summary:
http://arxiv.org/abs/2409.02370v1
Compressor summary:
http://arxiv.org/abs/2409.02363v1
Compressor summary:
http://arxiv.org/abs/2409.02361v1
Compressor summary:
http://arxiv.org/abs/2409.02347v1
Compressor summary:
http://arxiv.org/abs/2409.02343v1
Compressor summary: