This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-07-19 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2407.13771v1
Compressor summary: The paper proposes a method to merge scene understanding models trained on different domains without accessing their training data, using linear model parameter merging and Gaussian prior-based buffer merging.
http://arxiv.org/abs/2407.13768v1
Compressor summary: Key points: - The text is about class incremental learning (CIL) in the medical domain, which suffers from catastrophic forgetting when training on new classes. - The text introduces two plug-in methods to mitigate the imbalance caused by the class overlap and the distribution margin. - The text reports better performance than state-of-the-art methods on three benchmark datasets. Summary: The text presents two simple methods to improve CIL in the medical domain, which address the imbalance problem due to class overlap and distribution margin, and achieve better results than existing methods.
http://arxiv.org/abs/2407.13766v1
Compressor summary: The paper introduces Multi-Image Visual Question Answering (MIQA) as a new task, proposes a public benchmark called "Visual Haystacks," and presents MIRAGE, a novel framework that improves efficiency and accuracy for large multimodal models in this task.
http://arxiv.org/abs/2407.13765v1
Compressor summary: The authors propose a formal approach using structural causal models (SCM) to analyze and design probing experiments for language models, showing how it helps understand their unsupervised learning of latent causal concepts from text.
http://arxiv.org/abs/2407.13764v1
Compressor summary: Our method reconstructs dynamic scenes from monocular videos by exploiting low-dimensional structure of 3D motion and using data-driven priors to consolidate supervisory signals.
http://arxiv.org/abs/2407.13759v1
Compressor summary: The text describes a method for generating realistic long sequences of city views based on language input and map data, using video diffusion and an autoregressive framework with temporal imputation to maintain quality and consistency.
http://arxiv.org/abs/2407.13757v1
Compressor summary: The paper explores how retrieval-enhanced generative models are vulnerable to black-box attacks that manipulate ranking results and affect user cognition and decision-making.
http://arxiv.org/abs/2407.13755v1
Compressor summary: Random Latent Exploration (RLE) is a new exploration technique for deep reinforcement learning that combines bonus-based and noise-based strategies, adding structured random rewards to encourage agent exploration in certain states.
http://arxiv.org/abs/2407.13753v1
Compressor summary: The study explores using facial expressions and emotions as objective biomarkers for depression by analyzing video data of people with or without the disorder.
http://arxiv.org/abs/2407.13752v1
Compressor summary: Key points: - Text-to-image model customization can use new concepts with few examples - Logos are hard to learn for diffusion models due to unique patterns and textual elements - Logo insertion task aims to insert logos into diffusion models and synthesize them in contexts - LogoSticker is a two-phase pipeline that pre-trains actor-critic relation and learns decoupled identity of logos - LogoSticker outperforms customization methods and DALLE~3 Summary: LogoSticker is a novel logo insertion method that uses a two-phase pipeline to train diffusion models to synthesize logos accurately and harmoniously in various contexts, surpassing existing methods.
http://arxiv.org/abs/2407.13748v1
Compressor summary: Key points: - Paper proposes a general approach for weakly supervised 3D object detection using RGB images and 2D boxes - Approach consists of three components: prior injection, 2D projection constraint, and 3D geometry constraint - Method achieves high-quality 3D bounding boxes with only 2D annotation on two datasets Summary: The paper presents a general method for weakly supervised 3D object detection that uses RGB images and 2D boxes, and applies three components to obtain accurate 3D boxes without 3D annotations.
http://arxiv.org/abs/2407.13746v1
Compressor summary: The study proposes a new surrogate loss function for multi-label learning that accounts for label correlations and has optimal consistency guarantees, as well as adapting standard classification losses to multi-label settings.
http://arxiv.org/abs/2407.13745v1
Compressor summary: MaRINeR is a method that uses a nearby mapping image to improve the quality of novel views in Computer Vision and Robotics tasks by matching deep features and transferring details hierarchically.
http://arxiv.org/abs/2407.13744v1
Compressor summary: The paper discusses how natural language processing models are becoming more generalist and suggests evaluating their strengths and weaknesses based on their ability to approximate specialist functions from natural language specifications.
http://arxiv.org/abs/2407.13743v1
Compressor summary: The paper introduces an optimistic Q-learning algorithm for average reward reinforcement learning with a relaxed assumption on the frequent state visiting time, and shows a regret bound of O(H^5 S√AT).
http://arxiv.org/abs/2407.13739v1
Compressor summary: The paper presents long-context Granite code models that improve context handling up to 128K tokens using continual pretraining and instruction-tuned fine-tuning, with no performance drop on regular tasks.
http://arxiv.org/abs/2407.13734v1
Compressor summary: The tutorial covers RL-based methods for optimizing diffusion models to generate samples that maximize desired metrics in biology applications.
http://arxiv.org/abs/2407.13732v1
Compressor summary: The authors study different surrogate loss functions for learning to defer and show their consistency properties under various conditions and hypothesis sets.
http://arxiv.org/abs/2407.13729v1
Compressor summary: The text describes a new benchmark for testing multi-modal language models based on the game Baba Is You, where agents have to manipulate both objects and rules to win.
http://arxiv.org/abs/2407.13722v1
Compressor summary: This paper proposes a general framework to derive better $H$-consistency bounds for surrogate losses by relaxing previous assumptions on the relationship between zero-one estimation error and surrogate loss estimation error.
http://arxiv.org/abs/2407.13719v1
Compressor summary: HazeCLIP improves real-world dehazing performance using a language-guided adaptation framework based on CLIP model's ability to distinguish between hazy and clean images.
http://arxiv.org/abs/2407.13715v1
Compressor summary: The study proposes an Open World Compositional Zero-Shot Learning model with self-attention and external knowledge to predict realistic compositions of attributes and objects.
http://arxiv.org/abs/2407.13711v1
Compressor summary: The paper proposes a method to improve uncertainty estimates in deep neural networks by placing a prior on function space instead of weight space, using structured and interpretable inductive biases.
http://arxiv.org/abs/2407.13709v1
Compressor summary: This paper explores how the choice of reference policy affects Direct Preference Optimization (DPO) for instruction fine-tuning of large language models, and provides guidance on optimal settings and similarities between reference policies and target models.
http://arxiv.org/abs/2407.13708v1
Compressor summary: The paper evaluates OOD detection methods in digital pathology using proper protocols and exploring advanced ML settings.
http://arxiv.org/abs/2407.13702v1
Compressor summary: ANHALTEN is a new German dataset for cross-lingual transfer in reference-free hallucination detection that shows the benefits of few-shot learning with minimal annotations.
http://arxiv.org/abs/2407.13696v1
Compressor summary: The text discusses the importance of standardized benchmark agreement testing (BAT) for evaluating language models and introduces BenchBench, a python package to facilitate BAT and improve its robustness and validity.
http://arxiv.org/abs/2407.13680v1
Compressor summary: HPix is a new method that uses GANs to create detailed vector maps from satellite images, overcoming limitations of existing techniques.
http://arxiv.org/abs/2407.13677v1
Compressor summary: PASTA is a transformer-based model that can generate realistic and diverse 3D objects by composing cuboidal primitives and synthesizing high quality meshes, using various inputs and manipulating object parts.
http://arxiv.org/abs/2407.13666v1
Compressor summary: The paper proposes a new data-driven approach for uncertainty quantification in regression that improves confidence intervals for both LASSO and neural network predictors by estimating bias terms from training data.
http://arxiv.org/abs/2407.13664v1
Compressor summary: DFCL is a framework that integrates machine learning and operation research for optimal budget allocation in marketing, addressing technical challenges like uncertainty, counterfactual computation, and computational cost.
http://arxiv.org/abs/2407.13660v1
Compressor summary: CogniVoice is a new framework that uses speech data and its textual transcriptions to detect mild cognitive impairment (MCI) and estimate mental state scores in multiple languages.
http://arxiv.org/abs/2407.13657v1
Compressor summary: FuLG is a large Romanian corpus created from CommonCrawl, with a new methodology for filtering and comparing its quality to other Romanian corpora.
http://arxiv.org/abs/2407.13647v1
Compressor summary: The paper proposes a progressive learning framework that improves the reasoning capabilities of large language models using weaker models without needing external supervision or human-annotated data.
http://arxiv.org/abs/2407.13640v1
Compressor summary: The paper proposes a multi-mode synchronization learning (MMSL) strategy to improve person re-identification accuracy under extreme capture conditions by applying diverse data augmentation techniques without altering the original image structure.
http://arxiv.org/abs/2407.13638v1
Compressor summary: The study applies NLP and ML to automate medical coding with explainability and light-weighted models using a public database and network models, achieving high accuracy in code prediction.
http://arxiv.org/abs/2407.13632v1
Compressor summary: Data Alchemy is a method to improve cross-site analysis and tumor classification in histopathology images using stain normalization and template learning, without changing network weights or requiring site-specific fine-tuning.
http://arxiv.org/abs/2407.13623v1
Compressor summary: The optimal vocabulary size depends on the compute budget and is often overlooked in language model scaling research, leading to under-fitting and suboptimal performance.
http://arxiv.org/abs/2407.13622v1
Compressor summary: The paper by Dong & Yang (2023) explores the sample complexity of obtaining an optimal policy for misspecified sparse linear bandits in reinforcement learning, showing that a novel elimination-based algorithm can achieve suboptimal guarantees with a polynomial number of samples.
http://arxiv.org/abs/2407.13621v1
Compressor summary: The paper studies how differential privacy works in neural network learning and shows that it can protect user data while maintaining accurate predictions.
http://arxiv.org/abs/2407.13609v1
Compressor summary: This paper presents a training-free method to generate high-quality images from textual descriptions by resolving token conflicts and improving pixel relationships using attention redistribution.
http://arxiv.org/abs/2407.13608v1
Compressor summary: The paper describes the dzNLP team's approach to Multi-label Country-level Dialect Identification using various machine learning techniques and achieving high precision but low recall.
http://arxiv.org/abs/2407.13605v1
Compressor summary: The paper proposes a physics-guided neural network and a data-aware framework to improve urban flow prediction by addressing the limitations of existing physics-guided machine learning methods.
http://arxiv.org/abs/2407.13603v1
Compressor summary: Sentence Transformers perform better than TF-IDF features in detecting writers' stances on COVID-19 vaccine, digital transformation, and women empowerment, as shown by the team dzStance's results in a stance detection competition.
http://arxiv.org/abs/2407.13597v1
Compressor summary: The text introduces a new challenge in summarization called planning-like tasks that involve generating actions to achieve specific goals and proposes a dataset and a baseline method for this problem.
http://arxiv.org/abs/2407.13596v1
Compressor summary: EarthMarker is a novel visual prompting model that improves multi-granularity remote sensing imagery interpretation by leveraging natural and RS domain-specific knowledge, cross-domain phased learning, and a new dataset called RSVP.
http://arxiv.org/abs/2407.13594v1
Compressor summary: The paper proposes a formal framework for mechanistic interpretability of neural networks using abstract interpretation and demonstrates it on a Transformer-based model solving 2-SAT problems.
http://arxiv.org/abs/2407.13592v1
Compressor summary: MeshFeat is a new encoding technique for neural fields on meshes that uses multi-resolution feature grids and simplifies the mesh structure to speed up inference and maintain reconstruction quality for texture and BRDF representation.
http://arxiv.org/abs/2407.13588v1
Compressor summary: The paper tackles miscalibration in CLIP-based model adaptation for out-of-distribution samples and proposes a simple, model-agnostic solution to scale logit ranges.
http://arxiv.org/abs/2407.13584v1
Compressor summary: The paper proposes a framework to improve text-to-3D generation quality by analyzing and addressing issues like limited detail, low fidelity, and oversaturation in the generated 3D assets.
http://arxiv.org/abs/2407.13579v1
Compressor summary: Key points: - The paper proposes a method (ZeroMMT) to train MMT systems with multimodal English data only - ZeroMMT adapts a text-only MT model by using visually conditioned masked language modelling and KL divergence between MMT outputs - ZeroMMT achieves disambiguation performance comparable to state-of-the-art MMT models on CoMMuTE benchmark and can be extended to other languages - ZeroMMT allows controlling the trade-off between disambiguation and translation fidelity without extra data Summary: ZeroMMT is a method to train multimodal machine translation systems using only English data, by adapting a text-only model with visual and diversity objectives. It performs well on disambiguation tasks across languages and can balance disambiguation and translation quality.
http://arxiv.org/abs/2407.13578v1
Compressor summary: This study evaluates the reliability and effectiveness of large language models as knowledge bases using new metrics and finds that current models have significant limitations in factuality and consistency, regardless of model size or fine-tuning methods.
http://arxiv.org/abs/2407.13571v1
Compressor summary: The system allows users to search for unknown ASL signs by submitting a video and receiving the most likely matches to improve ASL dictionary lookup and annotation efficiency.
http://arxiv.org/abs/2407.13565v1
Compressor summary: The paper introduces dzFinNlp's intent detection system for financial chatbots, using various machine learning and deep learning models, achieving high scores on a benchmark dataset.
http://arxiv.org/abs/2407.13561v1
Compressor summary: The text discusses a study on using an AI system to improve tourism services in Tibet by generating more accurate information about the region's complex topography and historical sites.
http://arxiv.org/abs/2407.13559v1
Compressor summary: Qalam is a novel foundation model for Arabic OCR and HWR that achieves high accuracy and handles diacritics and high-resolution inputs well.
http://arxiv.org/abs/2407.13555v1
Compressor summary: PetFace is a large dataset for animal face identification with detailed annotations and benchmarks, helping to advance automated animal recognition methods.
http://arxiv.org/abs/2407.13541v1
Compressor summary: The paper analyzes the crowding problem in self-supervised learning (SSL) features and proposes a learnable regulator called Dynamic Semantic Adjuster (DSA) to improve feature separation and aggregation for complex downstream tasks.
http://arxiv.org/abs/2407.13538v1
Compressor summary: EnergyDiff is a generative AI framework for creating realistic time series data for energy systems, improving on temporal dependencies and marginal distributions.
http://arxiv.org/abs/2407.13531v1
Compressor summary: ItemKNN algorithms in RecBole and LensKit libraries were compared using four data sets, showing that RecBole performed better on most metrics until similarity matrix calculations were modified in LensKit, resulting in near-identical performance.
http://arxiv.org/abs/2407.13526v1
Compressor summary: The authors propose a sparse Mixture-of-Experts model with Logistic Regressors for interpretable outcome prediction from partial process traces, selecting input features automatically during training.
http://arxiv.org/abs/2407.13524v1
Compressor summary: The text introduces a new approach called Low-confidence Pseudo Label Distillation (LPLD) to improve source-free domain adaptive object detection by better utilizing low-confidence pseudo labels from Region Proposal Network (RPN).
http://arxiv.org/abs/2407.13522v1
Compressor summary: The paper introduces Indic-QA, a large context-grounded question-answering dataset for 11 Indian languages, to evaluate multilingual LLMs' performance in non-English QA tasks.
http://arxiv.org/abs/2407.13520v1
Compressor summary: EaDeblur-GS is a method that uses event cameras and Gaussian splatting to improve 3D reconstruction from blurry images with complex motion.
http://arxiv.org/abs/2407.13518v1
Compressor summary: The authors propose using symbolic regression to generate transition dynamics models for robotics, which improves sample efficiency and extrapolation quality in model-based reinforcement learning.
http://arxiv.org/abs/2407.13513v1
Compressor summary: Key points: - DAC is a challenge of setting hyperparameters for different instances - Deep RL agents have limited generalization in DAC due to bias in training instances - The paper proposes instance selection based on time series features to improve generalization - Empirical evaluations show the benefits of instance selection on DAC benchmarks Summary: The paper introduces a method for improving generalization of deep RL agents in dynamic algorithm configuration by selecting representative training instances using time series features.
http://arxiv.org/abs/2407.13511v1
Compressor summary: The paper compares the performance of commercial and open-source large language models in a natural language processing challenge, finding that open-source models are competitive in some settings but need more data for better results.
http://arxiv.org/abs/2407.13492v1
Compressor summary: The framework extracts disease-related knowledge from text, creating annotated datasets for Rett syndrome and Alzheimer's disease, while benchmarking and probing language models' semantic relation detection capabilities.
http://arxiv.org/abs/2407.13490v1
Compressor summary: The paper proposes a solution to overcome the challenges in text generation by combining Constraint Programming (CP) and Machine Learning (ML), using a Large Language Model (LLM) to generate words with meaning and CP to handle structural constraints, resulting in faster and better results than standard NLP methods.
http://arxiv.org/abs/2407.13488v1
Compressor summary: The study introduces MUSE, a simple but robust baseline for detecting out-of-context misinformation by comparing image-text pairs with external evidence, and shows its effectiveness on two datasets.
http://arxiv.org/abs/2407.13481v1
Compressor summary: Large language models struggle to suggest missing elements in long lists due to attention overflow, which can be mitigated by iterative loops but at a cost of novelty loss.
http://arxiv.org/abs/2407.13469v1
Compressor summary: The paper proposes using lightweight adapter modules in machine translation models to achieve multiple latency levels without training separate models, and demonstrates improved performance over existing methods.
http://arxiv.org/abs/2407.13463v1
Compressor summary: The text describes an automated system using large language models that can match cancer patients to clinical trials more accurately and efficiently than human experts.
http://arxiv.org/abs/2407.13449v1
Compressor summary: The study measures how similar different image generation models are by creating linear maps between their latent spaces and finds that they learn similar representations, especially for gender in CelebA models.
http://arxiv.org/abs/2407.13442v1
Compressor summary: Key points: - VLMs use visual encoder and LLM to perceive the world - VLMs are prone to hallucination, which reduces trustworthiness - The authors introduce BEAF dataset and new metrics (TU, IG, SB, ID) to measure hallucination based on scene changes - The new metrics reveal different aspects of VLM hallucination that have not been reported before Summary: The paper proposes a new benchmark and metrics for measuring hallucination in vision language models (VLMs), which use a visual encoder and a large language model to perceive the world. The benchmark manipulates scene information by image editing and evaluates VLMs on their ability to detect changes.
http://arxiv.org/abs/2407.13435v1
Compressor summary: The paper proposes a cheap way to collect more TTS training data for low-resource languages like Hindi and Tamil by using volunteers instead of professional voice artists, which improves out-of-vocabulary word pronunciation without affecting voice quality or in-domain performance.
http://arxiv.org/abs/2407.13431v1
Compressor summary: The paper proposes a new OoD testing protocol for trajectory prediction models that homogenizes datasets and tasks, introduces a polynomial-based algorithm for smaller and faster models with near SotA ID performance and improved OoD robustness, and studies the effects of two augmentation strategies on model generalization.
http://arxiv.org/abs/2407.13429v1
Compressor summary: The paper proposes a method to train acquirers for multivariate time series using conditional mutual information to improve performance and reduce costs.
http://arxiv.org/abs/2407.13426v1
Compressor summary: WiNet estimates scale-wise wavelet coefficients for displacement/velocity fields using the wavelet transform, enabling fast and explainable image registration with low memory usage.
http://arxiv.org/abs/2407.13421v1
Compressor summary: The paper proposes a method to improve deep learning-based image classification by using CycleGAN to learn and disregard image styles, enhancing generalization ability.
http://arxiv.org/abs/2407.13419v1
Compressor summary: The study examines how different factors affect the compositionality of large language models and identifies challenges for improving their abilities to learn compositional strategies.
http://arxiv.org/abs/2407.13417v1
Compressor summary: The GDDS method detects surface defects on photovoltaic modules using a single domain generalized approach, improving accuracy and speed while addressing distribution shift and normalized Wasserstein distance for similarity measurement.
http://arxiv.org/abs/2407.13399v1
Compressor summary: The text introduces a new algorithm, $\chi^2$-Preference Optimization ($\chi$PO), which improves sample-efficiency in offline language model alignment by mitigating overoptimization using the $\chi^2$-divergence.
http://arxiv.org/abs/2407.13394v1
Compressor summary: PICASSO is a novel framework that can learn parametric CAD sketches from precise or hand-drawn images by using self-supervised rendering techniques and geometric cues.
http://arxiv.org/abs/2407.13390v1
Compressor summary: GeometrySticker is a method for embedding binary messages into the geometry components of NeRF models to protect copyright, while preserving effectiveness against recolorization.
http://arxiv.org/abs/2407.13382v1
Compressor summary: The paper proposes a neuro-symbolic approach to find object configurations in images, using first-order logic and language-vision models, and shows its applicability to real-world scenarios.
http://arxiv.org/abs/2407.13379v1
Compressor summary: The study introduces a new method using U-Net architecture and conditional GAN to remove cloud shadows from solar images for better space weather prediction.
http://arxiv.org/abs/2407.13377v1
Compressor summary: This paper introduces SummaryMixing, a linear-complexity context encoder for self-supervised learning that reduces pre-training time and resources while maintaining or improving performance on downstream tasks.
http://arxiv.org/abs/2407.13372v1
Compressor summary: The paper proposes a unified model that efficiently restores degraded images using joint embedding, gated reweighting, and contextualized attention.
http://arxiv.org/abs/2407.13368v1
Compressor summary: The paper proposes an improved method for robots to understand and interact with objects in open world settings by combining affordance representation, vision-language models, and human feedback.
http://arxiv.org/abs/2407.13364v1
Compressor summary: The GAE algorithm combines AE and MDP homomorphisms to explore dynamical systems' state spaces more efficiently for scientific discovery using geometric structures.
http://arxiv.org/abs/2407.13362v1
Compressor summary: The paper introduces Geometry Guided Self-Distillation (GGSD), which improves open vocabulary 3D scene understanding by leveraging geometric priors from 2D data and enhancing representation learning with self-distillation.
http://arxiv.org/abs/2407.13358v1
Compressor summary: The paper introduces a new NLP model that learns embeddings for authors and documents with a focus on capturing their writing style, and shows its effectiveness on three datasets.
http://arxiv.org/abs/2407.13343v1
Compressor summary: The paper proposes techniques to improve translation for indigenous languages with little data using large language models and specific prompting methods.
http://arxiv.org/abs/2407.13342v1
Compressor summary: The paper proposes a novel method to reconstruct surfaces with fine-grained details using neural signed distance functions and a non-linear implicit filter.
http://arxiv.org/abs/2407.13338v1
Compressor summary: The paper proposes a novel SLAM framework for dynamic environments that leverages continual learning, forgetting, and object identification to improve robustness.
http://arxiv.org/abs/2407.13337v1
Compressor summary: The paper presents a new deep learning framework for long-term 3D point tracking that generalizes well and outperforms prior methods without test-time fine-tuning.
http://arxiv.org/abs/2407.13335v1
Compressor summary: The Object-level Attention Transformer (OAT) is a model that predicts human visual search behavior by using object features and a new positional encoding to generate gaze scanpaths in cluttered scenes.
http://arxiv.org/abs/2407.13331v1
Compressor summary: LIAR is an efficient and effective compression technique for large language models that preserves high accuracy by using linear interpolation to reconstruct pruned weights.
http://arxiv.org/abs/2407.13329v1
Compressor summary: The authors propose an approach to classify citation intents using advanced Ensemble Strategies with Language Models and Explainable AI techniques, showing that section titles improve performances and providing a web application for this purpose.
http://arxiv.org/abs/2407.13328v1
Compressor summary: DACCA is a novel method for domain-adaptive lane detection that uses cross-domain contrastive loss and feature aggregation to improve feature learning and knowledge transfer across domains, achieving superior performance on six datasets.
http://arxiv.org/abs/2407.13326v1
Compressor summary: This study applies RISC-V's vector extension RVV to optimize common ANN algorithms for high-performance computing using a parameterized vector block model.
http://arxiv.org/abs/2407.13322v1
Compressor summary: The paper proposes a novel Test-Time Adaptation framework for remote photoplethysmography estimation, which adapts to various domain information and heart rate distributions using synthetic signals and spectral-based entropy minimization.
http://arxiv.org/abs/2407.13313v1
Compressor summary: The paper studies how dataset characteristics, such as varsortability and R2-sortability, affect the performance of causal discovery algorithms for time-dependent processes using various types of real and simulated data.
http://arxiv.org/abs/2407.13309v1
Compressor summary: The paper proposes a new method to render high dynamic range videos from low dynamic range videos by completing missing exposure information and improving temporal consistency, resulting in state-of-the-art performance.
http://arxiv.org/abs/2407.13304v1
Compressor summary: Key points: - The paper proposes a 3D shape completion dataset for agricultural vision systems using RGB-D frames and high-precision point clouds of sweet peppers in lab and greenhouse conditions. - The dataset aims to help autonomous robotic systems estimate the complete 3D shapes of fruits despite occlusions, which is challenging due to cluttered agricultural environments. - The paper also provides segmented RGB-D frames with camera intrinsic parameters and a public challenge on a benchmark server to evaluate shape completion approaches. Summary: The paper introduces a new dataset for 3D shape completion of fruits in agricultural settings, using RGB-D frames and point clouds, to enable autonomous robotic systems to harvest fruits more efficiently.
http://arxiv.org/abs/2407.13303v1
Compressor summary: The paper proposes a semi-supervised learning framework for neural networks that uses unlabeled Wi-Fi fingerprints to improve indoor localization performance and can handle hybrid databases and continual expansion.
http://arxiv.org/abs/2407.13301v1
Compressor summary: The study proposes Chain-of-Diagnosis (CoD), a method to improve the interpretability and controllability of large language models for medical diagnosis by creating a diagnostic chain that resembles a physician's thought process.
http://arxiv.org/abs/2407.13300v1
Compressor summary: Our method filters low-quality error correction data to prevent overcorrection and improve automatic speech recognition performance in out-of-domain settings using Japanese language models.
http://arxiv.org/abs/2407.13297v1
Compressor summary: SpeciaLex is a benchmark to assess language models' ability to follow specialized lexicon constraints for various tasks and audiences.
http://arxiv.org/abs/2407.13288v1
Compressor summary: The paper proposes a new indoor localization method using linked neural networks trained in a hierarchical stage-wise way, achieving the most accurate results with the UJIIndoorLoc database.
http://arxiv.org/abs/2407.13285v1
Compressor summary: The paper presents a computer-vision system that detects and warns about foreign objects in an olive grinder to prevent quality issues and machinery damage.
http://arxiv.org/abs/2407.13281v1
Compressor summary: The paper proposes an auditing framework to check the consistency of machine learning algorithms' explanations, but shows that it requires a large number of queries and highlights the importance of locality in explainability.
http://arxiv.org/abs/2407.13279v1
Compressor summary: The paper analyzes how using discounted reward as a proxy for total reward affects deep reinforcement learning and proposes conditions to align optimal policies of these two objectives.
http://arxiv.org/abs/2407.13278v1
Compressor summary: Key points: - Time series are sequences of data points in discrete-time order with complex and dynamic patterns. - Deep learning models have advanced the analysis of time series across various tasks. - The paper reviews existing literature, introduces Time Series Library (TSLib), and evaluates 12 deep time series models on different tasks. Summary: The paper surveys deep learning models for time series analysis, introduces a benchmark library (TSLib) with 24 models, 30 datasets, and five tasks, and assesses 12 models on different tasks.
http://arxiv.org/abs/2407.13268v1
Compressor summary: This paper introduces a new multi-task learning approach for truth inference in crowdsourcing, which improves the accuracy and effectiveness of worker behavior models by focusing on item features rather than hidden ground truth variables.
http://arxiv.org/abs/2407.13252v1
Compressor summary: The text proposes a new method for protecting privacy in text-to-image models by detecting if an image was used to train them based on the preservation of image structures during the diffusion process.
http://arxiv.org/abs/2407.13251v1
Compressor summary: MotifCAR is a novel graph anomaly detection method that uses motifs and GANs to create realistic, valid, proximal, and sparse counterfactual graphs for improved performance.
http://arxiv.org/abs/2407.13248v1
Compressor summary: The paper explores how LLMs struggle with storytelling, especially creating suspense and diversity, and proposes a framework to improve their narrative skills.
http://arxiv.org/abs/2407.13244v1
Compressor summary: The paper introduces PM-LLM-Benchmark, a comprehensive benchmark for evaluating open-source large language models in process mining tasks, and discusses the challenges and limitations of such a benchmark.
http://arxiv.org/abs/2407.13241v1
Compressor summary: The paper introduces NODER, a novel framework that uses neural ODEs to model complex dynamics in medical image sequences and achieves state-of-the-art 3D image regression performance with reduced computational cost and practical applicability.
http://arxiv.org/abs/2407.13238v1
Compressor summary: A new Transformer-based deep learning model for tabular data uses stochastic competition to promote generalization capacity and outperforms gradient boosted decision trees on various datasets.
http://arxiv.org/abs/2407.13237v1
Compressor summary: The paper proposes LESR, a method that uses large language models to generate task-related state representation codes for better reinforcement learning performance.
http://arxiv.org/abs/2407.13228v1
Compressor summary:
http://arxiv.org/abs/2407.13222v1
Compressor summary:
http://arxiv.org/abs/2407.13221v1
Compressor summary:
http://arxiv.org/abs/2407.13219v1
Compressor summary:
http://arxiv.org/abs/2407.13218v1
Compressor summary:
http://arxiv.org/abs/2407.13217v1
Compressor summary:
http://arxiv.org/abs/2407.13214v1
Compressor summary:
http://arxiv.org/abs/2407.13211v1
Compressor summary:
http://arxiv.org/abs/2407.13205v1
Compressor summary:
http://arxiv.org/abs/2407.13200v1
Compressor summary:
http://arxiv.org/abs/2407.13195v1
Compressor summary:
http://arxiv.org/abs/2407.13194v1
Compressor summary:
http://arxiv.org/abs/2407.13193v1
Compressor summary:
http://arxiv.org/abs/2407.13188v1
Compressor summary:
http://arxiv.org/abs/2407.13185v1
Compressor summary:
http://arxiv.org/abs/2407.13184v1
Compressor summary:
http://arxiv.org/abs/2407.13182v1
Compressor summary:
http://arxiv.org/abs/2407.13181v1
Compressor summary:
http://arxiv.org/abs/2407.13178v1
Compressor summary:
http://arxiv.org/abs/2407.13174v1
Compressor summary:
http://arxiv.org/abs/2407.13168v1
Compressor summary:
http://arxiv.org/abs/2407.13164v1
Compressor summary:
http://arxiv.org/abs/2407.13159v1
Compressor summary:
http://arxiv.org/abs/2407.13158v1
Compressor summary:
http://arxiv.org/abs/2407.13153v1
Compressor summary:
http://arxiv.org/abs/2407.13147v1
Compressor summary:
http://arxiv.org/abs/2407.13146v1
Compressor summary:
http://arxiv.org/abs/2407.13143v1
Compressor summary:
http://arxiv.org/abs/2407.13142v1
Compressor summary:
http://arxiv.org/abs/2407.13141v1
Compressor summary:
http://arxiv.org/abs/2407.13139v1
Compressor summary:
http://arxiv.org/abs/2407.13133v1
Compressor summary:
http://arxiv.org/abs/2407.13123v1
Compressor summary:
http://arxiv.org/abs/2407.13122v1
Compressor summary:
http://arxiv.org/abs/2407.13120v1
Compressor summary:
http://arxiv.org/abs/2407.13115v1
Compressor summary:
http://arxiv.org/abs/2407.13113v1
Compressor summary:
http://arxiv.org/abs/2407.13108v1
Compressor summary:
http://arxiv.org/abs/2407.13101v1
Compressor summary:
http://arxiv.org/abs/2407.13097v1
Compressor summary:
http://arxiv.org/abs/2407.13095v1
Compressor summary:
http://arxiv.org/abs/2407.13094v1
Compressor summary:
http://arxiv.org/abs/2407.13091v1
Compressor summary:
http://arxiv.org/abs/2407.13089v1
Compressor summary:
http://arxiv.org/abs/2407.13078v1
Compressor summary:
http://arxiv.org/abs/2407.13069v1
Compressor summary:
http://arxiv.org/abs/2407.13068v1
Compressor summary: