This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-02-13 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2402.07901v1
Compressor summary: The paper proposes a faster, more efficient attention mechanism for transformers using a factorable form of attention that reduces computational and memory complexity while maintaining full representation and all-to-all relationship between tokens.
http://arxiv.org/abs/2402.07900v1
Compressor summary: Adding a random mask to an imaging system reduces optical aberrations and improves image quality by making deconvolution less sensitive to noise.
http://arxiv.org/abs/2402.07899v1
Compressor summary: The study trains six model architectures on five datasets containing subsets of a child's linguistic input to examine their ability to form meaningful syntactic and semantic representations, finding that they consistently match previous results.
http://arxiv.org/abs/2402.07896v1
Compressor summary: Key points: - Existing methods for controlling language models are not always suitable for inference time use - The paper proposes a novel method called Direct Principle Feedback that simplifies Constitutional AI - The method is tested on the Pink Elephant Problem, where an LLM should avoid discussing a certain entity and focus on another - The results show that the proposed method performs well compared to other models and GPT-4 Summary: The paper introduces Direct Principle Feedback, a simplified version of Constitutional AI, for controlling language models at inference time. It demonstrates its effectiveness on the Pink Elephant Problem, where an LLM should avoid mentioning a forbidden entity and discuss a preferred one.
http://arxiv.org/abs/2402.07895v1
Compressor summary: The study presents a visual machine learning method for plant disease detection using real-world camera data, showing improved accuracy with a sequential CNN model.
http://arxiv.org/abs/2402.07894v1
Compressor summary: YOLO Phantom is a small, efficient object detection model that works well in low-light and occluded scenarios for IoT applications.
http://arxiv.org/abs/2402.07891v1
Compressor summary: DiffUse is an efficient method for choosing between text generation models by clustering embeddings and reducing the need for preference annotations.
http://arxiv.org/abs/2402.07890v1
Compressor summary: The paper introduces MAIDCRL, a semi-centralized reinforcement learning method for multi-agent control using convolutional layers, which improves performance and speed on both homogeneous and heterogeneous StarCraft scenarios.
http://arxiv.org/abs/2402.07877v1
Compressor summary: WildfireGPT is a prototype LLM agent that uses climate projections and scientific literature to provide detailed, domain-specific insights on wildfire risks for various end users.
http://arxiv.org/abs/2402.07876v1
Compressor summary: Language Feedback Models (LFMs) improve instruction following by identifying desirable actions from large language models and generalize to unseen environments.
http://arxiv.org/abs/2402.07875v1
Compressor summary: The paper explores how the implicit bias of policy gradient in reinforcement learning affects extrapolation to unseen initial states and suggests selecting initial states wisely for better performance.
http://arxiv.org/abs/2402.07871v1
Compressor summary: This paper analyzes how Mixture of Experts models can be optimized by adjusting a new hyperparameter called granularity, leading to more efficient and better performing language models than dense Transformers.
http://arxiv.org/abs/2402.07868v1
Compressor summary: The paper presents a new Bayesian Experimental Design method that optimizes risk-sensitive policies using nested sequential Monte Carlo estimators, outperforming existing methods on dynamical systems.
http://arxiv.org/abs/2402.07865v1
Compressor summary: The authors evaluate, analyze, and improve visually-conditioned language models (VLMs) for visual dialogue and related tasks, providing a unified framework, code, and checkpoints.
http://arxiv.org/abs/2402.07859v1
Compressor summary: The paper presents Lissard, a benchmark to test language models' ability to handle long sequences with repetitive rules, and shows that existing models perform worse on these tasks as the sequence length increases.
http://arxiv.org/abs/2402.07858v1
Compressor summary: The text discusses using multi spatial scale neuroimaging features to help identify patients with mood disorders who may not respond to standard treatments and find better alternatives.
http://arxiv.org/abs/2402.07851v1
Compressor summary: Key points: - The paper trains neural networks to forecast rainfall in India using historical data from 1901 to 2022 - The paper compares the neural network predictions with NWP forecasts and persistence estimates - The paper finds that neural network predictions are more accurate than both alternatives, especially for three day forecasts - The paper suggests that NWP forecasts can be improved by using more diverse data and better neural network architecture Summary: The paper shows how neural networks trained on historical rainfall data in India outperform existing methods in predicting short-term rainfall, and proposes ways to improve NWP forecasts.
http://arxiv.org/abs/2402.07846v1
Compressor summary: The paper proposes a new generative model for discrete distributions using normalizing flows, which gradually assign categories and avoid discretization issues, and can represent complex dependencies in structured data.
http://arxiv.org/abs/2402.07845v1
Compressor summary: This paper shows that modularity can be used to optimize graph neural networks (GNNs) without ground-truth comparisons and investigates its limitations on synthetic datasets with different information partitioning scenarios.
http://arxiv.org/abs/2402.07841v1
Compressor summary: The paper studies how well membership inference attacks can guess if a text is part of the training data for large language models, finding that these attacks are mostly ineffective due to factors like dataset size and fuzzy boundaries between members and non-members.
http://arxiv.org/abs/2402.07839v1
Compressor summary: Intra-Fusion is a novel neural network pruning method that uses fusion and Optimal Transport to create a more effective sparse model without the need for fine-tuning, and it can also reduce training time.
http://arxiv.org/abs/2402.07834v1
Compressor summary: The study proposes Temporal Koopman Networks (TKNets) for addressing the challenging problem of generalizing predictive models to evolving domains using Koopman theory.
http://arxiv.org/abs/2402.07827v1
Compressor summary: Aya is a multilingual language model that performs well on various tasks across 101 languages and introduces new evaluation methods to assess its performance.
http://arxiv.org/abs/2402.07822v1
Compressor summary: The paper applies Local Optima Network analysis to compare the fitness landscapes of three genetic encodings for robot morpho-evolution and locomotion tasks, providing insights for designing better algorithms.
http://arxiv.org/abs/2402.07821v1
Compressor summary: This work proposes a novel notion of multi-class calibration called projected smooth calibration, which provides strong guarantees for downstream binary classification tasks and can be computed efficiently in polynomial time.
http://arxiv.org/abs/2402.07819v1
Compressor summary: The text introduces a new large-scale 3D grocery dataset (3DGrocery100) for computer vision applications, addressing the lack of fine-grained and real-world data in this domain.
http://arxiv.org/abs/2402.07818v1
Compressor summary: The paper proposes stagewise DP zeroth-order methods for LLM pretraining that balance privacy, utility, and scalability, and reduces trainable parameters using data-free pruning.
http://arxiv.org/abs/2402.07817v1
Compressor summary: Key points: - Contextual word embeddings are sensitive to context but need more supervision - The paper proposes injecting a lexicon as an alternative source of supervision using Wiktionary - The paper evaluates the approach on the Word-In-Context task and achieves new state-of-the-art results Summary: The paper introduces a novel method to improve contextual word embeddings by using Wiktionary as extra supervision and shows its effectiveness on the Word-In-Context task.
http://arxiv.org/abs/2402.07814v1
Compressor summary: PBADet is a new method for detecting human body parts and their associations with individuals using multi-scale features without anchors, achieving better accuracy and efficiency than existing methods.
http://arxiv.org/abs/2402.07812v1
Compressor summary: RATP is a method that uses Monte-Carlo Tree Search to improve the thought generation of large language models by leveraging external knowledge and addressing privacy, hallucination, and context handling issues.
http://arxiv.org/abs/2402.07808v1
Compressor summary: Key points: - The task is to estimate a distribution of parameters that can generate data-consistent simulations - The problem is ill-posed because many source distributions can match the data - The proposed approach maximizes entropy to retain uncertainty and uses Sliced-Wasserstein distance - The method works on sample-based tasks and recovers high-entropy source distributions without sacrificing fidelity - The method is applied to infer parameters of a neuron model from experimental datasets Summary: The authors propose a method for inferring uncertain source distributions of simulator parameters using maximum entropy and Sliced-Wasserstein distance, and demonstrate its application to a neuron model.
http://arxiv.org/abs/2402.07799v1
Compressor summary: The paper proposes a general, metric-agnostic approach to planning environment redesign that can handle various objectives and outperforms existing approaches on benchmarks.
http://arxiv.org/abs/2402.07790v1
Compressor summary: Key points: - Binary classifier performance typically measured by accuracy, which ignores uncertainty - Calibration important for interpreting model scores as probabilities - Local Calibration Score introduced as a refined metric to detect score distortions - Local regressions recommended as effective recalibration tools and visualization facilitators - Applied to Random Forest classifier for credit default prediction Summary: The paper proposes the Local Calibration Score, a new metric to measure and improve calibration of binary classifiers, especially in sensitive domains like finance, using local regressions.
http://arxiv.org/abs/2402.07788v1
Compressor summary: The study proposes a multi-intent attribute-aware matching model (MIM) that leverages attributes from both queries and items to improve text matching in searching platforms.
http://arxiv.org/abs/2402.07787v1
Compressor summary: The paper introduces EMGF, a framework that efficiently integrates diverse linguistic and structural features for improved Aspect-based Sentiment Analysis using multi-anchor triplet learning and orthogonal projection.
http://arxiv.org/abs/2402.07785v1
Compressor summary: HYPO is a novel framework for machine learning models to learn domain-invariant features across different environments by using a hyperspherical space and a prototypical learning objective, which improves out-of-distribution generalization.
http://arxiv.org/abs/2402.07776v1
Compressor summary: The text proposes a novel framework for detecting fake news that integrates human expertise, logical predicates, and generalizable rules to achieve explainability, generalizability, and controllability in the detection process.
http://arxiv.org/abs/2402.07772v1
Compressor summary: The paper presents a method for integrating nondifferentiable optimization problems with uncertain parameters and fairness/robustness properties into machine learning models using the Predict-Then-Optimize paradigm.
http://arxiv.org/abs/2402.07767v1
Compressor summary: The paper proposes three methods to automatically convert toxic text into non-toxic text while keeping its meaning and fluency, using a dataset with expert-annotated detoxified versions of toxic sentences.
http://arxiv.org/abs/2402.07757v1
Compressor summary: The text describes a synthetic graph navigation task to study stepwise inference in autoregressive Transformer models, revealing various phenomena related to reasoning and generalization.
http://arxiv.org/abs/2402.07754v1
Compressor summary: The paper introduces Diffusion-of-Thought, a new model that combines diffusion and Chain-of-Thought techniques to improve reasoning in text processing tasks.
http://arxiv.org/abs/2402.07745v1
Compressor summary: The paper studies predictive churn due to model updates and proposes using predictive multiplicity measures to examine expected churn over the Rashomon set of prospective models.
http://arxiv.org/abs/2402.07744v1
Compressor summary: The paper introduces $\mathbf{UA}^2$ principles for aligning agents with human intentions, environmental dynamics, and self-constraints to improve their performance in realistic environments.
http://arxiv.org/abs/2402.07742v1
Compressor summary: The text proposes adding images to clarifying questions in conversational search systems to improve multimodal query clarification, introduces a new dataset (Melon) and a model (Marto) for this task, and shows significant improvements in retrieval performance.
http://arxiv.org/abs/2402.07739v1
Compressor summary: The paper proposes task-conditioned adapters for multi-task policy learning in autonomous agents, enabling them to flexibly adapt their perception modules based on current tasks without finetuning pre-trained models and using example demonstrations when the task is unknown.
http://arxiv.org/abs/2402.07738v1
Compressor summary: UniLP is a novel link prediction model that adapts to different graphs without targeted training by combining heuristic and parametric approaches with In-context Learning.
http://arxiv.org/abs/2402.07726v1
Compressor summary: USLNet is an unsupervised model that translates and generates sign language from text and video data without parallel sign language data, using reconstruction modules and cross-modality back-translation procedure.
http://arxiv.org/abs/2402.07721v1
Compressor summary: LoRA-drop is a method to improve resource efficiency in fine-tuning large pre-trained models by analyzing and retaining LoRA output for important layers.
http://arxiv.org/abs/2402.07712v1
Compressor summary: The paper studies how large language models like ChatGPT deteriorate when trained on their own generated data and proposes an adaptive regularization strategy to prevent this.
http://arxiv.org/abs/2402.07710v1
Compressor summary: Deep learning methods, especially CNNs, are widely used for analyzing structured grid data like images, but face challenges with sparse and unstructured 3D point clouds from LiDAR and 3D sensors.
http://arxiv.org/abs/2402.07708v1
Compressor summary: The paper proposes a pipeline for automatic segmentation, mesh model creation, and statistical shape modelling of the left atrial appendage in patients with atrial fibrillation using deep learning methods.
http://arxiv.org/abs/2402.07703v1
Compressor summary: Key points: - The text is about online sequential decision-making with delays using online convex optimization (OCO) framework - The text proposes three families of delayed algorithms based on approximate solutions for different types of feedback - The text provides regret bounds and demonstrates efficiency under different norms Summary: The text presents three families of OCO algorithms for online sequential decision-making with delays, along with their regret bounds and efficiency in various norms.
http://arxiv.org/abs/2402.07692v1
Compressor summary: BE-CBO is a new Bayesian optimization method that uses neural networks to learn constraints and efficiently explore the boundary between feasible and infeasible regions of the design space.
http://arxiv.org/abs/2402.07689v1
Compressor summary: Our new backdoor attack on NLP systems uses repositioning of two words as a trigger, maintains high success rate, and is robust against ONION defense.
http://arxiv.org/abs/2402.07688v1
Compressor summary: CyberMetric is a dataset with 10,000 questions to benchmark LLMs in cybersecurity, showing they often perform better than human experts.
http://arxiv.org/abs/2402.07685v1
Compressor summary: CMIL is a new weakly supervised ReID framework that leverages contrastive losses and outperforms baselines on three datasets, introducing the WL-MUDD dataset.
http://arxiv.org/abs/2402.07682v1
Compressor summary: The paper proposes auxiliary tasks that improve the biaffine parser's performance on semantic dependency parsing by introducing interdependence between arcs while preserving its O(n^2) complexity.
http://arxiv.org/abs/2402.07681v1
Compressor summary: The study finds that large language models perform well in translating legal texts, despite lower scores on automatic evaluation metrics, suggesting the importance of human evaluation methods.
http://arxiv.org/abs/2402.07680v1
Compressor summary: AYDIV is a new framework that aligns LiDAR and camera data to improve long-distance object detection for autonomous driving systems.
http://arxiv.org/abs/2402.07677v1
Compressor summary: GBOT is a novel graph-based tracking approach for augmented reality assembly guidance that handles occlusions, complex assembly states, and multiple objects in real time.
http://arxiv.org/abs/2402.07658v1
Compressor summary: This study shows that Large Language Models can significantly improve the accuracy of Automatic Speech Recognition in medical transcription by enhancing various aspects of transcript quality, including word errors, medical concepts, speaker diarization, and semantic coherence.
http://arxiv.org/abs/2402.07645v1
Compressor summary: The authors developed a tool using a large language model to extract and label factors from electronic health records that are associated with difficult-to-treat depression, achieving good performance on real and synthetic data.
http://arxiv.org/abs/2402.07642v1
Compressor summary: This paper proposes a new metric, c-flow, for evaluating how well object detectors in automated driving can avoid safety-critical mistakes by using optical flow and without needing extra labels.
http://arxiv.org/abs/2402.07633v1
Compressor summary: The paper proposes a novel approach for weakly supervised instance segmentation using MaskIoU heads, Complete Instances Mining strategy, and Anti-noise strategy to refine proposals and improve robustness, achieving state-of-the-art performance on PASCAL VOC 2012 and MS COCO datasets.
http://arxiv.org/abs/2402.07632v1
Compressor summary: The text discusses how AI's overconfidence or underconfidence can affect human trust, acceptance of AI suggestions, and collaboration outcomes, and suggests that aligning AI's expressed confidence with its actual performance and calibrating human trust is crucial for enhancing human-AI collaboration.
http://arxiv.org/abs/2402.07630v1
Compressor summary: The paper presents a question-answering framework for textual graphs that integrates GNNs, LLMs, and RAG, and introduces a benchmark and outperforms baselines on various tasks.
http://arxiv.org/abs/2402.07625v1
Compressor summary: Our method improves language models' math skills by using meta-prompted models to select high-quality math content from the AutoMathText dataset, achieving significant token efficiency gains.
http://arxiv.org/abs/2402.07616v1
Compressor summary: The Anchor-based LLM (AnLLM) uses a new self-attention network and inference strategy to compress sequence information into an anchor token, reducing cache and improving inference efficiency for large language models.
http://arxiv.org/abs/2402.07610v1
Compressor summary: The paper explores how multi-time bootstrapping self-alignment can improve large language models' performance by exploiting data diversity from in-context learning and proposes Step-On-Feet Tuning (SOFT) to enhance zero or one-shot capabilities.
http://arxiv.org/abs/2402.07598v1
Compressor summary: The paper introduces a new algorithm for distributional reinforcement learning that approximates return distributions using a generative model and proves its minimax-optimality, along with new theoretical results and experimental comparisons.
http://arxiv.org/abs/2402.07596v1
Compressor summary: The Sheet Music Transformer is a new model for optical music recognition that can handle complex musical scores without relying on monophonic strategies.
http://arxiv.org/abs/2402.07594v1
Compressor summary: The paper introduces a new method for inferring ordinary differential equations (ODEs) from noisy data using neural networks, and shows its effectiveness on various systems.
http://arxiv.org/abs/2402.07585v1
Compressor summary: This paper reviews ML serving architectural design choices and quality characteristics, focusing on energy efficiency for achieving green AI.
http://arxiv.org/abs/2402.07577v1
Compressor summary: The paper proposes a new method for neural topic modeling that balances between document-level contrastive learning and evidence lower bound optimization to improve topic coherence and downstream performance.
http://arxiv.org/abs/2402.07570v1
Compressor summary: The General Time Transformer (GTT) is a foundation model for zero-shot multivariate time series forecasting that uses a channel-wise framework to predict next curve shapes based on past ones, achieving superior results on unseen datasets and surpassing supervised baselines.
http://arxiv.org/abs/2402.07568v1
Compressor summary: The paper explores how subgraph information and margin theory can improve the generalization performance of graph isomorphism algorithms, such as $1$-WL, and message-passing graph neural networks (MPNNs).
http://arxiv.org/abs/2402.07545v1
Compressor summary: The authors propose TransAxx, a framework that enables fast support for approximate arithmetic on ViT models and uses MCTS to generate approximate accelerators for them, achieving significant trade-offs between accuracy and power.
http://arxiv.org/abs/2402.07543v1
Compressor summary: Fine-tuning language models with explanations improves their performance and enables them to solve tasks they couldn't before, especially for smaller models.
http://arxiv.org/abs/2402.07536v1
Compressor summary: BreakGPT is a large language model that improves financial breakout detection accuracy using a multi-stage structure.
http://arxiv.org/abs/2402.07526v1
Compressor summary: A Morse sequence is a sequence of expansions and fillings that represents the gradient vector field of a discrete Morse function on a simplicial complex.
http://arxiv.org/abs/2402.07519v1
Compressor summary: The authors propose a method to debias pretrained language models across multiple dimensions using structured knowledge and a large generative model, improving performance on various tasks and languages.
http://arxiv.org/abs/2402.07514v1
Compressor summary: Physics-informed machine learning combines data and physical models for better regression, with faster convergence rates depending on the physical error.
http://arxiv.org/abs/2402.07513v1
Compressor summary: The study investigates biases in speech recognition systems for casual Portuguese conversations using Whisper and MMS methods and shows that oversampling techniques reduce stereotypical biases.
http://arxiv.org/abs/2402.07510v1
Compressor summary: The paper explores privacy and security issues in systems of communicating AI agents, focusing on the potential use of steganography for secret collusion, and proposes a framework to test and monitor these risks.
http://arxiv.org/abs/2402.07507v1
Compressor summary: The paper proposes a method to predict traffic speeds using sparse GPS data and topographical features, outperforming existing methods in regions with limited data coverage.
http://arxiv.org/abs/2402.07506v1
Compressor summary: NeuralSentinel is a tool that validates AI models by combining attack and defence strategies, explainability concepts, and an easy-to-use interface, which was tested on a skin cancer image detector in a Hackathon event.
http://arxiv.org/abs/2402.07502v1
Compressor summary: The paper proposes a new deep learning method for detecting tables in documents using word relations and shows it is more accurate and efficient than existing methods.
http://arxiv.org/abs/2402.07501v1
Compressor summary: The paper proposes CLE-TFE, a model that uses contrastive learning and graph data augmentation to improve encrypted traffic classification by jointly training packet-level and flow-level tasks with less computational overhead.
http://arxiv.org/abs/2402.07496v1
Compressor summary: The text discusses the importance of studying possible attacks on Deep Neural Network models used in critical tasks and visualizing the effectiveness of different defenses against adversarial example attacks.
http://arxiv.org/abs/2402.07487v1
Compressor summary: The article explains score-based diffusion models using stochastic differential equations (SDE), covering sampling and score matching methods, with proofs and examples for both beginners and practitioners.
http://arxiv.org/abs/2402.07483v1
Compressor summary: The text describes the development and deployment of Tree-RAG, a question answering system using a large language model that incorporates a tree structure to represent entity hierarchies in private enterprise documents.
http://arxiv.org/abs/2402.07480v1
Compressor summary: The text discusses a novel evasion attack detector for Deep Learning models that uses Graph Convolutional Neural Networks (GCN) to analyze neuron activations and model topology, aiming to improve cybersecurity against such threats.
http://arxiv.org/abs/2402.07477v1
Compressor summary: The text introduces F-RLP, a new framework that combines language models and food-specific data to create better food recommendations.
http://arxiv.org/abs/2402.07470v1
Compressor summary: RGPT is a framework that uses adaptive boosting to create a specialized text classification LLM by recurrently ensembling base learners, which significantly outperforms existing models and humans.
http://arxiv.org/abs/2402.07465v1
Compressor summary: The paper proposes a novel score-based method to solve high-dimensional Fokker-Planck equations, overcoming the challenges of curse of dimensionality and numerical errors in existing methods.
http://arxiv.org/abs/2402.07462v1
Compressor summary: HALO is a regulatory paradigm for artificial intelligence that uses hormetic analysis to ensure safe and optimal limits of AI behaviors by modeling them as allostatic opponent processes, solving the value-loading problem and weak-to-strong generalization problem.
http://arxiv.org/abs/2402.07458v1
Compressor summary: The paper studies a binary prediction setting with a calibration distance measure that evaluates deviation from perfect calibration and proves an O(sqrt(T)) bound on it for a forecasting algorithm.
http://arxiv.org/abs/2402.07456v1
Compressor summary: OS-Copilot is a framework that helps create digital agents that can interact with various elements in an operating system, such as the web, files, and applications, and improve their skills on different tasks.
http://arxiv.org/abs/2402.07453v1
Compressor summary: We analyze the impact of bandit feedback on multiclass classification loss and show that it increases the optimal mistake bound by a factor of at most $k$, where $k$ is the number of labels, compared to full information. We also reveal nearly optimal bounds for the gap between randomized and deterministic learners, and adaptive and oblivious adversaries in bandit feedback settings, which differ significantly from the full information scenario.
http://arxiv.org/abs/2402.07452v1
Compressor summary: The paper proposes a method for detecting out-of-distribution samples in breast ultrasound images and improving classification accuracy with triplet state augmentation and balanced sphere loss.
http://arxiv.org/abs/2402.07448v1
Compressor summary: The study introduces AraSpider, an Arabic version of Spider dataset, improves Arabic NLP with multilingual translation models, and highlights the importance of context, back translation, and data sharing in NLP research.
http://arxiv.org/abs/2402.07446v1
Compressor summary: The study evaluates the quality of web-mined corpora for low-resource languages using similarity measures and shows that NMT models can perform well with high-quality portions of these corpora.
http://arxiv.org/abs/2402.07443v1
Compressor summary: FlashAttention algorithm optimizes Transformer's self-attention by reducing I/O complexity, and this paper investigates its optimal performance for various memory hierarchies.
http://arxiv.org/abs/2402.07442v1
Compressor summary: This paper introduces a natural language-based text command control system for game agents that can understand and execute free-form commands using a large language model and behavior trees in a Pok'emon game simulation.
http://arxiv.org/abs/2402.07432v1
Compressor summary: The study proposes a new evaluation method for Referring Expression Generation (REG) models that considers referential success and alternative suggestions, improving on previous ratings-based evaluations.
http://arxiv.org/abs/2402.07431v1
Compressor summary: SALAD is an AI-powered app that helps foreigners learn Japanese by providing translations, speech recognition, audio, vocabulary tracking, grammar explanations, and songs using daily data to improve fluency and confidence in communication with native speakers.
http://arxiv.org/abs/2402.07429v1
Compressor summary: The paper presents a Particle Filter SLAM method that combines encoded data, fiber optic gyro information, and lidar technology to enable precise estimation of vehicle motion and environmental perception for simultaneous localization and mapping in robotics.
http://arxiv.org/abs/2402.07422v1
Compressor summary: The paper introduces NRAM, a new algorithm for news recommendation with attention mechanism, which could greatly enhance personalization of news content on digital platforms.
http://arxiv.org/abs/2402.07419v1
Compressor summary: The paper presents a method to compute causal effects from observational image data using conditional generative models and diffusion techniques, and applies it to evaluate conditional generative models on CelebA dataset.
http://arxiv.org/abs/2402.07418v1
Compressor summary: The text introduces a framework called SemTra that uses multi-modal models and a pretrained language model to adapt semantic skills from user input snippets for cross-domain long-horizon tasks.
http://arxiv.org/abs/2402.07417v1
Compressor summary: This study examines how well vision-language models can estimate uncertainty across different settings and shows that temperature scaling improves their calibration, even with few examples.
http://arxiv.org/abs/2402.07415v1
Compressor summary: SHIFT is a system that adapts object detection models based on contextual information and computational resources, improving energy efficiency and latency in autonomous systems.
http://arxiv.org/abs/2402.07412v1
Compressor summary: The text proposes a novel representation learning approach for reinforcement learning that generates auxiliary rewards based on the transition distance between states, improving learning efficiency and stability in manipulation tasks.
http://arxiv.org/abs/2402.07411v1
Compressor summary: The paper introduces Potential-Based Intrinsic Motivation (PBIM), which preserves optimal policies and prevents suboptimal behavior in complex environments by converting intrinsic motivation rewards into a potential-based form.
http://arxiv.org/abs/2402.07410v1
Compressor summary: This paper investigates the safety objectives of CLIP models, focusing on their resilience to visual factor variations, uncertainty estimations, and anomalous input detection, by testing 83 CLIP models and 127 ImageNet classifiers under various conditions.
http://arxiv.org/abs/2402.07405v1
Compressor summary: The paper introduces Tois'on de Oro, a bilingual framework for financial natural language processing in Spanish and English, which includes a large curated dataset, a finetuned LLM, and an evaluation benchmark to address the gap in Spanish finance NLP research.
http://arxiv.org/abs/2402.07404v1
Compressor summary: The study introduces a novel framework that combines the Analytic Hierarchy Process and GPT-4 to automate and enhance cybersecurity decision-making processes using AI-driven virtual experts.
http://arxiv.org/abs/2402.07403v1
Compressor summary: The paper proposes two new network structures, B-UNet and B-CE-UNet, for improving lung trachea segmentation by adding branch loss and central line loss to learn fine branch features and uncertainty estimation for confidence.
http://arxiv.org/abs/2402.07401v1
Compressor summary: The study investigates how Large Language Models can generate more faithful explanations for fact-checking using a Multi-Agent Debate Refinement framework that improves credibility and trustworthiness.
http://arxiv.org/abs/2402.07398v1
Compressor summary: VisLingInstruct optimizes instructions and visual features for MMLMs to improve zero-shot performance in multi-modal tasks, achieving significant gains on TextVQA and HatefulMemes datasets.
http://arxiv.org/abs/2402.07386v1
Compressor summary: Chain-of-Layer is a method for automatically constructing taxonomies from entities using in-context learning and ranking filters to minimize errors.
http://arxiv.org/abs/2402.07384v1
Compressor summary: This paper studies how small objects in images affect the performance of large language models in answering visual questions and identifies four factors that limit their perception.
http://arxiv.org/abs/2402.07376v1
Compressor summary: The text introduces uOCF, an unsupervised method for learning 3D object representations from real images, which improves generalization and enables applications like segmentation and scene manipulation.
http://arxiv.org/abs/2402.07371v1
Compressor summary: The paper proposes a domain adaptation framework that combines supervised simulated and unsupervised real-world atmospheric turbulence correction to improve image quality and downstream vision tasks.
http://arxiv.org/abs/2402.07370v1
Compressor summary: The paper proposes SAMAE, a self-supervised face swapping method that enhances model training by masking facial regions, using disentangled features, and addressing shape misalignment issues.
http://arxiv.org/abs/2402.07369v1
Compressor summary: The paper proposes Diff-RNTraj, a diffusion model that generates road network-constrained trajectories with road-related information to address privacy concerns and scale limitations in existing trajectory data.
http://arxiv.org/abs/2402.07368v1
Compressor summary: The study examines how well LLM-based SRMs generalize from data and respond to different demographic groups, finding in-context learning helps some groups while hurting others.
http://arxiv.org/abs/2402.07356v1
Compressor summary: The paper introduces a new pair of Gaussian processes that allows extending classical theorems in high-dimensional statistics and machine learning to non-identically-distributed rows.
http://arxiv.org/abs/2402.07352v1
Compressor summary: The paper introduces Data Distribution-based Curriculum Learning (DDCL), a new approach to ordering training samples from easy to hard, which improves the performance and speed of classification for different classifiers and datasets.
http://arxiv.org/abs/2402.07350v1
Compressor summary: The paper explores the idea of antagonistic AI systems that challenge users and may have benefits, such as helping them build resilience or healthier relationships, while discussing ethical considerations for their responsible design.
http://arxiv.org/abs/2402.07344v1
Compressor summary: The study explores new offline reinforcement learning methods to optimize laboratory test scheduling for ICU patients using a preprocessed dataset.
http://arxiv.org/abs/2402.07340v1
Compressor summary: The paper studies how one-layer graph neural networks can recover correct vertex alignments between two noisy graphs with random geometric structure and features, outperforming direct assignment methods in high noise levels.
http://arxiv.org/abs/2402.07338v1
Compressor summary: The text discusses the importance of considering semantics when detecting image manipulations that spread misinformation through social media.