This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-02-14 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2402.08682v1
Compressor summary: IM-3D is a text-to-3D model that uses video generators to improve multi-view generation and produces high-quality 3D outputs efficiently with reduced artifacts.
http://arxiv.org/abs/2402.08680v1
Compressor summary: MARINE is a training-free and API-free framework that reduces object hallucinations in large vision-language models by enriching visual context and using classifier-free guidance.
http://arxiv.org/abs/2402.08679v1
Compressor summary: Key points: - The paper studies how to generate controllable attacks on large language models (LLMs) using a text generation algorithm called COLD - The paper introduces the COLD-Attack framework that can enforce various control requirements such as fluency, stealthiness, sentiment, and coherence - The paper shows that COLD-Attack works well on different LLMs and scenarios Summary: The paper presents a novel framework for controllable attack generation on LLMs using COLD, a text generation algorithm, and demonstrates its effectiveness and versatility.
http://arxiv.org/abs/2402.08678v1
Compressor summary: Graph Mamba Networks (GMNs) are a new class of graph neural networks based on selective state space models that overcome the limitations of message-passing and graph transformers while achieving excellent performance with reduced computational cost.
http://arxiv.org/abs/2402.08676v1
Compressor summary: The text analyzes the convergence of approximate message passing dynamics for non-separable multivariate nonlinearities and its application to a convex optimization problem in multi-class classifications.
http://arxiv.org/abs/2402.08672v1
Compressor summary: The text proposes a method for evaluating and selecting models in changing environments using rolling windows, generalization error estimation, and tournaments.
http://arxiv.org/abs/2402.08671v1
Compressor summary: The paper proposes a new image matching method (SAM) that performs well in estimating poses, compares it with SDF methods, and shows that correspondences in textured regions are crucial for accurate pose estimation.
http://arxiv.org/abs/2402.08670v1
Compressor summary: The proposed Rec-GPT4V: VST scheme uses large vision-language models for multimodal recommendations by leveraging user history, generating item image summaries, and querying user preferences over candidate items.
http://arxiv.org/abs/2402.08667v1
Compressor summary: The paper proposes a new method for training Denoising Diffusion Models that improves score estimation at low noise levels by using knowledge of the target score.
http://arxiv.org/abs/2402.08666v1
Compressor summary: The authors use large language models to create more diverse and realistic text-to-SQL questions and show that this improves the performance and generalization of parsers.
http://arxiv.org/abs/2402.08657v1
Compressor summary: The paper proposes PIN, a learnable spatial prompt that unlocks object localization in caption-based vision-language models without supervised training or new output heads.
http://arxiv.org/abs/2402.08654v1
Compressor summary: The paper introduces Continuous 3D Words, input tokens that allow users to fine-tune various abstract attributes like illumination and shape in text-to-image models.
http://arxiv.org/abs/2402.08653v1
Compressor summary: SAGMAN is a spectral framework that analyzes the stability of graph neural networks (GNNs) by examining distance distortions between input and output manifolds using spectral graph embedding and probabilistic graphical models.
http://arxiv.org/abs/2402.08646v1
Compressor summary: The paper proposes a probabilistic framework for symbolic reasoning that incorporates neuroscience findings and formal logic concepts to advance machine intelligence.
http://arxiv.org/abs/2402.08645v1
Compressor summary: The paper investigates the "dissipating inputs" phenomenon in plain neural nets, which causes convergence failure, and proposes a new hypothesis and architecture for deep learning without residual connections.
http://arxiv.org/abs/2402.08644v1
Compressor summary: Tandem transformers combine a small autoregressive model and a large model in block mode to enhance prediction accuracy and speed up inference in language models.
http://arxiv.org/abs/2402.08643v1
Compressor summary: The paper proposes a new loss function to improve the quality of reconstructed text in learned image compression, showing significant enhancements in experiments.
http://arxiv.org/abs/2402.08638v1
Compressor summary: The paper introduces SemRel, a new dataset of semantic relatedness annotations across 14 languages from Africa and Asia, to study the broader phenomenon of semantic relatedness and its implications for NLP tasks and Large Language Models.
http://arxiv.org/abs/2402.08635v1
Compressor summary: The paper presents a new Bangla Sign Language dataset with 60 sign words, annotated by professionals and tested with machine learning models for word-level recognition.
http://arxiv.org/abs/2402.08631v1
Compressor summary: The paper proposes a multi-perspective evaluation framework for black-box knowledge editing of large language models and introduces postEdit, a novel method that improves style retention and privacy protection.
http://arxiv.org/abs/2402.08622v1
Compressor summary: The paper proposes a method to transfer the appearance of a Neural Radiance Field (NeRF) onto a different 3D geometry using semantic image analogies and correspondence transfer, achieving multi-view consistent results that users prefer over traditional methods.
http://arxiv.org/abs/2402.08621v1
Compressor summary: The paper analyzes online convex optimization algorithms in different settings and presents general meta-algorithms to convert them between feedback types with comparable regret bounds.
http://arxiv.org/abs/2402.08609v1
Compressor summary: The paper shows that using Soft MoE modules in value-based networks improves the scalability and performance of reinforcement learning models.
http://arxiv.org/abs/2402.08601v1
Compressor summary: The paper proposes a training-free approach for non-rigid image editing with Stable Diffusion that improves identity preservation without compromising editability by optimizing text, performing latent inversion, and using timestep-aware text injection sampling.
http://arxiv.org/abs/2402.08595v1
Compressor summary: Graph neural networks struggle with counting patterns like cycles in graphs, but a new approach using homomorphism counts of all structures in the target pattern can improve their expressive power without increasing complexity.
http://arxiv.org/abs/2402.08594v1
Compressor summary: The text proposes a Bayesian method for improving prompt tuning by considering the correlation among source tasks and outperforms existing methods on various NLP benchmarks.
http://arxiv.org/abs/2402.08593v1
Compressor summary: Key points: - Graph Feature Preprocessor is a software library that detects money laundering and fraud patterns in financial transaction graphs. - It enriches transactions with features that improve gradient-boosting-based machine learning models' accuracy. - It operates in a streaming manner, exploits multicore parallelism, and mines subgraph patterns efficiently. - It outperforms standard graph neural networks in accuracy, throughput, and latency on synthetic AML and real Ethereum phishing datasets. Summary: Graph Feature Preprocessor is a fast and accurate software library that detects money laundering and fraud patterns in financial transaction graphs, enriches transactions with features, and improves gradient-boosting-based machine learning models' performance.
http://arxiv.org/abs/2402.08586v1
Compressor summary: The paper proposes a method to find adversarial examples for tree ensembles faster by identifying the consistent features that are perturbed.
http://arxiv.org/abs/2402.08583v1
Compressor summary: Link-MoE is a new graph machine learning model that adapts to different node pairs and improves link prediction accuracy by using various GNNs as experts.
http://arxiv.org/abs/2402.08582v1
Compressor summary: FESS Loss combines contrastive learning and Dice loss for accurate and refined segmentation of medical images, especially in low-data situations.
http://arxiv.org/abs/2402.08581v1
Compressor summary: The paper introduces FactCloze, a model for correcting factual errors in summaries, and SummDSC, a more faithful summary dataset generated by data distillation.
http://arxiv.org/abs/2402.08573v1
Compressor summary: The text discusses a local learning algorithm called "dual propagation" that improves learning efficiency and stability, but has limitations in biological plausibility and nudging symmetry.
http://arxiv.org/abs/2402.08571v1
Compressor summary: Key points: - Glass-like objects are hard to detect and segment due to their transparency and vague boundaries - The paper proposes a new network (MGNet) that improves spatial relationship extraction and semantic mining - The paper also introduces a novel loss function with uncertainty-aware loss for high-confidence segmentation maps Summary: The paper presents MGNet, a new network that can accurately locate and segment glass-like objects using fine-rescaling, merging, and uncertainty-aware loss functions.
http://arxiv.org/abs/2402.08567v1
Compressor summary: Infectious jailbreak exploits large language models in multi-agent environments by spreading harmful behaviors exponentially fast through adversarial images.
http://arxiv.org/abs/2402.08565v1
Compressor summary: The text reviews AI applications in Systematic Literature Reviews, focusing on screening and extraction phases, and evaluates 21 leading tools using a framework of traditional and AI features.
http://arxiv.org/abs/2402.08563v1
Compressor summary: This paper proposes a new method for solving inverse problems in PDEs using denoising diffusion restoration models that improve estimation by exploiting eigenvalues and eigenfunctions of the Laplacian operator.
http://arxiv.org/abs/2402.08562v1
Compressor summary: The paper introduces MoE-LoRA with Layer-wise Expert Allocation (MoLA), a novel method to improve the efficiency of LoRA by dynamically allocating experts in different layers, achieving better performance on various NLP and commonsense QA tasks.
http://arxiv.org/abs/2402.08552v1
Compressor summary: The paper proposes TDPO-R, an algorithm that combines temporal inductive bias and critic active neuron reset to reduce reward overoptimization in diffusion model alignment.
http://arxiv.org/abs/2402.08540v1
Compressor summary: This paper compares generative and non-generative models for design space construction, finding that non-generative models can be more cost-effective and produce high-quality designs with fewer invalid options.
http://arxiv.org/abs/2402.08539v1
Compressor summary: The study uses innovative preprocessing strategies on ADNI dataset to detect Alzheimer's disease early using machine learning models with high accuracy.
http://arxiv.org/abs/2402.08530v1
Compressor summary: The paper introduces a new method for reinforcement learning that separates transition structure and reward, using a distributional successor measure that describes the consequences of behavior, and can learn from data using generative model techniques.
http://arxiv.org/abs/2402.08529v1
Compressor summary: APEN is a framework that improves point cloud neural networks by approximating part-based symmetry using functions with finer equivariance, leading to better generalization for classification and segmentation tasks.
http://arxiv.org/abs/2402.08526v1
Compressor summary: The text proposes a new scenario (instance-incremental learning) and dataset (Concept-1K) to study catastrophic forgetting in large neural networks, showing that existing methods are insufficient to address this issue.
http://arxiv.org/abs/2402.08522v1
Compressor summary: The paper explores how agents can audit platforms for fairness using different collaboration and sampling strategies, finding that uncoordinated collaboration generally improves audit accuracy.
http://arxiv.org/abs/2402.08514v1
Compressor summary: This paper proposes a method to generate counterfactual paths for Markov Decision Processes that remain influenced by the observed path, addressing an overlooked issue in existing approaches.
http://arxiv.org/abs/2402.08511v1
Compressor summary: AmEx-MCTS improves Monte-Carlo tree search by separating value updates, visit count updates, and the selected path, allowing exclusion of already explored regions for better search performance in complex problems.
http://arxiv.org/abs/2402.08506v1
Compressor summary: P-Mamba is a novel method for efficient and accurate segmentation of the left ventricle in pediatric echocardiography using two encoder branches to improve noise suppression and global dependencies.
http://arxiv.org/abs/2402.08502v1
Compressor summary: The paper proposes a safe reinforcement learning method for autonomous vessels that follows temporal logic constraints of COLREGS to avoid collisions.
http://arxiv.org/abs/2402.08498v1
Compressor summary: The Counterfire corpus is a novel dataset of enriched counterarguments to Reddit posts generated by various language models, showcasing strong paraphrasing abilities with evidence and high style integration, but human-written arguments have more richness and diversity.
http://arxiv.org/abs/2402.08496v1
Compressor summary: The text summarizes a systematic review of data-to-text generation research, highlighting research gaps, future directions, challenges, and various aspects of the field.
http://arxiv.org/abs/2402.08493v1
Compressor summary: The paper proposes a new regularization method for linear inverse problems with sparsity constraints that enhances group sparsity and approximates the $l_0$ norm more closely than existing methods.
http://arxiv.org/abs/2402.08492v1
Compressor summary: ChatGPT has lower accuracy and consistency than experienced endoscopists in assessing colonoscopy images using the Boston Bowel Preparation Scale, but shows promise for future fine-tuning.
http://arxiv.org/abs/2402.08491v1
Compressor summary: The study proposes a deep reinforcement learning framework to identify efficient cellular reprogramming strategies using artificial neural networks.
http://arxiv.org/abs/2402.08480v1
Compressor summary: The study introduces generalized propagation for weighted and directed graphs, proposes GPNNs for better attention mechanisms, and extends Ollivier-Ricci Curvature to CURC for analyzing propagation patterns in graph neural networks.
http://arxiv.org/abs/2402.08479v1
Compressor summary: This paper proposes a semi-supervised method to improve interpretability of black box models using entailment alignment between explanations and answers.
http://arxiv.org/abs/2402.08473v1
Compressor summary: The paper explores the embedding space of vision-language models using a new optimization method and finds that they can overgeneralize, failing systematic evaluations despite high zero-shot performance.
http://arxiv.org/abs/2402.08472v1
Compressor summary: The paper proposes using large language models like GPT-4 in STNWeb, a web tool for visualizing optimization algorithms, to create reports and plots that make the tool more accessible and useful for researchers.
http://arxiv.org/abs/2402.08470v1
Compressor summary: Key points: - The paper proposes a novel approach (ST-GTrend) to analyze performance degradation in PV power networks - ST-GTrend uses spatio-temporal coherence, graph attention, and parallel algorithm to separate aging and fluctuation terms - ST-GTrend outperforms existing methods on three large-scale datasets and can speed up trend analysis by 7.92 times Summary: ST-GTrend is a new method that uses spatio-temporal graph neural networks to estimate the long-term performance loss rate of PV inverters from time series data, achieving better accuracy and faster computation than existing methods.
http://arxiv.org/abs/2402.08467v1
Compressor summary: The study shows how ChatGPT can create convincing fake news about the war in Ukraine that people and tools can't easily spot.
http://arxiv.org/abs/2402.08466v1
Compressor summary: The text discusses the need for human oversight in AI training under a management-based regulatory paradigm, which can improve AI performance, fairness, and explainability.
http://arxiv.org/abs/2402.08450v1
Compressor summary: The paper introduces Subgraphormer, an architecture that combines Subgraph GNNs and Graph Transformers, improving performance on various datasets by leveraging attention and positional encodings in product graphs.
http://arxiv.org/abs/2402.08441v1
Compressor summary: The paper proposes two methods to control the latent space configuration of autoencoders, enabling more stable and interpretable training and allowing similarity estimation in supervised autoencoders without decoders or classifiers.
http://arxiv.org/abs/2402.08439v1
Compressor summary: The Jena Facial Palsy Toolbox is a user-friendly tool that simplifies advanced computer vision analysis of subtle facial movements like blinking for medical professionals without programming skills.
http://arxiv.org/abs/2402.08437v1
Compressor summary: The paper proposes a novel loss function based on geometric constraints for camera calibration, using a multitask learning framework that combines neural networks and mathematical properties, and introduces a new dataset with realistic conditions.
http://arxiv.org/abs/2402.08427v1
Compressor summary: RiCL is a self-supervised learning framework that uses contrastive learning to pre-train radar object detectors, enabling them to learn with fewer data and achieve better performance in adverse conditions.
http://arxiv.org/abs/2402.08423v1
Compressor summary: The text proposes eMem-NDT, a method to predict vehicle behavior in autonomous driving using an interpretable neural decision tree that clusters and aligns historical vehicle behavior features.
http://arxiv.org/abs/2402.08421v1
Compressor summary: The text proposes an offline multi-agent reinforcement learning scheme for digital twin-based wireless networks that uses distributional RL and conservative Q-learning to handle uncertainty and jointly trains policies in a centralized manner.
http://arxiv.org/abs/2402.08409v1
Compressor summary: The study presents a deep-learning framework that fuses low-field MRI features with 7T-like features to improve brain image segmentation in a 7T-absent environment, achieving superior results and adaptability.
http://arxiv.org/abs/2402.08406v1
Compressor summary: Bayesian optimization with Markov Decision Processes enables better optimization of black-box functions with constraints and transitions by using reinforcement learning to plan ahead.
http://arxiv.org/abs/2402.08405v1
Compressor summary: The paper introduces Watershed Classifiers, a novel non-parametric approach to regularize 1NN classifiers using a greedy method, leading to arbitrary boundary learning and good generalization on dense datasets.
http://arxiv.org/abs/2402.08403v1
Compressor summary: The paper proposes a model of human decision-making integrating three theories and applies it to conversational AI, aiming to understand ChatGPT's intelligence and its implications for our world.
http://arxiv.org/abs/2402.08401v1
Compressor summary: The paper proposes LOSS-GAT, a graph-based semi-supervised and one-class approach for fake news detection that uses only a small set of labeled data and improves performance over baseline models.
http://arxiv.org/abs/2402.08400v1
Compressor summary: The paper proposes an adaptive hierarchical certification method for image semantic segmentation that relaxes the certification level within a multi-level hierarchy to provide more meaningful information and lower abstain rate.
http://arxiv.org/abs/2402.08397v1
Compressor summary: A hybrid video compression framework with deep learning-based techniques improves the performance of the Enhanced Compression Model, achieving significant BD-rate savings.
http://arxiv.org/abs/2402.08392v1
Compressor summary: This paper studies how Large Language Models can be used as Minecraft agents, explores their strengths and weaknesses in builder and architect roles, and proposes a platform for online interaction and comparison with previous works.
http://arxiv.org/abs/2402.08384v1
Compressor summary: Dynamic Regularization (DReg) is a method to improve deep learning models' confidence and performance by fitting labels for in-distribution samples and applying regularization to outliers, resulting in a more reliable and calibrated model.
http://arxiv.org/abs/2402.08383v1
Compressor summary: LE-PDE-UQ is a method that uses latent vectors to integrate uncertainty quantification into deep learning-based surrogate models for PDEs, outperforming strong baselines in accuracy and long-term predictions.
http://arxiv.org/abs/2402.08382v1
Compressor summary: Punctuation restoration improves structure understanding in natural language and enhances structure-aware representations for various linguistic tasks.
http://arxiv.org/abs/2402.08373v1
Compressor summary: Dynamic Strategies (DyStrat) is a novel method for multi-step forecasting that adapts to different datasets and outperforms fixed strategies.
http://arxiv.org/abs/2402.08371v1
Compressor summary: The paper proposes a hybrid recommendation system combining Collaborative and Content-based filtering with a Genetic Algorithm to suggest suitable courses for students based on multiple criteria, using real data from University of Cordoba's Computer Science Degree.
http://arxiv.org/abs/2402.08369v1
Compressor summary: The text proposes a skill-based imitation learning framework that uses a vision-language model to learn skills from videos and adapts to environmental changes for one-shot imitation of complex tasks.
http://arxiv.org/abs/2402.08367v1
Compressor summary: The paper proposes using Radial Basis Function instead of Fourier-based feature mapping for Physics-Informed Neural Networks, showing improved performance in various problems.
http://arxiv.org/abs/2402.08365v1
Compressor summary: NeuRes is a neuro-symbolic SAT solver that can prove unsatisfiability and uses a novel architecture combining Graph Neural Networks and Pointer Networks to select node pairs for resolution proofs.
http://arxiv.org/abs/2402.08360v1
Compressor summary: The paper introduces VQA-IN, a method to train multimodal language models for domain-specific visual tasks using smaller versions of large language models.
http://arxiv.org/abs/2402.08359v1
Compressor summary: The study presents a novel localization method using semi-dense 2D-3D matching points to improve camera pose estimation accuracy, especially in noisy or sparse scenarios.
http://arxiv.org/abs/2402.08348v1
Compressor summary: The paper introduces CAP2QA, a method to reduce visual hallucination in synthetic image-text data for question-answering tasks by using image-aligned instructive QA dataset.
http://arxiv.org/abs/2402.08345v1
Compressor summary: Conditional Information Gain Trellis (CIGT) is a method for executing parts of a deep convolutional neural network using routing mechanisms based on differentiable information gain-based cost functions, which reduces computational burden and improves classification accuracy with fewer parameters.
http://arxiv.org/abs/2402.08341v1
Compressor summary: This study examines how different input prompts affect the personality traits of large language models (LLMs), revealing that more parameters result in a broader range of traits and fine-tuning influences their behavior.
http://arxiv.org/abs/2402.08333v1
Compressor summary: The paper presents a segmentation method for histopathology images that requires minimal user input and helps bridge the gap between pathologists and machines in clinical settings.
http://arxiv.org/abs/2402.08327v1
Compressor summary: The paper introduces M2KR, a framework for training and evaluating multi-modal retrievers in KB-VQA tasks, and presents PreFLMR, a pre-trained model that achieves state-of-the-art results.
http://arxiv.org/abs/2402.08324v1
Compressor summary: The authors present a new method for neural networks to handle uncertain inputs by approximating non-linearities with local linearization, which allows them to predict output uncertainties and outperform other methods like moment matching.
http://arxiv.org/abs/2402.08321v1
Compressor summary: The paper proposes a hybrid regularizer for exploration by optimization in online decision-making problems, improving regret bounds in both stochastic and adversarial environments.
http://arxiv.org/abs/2402.08320v1
Compressor summary: Vision-based gait recognition relies more on anthropometric information than motion patterns, and existing benchmarks may contain spurious correlations.
http://arxiv.org/abs/2402.08318v1
Compressor summary: The study uses word embedding to analyse how fairy tales from Portugal, Italy and Germany differ in their references to values, revealing a possible shared cultural understanding across European societies.
http://arxiv.org/abs/2402.08316v1
Compressor summary: CrossGaze is a novel gaze estimation method that uses existing computer vision models and attention modules to predict where people are looking without specialized architectures, achieving competitive results on Gaze360 benchmark.
http://arxiv.org/abs/2402.08313v1
Compressor summary: The paper uses physics-informed neural networks to solve Fisher's equation for traveling waves under large reaction rates, improving the method with a residual weighting scheme and an input-based network architecture.
http://arxiv.org/abs/2402.08310v1
Compressor summary: The approach generates detailed 3D representations from historic sketches and can be guided by text inputs, helping experts recreate lost artifacts.
http://arxiv.org/abs/2402.08309v1
Compressor summary: The authors propose a novel document vectorization method using an ensemble of LLMs to detect spear-phishing emails by analyzing their persuasion principles and achieving a 91% F1 score with limited training data.
http://arxiv.org/abs/2402.08303v1
Compressor summary: ChatCell is a natural language-based tool that leverages large language models to facilitate single-cell analysis and improve the accessibility of this pivotal field.
http://arxiv.org/abs/2402.08300v1
Compressor summary: The text proposes using Birkhoff's aesthetic measure to objectively evaluate AI generated music and improve its quality and recommendations.
http://arxiv.org/abs/2402.08298v1
Compressor summary: The paper discusses the importance of rigorous and convincing experimentation in artificial intelligence research, especially metaheuristic optimization, and encourages critical assessment and reflection at both individual and community levels.
http://arxiv.org/abs/2402.08296v1
Compressor summary: The paper introduces a GNN-based preconditioner within a multi-level Domain Decomposition framework to enhance the efficiency and scalability of numerical simulations using GPU computations.
http://arxiv.org/abs/2402.08294v1
Compressor summary: The paper proposes a robust model for ranking fetal ultrasound images based on semantic image quality and uncertainty, and compares it to existing methods.
http://arxiv.org/abs/2402.08290v1
Compressor summary: This paper examines how data poisoning can undermine the effectiveness of counterfactual explanations in analyzing and improving black-box systems.
http://arxiv.org/abs/2402.08284v1
Compressor summary: XAI techniques can help forensic experts solve complex cases by explaining the reasons behind their conclusions and applying logical approaches to crime scene investigations.
http://arxiv.org/abs/2402.08280v1
Compressor summary: Pix2Code is a framework that uses program synthesis and both symbolic and neural representations to learn abstract concepts from images in an unsupervised way, enabling generalizable and interpretable visual relational reasoning.
http://arxiv.org/abs/2402.08277v1
Compressor summary: This paper proposes a method to improve the reliability of large language models for answering questions based on evidence by fine-tuning them with synthetic data and quality filters.
http://arxiv.org/abs/2402.08269v1
Compressor summary: The study investigates how activation patterns in neural networks affect their optimization, finding that a lower batch functional dimension is favored, which may help reduce overfitting.
http://arxiv.org/abs/2402.08268v1
Compressor summary: Key points: - Current language models struggle with complex tasks that involve temporal information from videos - The paper introduces a large dataset of diverse videos and books, RingAttention technique, and other methods to train on long video and language sequences - The paper achieves new benchmarks in retrieval tasks and video understanding, and releases open-source 7B parameter models for multimodal training Summary: The paper presents a novel approach to train large neural networks on long video and language sequences using a curated dataset, RingAttention, and other techniques, and demonstrates improved performance on complex tasks.
http://arxiv.org/abs/2402.08267v1
Compressor summary: The paper proposes a new training method for image coding for machines that improves both recognition and compression performance by adding auxiliary loss to the encoder.
http://arxiv.org/abs/2402.08265v1
Compressor summary: The paper proposes a method to improve text-to-image diffusion models by aligning them with user preferences using a fine-grained reward perspective that considers the initial steps of the generation process and introduces temporal discounting.
http://arxiv.org/abs/2402.08259v1
Compressor summary: The paper surveys table reasoning with large language models (LLMs), analyzing techniques, advantages, and future research directions.
http://arxiv.org/abs/2402.08255v1
Compressor summary: The study introduces a new ANN architecture, ABEL-Spline, that addresses distal interference and catastrophic forgetting in continual learning by ensuring uniformly trainable models without exponential growth in size.
http://arxiv.org/abs/2402.08251v1
Compressor summary: Key points: - Neural network model for recognizing small objects in thermal images from drones - Model consists of backbone, neck, and prediction head using YOLOv5, BI-FPN, transformer encoder, sliding window, and attention mechanism - High accuracy and real-time speed on public dataset and embedded computer Summary: The authors propose a neural network model that uses YOLOv5 and transformers to detect small objects in thermal images from drones with high accuracy and real-time speed.
http://arxiv.org/abs/2402.08250v1
Compressor summary: Key points: - AI systems can revolutionize clinical practices but may also perpetuate social inequities or biases - The authors reviewed 55 articles on different debiasing methods for biomedical NLP and CV - They discussed the strengths and weaknesses of each method and suggested other potential methods to address bias and improve fairness Summary: The authors surveyed recent publications on debiasing methods for AI systems in biomedicine, discussing their pros and cons, and recommending further approaches to ensure accurate and reliable applications.
http://arxiv.org/abs/2402.08249v1
Compressor summary: SepRep-Net is a framework for adapting multiple models to a new domain without source data, using separate pathways that are optimized together and reparameterized for inference.
http://arxiv.org/abs/2402.08244v1
Compressor summary: APALU is a novel trainable activation function that improves learning performance in various deep-learning tasks by maintaining stability, efficiency, and adaptability.
http://arxiv.org/abs/2402.08242v1
Compressor summary: Key points: - ML/AI methods can amplify biases and prejudices in robots and AI applications - A culture of modularity prevents accountability for harms caused by AI - The authors propose a framework to build organizational equity capabilities and detect/address fairness issues in AI development projects - They adapt an Agile process based on Scrum as a primary example Summary: The authors suggest a framework to integrate equity practices into AI development projects, using Scrum as an example, to prevent and mitigate biases and harms caused by ML/AI systems.
http://arxiv.org/abs/2402.08236v1
Compressor summary: BERT4FCA is a novel method for predicting links in bipartite networks using formal concept analysis and BERT, which improves performance over previous FCA-based methods and other classical methods.
http://arxiv.org/abs/2402.08229v1
Compressor summary: The paper proposes a stochastic intervention model for causal graph discovery that minimizes the number of interventions needed, and presents approximation algorithms and experimental results.
http://arxiv.org/abs/2402.08228v1
Compressor summary: This paper investigates how different GNN architectures affect Out-of-Distribution generalization on graphs and proposes a new model that leverages robust properties of self-attention and decoupling.
http://arxiv.org/abs/2402.08227v1
Compressor summary: The text discusses a new method called Instance-Obfuscated Inference (IOI) that protects decision privacy in natural language understanding tasks using pre-trained models, while maintaining seamless operation and low overhead.
http://arxiv.org/abs/2402.08225v1
Compressor summary: LLM-TTA improves OOD robustness of NLP models by using LLM-generated augmentations without retraining or labeling, enhancing performance in sentiment, toxicity, and news classification tasks.
http://arxiv.org/abs/2402.08219v1
Compressor summary: BBox-Adapter is a novel method to adapt black-box LLMs for specific tasks using ranking-based NCE loss and online adaptation mechanism, improving performance and reducing costs.
http://arxiv.org/abs/2402.08211v1
Compressor summary: The study investigates how Transformer models achieve complex tasks by analyzing their self-attention mechanism, which resembles gating mechanisms in the human brain.
http://arxiv.org/abs/2402.08209v1
Compressor summary: The paper proposes an iterative method to quickly identify harmful data instances for data cleansing by using a thresholding bandit algorithm, with theoretical and empirical evidence of its effectiveness.
http://arxiv.org/abs/2402.08208v1
Compressor summary: The paper discusses the challenges of using AI algorithms in autonomous driving systems, focusing on generalization issues and overconfidence risks, and proposes methods to improve their safety and reliability.
http://arxiv.org/abs/2402.08207v1
Compressor summary: Key points: - Road network extraction is important for high-definition maps and localization - Existing methods struggle to merge Euclidean and non-Euclidean data domains - The paper proposes a unified representation (RoadNet Sequence) and a non-autoregressive sequence-to-sequence model - The approach outperforms existing alternatives on nuScenes dataset Summary: The paper introduces a novel way to represent road network data and a non-autoregressive model that efficiently and accurately generates high-definition maps from Euclidean and non-Euclidean data.
http://arxiv.org/abs/2402.08202v1
Compressor summary: The text discusses a novel approach to improve fraud detection by adaptively oversampling critical samples in the kernel space based on their distance to the decision boundary and surrounding sample density.
http://arxiv.org/abs/2402.08200v1
Compressor summary: The paper proposes a method to generate spurious features using large-scale text-to-image diffusion models and a new similarity loss, which can create consistent and visually similar spurious images across different classifiers.
http://arxiv.org/abs/2402.08195v1
Compressor summary: OIFTrack is a one-stream Transformer tracker that optimizes information flow between target template and search tokens to improve discriminative capability and achieve outstanding performance in challenging benchmarks.
http://arxiv.org/abs/2402.08193v1
Compressor summary: GEnBP is a fusion of Ensemble Kalman filter and Gaussian belief propagation methods that efficiently infers high-dimensional models by handling complex dependence structures and distributed computing with low-rank local messages.
http://arxiv.org/abs/2402.08187v1
Compressor summary: GraphDeepONet is a new model that uses graph neural networks to improve deep learning-based prediction of solutions for partial differential equations, handling irregular grids and enabling time extrapolation.
http://arxiv.org/abs/2402.08185v1
Compressor summary: The paper presents a novel strategy that uses low-resolution data for global weather prediction and shows its effectiveness, efficiency, and potential in climate change studies.
http://arxiv.org/abs/2402.08184v1
Compressor summary: The study introduces a framework for transfer learning in MARL using unified state spaces and evaluates it on StarCraft scenarios, showing improved performance compared to learning from scratch or without curriculum.
http://arxiv.org/abs/2402.08183v1
Compressor summary: The authors propose an unsupervised visual sentence representation learning framework that uses visually-grounded text perturbations and achieves comparable performance to existing methods in semantic textual similarity, with cross-lingual transferability.
http://arxiv.org/abs/2402.08182v1
Compressor summary: VCoTTA is a variational Bayesian method that reduces error propagation in Continual Test-Time Adaptation by measuring uncertainties and updating the student model with priors from both source and teacher models.
http://arxiv.org/abs/2402.08180v1
Compressor summary: The paper presents a general framework for online structured prediction with surrogate losses, improving regret bounds for multiclass classification and extending it to other problems using randomized decoding.
http://arxiv.org/abs/2402.08178v1
Compressor summary: The paper proposes a benchmark system to compare language-oriented task planners for home-service embodied agents, testing them on different datasets and simulators.
http://arxiv.org/abs/2402.08174v1
Compressor summary: The paper proposes a method called HPLC that uses landmarks, or high-degree nodes, to represent positional information in graphs and improve link prediction performance.
http://arxiv.org/abs/2402.08170v1
Compressor summary: The LLaGA model combines Large Language Models and Graph Neural Networks to analyze graph-structured data effectively, adapting graphs into sequences and token embeddings with a versatile projector.
http://arxiv.org/abs/2402.08156v1
Compressor summary: The paper proposes a method to enable efficient social learning while preserving individual privacy using differential privacy and log-linear rules for information exchange.
http://arxiv.org/abs/2402.08155v1
Compressor summary: The authors propose a method (CMA-R) to analyze how neural models detect rumors on Twitter by identifying the causal effects of tweets and words, and show that it agrees with human judgments and improves interpretability.
http://arxiv.org/abs/2402.08145v1
Compressor summary: The paper presents a new method for sequential decision-making systems to adapt to non-stationary stochastic environments using relational representations, exploration, and probabilistic model learning.
http://arxiv.org/abs/2402.08138v1
Compressor summary: H2O-SDF uses a new learning method with Object Surface Field to reconstruct 3D indoor scenes with accurate room layouts and detailed object surfaces, overcoming previous limitations.
http://arxiv.org/abs/2402.08134v1
Compressor summary: Key points: - SymNMF approximates a symmetric matrix with a product of two nonnegative matrices - Two randomized algorithms for SymNMF are developed: one using matrix sketching and one using leverage score sampling - Both methods are applied to graph clustering tasks on large data sets and achieve speed ups and quality preservation Summary: The paper proposes two fast and scalable algorithms for SymNMF, a technique that factors symmetric matrices, and demonstrates their effectiveness on graph clustering problems.