This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-02-08 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2402.05111v1
Compressor summary: Edu-ConvoKit is an open-source library for analyzing education conversation data with pre-processing, annotation, and analysis features.
http://arxiv.org/abs/2402.05110v1
Compressor summary: MIPS is a new method that uses neural networks to synthesize Python code from algorithmic tasks and makes them more interpretable and trustworthy.
http://arxiv.org/abs/2402.05109v1
Compressor summary: The paper introduces Hydra heads, a sequentially dependent replacement for standard draft heads in speculative decoding, which improves accuracy and throughput compared to existing methods.
http://arxiv.org/abs/2402.05106v1
Compressor summary: The authors developed a Brazilian Portuguese version of the GRIT model, which generates better image captions using two visual features.
http://arxiv.org/abs/2402.05099v1
Compressor summary: Hydragen is a hardware-aware attention implementation that significantly improves efficiency for large language models with shared prefixes.
http://arxiv.org/abs/2402.05098v1
Compressor summary: The paper compares different diffusion models, proposes a new exploration strategy for off-policy methods, and provides open-source code for future research.
http://arxiv.org/abs/2402.05073v1
Compressor summary: NITO is a novel deep learning approach for topology optimization that offers faster, more efficient, and domain-agnostic solutions compared to existing methods.
http://arxiv.org/abs/2402.05070v1
Compressor summary: The text discusses the challenge of creating AI systems that serve diverse human values and proposes a roadmap with different types of pluralistic models and benchmarks to address this issue.
http://arxiv.org/abs/2402.05054v1
Compressor summary: The paper introduces LGM, a novel framework that generates high-resolution 3D models from text or single-view images using multi-view Gaussian features and an asymmetric U-Net backbone.
http://arxiv.org/abs/2402.05052v1
Compressor summary: The paper proposes a general nonparametric method for learning causal representations from multiple distributions and shows that under certain conditions, it can recover the underlying causal graph and latent variables.
http://arxiv.org/abs/2402.05048v1
Compressor summary: The paper proposes a framework (VADER) to assess how well AI definitions are suited for regulation, highlighting issues with current AI regulation proposals affecting non-AI works.
http://arxiv.org/abs/2402.05045v1
Compressor summary: The paper introduces a new method for fusing multi-modal and multi-resolution remote sensor data without pixel-level labels, which improves efficiency by using binary fuzzy measures instead of the previous fuzzy measures.
http://arxiv.org/abs/2402.05044v1
Compressor summary: SALAD-Bench is a safety benchmark for evaluating the robustness of large language models against various attacks and defense methods.
http://arxiv.org/abs/2402.05039v1
Compressor summary: This paper studies how to use graph explanations to improve GNNs' performance, sample complexity, and robustness.
http://arxiv.org/abs/2402.05035v1
Compressor summary: Key points: - The paper reviews developments in Domain Generalization (DG) for Medical Image Analysis (MedIA), a tool for computer-aided diagnosis systems using deep learning. - It defines domain shift and DG, discusses settings, summarizes methods from three viewpoints, introduces datasets, and suggests future research topics. - It also provides a GitHub project with supporting resources. Summary: The paper gives an overview of Domain Generalization for Medical Image Analysis, which deals with the performance drop of deep learning models across different medical data distributions. It covers definitions, settings, methods, datasets, and future directions, and provides a GitHub project as a resource.
http://arxiv.org/abs/2402.05034v1
Compressor summary: The paper examines how well BERT and other models capture historical changes in English by testing them on fill-in-the-blank questions with sentences from different time periods.
http://arxiv.org/abs/2402.05033v1
Compressor summary: Key points: - Simulated Overparametrization (SOP): trains larger model with fewer parameters for inference - Majority kernels: novel algorithm that integrates with different architectures and boosts performance - Low overhead and strong results on various datasets and models Summary: The paper proposes SOP, a method to train overparameterized models with fewer parameters for inference, and majority kernels, an algorithm that improves performance across architectures with minimal cost.
http://arxiv.org/abs/2402.05025v1
Compressor summary: The proposed white-box method optimizes hyper-parameters for neural networks by minimizing the strong convexity of the loss to improve flatness and generalization.
http://arxiv.org/abs/2402.05015v1
Compressor summary: The text discusses using large language models to improve Bayesian optimization in molecular space discovery, but only if they are pretrained or finetuned with relevant data.
http://arxiv.org/abs/2402.05013v1
Compressor summary: Shallow autoencoders lose the sparse structure of input data during gradient descent, but adding denoising and multi-layer decoding improves compression for sparse data.
http://arxiv.org/abs/2402.05011v1
Compressor summary: The paper proposes a new lossless graph condensation method that uses curriculum learning and expanding window matching to better transfer knowledge from the original graph to the condensed one, reducing computational cost for training GNNs.
http://arxiv.org/abs/2402.05008v1
Compressor summary: EfficientViT-SAM is a faster and accurate segment anything model that combines EfficientViT and SAM, achieving 48.9x speedup on A100 GPU without compromising performance.
http://arxiv.org/abs/2402.05007v1
Compressor summary: FairDebugger is a system that uses machine unlearning to find and explain unfair outcomes in tree-based classifiers.
http://arxiv.org/abs/2402.05002v1
Compressor summary: The paper introduces new randomized strategies for sequential learning problems with incomplete feedback and shows their effectiveness in contextual and non-contextual settings with stochastic outcomes, using a real-world example of classifier monitoring.
http://arxiv.org/abs/2402.05000v1
Compressor summary: The paper introduces pedagogically aligned large language models that use reinforcement learning and human feedback to guide students towards solving complex problems in education, outperforming supervised fine-tuning.
http://arxiv.org/abs/2402.04987v1
Compressor summary: The paper explores how to construct aggregation sets for better event-level prediction using one-dimensional clustering and the PriorBoost algorithm, which improves homogeneity of samples and considers label differential privacy.
http://arxiv.org/abs/2402.04982v1
Compressor summary: The paper proposes a method that combines explainable AI with adaptive learning for energy consumption prediction models, using SHAP clustering to provide insights and balance complexity and performance.
http://arxiv.org/abs/2402.04979v1
Compressor summary: The authors propose a client-server-based augmented reality app that uses synthetic data to enable object pose estimation on edge devices like HoloLens 2 and iPad, without relying on real photographs.
http://arxiv.org/abs/2402.04978v1
Compressor summary: The study proposes a cooperative reasoning scheme between Knowledge Graph and Large Language Models to improve their performance in knowledge-based reasoning tasks and enhance transparency.
http://arxiv.org/abs/2402.04971v1
Compressor summary: The paper proposes a novel method using differentiable neural networks to find local Nash equilibria in complex signaling games with multiple senders and self-interested receivers, showing improvements over existing approaches.
http://arxiv.org/abs/2402.04967v1
Compressor summary: The paper shows that text helps multimodal hate meme detection across domains, while images hinder it.
http://arxiv.org/abs/2402.04964v1
Compressor summary: Key points: - ConvLoRA is a method for multi-target domain adaptation that reduces parameters by adding low-rank decomposition matrices to convolutional layers and using adaptive batch normalization. - It outperforms or matches independent fine-tuned networks with much fewer trainable parameters. - It can be applied to any architecture with convolutional and batch normalization layers. Summary: ConvLoRA is a simple and effective method for multi-target domain adaptation that uses low-rank decomposition and adaptive batch normalization to reduce parameters and improve performance.
http://arxiv.org/abs/2402.04958v1
Compressor summary: Test-time adaptation with selective channel adaptation improves robustness to label distribution shifts and reduces failure risks in deep neural networks.
http://arxiv.org/abs/2402.04957v1
Compressor summary: Large language models like ChatGPT and LLaMA can generate wrong answers confidently; researchers propose a method to correct their overconfidence using a knowledge base.
http://arxiv.org/abs/2402.04953v1
Compressor summary: The article explores how a Kalman filter enhances pose estimation accuracy in 4D deformation models using two data sets.
http://arxiv.org/abs/2402.04938v1
Compressor summary: The paper proposes a way to automate quality assurance in AAA game development, which is currently mostly manual and done by human beta testers.
http://arxiv.org/abs/2402.04933v1
Compressor summary: BCoR is an online RL approach for RMABs that combines Bayesian modeling and Thompson sampling to handle contextual and non-stationary settings, improving performance in public health interventions.
http://arxiv.org/abs/2402.04930v1
Compressor summary: The paper introduces a new class of diffusion models that use correlated noise to improve the training process and generation quality for computer graphics tasks.
http://arxiv.org/abs/2402.04929v1
Compressor summary: The paper proposes DM-SFDA, a method that uses diffusion models to generate source domain images from target features, improving domain adaptation performance.
http://arxiv.org/abs/2402.04924v1
Compressor summary: The paper proposes a new graph condensation method, CTRL, that reduces errors in training trajectories and improves performance on various graph datasets and tasks.
http://arxiv.org/abs/2402.04918v1
Compressor summary: The text explores how to improve ChatGPT's performance in identifying discourse relations using different prompting techniques, but finds that it still struggles with the task even with advanced methods.
http://arxiv.org/abs/2402.04915v1
Compressor summary: Moco is a meta optimizer that learns to adapt its solution construction procedure based on features extracted from the current search state, improving performance on combinatorial optimization problems like TSP and MIS.
http://arxiv.org/abs/2402.04914v1
Compressor summary: The paper proposes a new benchmark for evaluating generative models' ability to control fine-grained linguistic attributes in text generation and analyzes various language models' performance on it.
http://arxiv.org/abs/2402.04906v1
Compressor summary: The authors propose a new method called Conformal Monte Carlo (CMC) meta-learners that can estimate the uncertainty in the treatment effect and help make personalized decisions based on this information.
http://arxiv.org/abs/2402.04902v1
Compressor summary: L4Q is an algorithm that combines parameter-efficient fine-tuning and quantization-aware training for large language models, improving accuracy with sub-4-bit precision.
http://arxiv.org/abs/2402.04898v1
Compressor summary: Key points: - novel sequential team selection model in soccer - models player injury and unavailability using real-world data - Monte-Carlo Tree Search for optimal long-term performance - validated on 2018/19 English Premier League season - reduced injuries and costs compared to benchmark Summary: The paper proposes a soccer team selection model that uses real-world data and Monte-Carlo Tree Search to reduce injuries and costs while maintaining performance.
http://arxiv.org/abs/2402.04892v1
Compressor summary: The paper proposes a general framework for verifying various properties of AI systems using Weighted Model Integration, which can handle different models and properties without strong distributional assumptions.
http://arxiv.org/abs/2402.04883v1
Compressor summary: The paper proposes a cascade framework that uses depth information for effective feature lifting and 3D object localization in camera-based 3D detection, improving performance on NuScenes benchmark and other detectors.
http://arxiv.org/abs/2402.04878v1
Compressor summary: The paper proposes a texture-agnostic object detection method that focuses on shape features by randomizing textures during training with CAD models, addressing the challenge of textureless and metallic objects in robotics.
http://arxiv.org/abs/2402.04875v1
Compressor summary: The text explores proving how various sequence-to-sequence models achieve length and compositional generalization, which are essential forms of out-of-distribution generalization.
http://arxiv.org/abs/2402.04869v1
Compressor summary: The paper proposes a framework for incorporating causality into reinforcement learning, using interventions for causal structure learning during exploration and policy guidance during exploitation, and evaluates it on a simulated fault alarm environment.
http://arxiv.org/abs/2402.04858v1
Compressor summary: The paper introduces CodeIt, a self-improvement method for large language models that uses program sampling, hindsight relabeling, and prioritized experience replay to solve tasks requiring human-level reasoning, achieving state-of-the-art performance on the Abstraction and Reasoning Corpus.
http://arxiv.org/abs/2402.04856v1
Compressor summary: CTEs are explanations for reinforcement learning reward functions that show how different actions affect outcomes and can help users understand and evaluate learned rewards better.
http://arxiv.org/abs/2402.04855v1
Compressor summary: DPCNet is a novel image deraining method that uses spatial and frequency information from two separate feature extraction blocks and an adaptive fusion module to outperform existing methods and provide visually pleasing results.
http://arxiv.org/abs/2402.04852v1
Compressor summary: aLLM4TS adapts Large Language Models for time-series representation learning by using multi-patch prediction, which captures temporal dynamics better than traditional methods, and achieves superior performance in various downstream tasks.
http://arxiv.org/abs/2402.04841v1
Compressor summary: Key points: - Paper proposes an efficient vision model that works on minimal visual data without linguistic inputs - Model uses autoregression and reduces parameter size and training data requirements - Model shows proficiency in various high-level and low-level visual tasks Summary: The paper introduces a new vision model that can understand visual data with little data and no language, using an autoregressive architecture to improve efficiency and adaptability.
http://arxiv.org/abs/2402.04838v1
Compressor summary: The study proposes PaDeLLM-NER, a method to reduce NER generation latency using parallel decoding in large language models without additional modules or architecture changes.
http://arxiv.org/abs/2402.04836v1
Compressor summary: The paper studies the expressiveness of invariant geometric deep learning models, introduces a new model called GeoNGNN that can handle some corner cases, and proves its E(3)-completeness for three existing models.
http://arxiv.org/abs/2402.04835v1
Compressor summary: SARI is a novel framework for noisy partial label learning that uses pseudo-labeling and neural network classification to achieve state-of-the-art results in various settings.
http://arxiv.org/abs/2402.04833v1
Compressor summary: The paper suggests that using the 1,000 longest instructions from standard datasets as a simple and effective way to fine-tune LLMs, and that refining these long instructions further improves their performance.
http://arxiv.org/abs/2402.04832v1
Compressor summary: Structured d-DNNF is more succinct than SDD and less tractable in terms of transformations, while OBDD supports more tractable transformations but is less succinct.
http://arxiv.org/abs/2402.04830v1
Compressor summary: dSGP4 is a differentiable version of SGP4 that enables fast and precise orbital propagation for space applications, integrating with machine learning techniques to further improve precision.
http://arxiv.org/abs/2402.04829v1
Compressor summary: The paper proposes using NeRF as a non-distant environment emitter for inverse rendering, improving accuracy over the common distant environment map approach.
http://arxiv.org/abs/2402.04824v1
Compressor summary: The authors evaluate how well neural agents can adapt their language grounding and coordination strategies to different Follower behaviors in a collaborative reference game, using PPO reinforcement learning with an additional communicative effort signal.
http://arxiv.org/abs/2402.04823v1
Compressor summary: The paper proposes Constrained Deep Generative Models (C-DGMs) that ensure synthetic data complies with given constraints by integrating a Constraint Layer with the model, improving utility and detection.
http://arxiv.org/abs/2402.04821v1
Compressor summary: The paper proposes EMNN, an improved equivariant method for 3D mesh tasks that simplifies geometric deep learning by incorporating face information and hierarchy, achieving better results with less complexity and pre-processing.
http://arxiv.org/abs/2402.04814v1
Compressor summary: Key points: - Deep learning often optimizes for scalar performance on benchmarks, not real-world applications - Open world lifelong learning is a new trend that requires recognition of novel concepts, avoidance of uninformative data, and retention of previous knowledge - The paper introduces a simple baseline using batch normalization to repurpose standard models for open world lifelong learning - The approach shows promising results and should be a future standard for this field Summary: The paper proposes a batch normalization-based baseline for open world lifelong learning, a challenging real-world task that requires adaptability and knowledge retention.
http://arxiv.org/abs/2402.04812v1
Compressor summary: The paper proposes a machine learning method for analyzing employee satisfaction surveys in Dutch, identifying key aspects and using pre-trained language models.
http://arxiv.org/abs/2402.04798v1
Compressor summary: The Spiking-PhysFormer model combines artificial neural networks and spiking neural networks to measure cardiac activity and physiological signals from facial videos with less power consumption than existing methods.
http://arxiv.org/abs/2402.04794v1
Compressor summary: The paper introduces a new scalable framework for multi-view subspace clustering, using kernel feature maps to reduce computation time, and shows its effectiveness on large networks.
http://arxiv.org/abs/2402.04792v1
Compressor summary: OAIF improves direct alignment from preferences methods by providing online feedback from a large language model annotator.
http://arxiv.org/abs/2402.04788v1
Compressor summary: The paper introduces a new benchmark, MLLM-as-a-Judge, to evaluate multimodal large language models in assisting judges, but finds they still have limitations and biases.
http://arxiv.org/abs/2402.04787v1
Compressor summary: The authors propose a framework that uses Bayesian networks to generate and compare explanations for natural language inference tasks with LLM-generated explanations, aiming to understand how LLMs solve problems.
http://arxiv.org/abs/2402.04783v1
Compressor summary: The paper analyzes how periodic activation functions improve performance in vision tasks and provides theoretical evidence for their better behavior compared to ReLU-activated networks.
http://arxiv.org/abs/2402.04779v1
Compressor summary: StableMask improves the decoder-only Transformer by refining the causal mask to balance attention distributions, encode absolute positional information, and support efficient extrapolation and integration with other techniques.
http://arxiv.org/abs/2402.04764v1
Compressor summary: The paper proposes Code as Reward (VLM-CaR), a framework that generates dense reward functions from VLMs through code generation, enabling faster and more accurate training of RL agents.
http://arxiv.org/abs/2402.04762v1
Compressor summary: The text proposes a CNN-based color detection method for computer vision that improves robustness in various lighting conditions and outperforms existing methods.
http://arxiv.org/abs/2402.04756v1
Compressor summary: The paper introduces a network that uses contrastive learning to improve nuclei boundary denoising in semi-supervised segmentation, addressing challenges in pathological images due to color and morphological variations.
http://arxiv.org/abs/2402.04754v1
Compressor summary: LACE is a continuous diffusion model for controllable layout generation that incorporates aesthetic constraints and outperforms existing methods.
http://arxiv.org/abs/2402.04744v1
Compressor summary: This paper studies the problem of high-sparsity regions in N:M structured sparsity models and proposes a method to reduce induced noise and improve performance by using decay mechanisms on gradient flows.
http://arxiv.org/abs/2402.04732v1
Compressor summary: Key points: - The paper proposes a new graph cut algorithm for partitioning graphs under arbitrary size constraints - The algorithm is based on a regularized Gromov-Wasserstein problem and uses an accelerated proximal gradient descent method - The algorithm has several advantages over classical methods, such as global convergence, sparsity and efficiency Summary: The paper presents a new graph partitioning algorithm that can handle arbitrary size constraints by formulating the problem as a regularized Gromov-Wasserstein optimization and solving it with an efficient gradient-based method.
http://arxiv.org/abs/2402.04717v1
Compressor summary: InstructScene is a new method for generating 3D indoor scenes from natural language instructions, improving controllability and fidelity with a semantic graph prior and a layout decoder.
http://arxiv.org/abs/2402.04710v1
Compressor summary: The paper proposes a novel interpretable causal Graph Neural Network framework that combines retrieval-based causal learning and Graph Information Bottleneck theory to improve both explanation and prediction.
http://arxiv.org/abs/2402.04699v1
Compressor summary: Key points: - EvoSeed is a new algorithm to generate natural adversarial samples for deep neural networks - It uses evolutionary strategy, diffusion model, and classifier model in a black-box setting - The generated samples are of high quality and transferable to different classifiers Summary: EvoSeed is an evolutionary algorithm that creates natural adversarial samples for deep neural networks by using a diffusion model and a classifier model in a black-box way, resulting in high-quality and transferable samples.
http://arxiv.org/abs/2402.04686v1
Compressor summary: The paper analyzes camera calibration in robotics and computer vision, proposing a modified method that considers distance-dependent focal length to improve accuracy.
http://arxiv.org/abs/2402.04678v1
Compressor summary: xLLM is a framework that improves the faithfulness and accuracy of natural language explanations for large language models' decisions by optimizing an evaluator that quantifies faithfulness.
http://arxiv.org/abs/2402.04677v1
Compressor summary: The paper studies how neural summarization models convert source information into summaries by analyzing the source sentences of reference and system summaries on two datasets.
http://arxiv.org/abs/2402.04672v1
Compressor summary: The paper proposes G-NAS, a method that uses Differentiable NAS and Generalizable loss to train object detectors on one source domain and generalize to multiple target domains with complex feature imbalance issues.
http://arxiv.org/abs/2402.04671v1
Compressor summary: The paper introduces a collaborative semantic scene completion framework for autonomous vehicles using vehicle-to-vehicle communication to overcome occlusion and short-range perception challenges, improving performance on geometric and visual metrics.
http://arxiv.org/abs/2402.04668v1
Compressor summary: This paper provides an overview of individualized treatment effects methods for electronic health records data, discussing challenges and future research directions in this emerging field.
http://arxiv.org/abs/2402.04655v1
Compressor summary: The paper proposes a method called Distance-Aware Calibration (DAC) to improve confidence calibration in vision-language models fine-tuned with prompt learning, especially for open-vocabulary tasks.
http://arxiv.org/abs/2402.04653v1
Compressor summary: The authors propose a new method to improve machine learning techniques for solving inverse problems by embedding the solution into higher dimensions and jointly designing and learning the regularizer.
http://arxiv.org/abs/2402.04648v1
Compressor summary: OV-NeRF improves semantic field learning for 3D scenes using pre-trained vision and language models, addressing noisy and view-inconsistent semantics issues with single-view and cross-view strategies.
http://arxiv.org/abs/2402.04647v1
Compressor summary: The Latent Plan Transformer (LPT) is a model that uses latent space to connect a Trajectory Generator and the final return, enabling improved decisions and planning with suboptimal trajectories in tasks without step-wise rewards.
http://arxiv.org/abs/2402.04646v1
Compressor summary: The paper presents a new prior for block sparse learning that adapts to data and reduces sensitivity to pre-defined block information, leading to better performance than existing methods.
http://arxiv.org/abs/2402.04644v1
Compressor summary: LEVI is a novel method to improve fine-tuning generalization by adaptively ensembling pre-trained and task-specific models, addressing limitations in both data sources.
http://arxiv.org/abs/2402.04640v1
Compressor summary: The paper presents an enhanced approach to determine not only the general data domain but also its specific attributes using image embeddings and generative models, leveraging the large LAION-5B dataset.
http://arxiv.org/abs/2402.04636v1
Compressor summary: The study shows that large language models can perform simultaneous machine translation by generating a "wait" token and achieving comparable results to state-of-the-art baselines.
http://arxiv.org/abs/2402.04632v1
Compressor summary: The paper introduces a new representation called GSN that combines generalized radiance fields with distilled semantic features, enabling multi-view segmentation of unseen scenes.
http://arxiv.org/abs/2402.04630v1
Compressor summary: DVDet is a detector that uses conditional context prompts and hierarchical textual descriptors to align visual embeddings with fine-grained text descriptions of object parts, improving open-vocabulary detection performance.
http://arxiv.org/abs/2402.04627v1
Compressor summary: The study evaluates strategies for fine-tuning an LLM for question answering over life science KGs, using data augmentation and semantic clues in queries to overcome the scarcity of training data.
http://arxiv.org/abs/2402.04625v1
Compressor summary: Noise Map Guidance (NMG) is a new text-guided diffusion model that improves real-image editing by preserving quality and context without requiring optimization.
http://arxiv.org/abs/2402.04624v1
Compressor summary: MEMORYLLM is a self-updating language model that can memorize new knowledge and maintain its performance over time.
http://arxiv.org/abs/2402.04621v1
Compressor summary: Randomly shuffling features among nodes of the same class improves graph neural network performance by reducing the dependence between graph topology and features.
http://arxiv.org/abs/2402.04618v1
Compressor summary: The paper proposes a new adaptation of MBConv blocks for U-Net architectures to improve semantic segmentation by extracting more detailed spatial information.
http://arxiv.org/abs/2402.04617v1
Compressor summary: InfLLM is a memory-based method that allows large language models to process and understand long sequences without training, improving their performance on long-distance dependency tasks.
http://arxiv.org/abs/2402.04616v1
Compressor summary: TinyLLM is a novel knowledge distillation approach that leverages multiple large teacher models to train a small student model with diverse reasoning skills and contextual understanding.
http://arxiv.org/abs/2402.04615v1
Compressor summary: ScreenAI is a vision-language model that understands UIs and infographics, using a unique mixture of datasets and text annotations to improve performance on various tasks.
http://arxiv.org/abs/2402.04614v1
Compressor summary: The text discusses the trade-off between faithfulness and plausibility in self-explanations generated by large language models, emphasizing the importance of faithfulness for high-stakes decision-making and suggesting ways to improve it without losing plausibility.
http://arxiv.org/abs/2402.04609v1
Compressor summary: Key points: - Post-editing improves text quality of LLMs but has limitations - Neural programmer-interpreter approach preserves domain generalization and adapts editing actions for text generation - Approach outperforms other post-editing methods in cross-domain settings Summary: The paper proposes a neural programmer-interpreter that enhances LLM text quality by adapting editing actions to text generation tasks, while maintaining domain generalization.
http://arxiv.org/abs/2402.04601v1
Compressor summary: Key points: - The paper proposes an alignment-enhanced corrector for Chinese grammatical error correction (CGEC) to address overcorrection problems in Seq2Seq models and decoder-only LLMs. - The method involves training a correction model, using two alignment models, and transferring knowledge from them to the correction model. - The approach improves CGEC performance on three datasets. Summary: The paper presents an alignment-enhanced corrector for CGEC that uses two alignment models and knowledge transfer to reduce overcorrection in both Seq2Seq and decoder-only LLMs, leading to better CGEC results.
http://arxiv.org/abs/2402.04599v1
Compressor summary: JEANIE is a method for aligning 3D skeleton sequences by adjusting temporal and camera views to improve few-shot action recognition and clustering.
http://arxiv.org/abs/2402.04597v1
Compressor summary: The paper proposes a new hybrid metaheuristic algorithm to generate test data for software product families, which outperforms existing methods in quality but takes longer to run.
http://arxiv.org/abs/2402.04596v1
Compressor summary: The paper proposes a new spiking neural network architecture for continual multi-label learning that is computationally efficient and robust to data imbalance.
http://arxiv.org/abs/2402.04588v1
Compressor summary: This paper introduces UltraLink, an open-source multilingual supervised fine-tuning dataset that considers both language-specific and language-agnostic abilities of large language models to improve their cross-lingual transfer capabilities.
http://arxiv.org/abs/2402.04587v1
Compressor summary: The study proposes a new method using a self-supervised masked autoencoder and a sparse masked boundary prompt to accurately segment teeth in CBCT dental images with limited labeled data.
http://arxiv.org/abs/2402.04583v1
Compressor summary: The paper compares different algorithms for converting color images to grayscale using a psychological experiment with participants imagining a "colorless world" and evaluates their effectiveness based on visual quality, information preservation, and selection times.
http://arxiv.org/abs/2402.04579v1
Compressor summary: The text proposes a collective method for generating counterfactual explanations that considers the current density of individuals, which improves upon classical approaches by using optimal transport.
http://arxiv.org/abs/2402.04578v1
Compressor summary: The paper proposes a self-organizing agent system (S-Agents) for flexible collaboration in open-ended settings, inspired by human organizational behavior.
http://arxiv.org/abs/2402.04573v1
Compressor summary: PCAda is a meta-learning approach for evolving domain adaptation that fine-tunes classifier heads with progressive class prototypes and uses conservative sparse attention to prevent interference with historical knowledge.
http://arxiv.org/abs/2402.04567v1
Compressor summary: OIL-AD is an unsupervised method for detecting anomalies in decision-making sequences using offline imitation learning and two features derived from Q function and state value function.
http://arxiv.org/abs/2402.04563v1
Compressor summary: The paper proposes an attention-guided visualization method for ViT models that explains their decisions and localizes objects with high performance using only class labels.
http://arxiv.org/abs/2402.04559v1
Compressor summary: The paper examines whether large language model agents can simulate human trust behaviors in Trust Games and finds that they can, with potential implications for scenarios where trust is important.
http://arxiv.org/abs/2402.04558v1
Compressor summary: The paper introduces a dynamic mask-aware transformer (DMAT) for human de-occlusion, using an expanded convolution head, a multi-head attention mechanism, and an amodal loss to improve the model's performance on AHP dataset.
http://arxiv.org/abs/2402.04555v1
Compressor summary: The paper proposes a probabilistic label fusion and instance refinement method to improve instance-aware semantic mapping from object detection generated by foundation models, achieving better zero-shot performance on real-world datasets.
http://arxiv.org/abs/2402.04554v1
Compressor summary: BirdNeRF is a novel method that uses aerial imagery to reconstruct large-scale scenes faster and with better visual fidelity than traditional approaches, by decomposing the images into smaller sub-scenes and using a projection-guided re-rendering strategy.
http://arxiv.org/abs/2402.04553v1
Compressor summary: The paper proposes a novel method to accelerate stochastic gradient descent using curvature information and preconditioners based on connected Lie groups, which improves convergence and performance across various tasks and architectures.
http://arxiv.org/abs/2402.04542v1
Compressor summary: The study proposes a method to improve multilingual models by using native scripts for each language and aligning their representations in code-switched texts, achieving better results on Nepali-English and Hindi-English datasets.
http://arxiv.org/abs/2402.04541v1
Compressor summary: The authors created a large dataset of five types of brightness illusions and tested data-driven neural network approaches for classifying and locating them, achieving high accuracy and pixel accuracy.
http://arxiv.org/abs/2402.04539v1
Compressor summary: The paper proposes a reinforcement learning method that uses diverse past trajectories as guidance to learn faster and more efficiently, even with sparse and deceptive rewards.
http://arxiv.org/abs/2402.04538v1
Compressor summary: The Triplet Graph Transformer (TGT) is a new model that enables direct communication between neighboring pairs in graphs and achieves state-of-the-art results on various molecular property prediction and optimization tasks.
http://arxiv.org/abs/2402.04523v1
Compressor summary: The SumRec framework uses chat summaries to personalize information recommendations based on speakers' interests, preferences, and experiences.
http://arxiv.org/abs/2402.04520v1
Compressor summary: The paper studies the efficiency of modern Hopfield models for memory retrieval and shows a phase transition behavior based on pattern norms, with efficient variants possible under SETH assumptions.
http://arxiv.org/abs/2402.04519v1
Compressor summary: BioDrone is a bionic drone-based visual benchmark for single object tracking (SOT) that evaluates the robustness of SOT methods in challenging conditions such as tiny target and fast motion, camera shake, and drastic changes between frames.
http://arxiv.org/abs/2402.04513v1
Compressor summary: The paper proposes online cascade learning, a method to use lower-capacity models and a deferral policy to answer queries about data streams with the help of large language models, achieving high accuracy while reducing inference costs by up to 90%.
http://arxiv.org/abs/2402.04505v1
Compressor summary: Sheaves are mathematical tools used to model discourse ambiguities in natural language processing and improve contextual models.
http://arxiv.org/abs/2402.04504v1
Compressor summary: The Text2Street framework generates controllable street-view images from text by using a lane-aware road topology generator, an object layout generator, and a multiple control image generator.
http://arxiv.org/abs/2402.04497v1
Compressor summary: This paper shows that the time complexity of training large language models (LLMs) is almost-linear for some parameter regimes, and provides a complete characterization of their fine-grained complexity.
http://arxiv.org/abs/2402.04494v1
Compressor summary: This paper trains a large transformer model on a huge chess dataset to achieve strong chess performance without complex heuristics or explicit search algorithms.
http://arxiv.org/abs/2402.04492v1
Compressor summary: The ColorSwap dataset helps evaluate and improve multimodal models' ability to match objects with their colors by providing image-caption pairs with swapped color words.
http://arxiv.org/abs/2402.04489v1
Compressor summary: The text discusses the trade-offs between privacy and fairness in machine learning, showing that differential privacy amplifies bias but can be mitigated by counterfactual data augmentation.
http://arxiv.org/abs/2402.04482v1
Compressor summary: BEBLID is a learned binary image descriptor that improves matching accuracy and efficiency for computer vision applications on devices with limited hardware and energy resources.