This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-02-12 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2402.06627v1
Compressor summary: This text discusses how language models can manipulate their outputs to optimize objectives and create negative side effects, such as increasing social media toxicity, due to feedback loops from interactions with the external world.
http://arxiv.org/abs/2402.06625v1
Compressor summary: The text explores how iterative prompting can improve the accuracy and truthfulness of large language models, and proposes new prompting variants to address existing challenges.
http://arxiv.org/abs/2402.06619v1
Compressor summary: The paper introduces Aya, an initiative to create and share multilingual instruction-following datasets and resources to enable more diverse language models in artificial intelligence.
http://arxiv.org/abs/2402.06617v1
Compressor summary: FaBERT is a pre-trained Persian BERT model that excels in various NLP tasks, thanks to its exposure to diverse and cleaned texts from the HmBlogs corpus.
http://arxiv.org/abs/2402.06614v1
Compressor summary: The paper explores how to predict the next state of an unknown dynamical system using novel measures and dimensions without assuming any parametric constraints.
http://arxiv.org/abs/2402.06611v1
Compressor summary: The paper presents a method to predict fresh concrete properties during mixing using CNNs and stereoscopic images, which could help reduce CO2 emissions in concrete production.
http://arxiv.org/abs/2402.06608v1
Compressor summary: Key points: - The study proposes a three-step approach (translate, infer, compile) to generate plans from natural language requests using an LLM and a classical planner. - The approach uses an intermediate representation that is logically interpretable and reduces LLM errors. - The approach achieves high accuracy on task PDDL generation for seven domains. Summary: The study presents a novel three-step method to generate plans from natural language requests using an LLM and a classical planner, which improves accuracy by using a logically interpretable intermediate representation.
http://arxiv.org/abs/2402.06606v1
Compressor summary: RQP-SGD is a new method to train machine learning models with privacy and low memory requirements for IoT devices.
http://arxiv.org/abs/2402.06599v1
Compressor summary: The text investigates how well multimodal language models generalize to different tasks and scenarios, finding that they struggle with distribution shifts and propose in-context learning as a possible solution.
http://arxiv.org/abs/2402.06596v1
Compressor summary: Summary: AndroidArena is an environment and benchmark for evaluating large language models on a modern operating system, revealing their weaknesses in understanding, reasoning, exploration, and reflection.
http://arxiv.org/abs/2402.06592v1
Compressor summary: The text introduces a new neural network architecture for ASR systems that enhances recognizing uncommon words using contextual information flow and shallow fusion with a context language model.
http://arxiv.org/abs/2402.06590v1
Compressor summary: The paper explores how predictive representations, such as the successor representation, can serve as foundational elements for adaptive behavior and intelligence, drawing on reinforcement learning theory and cognitive neuroscience.
http://arxiv.org/abs/2402.06584v1
Compressor summary: G-SciEdBERT is a contextualized language model for German science responses that improves automated scoring accuracy compared to G-BERT.
http://arxiv.org/abs/2402.06581v1
Compressor summary: The paper proposes and compares two ensembling techniques for semantic segmentation that fuse features from different backbones, improving performance in challenging conditions with limited data.
http://arxiv.org/abs/2402.06580v1
Compressor summary: The authors propose a Single Architecture Ensemble method that learns the optimal number and depth of exits per input in a single neural network, improving accuracy and calibration while reducing computational resources.
http://arxiv.org/abs/2402.06578v1
Compressor summary: The paper introduces a new theory for understanding how coupling-based normalizing flows like RealNVP work and helps choose the best coupling functions for different applications.
http://arxiv.org/abs/2402.06570v1
Compressor summary: HyperDistill is a method that uses a morphology-conditioned hypernetwork and policy distillation to learn efficient robot policies that can generalize to different robot shapes.
http://arxiv.org/abs/2402.06563v1
Compressor summary: The study uses machine learning to analyze and fill missing data in electronic patient records, showing its link to health care practices and improving clinical decisions.
http://arxiv.org/abs/2402.06560v1
Compressor summary: Video Annotator (VA) is a framework that allows domain experts to directly label and manage video data for machine learning model development, improving efficiency, usability, and effectiveness of video classifiers.
http://arxiv.org/abs/2402.06559v1
Compressor summary: DiffusionES is a method that optimizes non-differentiable objectives using gradient-free optimization and trajectory denoising, achieving state-of-the-art performance in autonomous driving and complex behavior generation.
http://arxiv.org/abs/2402.06557v1
Compressor summary: The paper proposes QBBN, a model that combines logic and probability to reason with human language, using LBP for efficient inference.
http://arxiv.org/abs/2402.06552v1
Compressor summary: The paper proposes a reinforcement learning scheme for path planning that hides the goal from observers, overcoming limitations of existing methods with a local perception model, graph neural networks, and adaptive deception bonuses.
http://arxiv.org/abs/2402.06549v1
Compressor summary: The study examines how large language models like GPT-4 can improve climate activism tasks, such as detecting hate speech and stance, by enhancing zero-shot settings with retrieval augmentation and re-ranking.
http://arxiv.org/abs/2402.06544v1
Compressor summary: The paper introduces a unified calibration framework for Large Language Models to improve their reliability and correctness in long-form generation tasks, and proposes two self-consistency methods along with various techniques to enhance calibration performance.
http://arxiv.org/abs/2402.06539v1
Compressor summary: Key points: - Semantic segmentation and depth estimation are important tasks in image processing - They are usually done separately but are needed together for applications like robotics or autonomous navigation - The paper proposes a hybrid convolutional network that improves feature extraction for both tasks by separating relevant features - The results show that the proposed method is comparable with state of the art methods Summary: The paper presents a hybrid convolutional network that combines semantic segmentation and depth estimation from a single image, by separating relevant features for each task, achieving comparable results with existing methods.
http://arxiv.org/abs/2402.06537v1
Compressor summary: The paper proposes an unsupervised method to detect out-of-distribution data using feature density estimation via normalizing flows, which can be applied to any pretrained model and achieves strong results on image classification tasks.
http://arxiv.org/abs/2402.06532v1
Compressor summary: GABO is a method for optimizing surrogate objective functions using a Lipschitz-bounded source critic model to guide the optimization in reliable regions, and it performs better than existing baselines on various offline optimization tasks.
http://arxiv.org/abs/2402.06531v1
Compressor summary: The paper proposes a method to transfer semantic labels from a labeled point cloud to an unlabeled one using an octree structure, which also detects changes between the clouds.
http://arxiv.org/abs/2402.06530v1
Compressor summary: The study presents a novel method for early MI detection using multi-view echocardiography and a one-class classification algorithm that improves accuracy and offers more precise and efficient diagnostic tools.
http://arxiv.org/abs/2402.06529v1
Compressor summary: This paper proposes introspective planning to help large language models create uncertain-aware plans for robotic tasks, improving success rates and safety without fine-tuning.
http://arxiv.org/abs/2402.06521v1
Compressor summary: The paper presents a new method for reconstructing 3D façade details using point clouds and a pre-defined 3D model library, improving on conventional approaches without making rectangularity assumptions.
http://arxiv.org/abs/2402.06512v1
Compressor summary: LIFTED is a multimodal mixture-of-experts approach for clinical trial outcome prediction that transforms different modalities into natural language descriptions and integrates them using sparse Mixture-of-Experts framework, improving performance over the best baseline.
http://arxiv.org/abs/2402.06509v1
Compressor summary: The paper investigates how model uncertainty relates to human clarification-seeking behavior and proposes a method to generate clarification questions based on model uncertainty estimation to improve dialogue system performance.
http://arxiv.org/abs/2402.06506v1
Compressor summary: The paper proposes a method to classify 3D building models with facade details using deep neural networks that fuse geometric features with point cloud data, improving performance and advancing semantic segmentation.
http://arxiv.org/abs/2402.06503v1
Compressor summary: ACTER is an algorithm that generates actionable and diverse counterfactual sequences for explaining and preventing failure in reinforcement learning.
http://arxiv.org/abs/2402.06501v1
Compressor summary: The paper proposes using interactive machine learning to enhance human-AI collaboration in complex Command and Control operations, addressing three research focus areas: planning, team optimization, and scalability.
http://arxiv.org/abs/2402.06500v1
Compressor summary: The paper proposes a model and algorithm for detecting root causes of anomalies in threshold-based IT systems using causal discovery and subgraph traversal, with an agent-based extension for relaxing causality assumptions.
http://arxiv.org/abs/2402.06499v1
Compressor summary: The study proposes BarlowTwins-CXR, a self-supervised learning method that improves chest X-ray image analysis and abnormality localization by addressing domain inconsistency issues in cross-domain transfer learning.
http://arxiv.org/abs/2402.06497v1
Compressor summary: The authors develop a pixel-level iris segmentation model using Segment Anything Model (SAM) and fine-tune it with Focal Loss, achieving high accuracy on three datasets.
http://arxiv.org/abs/2402.06494v1
Compressor summary: The paper proposes using deep learning models (2D and 3D U-Net) to automate the segmentation of the target volume in complex radiotherapy treatments, improving accuracy and efficiency compared to manual contouring.
http://arxiv.org/abs/2402.06492v1
Compressor summary: The SQ-Transformer model improves compositionality in language understanding tasks by clustering word embeddings and encouraging systematic attention patterns, especially when the training data is low-complexity.
http://arxiv.org/abs/2402.06487v1
Compressor summary: The paper explains the basics of mathematical logic for legal professionals working with rule-based AI, discussing its limitations and interactions in logical, computational, and mathematical aspects using traffic regulations as examples.
http://arxiv.org/abs/2402.06475v1
Compressor summary: RS-CapRet is a Vision and Language method for remote sensing tasks that uses a large language model and image encoders pre-trained with contrastive learning to generate captions and retrieve images based on textual queries.
http://arxiv.org/abs/2402.06465v1
Compressor summary: The paper investigates how to privately estimate low-dimensional structures in datasets using different types of singular value gaps and provides new upper and lower bounds for the required number of points.
http://arxiv.org/abs/2402.06461v1
Compressor summary: SeqRF is a method that improves the speed and quality of sampling in continuous-time generative models by learning a linear path to straighten the probability flow and reduce global truncation error.
http://arxiv.org/abs/2402.06457v1
Compressor summary: V-STaR improves large language models by using both correct and incorrect solutions during self-improvement to train a verifier that selects better solutions at inference time, leading to significant test accuracy improvements on various tasks.
http://arxiv.org/abs/2402.06452v1
Compressor summary: The study introduces a new algorithmic framework that simultaneously constructs and evaluates combinations of decision trees for prediction, unlike existing methods like bagging and boosting that do not directly evaluate the combination performance.
http://arxiv.org/abs/2402.06446v1
Compressor summary: ControlUDA is a diffusion-assisted framework that uses text-to-image models and target prior to generate realistic images for unsupervised domain adaptation in semantic segmentation under adverse weather conditions.
http://arxiv.org/abs/2402.06445v1
Compressor summary: Graph neural networks can learn classical algorithms without iterating through them, by training on their equilibria instead.
http://arxiv.org/abs/2402.06443v1
Compressor summary: The paper proposes a multi-task explainable neural model for detecting misinformation that generates explanations as text summaries.
http://arxiv.org/abs/2402.06441v1
Compressor summary: The paper introduces a new neural network architecture that combines ResNet and Taylor series, improving time series analysis accuracy and opening up new possibilities for research and applications.
http://arxiv.org/abs/2402.06436v1
Compressor summary: The study compares two types of image-to-image translation networks for estimating 2D-3D correspondences and finds that diffusion models perform better than GANs for 6D object pose estimation.
http://arxiv.org/abs/2402.06434v1
Compressor summary: Confounded datasets cause problems in continual learning that are not addressed by standard methods; training jointly on all tasks can help overcome these challenges.
http://arxiv.org/abs/2402.06423v1
Compressor summary: CurveFormer++ is a single-stage Transformer method for 3D lane detection from perspective images without image feature transformation, using curve propagation and attention mechanisms.
http://arxiv.org/abs/2402.06420v1
Compressor summary: This workshop gathers experts on open-domain dialogue research, focusing on simulating human intelligence in conversations, and includes a research track and shared task with live human evaluation.
http://arxiv.org/abs/2402.06414v1
Compressor summary: The paper proposes using cryptographic techniques like Zero-Knowledge Proofs to ensure fairness, transparency, and privacy in generative AI models for domains like medicine and law.
http://arxiv.org/abs/2402.06402v1
Compressor summary: HTrMRL is a meta-reinforcement learning method that leverages past experiences to improve performance, efficiency, and generalization in new tasks.
http://arxiv.org/abs/2402.06390v1
Compressor summary: Emerging deep-learning techniques like NeRFs and GS are revolutionizing computer graphics, while deepfake methods raise ethical concerns but offer potential for creating realistic avatars when combined with other technologies.
http://arxiv.org/abs/2402.06389v1
Compressor summary: The paper presents a prompting-free generative method to create personalized painterly content based on users' aesthetic preferences using semantic injection and a genetic algorithm.
http://arxiv.org/abs/2402.06385v1
Compressor summary: The authors propose using facial expressions and head movements for Human-AI interaction instead of text chats, and present three approaches to model non-verbal cues in real-time.
http://arxiv.org/abs/2402.06380v1
Compressor summary: The authors develop algorithms for learning Gaussian trees and polytrees from data, with optimal and efficient methods for distribution and structure learning, and provide theoretical and empirical results.
http://arxiv.org/abs/2402.06379v1
Compressor summary: The text proposes a technique called Learning Using Privileged Information, which improves tumor segmentation on digital mammograms by using an auxiliary model that accesses more data than the main model, leading to better results and higher F1 score.
http://arxiv.org/abs/2402.06378v1
Compressor summary: The FDVM-Net is a frequency-domain based network that corrects exposure abnormalities in endoscopic images by reconstructing their frequency domain, using a C-SSM block to capture local and long-range dependencies, and achieves state-of-the-art results in speed and accuracy.
http://arxiv.org/abs/2402.06377v1
Compressor summary: Reinforcement learning and particle filter are integrated to optimize geosteering decisions by processing real-time well-log data and estimating the well's location relative to stratigraphic layers.
http://arxiv.org/abs/2402.06367v1
Compressor summary: The paper proposes a transformer event encoder with point process loss to handle irregular sampling patterns in electronic health records, and shows its effectiveness in various benchmark and real-world datasets.
http://arxiv.org/abs/2402.06359v1
Compressor summary: Key points: - The text proposes a formal model of human values for computational representation and reasoning over values in AI systems. - The model is based on social psychology research and can be applied to real-world use cases. - The model helps address the value alignment problem and support individuals and communities in making informed decisions. Summary: The text introduces a new model of human values that can be used to reason over values in AI systems, based on social psychology research and with potential applications to real-world scenarios.
http://arxiv.org/abs/2402.06353v1
Compressor summary: Key points: - Medical imaging datasets are crucial for AI in healthcare but their quality is often poor on CCPs. - The paper investigates 20 popular datasets on CCPs and finds issues like vague licenses, missing metadata, duplicates, etc. - The paper introduces a commons-based stewardship model to improve data quality and governance. Summary: The paper examines the quality of medical imaging datasets on CCPs and suggests a new stewardship model to enhance their documentation, sharing and maintenance.
http://arxiv.org/abs/2402.06348v1
Compressor summary: The paper presents a new framework for restless multi-armed bandits that ensures fair exposure to each arm by proportional allocation of pulls based on their stationary reward distribution, achieving sublinear regret in single and multiple pull scenarios.
http://arxiv.org/abs/2402.06342v1
Compressor summary: The paper explores how to improve document-level neural machine translation by giving more weight to target language context in the source sentences.
http://arxiv.org/abs/2402.06341v1
Compressor summary: RareBench is a benchmark and dataset to evaluate large language models' ability to diagnose rare diseases, showing promising results compared to specialist physicians.
http://arxiv.org/abs/2402.06332v1
Compressor summary: The paper introduces InternLM-Math, an open-source math reasoning model that can reason, verify, prove, and augment math problems in various settings, achieving state-of-the-art performance on several benchmarks.
http://arxiv.org/abs/2402.06331v1
Compressor summary: This paper evaluates Open Set Recognition methods, which handle uncertainty by distinguishing between known and unknown classes, and provides guidelines for their assessment.
http://arxiv.org/abs/2402.06329v1
Compressor summary: The study presents a Network that uses FlowNet2 and POFRN-Net to recognize displacement of a RC frame structure from videos by monocular camera, and demonstrates its application on four floors of three videos.
http://arxiv.org/abs/2402.06330v1
Compressor summary: This article surveys recent advances in continual graph learning, focusing on overcoming catastrophic forgetting and improving performance continuously.
http://arxiv.org/abs/2402.06326v1
Compressor summary: TIGPrompt is a versatile framework that bridges temporal and semantic gaps in TIG models by using temporally-aware prompts for various tasks.
http://arxiv.org/abs/2402.06323v1
Compressor summary: The paper proves that random over-parameterized neural networks with a flat prior and an underlying narrow teacher network can generalize well by biasing towards simpler functions, reducing sample complexity.
http://arxiv.org/abs/2402.06318v1
Compressor summary: TimEHR is a novel GAN model that generates time series data from EHRs by treating them as images and using two conditional GANs to handle missingness patterns and values, achieving better performance than existing methods.
http://arxiv.org/abs/2402.06315v1
Compressor summary: Key points: - The paper proposes a novel method (MSADGN) for cross-scene sea-ice land clutter classification using deep learning and multisource semi-supervised learning. - MSADGN consists of three modules that extract domain-invariant, domain-specific, and domain-related features from labeled and unlabeled source domains and generalize them to an arbitrary target domain. - The method is validated in twelve domain generalization scenarios and outperforms 10 state-of-the-art methods. Summary: The paper presents MSADGN, a novel deep learning method that can classify sea-ice land clutter across different domains using semi-supervised learning and feature extraction from multiple source domains. The method shows superior performance in twelve domain generalization tests.
http://arxiv.org/abs/2402.06295v1
Compressor summary: The study presents an interpretable multimodal deep neural network approach to predict antimicrobial multidrug resistance in ICU patients using electronic health records.
http://arxiv.org/abs/2402.06293v1
Compressor summary: ProFITi is a new model for forecasting irregular time series with missing values, using conditional normalizing flows to learn joint distributions without fixed-shape assumptions.
http://arxiv.org/abs/2402.06288v1
Compressor summary: The paper proposes a new refinement strategy for creating Level of Detail 3 (LoD3) building models using lower-level models and laser point clouds, overcoming challenges in standardization and accuracy, and enabling more applications.
http://arxiv.org/abs/2402.06287v1
Compressor summary: The text introduces a new taxonomy for studying human-machine interaction in high-stake tasks using machine learning systems.
http://arxiv.org/abs/2402.06276v1
Compressor summary: The study presents an active learning method for time-series models that considers safety constraints and dynamically explores the input space using Gaussian processes.
http://arxiv.org/abs/2402.06268v1
Compressor summary: YAMLE is an open-source framework that helps machine learning researchers and practitioners to easily prototype, experiment, and reproduce their models using PyTorch libraries.
http://arxiv.org/abs/2402.06266v1
Compressor summary: MORL algorithms face challenges with scalarization methods and value function interference, leading to suboptimal policies in certain environments.
http://arxiv.org/abs/2402.06264v1
Compressor summary: This study develops LLaVA-Docent, a multi-modal large language model that enhances art appreciation education using technology.
http://arxiv.org/abs/2402.06262v1
Compressor summary: Key points: - Large Language Models are costly and memory-intensive - Existing eviction policies have limitations in importance score calculation and eviction scope construction - RoCo is a new policy based on temporal attention scores and robustness measures - RoCo outperforms existing policies in prefilling and auto-regressive decoding stages - EasyKV is a software package for key-value constrained generative inference Summary: The paper proposes RoCo, a robust cache eviction policy for Large Language Models that improves performance and efficiency in memory-constrained environments, and releases EasyKV, a user-friendly software package for key-value constrained generative inference.
http://arxiv.org/abs/2402.06255v1
Compressor summary: The paper proposes Prompt Adversarial Tuning (PAT), a method to train and embed a defense control mechanism as a prefix to user prompts, protecting Large Language Models from producing harmful information by reducing the success rate of advanced attacks while maintaining high answer accuracy.
http://arxiv.org/abs/2402.06251v1
Compressor summary: The study uses deep learning to automatically identify insomnia patients by extracting optimal features from EEG signals and achieves high accuracy without sleep stage annotation.
http://arxiv.org/abs/2402.06249v1
Compressor summary: This paper proposes a clustering-based defense mechanism to counter adversarial patch attacks in image classification tasks by isolating and neutralizing anomalous image segments, achieving better accuracy than existing methods.
http://arxiv.org/abs/2402.06244v1
Compressor summary: The paper proposes a new training procedure called CRMT that improves the robustness of multi-modal models by regulating essential components and reducing modality preference influence.
http://arxiv.org/abs/2402.06223v1
Compressor summary: The paper introduces a causal model for multimodal data that reveals how contrastive representation learning identifies coupled variables and suggests linear independent component analysis as an effective tool for learning disentangled representations.
http://arxiv.org/abs/2402.06221v1
Compressor summary: ResumeFlow is a tool that uses large language models like GPT-4 to quickly tailor resumes to specific job postings, making the process easier and more accurate for applicants.
http://arxiv.org/abs/2402.06220v1
Compressor summary: The paper proposes a meta-SCM method to identify and use causal factors from different NLP tasks, improving zero-shot capabilities and reducing spurious correlations.
http://arxiv.org/abs/2402.06213v1
Compressor summary: UAD is a novel method for MSFDA that uses knowledge distillation to adapt models from multiple institutions without accessing their data and achieves better performance on image-based diagnosis tasks.
http://arxiv.org/abs/2402.06212v1
Compressor summary: The authors propose a method to reduce halos in image enhancement algorithms by considering how our eyes perceive light and dark variations.
http://arxiv.org/abs/2402.06204v1
Compressor summary: The paper investigates if large language models are good at both generating and evaluating answers, finding that they perform worse in evaluation tasks and sometimes give unfaithful evaluations.
http://arxiv.org/abs/2402.06198v1
Compressor summary: GS-CLIP uses 3D Gaussian Splatting and a pre-trained vision-language model to enhance 3D representation for object identification and classification.
http://arxiv.org/abs/2402.06196v1
Compressor summary: Key points: - Large Language Models (LLMs) are powerful natural language models trained on massive text data - The paper reviews three popular LLM families (GPT, LLaMA, PaLM) and other related techniques, datasets, metrics, and benchmarks - The paper discusses the characteristics, contributions, and limitations of LLMs and open challenges for future research Summary: The paper provides an overview of Large Language Models, their families, applications, and evaluations, and identifies the current challenges and directions in this rapidly evolving field.
http://arxiv.org/abs/2402.06191v1
Compressor summary: The BSCCM dataset is a large collection of images of white blood cells with surface protein measurements, aiming to aid in developing and testing computational imaging algorithms for biomedical purposes.
http://arxiv.org/abs/2402.06190v1
Compressor summary: LoGoNet is a new neural network architecture that uses self-supervised learning to improve medical image segmentation with less data and cost.
http://arxiv.org/abs/2402.06188v1
Compressor summary: S3L is a self-supervised framework that learns high-quality features from whole slide images for biomedical diagnostic tasks.
http://arxiv.org/abs/2402.06187v1
Compressor summary: Premier-TACO is a multitask feature representation learning method that improves few-shot policy learning efficiency in sequential decision-making tasks by pretraining with multitask offline datasets and using a novel negative example sampling strategy.
http://arxiv.org/abs/2402.06185v1
Compressor summary: The study presents SpinePose, an artificial intelligence tool that accurately and reliably predicts spinopelvic parameters from X-ray images without manual entry.
http://arxiv.org/abs/2402.06184v1
Compressor summary: The paper explores the fractal nature of the boundary between stable and divergent neural network training, using hyperparameter values as a function input.
http://arxiv.org/abs/2402.06171v1
Compressor summary: Mixup augments data by combining instances and labels, and its unique geometric configuration in deep networks enhances calibration by aligning activations along decision boundaries.
http://arxiv.org/abs/2402.06165v1
Compressor summary: The study proposes a contrastive learning framework that uses supervised and self-supervised signals to detect facial action units (AUs) without relying on pixel-level information, improving performance over existing methods.
http://arxiv.org/abs/2402.06160v1
Compressor summary: The paper proposes a new approach to estimate predictive uncertainty using evidential deep learning and shows that existing methods have spurious uncertainty issues, which are resolved by modeling a consistent target distribution with a mixture of Dirichlet distributions.
http://arxiv.org/abs/2402.06155v1
Compressor summary: The authors propose and evaluate model editing with canonical examples, a method that improves language models' performance on various tasks by finetuning selected sense vectors using a small number of examples for desired behaviors.
http://arxiv.org/abs/2402.06152v1
Compressor summary: The text describes a new algorithm that uses infrared imaging and color processing to accurately identify targets in electric power construction monitoring videos with low false recognition rates.
http://arxiv.org/abs/2402.06150v1
Compressor summary: The paper proposes a probabilistic framework to learn domain-invariant representations from insufficient data by measuring discrepancy between mixture distributions and aligning probabilistic embeddings.
http://arxiv.org/abs/2402.06149v1
Compressor summary: HeadStudio is a framework that uses 3D Gaussian splatting to create realistic and animated avatars from text prompts, achieving high-quality results and interactive control.
http://arxiv.org/abs/2402.06137v1
Compressor summary: This paper improves the privacy analysis of two classical differentially private mechanisms using Gaussian noise under certain assumptions, and proposes a new adaptive mechanism for high privacy scenarios.
http://arxiv.org/abs/2402.06136v1
Compressor summary: SIR is a method to decompose shadows for inverse rendering on indoor scenes using multi-view data, improving realism in material estimation under unknown light positions.
http://arxiv.org/abs/2402.06135v1
Compressor summary: The paper proposes HOME-GCL, a novel method that learns representations for multiple categories of map entities (e.g., road segments and land parcels) using a heterogeneous graph and contrastive learning.
http://arxiv.org/abs/2402.06132v1
Compressor summary: The paper investigates user clicking patterns in interactive segmentation and proposes a new evaluation strategy based on adversarial attacks to assess the robustness of models w.r.t click positions.
http://arxiv.org/abs/2402.06128v1
Compressor summary: ATP is a plug-and-play node-wise propagation optimization strategy that adapts to different topological roles of nodes in web-scale graphs, improving efficiency and performance of scalable graph neural networks.
http://arxiv.org/abs/2402.06126v1
Compressor summary: The paper introduces Learn-To-be-Efficient (LTE), an algorithm to train large language models with more structured activation sparsity for efficiency and improved performance.
http://arxiv.org/abs/2402.06125v1
Compressor summary: The study introduces a new algorithm to generate text guided by specific rhetorical relations without fine-tuning the language model, and evaluates it using both automatic and human methods.
http://arxiv.org/abs/2402.06121v1
Compressor summary: iDEM is a fast, scalable, and simulation-free algorithm that generates independent samples from unnormalized probability distributions using only the energy function and its gradient.
http://arxiv.org/abs/2402.06120v1
Compressor summary: The paper introduces a framework based on group and symmetry principles to evaluate large language models' reasoning capabilities, focusing on arithmetic reasoning and four group properties.
http://arxiv.org/abs/2402.06119v1
Compressor summary: The Continuum Physical Dataset (ContPhy) is a new benchmark for testing AI's ability to reason about diverse physical properties and dynamics, especially for soft-bodied objects, and it introduces an oracle model that combines particle-based physics and large language models.
http://arxiv.org/abs/2402.06118v1
Compressor summary: The paper introduces ViGoR, a framework that uses fine-grained reward modeling to improve visual grounding of large vision language models with human evaluations and automated methods.
http://arxiv.org/abs/2402.06117v1
Compressor summary: The paper presents a new method for motion deblurring that uses pixel adaptive and feature attentive design, content-aware global-local filtering, and non-uniform sampling to handle large blur variations and achieve better performance than existing approaches.
http://arxiv.org/abs/2402.06110v1
Compressor summary: The study compares two machine learning models for simulating carbon storage and proposes faster, more accurate methods using surrogate models and data assimilation techniques.
http://arxiv.org/abs/2402.06107v1
Compressor summary: CHEESE is a framework that detects cheating in online exams using multiple features and instances, achieving high performance compared to existing methods.