This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-09-11 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2409.06704v1
Compressor summary: GeoCalib is a deep neural network that uses 3D geometry to estimate camera parameters from a single image, outperforming existing methods in accuracy and robustness.
http://arxiv.org/abs/2409.06703v1
Compressor summary: LEIA is a novel NeRF-based method to represent and interpolate dynamic 3D objects without relying on heuristics or motion information.
http://arxiv.org/abs/2409.06694v1
Compressor summary: Key points: - Cancer is a disease of uncontrolled cell growth - TCRs are proteins that help recognize antigens, including cancer-related ones - TCR-based immunotherapies use sequencing technologies to find potent anti-cancer TCRs - DANCE is a method that generates images from TCR sequences using CGR and kaleidoscopic images - The study classifies TCRs based on their target cancer cells using deep learning vision models Summary: The paper introduces DANCE, a method that converts TCR protein sequences into chaos-enhanced kaleidoscopic images for visual analysis and classification of their target cancer cells using deep learning.
http://arxiv.org/abs/2409.06692v1
Compressor summary: The paper proposes a hybrid fact-checking approach for knowledge graphs that combines different methods to achieve better performance than existing approaches.
http://arxiv.org/abs/2409.06691v1
Compressor summary: The authors propose a distributional soft preference labeling method to improve Direct Preference Optimization (DPO) by using weighted geometric averaging, which improves performance on standard benchmarks for alignment research.
http://arxiv.org/abs/2409.06685v1
Compressor summary: GigaGS is a novel 3D Gaussian Splatting method that efficiently and effectively reconstructs large-scale scene surfaces with high quality by applying partitioning and consistency constraints.
http://arxiv.org/abs/2409.06683v1
Compressor summary: The proposed method improves object pose distribution estimation in robotics by using CAD models and correspondence distributions, leading to faster convergence and better performance.
http://arxiv.org/abs/2409.06679v1
Compressor summary: E2LLM is a novel approach that improves large language models' ability to process long contexts while reducing computational complexity and leveraging pretrained models.
http://arxiv.org/abs/2409.06671v1
Compressor summary: The study uses advanced AI models like YOLOv8 to accurately diagnose diseases in sweet orange leaves, potentially transforming disease detection in agriculture and promoting sustainable farming practices.
http://arxiv.org/abs/2409.06669v1
Compressor summary: The paper introduces a new dynamic router mechanism for Mixture-of-Experts models that adapts the number of experts assigned to each input token based on its importance, improving performance on NLP tasks.
http://arxiv.org/abs/2409.06666v1
Compressor summary: LLaMA-Omni is a new model that enables real-time, high-quality speech interaction with large language models without transcription and with low latency.
http://arxiv.org/abs/2409.06662v1
Compressor summary: Key points: - novel method for recovering human motion from monocular video - uses Gravity-View (GV) coordinate system to reduce ambiguity and error accumulation - outperforms state-of-the-art methods in accuracy and speed Summary: The paper proposes a new method that estimates human motion from monocular video using a gravity-aligned coordinate system, which improves accuracy and speed over existing methods.
http://arxiv.org/abs/2409.06648v1
Compressor summary: The paper proposes a new image vectorization method that considers depth ordering, convexification, and curvature-based inpainting to create scalable shape layers for better editing and semantic vectorization.
http://arxiv.org/abs/2409.06644v1
Compressor summary: EyeCLIP is a multi-modal visual-language foundation model that uses partial text data to improve early detection of eye diseases by leveraging large unlabeled and labeled data through a pretraining strategy.
http://arxiv.org/abs/2409.06639v1
Compressor summary: TeXBLEU is a new metric for evaluating mathematical expressions in LaTeX format that performs better than traditional metrics and has high correlation with human evaluation.
http://arxiv.org/abs/2409.06633v1
Compressor summary: The authors propose SaRA, a method to improve image and video generation tasks by re-utilizing ineffective parameters in pre-trained diffusion models and fine-tuning them with a sparse weight matrix.
http://arxiv.org/abs/2409.06624v1
Compressor summary: The paper studies how to choose the right mixture of extra languages and learning rate for continuous pre-training large language models to improve their Chinese ability and adapt to other domains, and deploys a 70B version on a chat system.
http://arxiv.org/abs/2409.06622v1
Compressor summary: The study examines how well LLMs capture syntactic and semantic information in Italian using synthetic data from BLMs, finding that abstract linguistic concepts are not well-represented in pre-trained sentence embeddings.
http://arxiv.org/abs/2409.06620v1
Compressor summary: The paper presents a unified framework for text-to-3D content generation that uses multi-view guidance and a novel densification algorithm to produce realistic 3D models efficiently.
http://arxiv.org/abs/2409.06618v1
Compressor summary: Key points: - Apply self-supervised learning techniques to seafloor imagery dataset (BenthicNet) - Study performance for complex hierarchical multi-label classification task - Show benefits of using in-domain data for pre-training over ImageNet Summary: The authors use self-supervised learning on a large seafloor imagery dataset to improve hierarchical multi-label classification and show that pre-training with in-domain data outperforms ImageNet.
http://arxiv.org/abs/2409.06617v1
Compressor summary: The paper proposes a selective feature extraction method for Multiple Object Tracking that reduces overhead and improves accuracy in occlusion scenarios.
http://arxiv.org/abs/2409.06612v1
Compressor summary: The study proposes label-free evaluation metrics for SSL encoders using unlabelled data and investigates their correlation with linear probe accuracy across different SSL methods.
http://arxiv.org/abs/2409.06609v1
Compressor summary: This paper discusses challenges and solutions for using machine learning to analyze magnetic resonance spectroscopic imaging data, focusing on improving precision and error characterization.
http://arxiv.org/abs/2409.06603v1
Compressor summary: The proposed GRTN network uses a multi-fusion gated recurrent Transformer to achieve state-of-the-art video denoising performance with minimal delay, by selectively fusing relevant information from previous frames.
http://arxiv.org/abs/2409.06601v1
Compressor summary: The Skepticism Modeling (SM) approach improves large language models' (LLMs) uncertainty estimation by combining token and logits information, pre-training, and fine-tuning with doubt emotion aware data.
http://arxiv.org/abs/2409.06595v1
Compressor summary: The paper introduces GroUSE, a benchmark to evaluate judge models in RAG systems, and finds that existing judges have limitations, while finetuning Llama-3 improves its performance.
http://arxiv.org/abs/2409.06590v1
Compressor summary: Key points: - Paper proposes a new multi-scale feature fusion network for single image super-resolution using convolutional and Transformer networks - Model fuses global and local information through two-branch architecture - Model uses modular connection to supplement low-pixel images with shallow and deep features - Model outperforms other lightweight models with same parameters Summary: The paper introduces a novel network that combines convolutional and Transformer networks for super-resolution, fusing global and local image features in two branches and supplementing low-pixel images with shallow and deep features, achieving better results than other lightweight models.
http://arxiv.org/abs/2409.06585v1
Compressor summary: Key points: - Hip replacement procedures improve quality of life and mobility - A temporal graph convolutional neural network (TG-CNN) model predicts hip replacement risk one year in advance using primary care medical event codes - The model achieves high accuracy and calibration, and outperforms four baselines Summary: The study developed a TG-CNN model that can accurately predict hip replacement need a year ahead by analysing primary care data, potentially improving patient care and health service efficiency.
http://arxiv.org/abs/2409.06583v1
Compressor summary: The authors propose a teacher-student framework using channel augmentation for 3D semi-supervised object detection, which improves performance on the KITTI dataset.
http://arxiv.org/abs/2409.06567v1
Compressor summary: The paper explores how well multilingual pretrained language models capture abstract linguistic representations by using synthetic data and a new multiple-choice task focused on subject-verb agreement in various languages, finding that these models have language-specific differences and syntactic structure is not shared even among closely related languages.
http://arxiv.org/abs/2409.06559v1
Compressor summary: Learn2Aggregate is a machine learning framework that uses graph neural networks to selectively aggregate constraints in Chv'atal-Gomory cuts for faster and stronger mixed integer linear programming solutions.
http://arxiv.org/abs/2409.06550v1
Compressor summary: The article presents LIMA, a framework for text analysis with deep neural networks, supporting over 60 languages and integrating with other platforms using Universal Dependencies.
http://arxiv.org/abs/2409.06542v1
Compressor summary: The paper proposes four adaptive learning rates for gradient descent based on terminal attractor theory and terminal sliding mode theory to improve its convergence speed, and evaluates them in simulations.
http://arxiv.org/abs/2409.06540v1
Compressor summary: The authors propose a numerical representation of narratives based on Greimas' Actantial Model, which can be used to analyze news articles and understand how different texts present the same topic with different structures.
http://arxiv.org/abs/2409.06525v1
Compressor summary: MENSA is a deep learning method for predicting the time until a patient with ALS loses various physical functions, improving on existing approaches by jointly learning covariate representations and event dependencies.
http://arxiv.org/abs/2409.06522v1
Compressor summary: The paper presents methods to make data-driven weather forecasting models more interpretable using the Koopman operator, while addressing the challenges of applying it to large-scale atmospheric problems.
http://arxiv.org/abs/2409.06520v1
Compressor summary: Our method automatically calibrates hyperspectral cameras on small aircraft using only spectral imagery and GPS/INS trajectory data, achieving accuracy similar to manual calibration.
http://arxiv.org/abs/2409.06518v1
Compressor summary: The paper explores how large language models represent knowledge about Olympic medal counts and finds they excel at reporting total medals but struggle with ranking information, unlike humans.
http://arxiv.org/abs/2409.06509v1
Compressor summary: The authors propose a method to make neural networks more human-like by transferring knowledge from a teacher model trained to imitate human judgments, improving their performance on various tasks and generalization abilities.
http://arxiv.org/abs/2409.06506v1
Compressor summary: Key points: - Discrete Laplacian operator is important for 3D geometry processing but hard to define on point clouds - Previous methods used local triangulation, which is not robust or accurate - Proposed method uses KNN graph and GNNs to learn Laplacian operator - Novel training scheme imitates ground-truth Laplacian behavior on probe functions - Method reduces error by an order of magnitude and handles sparse point clouds well - Method enables geometry processing on point clouds with learned Laplacian operator Summary: The paper proposes a method to learn the discrete Laplacian operator on point clouds using GNNs and a novel training scheme, achieving high accuracy and generalization, and enabling geometry processing applications on point clouds.
http://arxiv.org/abs/2409.06485v1
Compressor summary: The paper proposes a Re-Balancing Contrastive Decoding (RBD) method that improves attention distribution in Visual-Language Models (VLMs) to reduce textual bias and enhance visual information, mitigating hallucinations.
http://arxiv.org/abs/2409.06477v1
Compressor summary: The paper applies MPC, rollout, and RL to computer chess, using a new architecture for move selection that incorporates multiple chess engines and improves their performance, particularly for position evaluation.
http://arxiv.org/abs/2409.06471v1
Compressor summary: This paper proposes a weakly supervised learning method for improving camera pose accuracy using satellite images, without requiring accurate GPS labels for training.
http://arxiv.org/abs/2409.06468v1
Compressor summary: The paper proposes a context-balanced learning objective for contextual adapters in end-to-end speech recognition models to address data imbalance issues and improve performance on rare words.
http://arxiv.org/abs/2409.06445v1
Compressor summary: The paper introduces GenieRedux, an improved model that uses reinforcement learning agents for data generation, enhancing its ability to adapt and perform well in complex environments.
http://arxiv.org/abs/2409.06443v1
Compressor summary: The paper proposes a novel query selection method for compressing DETR models using knowledge distillation, which improves performance and reduces size without high computational costs.
http://arxiv.org/abs/2409.06442v1
Compressor summary: The authors use generative models to create a fashion image dataset tailored to users' preferences and needs, and discuss the importance of expert evaluation for such datasets.
http://arxiv.org/abs/2409.06439v1
Compressor summary: The paper introduces E2Tree, a method to explain random forests in both classification and regression tasks, by showing relationships between response variables, predictors, and their associations using dissimilarity measures.
http://arxiv.org/abs/2409.06437v1
Compressor summary: The note proves that the Gaussian maximum likelihood estimator is consistent (works well) in linear auto-regressive models, using information theory and getting close to optimal performance.
http://arxiv.org/abs/2409.06411v1
Compressor summary: LD-DPO is a method to reduce verbosity in language models by decoupling length preference from other preferences, leading to more concise and human-aligned responses.
http://arxiv.org/abs/2409.06407v1
Compressor summary: The paper introduces a taxonomy of uncertainties in NeRF and GS methods for 3D scene reconstruction and proposes techniques to estimate and capture these uncertainties.
http://arxiv.org/abs/2409.06402v1
Compressor summary: The paper proposes that symmetry breaking is important for neural network optimization and introduces a metric to measure it, which can help improve network design and performance.
http://arxiv.org/abs/2409.06386v1
Compressor summary: The paper proposes new coarse-grained sense inventories for natural language processing tasks by semantically matching WordNet and Cambridge dictionaries and shows their advantages in semantic coherence, resource dependency, and usability.
http://arxiv.org/abs/2409.06385v1
Compressor summary: The paper proposes a method to improve text-to-image person retrieval by suppressing noise labels and using attention-weighted selective mask to handle noisy image-text pairings.
http://arxiv.org/abs/2409.06381v1
Compressor summary: The paper proposes a new network method to help decipher ancient Chinese writing system by finding similarities between different font styles.