arxiv compressed, 2024-07-29

This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-07-29 generated by the compressor, my personal LLM-based project.


Floating No More: Object-Ground Reconstruction from a Single Image

http://arxiv.org/abs/2407.18914v1

Compressor summary: ORG is a new method that improves 3D object reconstruction from single images by considering the object's relationship with the ground surface, leading to better shadow rendering and pose manipulation.


SOAP-RL: Sequential Option Advantage Propagation for Reinforcement Learning in POMDP Environments

http://arxiv.org/abs/2407.18913v1

Compressor summary: The paper proposes two algorithms, PPOEM and SOAP, to learn temporally consistent options in POMDPs without supervision, and shows that SOAP outperforms other baselines in various environments.


Do We Really Need Graph Convolution During Training? Light Post-Training Graph-ODE for Efficient Recommendation

http://arxiv.org/abs/2407.18910v1

Compressor summary: LightGODE is a new method that improves the efficiency and effectiveness of recommender systems by using post-training graph ordinary-differential-equations instead of computationally expensive graph convolutions.


SHIC: Shape-Image Correspondences with no Keypoint Supervision

http://arxiv.org/abs/2407.18907v1

Compressor summary: SHIC is a method to learn 3D templates from objects without manual supervision by using features from open-ended foundation models and non-photorealistic image generators.


Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence

http://arxiv.org/abs/2407.18899v1

Compressor summary: The paper proposes a novel active domain adaptation method called LFTL that leverages learnt knowledge from previous models without accessing source data, and uses contrastive active sampling and visual persistence-guided adaptation to improve performance.


Small Molecule Optimization with Large Language Models

http://arxiv.org/abs/2407.18897v1

Compressor summary: The authors present Chemlactica and Chemma, large language models for generative molecular drug design, and a novel optimization algorithm that uses them to optimize molecules for arbitrary properties with high performance on various benchmarks.


Embedding And Clustering Your Data Can Improve Contrastive Pretraining

http://arxiv.org/abs/2407.18887v1

Compressor summary: The paper proposes using k-means clustering to split training data by semantic clusters for large-scale contrastive pretraining, leading to improved performance on the MSMARCO dataset.


An Accelerated Multi-level Monte Carlo Approach for Average Reward Reinforcement Learning with General Policy Parametrization

http://arxiv.org/abs/2407.18878v1

Compressor summary: The paper proposes a new reinforcement learning method that converges faster than existing methods without needing prior knowledge.


Generative Adversarial Networks for Imputing Sparse Learning Performance

http://arxiv.org/abs/2407.18875v1

Compressor summary: The paper proposes a GAIN framework to fill missing data in learning performance, enhancing personalized instruction in ITSs.


Downlink CCM Estimation via Representation Learning with Graph Regularization

http://arxiv.org/abs/2407.18865v1

Compressor summary: The paper presents an algorithm for estimating the downlink channel covariance matrix using uplink data and a nonlinear mapping function with optimal Lipschitz continuity.


Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment

http://arxiv.org/abs/2407.18854v1

Compressor summary: The text proposes MARNet, a network that enhances image classification models by aligning and blending multimodal information, improving resistance to visual noise and performance.


Repairing Networks of $\mathcal{EL_\perp}$ Ontologies using Weakening and Completing -- Extended version

http://arxiv.org/abs/2407.18848v1

Compressor summary: The paper proposes a framework to repair ontology networks by combining basic operations like debugging, weakening, and completing, while considering autonomy levels of the involved ontologies and alignments.


Enhancing material property prediction with ensemble deep graph convolutional networks

http://arxiv.org/abs/2407.18847v1

Compressor summary: The study evaluates ensemble strategies in deep graph networks for material property prediction tasks and shows that they improve precision for key properties in inorganic materials.


QT-TDM: Planning with Transformer Dynamics Model and Autoregressive Q-Learning

http://arxiv.org/abs/2407.18841v1

Compressor summary: The paper proposes a method called QT-TDM, which uses a Transformer model for short-term planning and an autoregressive discrete Q-function for long-term return estimation in continuous control tasks.


The Cross-environment Hyperparameter Setting Benchmark for Reinforcement Learning

http://arxiv.org/abs/2407.18840v1

Compressor summary: The paper presents a new method to compare RL algorithms across different environments using a single hyperparameter setting, which is robust to noise and low-cost, and applies it to study exploration methods in continuous control.


Scalable Group Choreography via Variational Phase Manifold Learning

http://arxiv.org/abs/2407.18839v1

Compressor summary: The text proposes a new method for generating realistic and adaptable group dance motions from music using a phase-based variational generative model that works with any number of dancers.


Deep Companion Learning: Enhancing Generalization Through Historical Consistency

http://arxiv.org/abs/2407.18821v1

Compressor summary: Deep Companion Learning (DCL) is a method to improve deep neural networks by training a companion model that provides targeted supervision based on previous versions of the primary model.


Online Planning in POMDPs with State-Requests

http://arxiv.org/abs/2407.18812v1

Compressor summary: AEMS-SR is a new online planning algorithm for POMDPs with state requests that works efficiently under partial observability and high cost of state information, outperforming existing methods.


Robust Learning in Bayesian Parallel Branching Graph Neural Networks: The Narrow Width Limit

http://arxiv.org/abs/2407.18807v1

Compressor summary: The paper studies how the performance and learning behavior of BPB-GNNs change with varying width, showing that they can achieve better or comparable results in narrow width regimes compared to wide width ones.


Benchmarking Dependence Measures to Prevent Shortcut Learning in Medical Imaging

http://arxiv.org/abs/2407.18792v1

Compressor summary: The paper investigates how to reduce spurious correlations in deep learning models for medical imaging by using various dependence measures between task-related and non-task-related variables.


Granularity is crucial when applying differential privacy to text: An investigation for neural machine translation

http://arxiv.org/abs/2407.18789v1

Compressor summary: The paper explores how applying differential privacy to neural machine translation at different levels (sentence or document) affects privacy/utility trade-off and risk of leaking personal data.


The power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs

http://arxiv.org/abs/2407.18786v1

Compressor summary: The paper investigates gender bias in machine translation using large language models, finds it pervasive, and proposes a prompt structure that reduces bias by up to 12%.


Understanding XAI Through the Philosopher's Lens: A Historical Perspective

http://arxiv.org/abs/2407.18782v1

Compressor summary: The paper explores how the concept of explanation in AI has evolved similarly to the philosophy of science, and suggests this could help understand the foundations of explainable AI.


Learning production functions for supply chains with graph neural networks

http://arxiv.org/abs/2407.18772v1

Compressor summary: The paper introduces a new model that combines temporal graph neural networks (GNNs) with an inventory module to infer production functions and forecast transactions in supply chains, improving over existing methods.


Any four real numbers are on all fours with analogy

http://arxiv.org/abs/2407.18770v1

Compressor summary: The paper introduces a formalism for analogies between numbers based on generalized means, which can help with various applications in artificial intelligence and machine learning.


Unsupervised Reservoir Computing for Multivariate Denoising of Severely Contaminated Signals

http://arxiv.org/abs/2407.18759v1

Compressor summary: The text describes a machine learning method for denoising multivariate signals that captures both signal and noise dependencies and performs better than other methods.


Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery

http://arxiv.org/abs/2407.18752v1

Compressor summary: The paper presents a new method called KG Structure as Prompt that uses small language models and knowledge graphs to discover causal relationships from observational data more effectively than existing methods.


Multi-Robot System Architecture design in SysML and BPMN

http://arxiv.org/abs/2407.18749v1

Compressor summary: The article proposes a modular modeling and simulation technique for multi-robot systems using formal system engineering method, SysML and BPMN ADLs, and JADE middleware to reduce design complexity and evaluate performance.


FairAIED: Navigating Fairness, Bias, and Ethics in Educational AI Applications

http://arxiv.org/abs/2407.18745v1

Compressor summary: The text explores how AI bias can harm educational equity and suggests techniques to mitigate it while stressing the importance of ethics and diversity in AI-driven education.


Towards Effective and Efficient Continual Pre-training of Large Language Models

http://arxiv.org/abs/2407.18743v1

Compressor summary: The paper presents a method to improve Chinese language models' abilities using synthetic scientific question-answer pairs and continual pre-training, enhancing both general and scientific reasoning skills without compromising the original capacities.


Towards Generalized Offensive Language Identification

http://arxiv.org/abs/2407.18738v1

Compressor summary: This paper studies how well offensive language detection models and datasets work in different situations, using a new benchmark.


AutoRDF2GML: Facilitating RDF Integration in Graph Machine Learning

http://arxiv.org/abs/2407.18735v1

Compressor summary: AutoRDF2GML is a tool that converts RDF data into formats suitable for graph machine learning tasks, enabling users to create features based on both content and topology, and providing new datasets for evaluating graph machine learning methods.


Creating an Aligned Corpus of Sound and Text: The Multimodal Corpus of Shakespeare and Milton

http://arxiv.org/abs/2407.18730v1

Compressor summary: The authors create an enriched corpus of Shakespeare and Milton's poems with public domain readings, aligning them at various levels and providing scansion and visualization.


LLASP: Fine-tuning Large Language Models for Answer Set Programming

http://arxiv.org/abs/2407.18723v1

Compressor summary: Key points: - Large Language Models (LLMs) can generate code for imperative languages but not for declarative formalisms like Answer Set Programming (ASP). - The paper evaluates several LLMs and proposes LLASP, a fine-tuned lightweight model trained on ASP patterns. - LLASP generates high-quality ASP programs compared to non-fine-tuned LLMs and other eager models. Summary: The paper introduces LLASP, a fine-tuned LLM for generating declarative Answer Set Programming code, outperforming other models in quality and semantics.


Neurosymbolic AI for Enhancing Instructability in Generative AI

http://arxiv.org/abs/2407.18722v1

Compressor summary: Neurosymbolic AI can improve generative AI's ability to understand and execute complex instructions by using a symbolic task planner, a neural semantic parser, and a neuro-symbolic executor.


ChatSchema: A pipeline of extracting structured information with Large Multimodal Models based on schema

http://arxiv.org/abs/2407.18716v1

Compressor summary: ChatSchema is a method that uses LMMs and OCR to extract and structure information from medical paper reports based on a schema, achieving high precision, recall, and F1-scores in key and value extraction.


BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation

http://arxiv.org/abs/2407.18715v1

Compressor summary: Bidirectional Conditioning Transformer (BCTR) is a novel scene graph generation model that improves prediction efficiency by enabling efficient interaction between entities and predicates using bidirectional conditioning factorization.


Cluster-norm for Unsupervised Probing of Knowledge

http://arxiv.org/abs/2407.18712v1

Compressor summary: The text introduces a cluster normalization method to improve unsupervised probing techniques in language models by reducing the impact of salient but unrelated features.


Finite Neural Networks as Mixtures of Gaussian Processes: From Provable Error Bounds to Prior Selection

http://arxiv.org/abs/2407.18707v1

Compressor summary: The paper presents a method to approximate finite neural networks with mixture of Gaussian processes, providing error bounds and applications in Bayesian inference and uncertainty quantification.


Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation

http://arxiv.org/abs/2407.18698v1

Compressor summary: Adaptive contrastive search is a new decoding strategy for language models that improves creativity, diversity, coherence, and quality of generated text by using an adaptive degeneration penalty based on model uncertainty.


PIV3CAMS: a multi-camera dataset for multiple computer vision problems and its application to novel view-point synthesis

http://arxiv.org/abs/2407.18695v1

Compressor summary: The thesis introduces a new multicamera image and video dataset called PIV3CAMS for various computer vision tasks, and studies the importance of depth information in view synthesis.


Deep learning for predicting the occurrence of tipping points

http://arxiv.org/abs/2407.18693v1

Compressor summary: The paper presents a deep learning algorithm that predicts tipping points in complex systems from irregularly-sampled time series data, improving on traditional methods and enabling risk mitigation and system restoration.


Graph Neural Networks for Virtual Sensing in Complex Systems: Addressing Heterogeneous Temporal Dynamics

http://arxiv.org/abs/2407.18691v1

Compressor summary: The text proposes a new framework called HTGNN that uses graph neural networks to model diverse sensor signals and operating conditions for accurate virtual sensing in complex systems.


Collaborative Evolving Strategy for Automatic Data-Centric Development

http://arxiv.org/abs/2407.18690v1

Compressor summary: The paper introduces a novel AI system called Co-STEER that uses large language models to autonomously develop data for machine learning tasks, addressing challenges in scheduling and implementation through a collaborative evolution process.


The BIAS Detection Framework: Bias Detection in Word Embeddings and Language Models for European Languages

http://arxiv.org/abs/2407.18689v1

Compressor summary: The BIAS project develops new methods to detect societal bias in language models and word embeddings in European languages, considering linguistic and geographic differences, and provides a framework with code updates.


Rapid Object Annotation

http://arxiv.org/abs/2407.18682v1

Compressor summary: The report presents a method to quickly label a new object in a video using a user-friendly interface and a simple workflow.


Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift

http://arxiv.org/abs/2407.18676v1

Compressor summary: NS-DPO is a method to optimize language models with preferences that change over time by using a Dynamic Bradley-Terry model and discounting more recent data.


A dual ensemble classifier used to recognise contaminated multi-channel EMG and MMG signals in the control of upper limb bioprosthesis

http://arxiv.org/abs/2407.18675v1

Compressor summary: The paper proposes a method to improve recognition of user intent using two cooperating multiclassifier systems that deal with contaminated biosignal channels and class of movement.


A Labeled Ophthalmic Ultrasound Dataset with Medical Report Generation Based on Cross-modal Deep Learning

http://arxiv.org/abs/2407.18667v1

Compressor summary: The labeled ophthalmic dataset contains ultrasound images, blood flow information, and examination reports from 2,417 patients to help diagnose and treat eye diseases using a cross-modal deep learning model.


Auto DragGAN: Editing the Generative Image Manifold in an Autoregressive Manner

http://arxiv.org/abs/2407.18656v1

Compressor summary: This paper presents a regression-based network that learns StyleGAN latent code patterns for pixel-level fine-grained image editing with high inference speed and quality.


Aspects of importance sampling in parameter selection for neural networks using ridgelet transform

http://arxiv.org/abs/2407.18655v1

Compressor summary: The text discusses how using an oracle distribution from the ridgelet transform can help obtain optimal initial parameters for neural networks, simplifying their learning process and emphasizing the importance of weight parameters over intercept parameters.


Contrastive Learning of Asset Embeddings from Financial Time Series

http://arxiv.org/abs/2407.18645v1

Compressor summary: The paper proposes a new method using contrastive learning to generate asset embeddings from financial time series data, which improves industry classification and portfolio optimization tasks.


Multi-Agent Deep Reinforcement Learning for Energy Efficient Multi-Hop STAR-RIS-Assisted Transmissions

http://arxiv.org/abs/2407.18627v1

Compressor summary: The paper proposes a novel multi-hop STAR-RIS architecture for wireless communication coverage expansion, and uses a MAGAR algorithm to optimize energy efficiency in active and passive beamforming.


Every Part Matters: Integrity Verification of Scientific Figures Based on Multimodal Large Language Models

http://arxiv.org/abs/2407.18626v1

Compressor summary: The paper introduces a new task (Figure Integrity Verification) to evaluate how well technology can align text with visual elements in scientific figures, and proposes a method (Every Part Matters) that uses large language models to improve this alignment and reasoning.


Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning

http://arxiv.org/abs/2407.18624v1

Compressor summary: The paper proposes a dual-perspective method to generate high-quality pseudo-labels for semi-supervised multi-label learning, improving both model predictions and class-wise thresholds.


MOoSE: Multi-Orientation Sharing Experts for Open-set Scene Text Recognition

http://arxiv.org/abs/2407.18616v1

Compressor summary: MOoSE is a multi-orientation text recognition framework that handles novel characters and different writing directions using a mixture-of-experts scheme to address data scarcity and domain gaps.


LookupForensics: A Large-Scale Multi-Task Dataset for Multi-Phase Image-Based Fact Verification

http://arxiv.org/abs/2407.18614v1

Compressor summary: The paper introduces a new AI-based task to identify and retrieve original images from deepfake content using a two-phase framework and a large-scale dataset with diverse manipulations.


Dilated Strip Attention Network for Image Restoration

http://arxiv.org/abs/2407.18613v1

Compressor summary: DSAN is a novel attention network for image restoration that uses dilated strip attention to capture contextual information from wider regions and multi-scale receptive fields for improved representation learning.


IOVS4NeRF:Incremental Optimal View Selection for Large-Scale NeRFs

http://arxiv.org/abs/2407.18611v1

Compressor summary: The paper proposes a new NeRF framework that uses image content and pose data to plan the next best view, improves rendering quality over time, and boosts efficiency with Vonoroi diagram and threshold sampling.


Denoising Lévy Probabilistic Models

http://arxiv.org/abs/2407.18609v1

Compressor summary: DLPM is a simplified model that extends DDPM with heavy-tailed noise, improving performance on challenging data distributions.


Using GPT-4 to guide causal machine learning

http://arxiv.org/abs/2407.18607v1

Compressor summary: This paper evaluates GPT-4's ability to identify causal relationships without context and compares its performance to causal ML and expert-generated knowledge graphs, finding that GPT-4 can enhance causal representation and discovery.


A data balancing approach designing of an expert system for Heart Disease Prediction

http://arxiv.org/abs/2407.18606v1

Compressor summary: Machine learning methods, especially ensemble approaches like random forests, can accurately predict heart disease using various factors and offer better risk assessment than conventional techniques.


Climbing the Complexity Ladder with Expressive Attention

http://arxiv.org/abs/2407.18601v1

Compressor summary: Expressive attention (EA) improves over dot-product attention (DPA) in various autoregressive prediction tasks by enhancing parallel or antiparallel query-key relationships and suppressing orthogonal ones.


Reinforcement Learning for Sustainable Energy: A Survey

http://arxiv.org/abs/2407.18597v1

Compressor summary: This paper surveys how reinforcement learning can help address challenges in the transition to sustainable energy by learning behavior from data and connecting the energy and machine learning research communities.


LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement

http://arxiv.org/abs/2407.18595v1

Compressor summary: The study presents LinguaLinker, a diffusion-based approach to create realistic facial animations that sync with multilingual audio inputs.


Content-driven Magnitude-Derivative Spectrum Complementary Learning for Hyperspectral Image Classification

http://arxiv.org/abs/2407.18593v1

Compressor summary: The paper proposes a new method to classify hyperspectral images using both spectral magnitude and derivative features, which improves accuracy by leveraging their complementary information.


HICEScore: A Hierarchical Metric for Image Captioning Evaluation

http://arxiv.org/abs/2407.18589v1

Compressor summary: HICE-S is a reference-free metric for image captioning evaluation that uses an interpretable hierarchical scoring mechanism to assess detailed captions and outperforms existing metrics on several benchmarks.


Dynamic Language Group-Based MoE: Enhancing Efficiency and Flexibility for Code-Switching Speech Recognition

http://arxiv.org/abs/2407.18581v1

Compressor summary: The DLG-MoE model tackles multilingual and code-switching challenges using a dynamic language group layer with a shared router for language modeling and independent routers for other attributes, achieving state-of-the-art results without pre-training.


Learning to Enhance Aperture Phasor Field for Non-Line-of-Sight Imaging

http://arxiv.org/abs/2407.18574v1

Compressor summary: The paper proposes a phasor-based enhancement network to improve NLOS imaging by predicting clean measurements from noisy partial observations, reducing the number of samplings and scan areas.


Unveiling Privacy Vulnerabilities: Investigating the Role of Structure in Graph Data

http://arxiv.org/abs/2407.18564v1

Compressor summary: The study examines privacy risks from network structure exposure, introduces a measure to quantify it, develops an attack model, and proposes a graph data publishing method to protect user data.


Learning Robust Named Entity Recognizers From Noisy Data With Retrieval Augmentation

http://arxiv.org/abs/2407.18562v1

Compressor summary: The paper proposes a method to improve named entity recognition (NER) for noisy inputs by retrieving and using relevant text from a knowledge corpus.


Look Globally and Reason: Two-stage Path Reasoning over Sparse Knowledge Graphs

http://arxiv.org/abs/2407.18556v1

Compressor summary: LoGRe is a two-stage path reasoning model that uses global analysis of training data to fill in missing facts and aggregate paths for answering queries over sparse knowledge graphs.


Skin Cancer Detection utilizing Deep Learning: Classification of Skin Lesion Images using a Vision Transformer

http://arxiv.org/abs/2407.18554v1

Compressor summary: The study uses Vision Transformers to improve skin cancer detection, achieving high accuracy and better melanoma recall than previous methods.


Utilising Explainable Techniques for Quality Prediction in a Complex Textiles Manufacturing Use Case

http://arxiv.org/abs/2407.18544v1

Compressor summary: The paper presents an approach to classify product failure instances in a textiles manufacturing dataset using tree-based algorithms and feature selection methods, achieving good results with Random Forest and Boruta, and providing interpretable rules for humans.


A Universal Prompting Strategy for Extracting Process Model Information from Natural Language Text using Large Language Models

http://arxiv.org/abs/2407.18540v1

Compressor summary: The study investigates the potential of large language models (LLMs) for extracting process elements from textual descriptions and shows they can outperform existing machine learning methods with a novel prompting strategy.


Towards a Multidimensional Evaluation Framework for Empathetic Conversational Systems

http://arxiv.org/abs/2407.18538v1

Compressor summary: The paper introduces a new framework for evaluating empathetic conversational systems that considers structural, behavioral, and overall aspects of empathy using three novel methods.


Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers

http://arxiv.org/abs/2407.18534v1

Compressor summary: The text proposes a new method, Relational Priors Distillation (RPD), that uses transformer models and self-supervised learning to improve 3D point cloud classification across different domains.


Constructing Enhanced Mutual Information for Online Class-Incremental Learning

http://arxiv.org/abs/2407.18526v1

Compressor summary: EMI is a new method for online class-incremental continual learning that improves knowledge alignment and prevents catastrophic forgetting by using diversity, representativeness, and separability in mutual information relationships.


Is larger always better? Evaluating and prompting large language models for non-generative medical tasks

http://arxiv.org/abs/2407.18525v1

Compressor summary: The study compares various language models' performance on structured and unstructured medical tasks, finding that large language models excel at zero-shot learning on structured data but finetuned BERT models perform better on unstructured texts.


DTFormer: A Transformer-Based Method for Discrete-Time Dynamic Graph Representation Learning

http://arxiv.org/abs/2407.18523v1

Compressor summary: DTFormer is a novel Transformer-based representation learning method for discrete-time dynamic graphs that addresses limitations of GNN+RNN architectures and captures intersection relationships among nodes.


Text-Region Matching for Multi-Label Image Recognition with Missing Labels

http://arxiv.org/abs/2407.18520v1

Compressor summary: The paper proposes TRM-ML, a method that uses text-region matching and multimodal contrastive learning to improve multi-label image recognition with missing labels.


TCGPN: Temporal-Correlation Graph Pre-trained Network for Stock Forecasting

http://arxiv.org/abs/2407.18519v1

Compressor summary: TCGPN is a novel approach for time series prediction without periodicity that uses a Temporal-correlation fusion encoder and pre-training methods to overcome the limitations of STGNNs, achieving better results on real stock market data sets.


WorkR: Occupation Inference for Intelligent Task Assistance

http://arxiv.org/abs/2407.18518v1

Compressor summary: WorkR is a framework that uses passive sensing to capture pervasive signals from various tasks and infers occupations with over 91% accuracy.


The formation of perceptual space in early phonetic acquisition: a cross-linguistic modeling approach

http://arxiv.org/abs/2407.18501v1

Compressor summary: The study shows how unsupervised learning of context-free acoustic information leads to similar learned representations of perceptual space in native and non-native speakers, mimicking early language learning in infants.


Revisit Event Generation Model: Self-Supervised Learning of Event-to-Video Reconstruction with Implicit Neural Representations

http://arxiv.org/abs/2407.18500v1

Compressor summary: EvINR is a self-supervised learning method that reconstructs intensity frames from event data using an implicit neural representation of the event generation equation, achieving superior performance and interpretability compared to previous methods.


A Reliable Common-Sense Reasoning Socialbot Built Using LLMs and Goal-Directed ASP

http://arxiv.org/abs/2407.18498v1

Compressor summary: AutoCompanion is a socialbot that uses an LLM for natural language translation and ASP-based commonsense reasoning to hold coherent and goal-directed conversations with humans about movies and books.


Answerability Fields: Answerable Location Estimation via Diffusion Models

http://arxiv.org/abs/2407.18497v1

Compressor summary: Answerability Fields is a new method to predict if machines can answer questions about indoor scenes, using a 3D dataset and a diffusion model.


Towards More Accurate Prediction of Human Empathy and Emotion in Text and Multi-turn Conversations by Combining Advanced NLP, Transformers-based Networks, and Linguistic Methodologies

http://arxiv.org/abs/2407.18496v1

Compressor summary: The project aims to predict empathy and emotions using neural networks and different embedding models, with revisions focusing on model architecture, data balancing, and lexical resources, and an ensemble of models for the final system.


Neural Modulation Alteration to Positive and Negative Emotions in Depressed Patients: Insights from fMRI Using Positive/Negative Emotion Atlas

http://arxiv.org/abs/2407.18492v1

Compressor summary: The study used fMRI to create atlases of positive and negative emotions in healthy individuals and found that depressed patients showed significant differences in brain activity associated with both emotions.


Conversational Dueling Bandits in Generalized Linear Models

http://arxiv.org/abs/2407.18488v1

Compressor summary: ConDuel is a novel conversational dueling bandit algorithm that uses relative feedback in generalized linear models to improve recommendation systems by learning user preferences more effectively and addressing existing limitations in conversational bandit methods.


A Role-specific Guided Large Language Model for Ophthalmic Consultation Based on Stylistic Differentiation

http://arxiv.org/abs/2407.18483v1

Compressor summary: EyeDoctor is an ophthalmic large language model that improves question-answering accuracy in eye consultations by using doctor-patient role perception and an augmented knowledge base.


Practical Attribution Guidance for Rashomon Sets

http://arxiv.org/abs/2407.18482v1

Compressor summary: The paper discusses the Rashomon effect in Explainable AI, introduces two axioms for practical sampling methods, and proposes an $\epsilon$-subgradient-based sampling method that satisfies these axioms.


Scalable Graph Compressed Convolutions

http://arxiv.org/abs/2407.18480v1

Compressor summary: The authors propose CoCN, a novel GNN that generalizes Euclidean convolution to graphs using differentiable permutations, and achieves better performance than existing methods on node-level and graph-level benchmarks.


Multi-turn Response Selection with Commonsense-enhanced Language Models

http://arxiv.org/abs/2407.18479v1

Compressor summary: The authors propose a Siamese network called SinLG that combines pre-trained language models and graph neural networks to incorporate external commonsense knowledge for multi-turn response selection in dialogue systems, improving performance and efficiency.


Constructing the CORD-19 Vaccine Dataset

http://arxiv.org/abs/2407.18471v1

Compressor summary: Key points: - New dataset 'CORD-19-Vaccination' for COVID-19 vaccine research - Dataset has language, author demography, keywords, and topic information - Dataset is evaluated using question-answering and sequential sentence classification tasks Summary: The paper introduces a new dataset with rich metadata for COVID-19 vaccine research and evaluates it on various NLP tasks.


Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints

http://arxiv.org/abs/2407.18468v1

Compressor summary: The paper proposes a diffusion-driven semantic communication framework with advanced VAE-based compression for bandwidth-constrained generative models, improving pixel-level and semantic metrics.


Machine Unlearning using a Multi-GAN based Model

http://arxiv.org/abs/2407.18467v1

Compressor summary: The article proposes a machine unlearning method using GANs, which generates synthetic data with inverted labels and fine-tunes a pre-trained model to improve performance against membership inference attacks.


A Progressive Single-Modality to Multi-Modality Classification Framework for Alzheimer's Disease Sub-type Diagnosis

http://arxiv.org/abs/2407.18466v1

Compressor summary: The paper proposes a novel progressive Alzheimer's sub-type diagnosis framework that uses inter-correlation among multiple modalities to provide accurate diagnosis results at low cost and follows clinical guidelines.


MistralBSM: Leveraging Mistral-7B for Vehicular Networks Misbehavior Detection

http://arxiv.org/abs/2407.18462v1

Compressor summary: Key points: - Vehicular networks face threats from malicious attacks using misbehaving vehicles - A pretrained Large Language Model (LLM)-based Misbehavior Detection System (MDS) is proposed to detect and analyze these attacks - Mistral-7B, a state-of-the-art LLM, shows superior performance and efficiency in edge-cloud detection framework Summary: The paper proposes a LLM-based system to detect and analyze malicious attacks using misbehaving vehicles in vehicular networks, achieving high accuracy and efficiency with Mistral-7B.


Fairness Definitions in Language Models Explained

http://arxiv.org/abs/2407.18454v1

Compressor summary: This paper surveys fairness definitions in large language models, introducing a novel taxonomy and illustrating each with experiments.


Textile Anomaly Detection: Evaluation of the State-of-the-Art for Automated Quality Inspection of Carpet

http://arxiv.org/abs/2407.18450v1

Compressor summary: The study tested unsupervised anomaly detection models on wool carpets using a custom dataset and found student-teacher networks with multi-class training to have the best performance in accuracy, low false detections, and real-time speed.


HybridDepth: Robust Depth Fusion for Mobile AR by Leveraging Depth from Focus and Single-Image Priors

http://arxiv.org/abs/2407.18443v1

Compressor summary: HYBRIDDEPTH is a depth estimation pipeline for mobile AR that combines focal planes and single-image depth priors to achieve high accuracy, generalization, and structural detail.


Guidance-Based Prompt Data Augmentation in Specialized Domains for Named Entity Recognition

http://arxiv.org/abs/2407.18442v1

Compressor summary: The study presents a new technique to generate varied sentences with contextual information for better data augmentation and named entity recognition.


Impact of Recurrent Neural Networks and Deep Learning Frameworks on Real-time Lightweight Time Series Anomaly Detection

http://arxiv.org/abs/2407.18439v1

Compressor summary: The paper evaluates how different types of recurrent neural networks (RNNs) in various deep learning frameworks affect real-time lightweight time series anomaly detection performance.


Mixed Non-linear Quantization for Vision Transformers

http://arxiv.org/abs/2407.18437v1

Compressor summary: Mixed non-linear quantization assigns the best quantization method for each non-linear operation in Vision Transformers, improving performance and reducing training time.


A Model for Combinatorial Dictionary Learning and Inference

http://arxiv.org/abs/2407.18436v1

Compressor summary: The text proposes a combinatorial model for decomposing data into simple components and studies properties of well-structuredness and component explanations for images.


Investigating the Privacy Risk of Using Robot Vacuum Cleaners in Smart Environments

http://arxiv.org/abs/2407.18433v1

Compressor summary: The paper explores the risk of private information exposure from network header metadata in robot vacuum cleaner smartphone applications.