This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2024-07-26 generated by the compressor, my personal LLM-based project.
http://arxiv.org/abs/2407.18251v1
Compressor summary: The paper studies how robust multimodal models are against different types of adversarial attacks and finds that unimodal DNNs are more resilient and models with ViT-based Image Encoder are more vulnerable.
http://arxiv.org/abs/2407.18248v1
Compressor summary: The paper proposes self-training with Direct Preference Optimization (DPO) to enhance mathematical reasoning in small-scale language models, offering a cheaper and more stable alternative than using large proprietary LMs.
http://arxiv.org/abs/2407.18247v1
Compressor summary: RegionDrag is a faster and more accurate region-based copy-and-paste dragging method for image editing that overcomes the limitations of point-drag-based approaches.
http://arxiv.org/abs/2407.18245v1
Compressor summary: VGGHeads is a large synthetic dataset with over 1 million images of human heads annotated with 3D meshes, landmarks, and boxes, used to train models that detect and reconstruct human heads in real images.
http://arxiv.org/abs/2407.18242v1
Compressor summary: LoRA-Pro improves LoRA by using an equivalent gradient to approximate the full fine-tuning optimization process, closing the performance gap on NLP tasks.
http://arxiv.org/abs/2407.18241v1
Compressor summary: The text discusses the importance of using numerical literals in link prediction over knowledge graphs, proposing a methodology to evaluate their effectiveness with a new synthetic dataset and ablation strategies on existing benchmarks.
http://arxiv.org/abs/2407.18232v1
Compressor summary: LION is a window-based framework that uses linear group RNN to improve 3D object detection in sparse point clouds by enhancing spatial features and voxel generation.
http://arxiv.org/abs/2407.18227v1
Compressor summary: The paper introduces AutoPrognosis-M, a multimodal machine learning framework that combines structured clinical data and medical imaging for diagnosis and prognosis, using various models and fusion strategies.
http://arxiv.org/abs/2407.18219v1
Compressor summary: RISE is an approach to fine-tune LLMs to introspect and correct their mistakes sequentially on hard problems using iterative multi-turn strategies inspired by online imitation learning and reinforcement learning.
http://arxiv.org/abs/2407.18213v1
Compressor summary: Larger language models are more resistant to adversarial prompts after adversarial training, but model size alone does not improve robustness.
http://arxiv.org/abs/2407.18207v1
Compressor summary: Omnidirectional FID (OmniFID) and Discontinuity Score (DS) are new metrics for measuring the geometric accuracy of spherical images, which account for field-of-view and seam alignment constraints not considered by traditional FID.
http://arxiv.org/abs/2407.18184v1
Compressor summary: The paper introduces a large dataset for antibody-specific epitope prediction and proposes a new method, WALLE, that combines language models and graph neural networks to achieve significant performance improvement.
http://arxiv.org/abs/2407.18181v1
Compressor summary: The paper presents a novel joint graph learning approach that combines single-cell language models and gene regulatory networks to infer gene regulatory networks from scRNA-seq data, achieving superior performance over existing methods.
http://arxiv.org/abs/2407.18178v1
Compressor summary: PianoMime is a framework that uses internet piano demonstrations, such as Youtube videos, to train a generalist piano-playing agent capable of playing any arbitrary song.
http://arxiv.org/abs/2407.18175v1
Compressor summary: Quasar-ViT is a framework that designs efficient ViT models for hardware implementation by using quantization-aware architecture search techniques and model-adaptive designs on FPGA platform.
http://arxiv.org/abs/2407.18147v1
Compressor summary: The FIGNEWS shared task is a multilingual effort to develop annotation guidelines for bias and propaganda in news posts about the Israel War on Gaza, with 17 teams participating and producing over 129,000 data points.
http://arxiv.org/abs/2407.18143v1
Compressor summary: The paper proposes a method to implement maximum entropy reinforcement learning (MaxEnt RL) in on-policy settings by separating the entropy objective from the main objective and shows that it improves policy optimisation performance and generalisation in various tasks.
http://arxiv.org/abs/2407.18137v1
Compressor summary: The XS-VID dataset provides diverse aerial scenes with small objects for evaluating and improving small video object detection methods, especially for very small objects.
http://arxiv.org/abs/2407.18134v1
Compressor summary: The authors propose a new contrastive loss that encodes how samples relate to others and show that it improves vision model performance across various tasks and data regimes.
http://arxiv.org/abs/2407.18129v1
Compressor summary: Dallah is a state-of-the-art Arabic multimodal assistant that leverages an advanced LLaMA-2 language model for text and image interactions in multiple dialects, including MSA and dialect-specific tasks.
http://arxiv.org/abs/2407.18128v1
Compressor summary: The paper proposes using metric learning to estimate earthquake magnitudes from satellite images, improving accuracy over existing methods.
http://arxiv.org/abs/2407.18121v1
Compressor summary: Elastic Cache is a novel approach for improving the efficiency of large multimodal instruction-following models by applying different acceleration methods and an importance-driven cache merging strategy.
http://arxiv.org/abs/2407.18119v1
Compressor summary: The study investigates how linguistic information like grammatical number or semantic role is reflected and localized in sentence embeddings of transformer-based models.
http://arxiv.org/abs/2407.18112v1
Compressor summary: Keypoint Promptable ReID (KPR) is a new method for identifying occluded individuals by adding keypoints to bounding boxes and introducing a new dataset, Occluded-PoseTrack ReID, with keypoint labels.
http://arxiv.org/abs/2407.18108v1
Compressor summary: The paper proposes a machine learning method to simplify complex socioeconomic systems and predict their behavior, using Baltimore as a case study.
http://arxiv.org/abs/2407.18078v1
Compressor summary: The text introduces a new dataset, PEFT-U, for evaluating and building NLP models that can personalize large language models like Chat-GPT to individual users' preferences.
http://arxiv.org/abs/2407.18067v1
Compressor summary: Human-like Video Models (HVM-1) are large-scale video models trained on human-like videos and outperform image-based models in few-shot recognition tasks, thanks to their ability to capture temporal regularities.
http://arxiv.org/abs/2407.18066v1
Compressor summary: The paper proposes a resilience management technique for future radio networks using multi-agent deep reinforcement learning to dynamically adjust antennas and power, improving service availability and coverage.
http://arxiv.org/abs/2407.18061v1
Compressor summary: The authors propose using generative language models for difficulty estimation and text simplification in foreign languages, achieving high accuracy and meaningful simplifications with minimal fine-tuning.
http://arxiv.org/abs/2407.18060v1
Compressor summary: The study compares SVM and RF models using radiomic features from different MRI libraries for prostate cancer detection, finding multimodal feature integration can improve robustness and generalizability.
http://arxiv.org/abs/2407.18046v1
Compressor summary: GaussianSR is a novel super-resolution method that uses 2D Gaussian Splatting to represent pixels as continuous Gaussian fields, improving representation ability and performance over traditional discrete latent codes in the encoder.
http://arxiv.org/abs/2407.18044v1
Compressor summary: QB-RAG is a novel approach that uses pre-computed queries to improve the accuracy of healthcare question answering by LLMs.
http://arxiv.org/abs/2407.18042v1
Compressor summary: Key points: - Neural networks for lifelong graph summarization of web graphs - Comparison of GNNs Graph-MLP and GraphSAINT with MLP baseline - Impact of reusing parameters from previous snapshots - $1$-hop and $2$-hop summaries - Heterogeneity of web graphs affects summary accuracy Summary: The paper explores neural networks for summarizing web graphs over time, using different GNN architectures and comparing $1$-hop and $2$-hop summaries. It also studies the impact of reusing parameters and the effect of web graph heterogeneity on summary accuracy.
http://arxiv.org/abs/2407.18041v1
Compressor summary: The paper shows that training a teacher model for knowledge distillation with mean squared error (MSE) loss improves the student's accuracy by making the teacher's output closer to the true Bayes conditional probability density (BCPD).
http://arxiv.org/abs/2407.18035v1
Compressor summary: RestoreAgent is an intelligent image restoration system that uses multimodal large language models to autonomously assess and restore images with multiple degradations, outperforming human experts.
http://arxiv.org/abs/2407.18034v1
Compressor summary: AttentionHand is a novel method for generating controllable hand images from text and various modalities to overcome challenges in 3D hand reconstruction in the wild.
http://arxiv.org/abs/2407.18033v1
Compressor summary: The authors propose DANet, a novel attention-based deep learning model for arrhythmia detection from short ECG recordings, which provides interpretable waveform regions for diagnosis guidance.
http://arxiv.org/abs/2407.18013v1
Compressor summary: SimpDM is a new diffusion model that improves tabular data imputation by regularizing noise alignment and enhancing robustness with state-dependent data augmentation.
http://arxiv.org/abs/2407.18011v1
Compressor summary: HANNA is a novel neural network that predicts thermodynamic activity coefficients by strictly adhering to physical laws and outperforming the current state-of-the-art model UNIFAC.
http://arxiv.org/abs/2407.18003v1
Compressor summary: Key points: - Large Language Models (LLMs) are powerful but struggle with long texts due to Transformer architecture. - KV-Cache is a solution that improves efficiency at the cost of increased GPU memory overhead. - Various KV-Cache compression methods have been proposed and reviewed, covering different phases and aspects of LLM optimization. Summary: The text reviews how KV-Cache compression methods optimize Large Language Models for long texts, discussing their advantages, disadvantages, and applications in different phases of LLM development.
http://arxiv.org/abs/2407.18002v1
Compressor summary: The paper proposes a network inversion technique that uses a conditioned generator to reconstruct inputs likely to produce specific outputs, making neural networks more interpretable and trustworthy.
http://arxiv.org/abs/2407.18000v1
Compressor summary: The paper presents a new plant pest identification framework that uses ROI detection and CNN-based identification, achieving high accuracy and fast speed on a large dataset of images from various plants and pests.
http://arxiv.org/abs/2407.17997v1
Compressor summary: The paper evaluates using synthetic speech for training ASR systems and compares different architectures' sensitivity to synthetic data generation.
http://arxiv.org/abs/2407.17996v1
Compressor summary: The proposed framework combines RGB and spectral images to improve image quality and enhance dynamic range, color mapping, and material semantics in mobile photography using a joint decomposition and prior-guided enhancement model.
http://arxiv.org/abs/2407.17992v1
Compressor summary: The paper proposes a fast and efficient active learning method using neural networks and Gaussian processes for function learning.
http://arxiv.org/abs/2407.17980v1
Compressor summary: The paper proposes a novel graph neural network and deep reinforcement learning framework that customizes routes for autonomous vehicles based on individual driver preferences, outperforming conventional route planners in terms of travel time, congestion level, and satisfaction.
http://arxiv.org/abs/2407.17974v1
Compressor summary: The study explores how multimodal AI models represent visio-linguistic associations and whether they share the human cross-modal preference for the bouba-kiki effect, finding that results depend on model features.
http://arxiv.org/abs/2407.17963v1
Compressor summary: The text explains that different arithmetic tasks affect how well large language models (LLMs) perform, depending on the task properties and positional encoding used, and proposes a unified theoretical framework to understand these behaviors.
http://arxiv.org/abs/2407.17960v1
Compressor summary: The paper explores how representational alignment affects the emergence of linguistic properties in simulated communication and suggests that it may explain mixed results compared to human language experiments.
http://arxiv.org/abs/2407.17957v1
Compressor summary: Neural network material discretizations can improve acoustic topology optimization by finding better local optima when combined with the Adam optimizer, but their advantages are limited compared to constrained and higher-order techniques.
http://arxiv.org/abs/2407.17956v1
Compressor summary: SaccadeDet is a new object detection method for gigapixel images that mimics human eye movement to quickly and efficiently find objects of interest.
http://arxiv.org/abs/2407.17954v1
Compressor summary: The paper proposes a storage scaling law for computer vision tasks that balances the trade-off between storage space and model quality when using lossy data compression.
http://arxiv.org/abs/2407.17952v1
Compressor summary: BetterDepth is a conditional diffusion-based refiner that uses pre-trained MDE predictions as depth conditioning to achieve geometrically correct and detailed monocular depth estimation while being efficient and easy to use with other models.
http://arxiv.org/abs/2407.17951v1
Compressor summary: The paper introduces Tseitin artifacts as irrelevant subcircuits in d-DNNF compilation and proposes methods to detect and remove them for better probabilistic inference.
http://arxiv.org/abs/2407.17950v1
Compressor summary: The paper investigates the performance of YOLO-v9, a real-time American Sign Language detection model that is new to this domain.
http://arxiv.org/abs/2407.17940v1
Compressor summary: The paper proposes a multi-strategy optimization framework to improve positive reframing using pre-trained language models, which involves designing rewards, decoding approaches, and a re-ranking method for generating fluent and diverse texts with preserved meaning.
http://arxiv.org/abs/2407.17930v1
Compressor summary: The study explores how varying the length of sequences used to predict cryptocurrency returns using ANNs affects accuracy and suggests optimizing sequence configurations for better financial forecasting.
http://arxiv.org/abs/2407.17927v1
Compressor summary: The authors evaluate image quality metrics by testing their invariance to natural transformations like rotation and illumination changes, and find that none of the current state-of-the-art models match human vision.
http://arxiv.org/abs/2407.17914v1
Compressor summary: The study investigates whether multimodal vision-and-language DNN models better represent human meaning and brain activity than unimodal ones, finding mixed results.
http://arxiv.org/abs/2407.17909v1
Compressor summary: The technical report proposes improving KD-based methods for detecting logical defects in industrial settings by using a margin-based constraint to prevent false negatives and increase AUROC by 1.3%.
http://arxiv.org/abs/2407.17907v1
Compressor summary: The authors present a variational inference method that uses a conditional flow model trained from a pre-trained diffusion model to sample efficiently from the posterior distribution for inverse problems in Euclidean and manifold spaces.
http://arxiv.org/abs/2407.17906v1
Compressor summary: HODRF is a two-stage system that combines object detection and classification for plant disease diagnosis, improving accuracy and reducing labeling costs.
http://arxiv.org/abs/2407.17904v1
Compressor summary: The study finds that using diverse unlabeled surgical data in self-supervised learning improves performance for surgical computer vision applications, and provides a public dataset and model.
http://arxiv.org/abs/2407.17900v1
Compressor summary: This paper proposes an ensemble method that combines large language models with machine learning to improve the accuracy of predicting lymph node metastasis in lung cancer patients using patient data and medical knowledge.
http://arxiv.org/abs/2407.17892v1
Compressor summary: The paper proposes an iterative process for topic modelling that improves its quality and produces a sense of completeness, using the BERTopic package and clustering comparison measures on a COVIDSenti-A dataset subset.
http://arxiv.org/abs/2407.17880v1
Compressor summary: The DAM is a neural model that uses randomly sampled histories to forecast non-fixed horizons, outperforming existing models in universal forecasting across multiple domains.
http://arxiv.org/abs/2407.17876v1
Compressor summary: The study analyzes how changes in text corpora, hyperparameters, and randomness affect the stability of map-like metaphors visualizing semantic similarity between documents.
http://arxiv.org/abs/2407.17874v1
Compressor summary: The paper proposes a method to improve speech recognition for domain specific words by using Whisper and various training techniques, including generating descriptions with a large language model.
http://arxiv.org/abs/2407.17869v1
Compressor summary: The authors propose a deep learning framework that uses residual connections, self-attention, and a reconstruction loss to solve the inverse problem of ellipsometry faster and more accurately than traditional machine learning methods.
http://arxiv.org/abs/2407.17863v1
Compressor summary: Factgenie is a tool that helps analyze and visualize word spans in text outputs, using both human and machine annotations.
http://arxiv.org/abs/2407.17862v1
Compressor summary: The paper proposes methods for intent classification using text embeddings without labelled data and evaluates their performance and limitations on four datasets.
http://arxiv.org/abs/2407.17857v1
Compressor summary: Mew is a novel framework for multiplex immunofluorescence images that addresses cellular heterogeneity and scalability issues using a multiplex network with two layers, a Voronoi network and a Cell-type network, and an interpretable attention module.
http://arxiv.org/abs/2407.17856v1
Compressor summary: The authors present a multimodal dataset for benchmarking decision support algorithms in emergency care, showing improved performance using raw waveform data.
http://arxiv.org/abs/2407.17854v1
Compressor summary: The paper introduces a new Image-Context-Text interaction paradigm and proposes Shap-CA, a contrastive learning method to align context-text and context-image pairs for Multimodal Information Extraction, achieving state-of-the-art results.
http://arxiv.org/abs/2407.17852v1
Compressor summary: The paper introduces MMS Zero-shot, a method that improves zero-shot automatic speech recognition by using romanization and an acoustic model trained on more languages than previous work.
http://arxiv.org/abs/2407.17850v1
Compressor summary: FlexiEdit is a new method that improves image editing by reducing high-frequency components in specific areas to better preserve the original layout and features during non-rigid edits.
http://arxiv.org/abs/2407.17847v1
Compressor summary: Our proposed tuning-free method enables simultaneous object editing and background preservation in videos using two branches: inversion and editing, with self-attention for consistent image editing.
http://arxiv.org/abs/2407.17843v1
Compressor summary: The study investigates how text and image embeddings interact during point-based image editing, proposing DragText to optimize text embedding while preserving content integrity.
http://arxiv.org/abs/2407.17842v1
Compressor summary: Key points: - Current AI applications in atmospheric science use classic deep learning, but have limitations. - Multimodal foundation models, like GPT-4o, can process diverse data and execute complex tasks. - The report evaluates GPT-4o's performance on four main classes of atmospheric scientific tasks. Summary: The report explores how GPT-4o, a multimodal foundation model, performs various atmospheric scientific tasks that classic deep learning cannot handle well.
http://arxiv.org/abs/2407.17839v1
Compressor summary: The text proposes a dynamic Markov Decision Process model for ride-hailing that balances efficiency and fairness by predicting future requests and using a customised scalarisation function.
http://arxiv.org/abs/2407.17838v1
Compressor summary: UMono is a new framework for estimating depth from a single underwater image by considering light, medium, and feature fusion.
http://arxiv.org/abs/2407.17835v1
Compressor summary: IsUMap is a new method for better representing complex data using UMAP, Isomap, and Vietoris-Rips filtrations.
http://arxiv.org/abs/2407.17834v1
Compressor summary: Normalization techniques can reduce spectral bias in coordinate networks, improving their performance in various scientific computing tasks.
http://arxiv.org/abs/2407.17827v1
Compressor summary: LexVLA is a framework that learns interpretable lexical representations for both visual and language modalities without complex design, achieving better cross-modal retrieval performance than baselines.
http://arxiv.org/abs/2407.17822v1
Compressor summary: This study proposes deep-reinforcement-learning methods for flow control in energy systems, using group-invariant networks and positional encoding to improve learning speed and quality.
http://arxiv.org/abs/2407.17817v1
Compressor summary: The study finds that large language models memorize sequences by encoding high-level features and using general language modeling abilities, making it hard to remove without affecting the model's performance.
http://arxiv.org/abs/2407.17816v1
Compressor summary: SWORD is a novel self-training framework that clusters unlabeled data for node classification, preventing forgetting of old categories and improving performance on new ones.
http://arxiv.org/abs/2407.17813v1
Compressor summary: The Bottleneck Adapter is a novel approach to improve multimodal functionalities of large language models and vision-language tasks by using lightweight adapters for joint optimization, achieving 90.12% accuracy.
http://arxiv.org/abs/2407.17801v1
Compressor summary: EEG-SSM is a novel model that combines temporal and spectral components to effectively classify dementia using EEG data, achieving high accuracy and outperforming existing models.
http://arxiv.org/abs/2407.17797v1
Compressor summary: The authors propose Feature Guidance Attack (FGA), which uses text representations to perturb clean images, and its enhanced version FGA-T that leverages text attack, data augmentation, and momentum, showing superior black-box transferability against VLP models.
http://arxiv.org/abs/2407.17795v1
Compressor summary: The paper proposes a method to improve feature selection in genetic algorithms by initializing the population with diverse individuals and re-initializing it with new random individuals in each generation.
http://arxiv.org/abs/2407.17791v1
Compressor summary: This study shows that artificial neural networks can perform non-trivial visual reasoning tasks without prior training, similar to humans' learning-independent reasoning.
http://arxiv.org/abs/2407.17790v1
Compressor summary: Key points: - KANs are a novel neural network type that can replace MLPs with higher accuracy and interoperability - KAN assessment is limited and no study has explored their implementation in hardware design - The paper tests KANs for classification issues using four datasets and implements them in hardware using Vitis HLS tool - The results show that KANs are not better than MLPs in high complex datasets and require more resources Summary: The paper compares KANs and MLPs for classification tasks, finding that MLP is more efficient and accurate than KANs, especially in high complex datasets.
http://arxiv.org/abs/2407.17783v1
Compressor summary: The paper proposes a simple vision transformer using MoE, feedforward networks, and grouped query attention to reduce complexity and improve performance.
http://arxiv.org/abs/2407.17781v1
Compressor summary: The study shows that combining an AI-based weather prediction model with a data assimilation method can improve forecasts, especially in areas with limited observations.
http://arxiv.org/abs/2407.17779v1
Compressor summary: Key points: - Cross-modal retrieval with 2D and 3D data has challenges due to noisy annotations - DAC is a framework that uses adaptive division and alignment strategies to handle noisy labels - DAC achieves state-of-the-art results on both traditional and newly proposed benchmarks Summary: The paper presents DAC, a framework that improves cross-modal retrieval with 2D and 3D data by adapting to noisy annotations using dynamic division and alignment techniques. It shows superior performance on various benchmarks.
http://arxiv.org/abs/2407.17773v1
Compressor summary: The paper introduces a new benchmark for testing visual analogical reasoning in large multimodal models (LMMs) and compares their performance to human adults and children, finding LMMs struggle with complex tasks requiring 3D understanding.
http://arxiv.org/abs/2407.17771v1
Compressor summary: Banyan is an improved model that learns semantic representations by resolving multiple constituent structures into a shared one, leading to better performance and memory efficiency than prior approaches.
http://arxiv.org/abs/2407.17770v1
Compressor summary: BotEval is an open-source toolkit that allows human evaluators to interact with NLP models in complex tasks like negotiations and conversation moderation, providing templates and compatibility with crowdsourcing platforms.
http://arxiv.org/abs/2407.17762v1
Compressor summary: The study presents a novel approach using synthetic data to train a computer vision model that accurately detects Mpox lesions on various body parts and skin tones, achieving high accuracy, precision, recall, and F1-Score metrics.
http://arxiv.org/abs/2407.17755v1
Compressor summary: The paper proposes an ensemble learning technique using machine learning and deep learning models for improved diagnosis of diabetic retinopathy with high accuracy.
http://arxiv.org/abs/2407.17745v1
Compressor summary: The paper proposes a new model, EREM, for knowledge graph alignment that decomposes the task into entity alignment and relation alignment, achieving better results than existing models.
http://arxiv.org/abs/2407.17744v1
Compressor summary: The paper proposes a dual network with delayed activation to balance complementarity and consistency in incomplete multi-view clustering, improving performance over 12 baselines.
http://arxiv.org/abs/2407.17738v1
Compressor summary: The paper introduces Orthogonal Mapping, a method to improve fine-grained object detection by reducing semantic confusion using orthogonal vectors and improving classification accuracy.
http://arxiv.org/abs/2407.17734v1
Compressor summary: CLOVER is a cost-effective instruction learning framework for conversational pathology using GPT-3.5 and template-based instructions that outperforms strong baselines in answering questions.
http://arxiv.org/abs/2407.17730v1
Compressor summary: The authors evaluate the feasibility of using large language models for cognitive behavioral therapy by testing their emotional tendency, structured dialogue pattern, and proactive questioning ability on a CBT corpus from online videos.
http://arxiv.org/abs/2407.17726v1
Compressor summary: The text describes a new framework that handles incomplete data from different sources and censored survival labels, using advanced foundation models to improve survival analysis accuracy.
http://arxiv.org/abs/2407.17723v1
Compressor summary: This paper analyzes the theoretical relationship between graph recommender (GR) and graph contrastive learning (GCL), showing their equivalence in terms of encoders and loss functions, and suggesting cross-field research directions.
http://arxiv.org/abs/2407.17721v1
Compressor summary: The text proposes a hybrid learning framework that combines CNNs and PINNs to solve the full inverse EIT problem using supervised and unsupervised learning.
http://arxiv.org/abs/2407.17710v1
Compressor summary: The text introduces a novel framework for machine unlearning that uses dimensional alignment as a regularizer loss to remove information from specific data while preserving knowledge from the rest, and also criticizes existing evaluation metrics for machine unlearning.
http://arxiv.org/abs/2407.17705v1
Compressor summary: The paper proposes a novel anomaly localization method that combines Mamba with feature reconstruction and refinement, using artificially simulated anomalies for better training and performance.
http://arxiv.org/abs/2407.17703v1
Compressor summary: The text proposes a novel framework that uses context-aware knowledge graphs and neural networks to improve traffic speed forecasting by considering spatial and temporal urban contexts.
http://arxiv.org/abs/2407.17697v1
Compressor summary: The study presents new scoring rules (Penalized Brier Score and Penalized Logarithmic Loss) that reward correct predictions more and help improve probabilistic classification models.
http://arxiv.org/abs/2407.17695v1
Compressor summary: DiVE is a framework that helps large language models learn and improve their understanding of world dynamics from limited demonstrations, enabling them to make better decisions like human players.
http://arxiv.org/abs/2407.17688v1
Compressor summary: The study examines political biases in large language models and how they affect stance classification tasks, finding significant differences across datasets but not models or prompting schemes.
http://arxiv.org/abs/2407.17686v1
Compressor summary: Transformers can model generative processes for higher order Markov sources by learning the in-context conditional empirical distribution, as shown by both theoretical and empirical results.
http://arxiv.org/abs/2407.17678v1
Compressor summary: Sparsely-Sharded (S2) Attention is a new attention algorithm that divides context into partitions for different heads, improving efficiency and memory reduction without sacrificing model quality.