arxiv compressed, 2023-12-21

This page contains one-sentence summaries of cs.AI/ML/CV/CL papers announced on 2023-12-21 generated by the compressor, my personal LLM-based project.


Generative Multimodal Models are In-Context Learners

Quan Sun,Yufeng Cui,Xiaosong Zhang,Fan Zhang,Qiying Yu,Zhengxiong Luo,Yueze Wang,Yongming Rao,Jingjing Liu,Tiejun Huang,Xinlong Wang

http://arxiv.org/abs/2312.13286v1

Compressor summary: Emu2 is a large, 37 billion parameter generative multimodal model that excels at solving various multimodal tasks with minimal input using in-context learning and reasoning.


UniSDF: Unifying Neural Representations for High-Fidelity 3D Reconstruction of Complex Scenes with Reflections

Fangjinhua Wang,Marie-Julie Rakotosaona,Michael Niemeyer,Richard Szeliski,Marc Pollefeys,Federico Tombari

http://arxiv.org/abs/2312.13285v1

Compressor summary: UniSDF is a 3D reconstruction method that accurately models complex scenes with reflections using blended representation techniques and a multi-resolution grid backbone.


Deep Learning on 3D Neural Fields

Pierluigi Zama Ramirez,Luca De Luigi,Daniele Sirocchi,Adriano Cardace,Riccardo Spezialetti,Francesco Ballerini,Samuele Salti,Luigi Di Stefano

http://arxiv.org/abs/2312.13277v1

Compressor summary: The paper introduces nf2vec, a framework that converts Neural Fields representing 3D data into compact embeddings for use in deep learning tasks.


Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting

Junwu Zhang,Zhenyu Tang,Yatian Pang,Xinhua Cheng,Peng Jin,Yida Wei,Wangbo Yu,Munan Ning,Li Yuan

http://arxiv.org/abs/2312.13271v1

Compressor summary: Repaint123 improves 3D image generation by combining a 2D diffusion model, repainting strategy, and visibility-aware adaptive strength for consistent multi-view images with fine textures and fast speed.


ClassLIE: Structure- and Illumination-Adaptive Classification for Low-Light Image Enhancement

Zixiang Wei,Yiting Wang,Lichao Sun,Athanasios V. Vasilakos,Lin Wang

http://arxiv.org/abs/2312.13265v1

Compressor summary: ClassLIE is a novel framework that combines CNNs and transformers to enhance low-light images by classifying and adaptively learning structural and illumination information, achieving state-of-the-art performance.


dIR -- Discrete Information Retrieval: Conversational Search over Unstructured (and Structured) Data with Large Language Models

Pablo M. Rodriguez Bertorello,Jean Rodmond Junior Laguerre

http://arxiv.org/abs/2312.13264v1

Compressor summary: dIR is a novel approach that allows querying structured and unstructured data using natural language and converting it into SQL for efficient retrieval.


Conditional Image Generation with Pretrained Generative Model

Rajesh Shrestha,Bowen Xie

http://arxiv.org/abs/2312.13253v1

Compressor summary: This paper proposes methods to speed up conditional image generation using pre-trained diffusion models and reduce the need for training and computational resources.


Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model

Saurabh Saxena,Junhwa Hur,Charles Herrmann,Deqing Sun,David J. Fleet

http://arxiv.org/abs/2312.13252v1

Compressor summary: The paper proposes DMD, a diffusion model that uses log-scale depth parameterization, FOV conditioning, and synthetic data augmentation to achieve significant improvements in zero-shot metric depth estimation for indoor and outdoor scenes.


Efficient Verification-Based Face Identification

Amit Rozner,Barak Battash,Ofir Lindenbaum,Lior Wolf

http://arxiv.org/abs/2312.13240v1

Compressor summary: The paper proposes a novel face verification method that uses a hypernetwork to generate efficient neural models for personalized face identification, achieving state-of-the-art results with fewer parameters and computational cost.


Diffusion Models With Learned Adaptive Noise

Subham Sekhar Sahoo,Aaron Gokaslan,Chris De Sa,Volodymyr Kuleshov

http://arxiv.org/abs/2312.13236v1

Compressor summary: The paper introduces MuLAN, a learned diffusion process for image synthesis that adapts noise levels across an image, improving performance on density estimation tasks.


Position Paper: Bridging the Gap Between Machine Learning and Sensitivity Analysis

Christian A. Scholbeck,Julia Moosbauer,Giuseppe Casalicchio,Hoshin Gupta,Bernd Bischl,Christian Heumann

http://arxiv.org/abs/2312.13234v1

Compressor summary: The authors propose that sensitivity analysis, a method used to explain complex systems in various fields, can also be used to interpret machine learning models and highlight the benefits of this unified view for both researchers and practitioners.


StableKD: Breaking Inter-block Optimization Entanglement for Stable Knowledge Distillation

Shiu-hong Kao,Jierun Chen,S. H. Gary Chan

http://arxiv.org/abs/2312.13223v1

Compressor summary: StableKD is a novel knowledge distillation framework that achieves more stable optimization and boosts model accuracy by breaking the Inter-Block Optimization Entanglement phenomenon using Decomposition and Recomposition operations.


FiFAR: A Fraud Detection Dataset for Learning to Defer

Jean V. Alves,Diogo Leitão,Sérgio Jesus,Marco O. P. Sampaio,Pedro Saleiro,Mário A. T. Figueiredo,Pedro Bizarro

http://arxiv.org/abs/2312.13218v1

Compressor summary: The paper introduces FiFAR, a synthetic dataset for learning to defer algorithms in financial fraud detection, which considers human work capacity constraints and allows for benchmarking of hybrid human-AI decision systems.


Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps

Octave Mariotti,Oisin Mac Aodha,Hakan Bilen

http://arxiv.org/abs/2312.13216v1

Compressor summary: The authors propose a new self-supervised method for semantic correspondence estimation that uses weak 3D understanding with a spherical prior, improving performance on challenging image characteristics like symmetries and repeated parts.


DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization

Rahul Chand,Yashoteja Prabhu,Pratyush Kumar

http://arxiv.org/abs/2312.13211v1

Compressor summary: DSFormer is a novel weight factorization method for compressing transformer models in natural language understanding that improves efficiency-accuracy trade-off by using a semi-structured sparse matrix and a task-aware learning algorithm.


LlaMaVAE: Guiding Large Language Model Generation via Continuous Latent Sentence Spaces

Yingji Zhang,Danilo S. Carvalho,Ian Pratt-Hartmann,André Freitas

http://arxiv.org/abs/2312.13208v1

Compressor summary: LlaMaVAE combines a sentence encoder (sentenceT5) with a language model (LlaMA) and a VAE to improve text generation control and performance on various tasks compared to previous models.


HCDIR: End-to-end Hate Context Detection, and Intensity Reduction model for online comments

Neeraj Kumar Singh,Koyel Ghosh,Joy Mahapatra,Utpal Garain,Apurbalal Senapati

http://arxiv.org/abs/2312.13193v1

Compressor summary: The paper proposes HCDIR, an end-to-end model for detecting hateful comments and reducing their intensity in social media posts, focusing on low-resource languages like Indian languages.


Contextual Code Switching for Machine Translation using Language Models

Arshad Kaji,Manan Shah

http://arxiv.org/abs/2312.13179v1

Compressor summary: Large language models perform well in many tasks but struggle with code switching in machine translation due to their training methods.


Learning Fair Policies for Multi-stage Selection Problems from Observational Data

Zhuangzhuang Jia,Grani A. Hanasusanto,Phebe Vayanos,Weijun Xie

http://arxiv.org/abs/2312.13173v1

Compressor summary: Key points: - The text is about learning fair policies for multi-stage selection problems from observational data - The proposed framework can handle various fairness constraints and linear selection rules - The approach improves precision and reduces unfairness compared to the existing policy Summary: The paper proposes a framework for learning fair and interpretable policies for multi-stage selection problems using causal inference and optimization.


Gappy local conformal auto-encoders for heterogeneous data fusion: in praise of rigidity

Erez Peterfreund,Iryna Burak,Ofir Lindenbaum,Jim Gimlett,Felix Dietrich,Ronald R. Coifman,Ioannis G. Kevrekidis

http://arxiv.org/abs/2312.13155v1

Compressor summary: The paper proposes a neural network pipeline that fuses partial and heterogeneous measurements from different sensors by using multiple slightly perturbed instances to estimate local distortion and create a consistent latent space.


Neural Stochastic Differential Equations with Change Points: A Generative Adversarial Approach

Zhongchang Sun,Yousef El-Laham,Svitlana Vyetrenko

http://arxiv.org/abs/2312.13152v1

Compressor summary: Key points: - The paper proposes a change point detection algorithm for time series modeled as neural SDEs - The algorithm jointly learns the change points and the SDE parameters using GANs - The algorithm outperforms classical benchmarks, standard GAN-based neural SDEs, and other deep generative models Summary: The paper presents a novel algorithm that uses GANs to detect change points and learn neural SDE models for time series data.


Splatter Image: Ultra-Fast Single-View 3D Reconstruction

Stanislaw Szymanowicz,Christian Rupprecht,Andrea Vedaldi

http://arxiv.org/abs/2312.13150v1

Compressor summary: Splatter Image is a fast and accurate monocular 3D object reconstruction method based on Gaussian Splatting and neural networks.


Augment on Manifold: Mixup Regularization with UMAP

Yousef El-Laham,Elizabeth Fons,Dillon Daudert,Svitlana Vyetrenko

http://arxiv.org/abs/2312.13141v1

Compressor summary: UMAP Mixup is a new data augmentation technique for deep learning models that uses uniform manifold approximation and projection to create synthetic samples on the data manifold, improving generalization performance in various regression tasks.


Scaling Compute Is Not All You Need for Adversarial Robustness

Edoardo Debenedetti,Zishen Wan,Maksym Andriushchenko,Vikash Sehwag,Kshitij Bhardwaj,Bhavya Kailkhura

http://arxiv.org/abs/2312.13131v1

Compressor summary: The paper derives scaling laws for adversarial robustness and analyzes the impact of computing power, model size, and training techniques on performance improvements.


VSR-Net: Vessel-like Structure Rehabilitation Network with Graph Clustering

Haili Ye,Xiaoqing Zhang,Yan Hu,Huazhu Fu,Jiang Liu

http://arxiv.org/abs/2312.13116v1

Compressor summary: The paper proposes a novel network (VSR-Net) to improve segmentation of vessel-like structures in medical images by rehabilitating subsection ruptures and calibrating model predictions.


Investigating Color Illusions from the Perspective of Computational Color Constancy

Oguzhan Ulucan,Diclehan Ulucan,Marc Ebner

http://arxiv.org/abs/2312.13114v1

Compressor summary: The text discusses how analyzing color illusions can help improve computational color constancy methods, enabling them to estimate light sources in scenes with multiple illuminants.


Pre-training of Molecular GNNs as Conditional Boltzmann Generator

Daiki Koge,Naoaki Ono,Shigehiko Kanaya

http://arxiv.org/abs/2312.13110v1

Compressor summary: The text introduces Boltzmann GNN, a pre-training method for molecular GNNs that generates latent vectors for multiple conformations from 2D molecular graphs, outperforming existing methods.


ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation

Difei Gao,Lei Ji,Zechen Bai,Mingyu Ouyang,Peiran Li,Dongxing Mao,Qinchen Wu,Weichen Zhang,Peiyi Wang,Xiangwu Guo,Hengxu Wang,Luowei Zhou,Mike Zheng Shou

http://arxiv.org/abs/2312.13108v1

Compressor summary: The paper introduces AssistGUI, a new benchmark to test AI agents' abilities to automate complex tasks on Windows using mouse and keyboard input, and proposes an improved framework that performs better than existing methods but still has room for improvement.


Optimizing Ego Vehicle Trajectory Prediction: The Graph Enhancement Approach

Sushil Sharma,Aryan Singh,Ganesh Sistu,Mark Halton,Ciarán Eising

http://arxiv.org/abs/2312.13104v1

Compressor summary: The authors propose using Graph Neural Networks and Bird's Eye View perspectives for future trajectory prediction in autonomous driving systems, improving on traditional DNN-based methods.


Exploring Multimodal Large Language Models for Radiology Report Error-checking

Jinge Wu,Yunsoo Kim,Eva C. Keller,Jamie Chow,Adam P. Levine,Nikolas Pontikos,Zina Ibrahim,Paul Taylor,Michelle C. Williams,Honghan Wu

http://arxiv.org/abs/2312.13103v1

Compressor summary: The paper presents an LLM-based assistant for radiologists that helps them find errors in their reports and outperforms other models and clinicians on a SIMPLE error-checking task.


SpecNeRF: Gaussian Directional Encoding for Specular Reflections

Li Ma,Vasu Agrawal,Haithem Turki,Changil Kim,Chen Gao,Pedro Sander,Michael Zollhöfer,Christian Richardt

http://arxiv.org/abs/2312.13102v1

Compressor summary: The paper proposes a learnable Gaussian directional encoding to better model glossy surfaces' appearance under near-field lighting and introduces a data-driven geometry prior to improve specular reflection modeling in neural radiance fields.


SEER-ZSL: Semantic Encoder-Enhanced Representations for Generalized Zero-Shot Learning

William Heyden,Habib Ullah,M. Salman Siddiqui,Fadi Al Machot

http://arxiv.org/abs/2312.13100v1

Compressor summary: The paper introduces a dual strategy for GZSL that enhances semantic information with an innovative encoder and refines generative capabilities with a novel loss function, improving generalization and performance in diverse settings.


In Generative AI we Trust: Can Chatbots Effectively Verify Political Information?

Elizaveta Kuznetsova,Mykola Makhortykh,Victoria Vziatysheva,Martha Stolze,Ani Baghumyan,Aleksandra Urman

http://arxiv.org/abs/2312.13096v1

Compressor summary: The article compares two AI-based chatbots' ability to detect true and false statements about political topics in different languages and finds that ChatGPT performs better than Bing Chat.


MoSAR: Monocular Semi-Supervised Model for Avatar Reconstruction using Differentiable Shading

Abdallah Dib,Luiz Gustavo Hafemann,Emeline Got,Trevor Anderson,Amin Fadaeinejad,Rafael M. O. Cruz,Marc-Andre Carbonneau

http://arxiv.org/abs/2312.13091v1

Compressor summary: MoSAR is a method to generate 3D avatars from monocular images using semi-supervised learning and differentiable shading, producing realistic and relightable results.


Perception Test 2023: A Summary of the First Challenge And Outcome

Joseph Heyward,João Carreira,Dima Damen,Andrew Zisserman,Viorica Pătrăucean

http://arxiv.org/abs/2312.13090v1

Compressor summary: The First Perception Test challenge assessed various video models on seven tasks involving different modalities at the ICCV 2023 conference.


Pyreal: A Framework for Interpretable ML Explanations

Alexandra Zytek,Wei-En Wang,Dongyu Liu,Laure Berti-Equille,Kalyan Veeramachaneni

http://arxiv.org/abs/2312.13084v1

Compressor summary: Pyreal is a system that helps users create understandable explanations for machine learning predictions using Python.


BEVSeg2TP: Surround View Camera Bird's-Eye-View Based Joint Vehicle Segmentation and Ego Vehicle Trajectory Prediction

Sushil Sharma,Arindam Das,Ganesh Sistu,Mark Halton,Ciarán Eising

http://arxiv.org/abs/2312.13081v1

Compressor summary: The paper proposes BEVSeg2TP, a system that predicts the ego vehicle's future trajectory using semantic segmentation of objects in surround-view camera images and a spatiotemporal probabilistic network.


Point Deformable Network with Enhanced Normal Embedding for Point Cloud Analysis

Xingyilang Yin,Xi Yang,Liangchen Liu,Nannan Wang,Xinbo Gao

http://arxiv.org/abs/2312.13071v1

Compressor summary: PDNet is a new MLP-based network that uses Point Deformable Aggregation Module (PDAM) to capture long-range dependencies in point cloud analysis by aggregating information from adaptive deformable reference points, improving representation capability and performance.


Continuous-time Graph Representation with Sequential Survival Process

Abdulkadir Celikkanat,Nikolaos Nakis,Morten Mørup

http://arxiv.org/abs/2312.13068v1

Compressor summary: The text proposes GraSSP, a novel stochastic process using survival functions to model intermittent edge-persistent networks, improving representation learning for evolving networks.


PPEA-Depth: Progressive Parameter-Efficient Adaptation for Self-Supervised Monocular Depth Estimation

Yue-Jiang Dong,Yuan-Chen Guo,Ying-Tian Liu,Fang-Lue Zhang,Song-Hai Zhang

http://arxiv.org/abs/2312.13066v1

Compressor summary: PPEA-Depth is a method to improve self-supervised depth estimation in dynamic scenes by transferring knowledge from pre-trained image models using compact encoder and decoder adapters.


Quantifying Bias in Text-to-Image Generative Models

Jordan Vice,Naveed Akhtar,Richard Hartley,Ajmal Mian

http://arxiv.org/abs/2312.13053v1

Compressor summary: The paper proposes a method to evaluate biases in text-to-image models without preconceived notions, using three metrics and testing on various scenarios.


Retrieval-augmented Multilingual Knowledge Editing

Weixuan Wang,Barry Haddow,Alexandra Birch

http://arxiv.org/abs/2312.13040v1

Compressor summary: ReMaKE is a method to update LLMs' knowledge in multilingual settings using retrieved information from a multilingual database.


AutoXPCR: Automated Multi-Objective Model Selection for Time Series Forecasting

Raphael Fischer,Amal Saadallah

http://arxiv.org/abs/2312.13038v1

Compressor summary: AutoXPCR is a novel AutoML method that selects and explains DNNs for time series forecasting based on predictive quality, complexity, and resource consumption.


NodeMixup: Tackling Under-Reaching for Graph Neural Networks

Weigang Lu,Ziyu Guan,Wei Zhao,Long Jin

http://arxiv.org/abs/2312.13032v1

Compressor summary: The paper introduces NodeMixup, a method to improve graph neural networks' performance by addressing the under-reaching issue caused by uneven labeled node distribution in graphs.


A self-attention-based differentially private tabular GAN with high data utility

Zijian Li,Zhihui Wang

http://arxiv.org/abs/2312.13031v1

Compressor summary: DP-SACTGAN is a new framework for creating private and accurate tabular data using GANs.


Doubly Perturbed Task-Free Continual Learning

Byung Hyun Lee,Min-hwan Oh,Se Young Chun

http://arxiv.org/abs/2312.13027v1

Compressor summary: The paper proposes DPCL, a novel framework for task-free continual learning that uses input and decision-making perturbations to prevent catastrophic forgetting and improve plasticity.


DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis

Yuming Gu,Hongyi Xu,You Xie,Guoxian Song,Yichun Shi,Di Chang,Jing Yang,Lingjie Luo

http://arxiv.org/abs/2312.13016v1

Compressor summary: Key points: - DiffPortrait3D is a conditional diffusion model that synthesizes 3D-consistent photo-realistic novel views from a single portrait - It uses generative prior of 2D diffusion models, disentangled attentive control of appearance and camera pose, and cross-view attention module - It achieves state-of-the-art results on in-the-wild and multi-view benchmarks Summary: DiffPortrait3D is a novel method that can generate realistic 3D facial details from a single portrait by using diffusion models, attention mechanisms, and 3D noise.


AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

Dong Huang,Qingwen Bu,Jie M. Zhang,Michael Luck,Heming Cui

http://arxiv.org/abs/2312.13010v1

Compressor summary: AgentCoder is a novel multi-agent framework that improves code generation by collaboratively generating test cases, executing them, and providing feedback to the programmer agent.


No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

Ishan Rajendrakumar Dave,Simon Jenni,Mubarak Shah

http://arxiv.org/abs/2312.13008v1

Compressor summary: The authors propose a new frame-level temporal self-supervision method for videos that improves feature learning and generalization performance on various tasks.


Machine Mindset: An MBTI Exploration of Large Language Models

Jiaxi Cui,Liuzhenghao Lv,Jing Wen,Jing Tang,YongHong Tian,Li Yuan

http://arxiv.org/abs/2312.12999v1

Compressor summary: The paper introduces Machine Mindset, a method to integrate MBTI personality traits into large language models for personalized AI applications.


Aggregating Multiple Bio-Inspired Image Region Classifiers For Effective And Lightweight Visual Place Recognition

Bruno Arcanjo,Bruno Ferrarini,Maria Fasli,Michael Milford,Klaus D. McDonald-Maier,Shoaib Ehsan

http://arxiv.org/abs/2312.12995v1

Compressor summary: RegionDrosoNet is a novel multi-DrosoNet system that achieves improved visual place recognition performance while maintaining low computational requirements by specializing DrosoNets on different partitions of the image and using a voting module to combine outputs.


Benchmarking and Analyzing In-context Learning, Fine-tuning and Supervised Learning for Biomedical Knowledge Curation: a focused study on chemical entities of biological interest

Emily Groves,Minhong Wang,Yusuf Abdulle,Holger Kunz,Jason Hoelscher-Obermaier,Ronin Wu,Honghan Wu

http://arxiv.org/abs/2312.12989v1

Compressor summary: The study compares NLP paradigms for biomedical ontology curation and shows that in-context learning (ICL) with GPT-4 excels in tasks requiring less data, while fine-tuning (FT) and supervised learning (ML) perform better with more data.


From Past to Future: Rethinking Eligibility Traces

Dhawal Gupta,Scott M. Jordan,Shreyas Chaudhari,Bo Liu,Philip S. Thomas,Bruno Castro da Silva

http://arxiv.org/abs/2312.12972v1

Compressor summary: The paper proposes a bidirectional value function that considers both past and future rewards to improve credit assignment and policy evaluation in reinforcement learning.


D3Former: Jointly Learning Repeatable Dense Detectors and Feature-enhanced Descriptors via Saliency-guided Transformer

Junjie Gao,Pengfei Wang,Qiujie Dong,Qiong Zeng,Shiqing Xin,Caiming Zhang

http://arxiv.org/abs/2312.12970v1

Compressor summary: D3Former is a new point cloud matching method that jointly learns repeatable keypoint detectors and feature-enhanced descriptors, improving accuracy on indoor and outdoor benchmarks.


Radar Fields: An Extension of Radiance Fields to SAR

Thibaud Ehret,Roger Marí,Dawa Derksen,Nicolas Gasnier,Gabriele Facciolo

http://arxiv.org/abs/2312.12961v1

Compressor summary: The paper introduces "radar fields", an extension of radiance fields to radar images, enabling surface modeling from radar image collections and hybrid methods with optical images.


TADAP: Trajectory-Aided Drivable area Auto-labeling with Pre-trained self-supervised features in winter driving conditions

Eerik Alamikkotervo,Risto Ojala,Alvari Seppänen,Kari Tammi

http://arxiv.org/abs/2312.12954v1

Compressor summary: TADAP is a method for automatically labeling drivable areas in winter conditions using satellite trajectories and pre-trained visual features, which improves self-supervised driving detection by 9.6%.


Class Conditional Time Series Generation with Structured Noise Space GAN

Hamidreza Gholamrezaei,Alireza Koochali,Andreas Dengel,Sheraz Ahmed

http://arxiv.org/abs/2312.12946v1

Compressor summary: SNS-GAN is a new generative model that embeds class labels in the noise space and performs well in both image and time series data generation.


Misclassification excess risk bounds for 1-bit matrix completion

The Tien Mai

http://arxiv.org/abs/2312.12945v1

Compressor summary: The study analyzes the prediction error of two logistic regression methods for 1-bit matrix completion and shows that nuclear-norm penalization achieves the optimal rate.


Robust Loss Functions for Training Decision Trees with Noisy Labels

Jonathan Wilton,Nan Ye

http://arxiv.org/abs/2312.12937v1

Compressor summary: The paper studies how to train decision trees with noisy labels using robust loss functions, proposes a framework for constructing them, and introduces a new loss called negative exponential loss.


Concept-based Explainable Artificial Intelligence: A Survey

Eleonora Poeta,Gabriele Ciravegna,Eliana Pastor,Tania Cerquitelli,Elena Baralis

http://arxiv.org/abs/2312.12936v1

Compressor summary: The paper reviews concept-based explainable artificial intelligence (C-XAI) approaches, defines and categorizes them, and suggests evaluation strategies to help advance the field.


Stability of Graph Convolutional Neural Networks through the lens of small perturbation analysis

Lucia Testa,Claudio Battiloro,Stefania Sardellitti,Sergio Barbarossa

http://arxiv.org/abs/2312.12934v1

Compressor summary: This paper investigates how changing a small number of edges in a graph affects the stability of Graph Convolutional Neural Networks (GCNs) and provides a way to measure and analyze this effect.


Assaying on the Robustness of Zero-Shot Machine-Generated Text Detectors

Yi-Fan Zhang,Zhang Zhang,Liang Wang,Rong Jin

http://arxiv.org/abs/2312.12918v1

Compressor summary: The text explores advanced language models for detecting AI-generated texts across different topics without needing labeled data, addressing challenges in real-world scenarios.


Sign Language Production with Latent Motion Transformer

Pan Xie,Taiyi Peng,Yao Du,Qipeng Zhang

http://arxiv.org/abs/2312.12917v1

Compressor summary: The research presents a new method to generate high-quality sign videos from sign glosses using improved 3D VQ-GAN and sequence-to-sequence attention, achieving better results than previous approaches on two datasets.


Produce Once, Utilize Twice for Anomaly Detection

Shuyuan Wang,Qi Li,Huiyuan Luo,Chengkan Lv,Zhengtao Zhang

http://arxiv.org/abs/2312.12913v1

Compressor summary: POUTA is a novel method for visual anomaly detection that improves accuracy and efficiency by reusing and refining features from a reconstruction-based network.


The Common Optical Music Recognition Evaluation Framework

Pau Torras,Sanket Biswas,Alicia Fornés

http://arxiv.org/abs/2312.12908v1

Compressor summary: The paper proposes a new music representation language, Music Tree Notation (MTN), to enable standardized comparison of Optical Music Recognition systems using a specific set of metrics.


PGN: A perturbation generation network against deep reinforcement learning

Xiangjuan Li,Feifan Li,Yang Li,Quan Pan

http://arxiv.org/abs/2312.12904v1

Compressor summary: The paper proposes a generative model to create adversarial examples for deep reinforcement learning, measuring stealthiness by action consistency ratio, and showing fast and effective attacks compared to other methods.


MinePlanner: A Benchmark for Long-Horizon Planning in Large Minecraft Worlds

William Hill,Ireton Liu,Anita De Mello Koch,Damion Harvey,George Konidaris,Steven James

http://arxiv.org/abs/2312.12891v1

Compressor summary: The authors create a new Minecraft planning benchmark that tests state-of-the-art planners on various challenges and provide a framework for creating new tasks.


BSL: Understanding and Improving Softmax Loss for Recommendation

Junkang Wu,Jiawei Chen,Jiancan Wu,Wentao Shi,Jizhi Zhang,Xiang Wang

http://arxiv.org/abs/2312.12882v1

Compressor summary: The paper investigates why Softmax loss performs well in recommendation models and proposes a new loss function, Bilateral SoftMax Loss, that improves robustness and fairness on both positive and negative examples.


Rule-Extraction Methods From Feedforward Neural Networks: A Systematic Literature Review

Sara El Mekkaoui,Loubna Benabbou,Abdelaziz Berrado

http://arxiv.org/abs/2312.12878v1

Compressor summary: The paper reviews different approaches for extracting rules from feedforward neural networks to improve interpretability in AI systems.


Relightable and Animatable Neural Avatars from Videos

Wenbin Lin,Chengwei Zheng,Jun-Hai Yong,Feng Xu

http://arxiv.org/abs/2312.12877v1

Compressor summary: The text proposes a method to create realistic 3D digital avatars that can adapt to different lighting and poses, using novel techniques for modeling geometry and shadow changes.


Integration and Performance Analysis of Artificial Intelligence and Computer Vision Based on Deep Learning Algorithms

Bo Liu,Liqiang Yu,Chang Che,Qunwei Lin,Hao Hu,Xinyu Zhao

http://arxiv.org/abs/2312.12872v1

Compressor summary: The paper analyzes how deep learning and computer vision technologies are integrated for better image classification and object detection, while discussing their limitations and future directions.


Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches

Yu Liu,Runzhe Wan,James McQueen,Doug Hains,Jinxiang Gu,Rui Song

http://arxiv.org/abs/2312.12871v1

Compressor summary: The paper proposes two data-driven methods for selecting the assumed effect size in online experiments, which can improve accuracy and efficiency compared to traditional domain knowledge-based methods.


The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective

Wenqi Jia,Miao Liu,Hao Jiang,Ishwarya Ananthabhotla,James M. Rehg,Vamsi Krishna Ithapu,Ruohan Gao

http://arxiv.org/abs/2312.12870v1

Compressor summary: The paper introduces Av-CONV, a multi-modal, multi-task framework that predicts exocentric conversational interactions from egocentric videos, outperforming baselines in experiments.


Parameterized Projected Bellman Operator

Théo Vincent,Alberto Maria Metelli,Boris Belousov,Jan Peters,Marcello Restelli,Carlo D'Eramo

http://arxiv.org/abs/2312.12869v1

Compressor summary: The paper proposes a new reinforcement learning algorithm that learns an approximate version of the Bellman operator to improve generalization and avoid projection steps, which is called projected Bellman operator (PBO).


Towards Machines that Trust: AI Agents Learn to Trust in the Trust Game

Ardavan S. Nobandegani,Irina Rish,Thomas R. Shultz

http://arxiv.org/abs/2312.12868v1

Compressor summary: The text studies how trust emerges in human social interactions using reinforcement learning and simulations to analyze the trust game.


RadEdit: stress-testing biomedical vision models via diffusion image editing

Fernando Pérez-García,Sam Bond-Taylor,Pedro P. Sanchez,Boris van Breugel,Daniel C. Castro,Harshita Sharma,Valentina Salvatelli,Maria T. A. Wetscherek,Hannah Richardson,Matthew P. Lungren,Aditya Nori,Javier Alvarez-Valle,Ozan Oktay,Maximilian Ilse

http://arxiv.org/abs/2312.12865v1

Compressor summary: The text proposes using generative image editing with a text-to-image diffusion model to simulate dataset shifts and diagnose failure modes of biomedical vision models, improving their performance and robustness without additional data collection.


SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing

Zhecheng Wang,Rajanie Prabha,Tianyuan Huang,Jiajun Wu,Ram Rajagopal

http://arxiv.org/abs/2312.12856v1

Compressor summary: The authors create SkyScript, a large vision-language dataset for remote sensing images, by linking them to OpenStreetMap data using geo-coordinates, and use it to train a versatile vision language model that improves zero-shot scene classification and other tasks.


CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models

Dan Shi,Chaobin You,Jiantao Huang,Taihao Li,Deyi Xiong

http://arxiv.org/abs/2312.12853v1

Compressor summary: The paper introduces CORECODE, a dataset for evaluating Chinese language models' commonsense reasoning and conflict detection skills in dialogues, by annotating 19,700 conversations with 76,787 pieces of commonsense knowledge.


Language Resources for Dutch Large Language Modelling

Bram Vanroy

http://arxiv.org/abs/2312.12852v1

Compressor summary: The authors introduce two fine-tuned Dutch language models and provide data, benchmarks, and a leaderboard to improve the state of the art in Dutch natural language processing.


A Stochastic Analysis of the Linguistic Provenance of English Place Names

Michael Dalvean

http://arxiv.org/abs/2312.12850v1

Compressor summary: The paper uses stochastic methods to rank English place names by their similarity to other languages' place names, helping to determine their origin.


Causal Discovery under Identifiable Heteroscedastic Noise Model

Naiyu Yin,Tian Gao,Yue Yu,Qiang Ji

http://arxiv.org/abs/2312.12844v1

Compressor summary: The paper proposes a novel method for learning causal DAGs that accounts for heteroscedastic noise, which improves accuracy and efficiency over existing methods.


Comparing Machine Learning Algorithms by Union-Free Generic Depth

Hannah Blocher,Georg Schollmeyer,Malte Nalenz,Christoph Jansen

http://arxiv.org/abs/2312.12839v1

Compressor summary: The paper proposes a depth function for partial orders and uses it to compare machine learning algorithms on standard data sets, offering a novel analysis approach.


Near-Optimal Resilient Aggregation Rules for Distributed Learning Using 1-Center and 1-Mean Clustering with Outliers

Yuhao Yi,Ronghui You,Hong Liu,Changxin Liu,Yuan Wang,Jiancheng Lv

http://arxiv.org/abs/2312.12835v1

Compressor summary: The paper proposes near-optimal resilient aggregation rules for Byzantine machine learning using outlier-robust clustering, and discusses attacks and a two-phase framework to improve security.


Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

Yiwei Li,Peiwen Yuan,Shaoxiong Feng,Boyuan Pan,Bin Sun,Xinglin Wang,Heda Wang,Kan Li

http://arxiv.org/abs/2312.12832v1

Compressor summary: The authors propose a model specialization framework that uses both positive and negative samples to distill reasoning ability from large language models for arithmetic reasoning tasks.


TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP Without Training

Yuqi Lin,Minghao Chen,Kaipeng Zhang,Hengjia Li,Mingming Li,Zheng Yang,Dongqin Lv,Binbin Lin,Haifeng Liu,Deng Cai

http://arxiv.org/abs/2312.12828v1

Compressor summary: The paper proposes a local-to-global framework to improve CLIP's multi-label classification performance by preserving patch-wise spatial information and applying it to weakly supervised semantic segmentation.


ReCo-Diff: Explore Retinex-Based Condition Strategy in Diffusion Model for Low-Light Image Enhancement

Yuhui Wu,Guoqing Wang,Zhiwen Wang,Yang Yang,Tianyu Li,Peng Wang,Chongyi Li,Heng Tao Shen

http://arxiv.org/abs/2312.12826v1

Compressor summary: ReCo-Diff is a novel method that uses Retinex theory as a pre-processing condition to improve low-light image enhancement by guiding a conditional diffusion model with feature- and image-level information.


Object-aware Adaptive-Positivity Learning for Audio-Visual Question Answering

Zhangbin Li,Dan Guo,Jinxing Zhou,Jing Zhang,Meng Wang

http://arxiv.org/abs/2312.12816v1

Compressor summary: The paper proposes a model that uses fine-grained visual objects and multi-modal relations to answer questions from untrimmed audible videos, improving both feature interaction and model optimization.


OCTOPUS: Open-vocabulary Content Tracking and Object Placement Using Semantic Understanding in Mixed Reality

Luke Yoffe,Aditya Sharma,Tobias Höllerer

http://arxiv.org/abs/2312.12815v1

Compressor summary: The paper presents an open-vocabulary method for placing virtual objects in augmented reality using recent advances in segmentation, vision-language, and LLMs, and shows its performance compared to human experts.


Enhancing Consistency in Multimodal Dialogue System Using LLM with Dialogue Scenario

Hiroki Onozeki,Zhiyang Qi,Kazuma Akiyama,Ryutaro Asahara,Takumasa Kaneko,Michimasa Inaba

http://arxiv.org/abs/2312.12808v1

Compressor summary: The paper presents a dialogue system for a travel agency that helps users choose sightseeing plans in Kyoto, using flexible and stable dialogue flow control and motion-speech cues.


All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models

Seunghoo Hong,Juhun Lee,Simon S. Woo

http://arxiv.org/abs/2312.12807v1

Compressor summary: The paper proposes a new approach to remove unwanted content from image generation models while maintaining their synthesis quality and user control.


MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models

Yan Cai,Linlin Wang,Ye Wang,Gerard de Melo,Ya Zhang,Yanfeng Wang,Liang He

http://arxiv.org/abs/2312.12806v1

Compressor summary: MedBench is a benchmark for Chinese medical language models that assesses their knowledge and reasoning abilities across various domains and findings reveal their strengths and weaknesses.


Multi-stages attention Breast cancer classification based on nonlinear spiking neural P neurons with autapses

Bo Yang,Hong Peng,Xiaohui Luo,Jun Wang,Xianzhong Long

http://arxiv.org/abs/2312.12804v1

Compressor summary: The paper proposes a multi-stage attention architecture using NSNP neurons with autapses for breast cancer classification, which improves accuracy and preserves valuable data features.


Bandit Sequential Posted Pricing via Half-Concavity

Sahil Singla,Yifan Wang

http://arxiv.org/abs/2312.12794v1

Compressor summary: The paper studies sequential posted pricing auctions in the bandit learning model, obtaining tight regret bounds for various buyer distributions and showing a new half-concavity property of the revenue function.


Fast Cell Library Characterization for Design Technology Co-Optimization Based on Graph Neural Networks

Tianliang Ma,Zhihui Deng,Xuguang Sun,Leilai Shao

http://arxiv.org/abs/2312.12784v1

Compressor summary: The text proposes a machine learning model using graph neural networks for fast and accurate cell library characterization in semiconductor process development, achieving high prediction accuracy and significant speed-up compared to traditional methods.


DynaLay: An Introspective Approach to Dynamic Layer Selection for Deep Networks

Mrinal Mathur,Sergey Plis

http://arxiv.org/abs/2312.12781v1

Compressor summary: DynaLay is an adaptive deep learning model that adjusts its computational effort based on input complexity, improving efficiency without sacrificing accuracy.


Segmenting Messy Text: Detecting Boundaries in Text Derived from Historical Newspaper Images

Carol Anderson,Phil Crone

http://arxiv.org/abs/2312.12773v1

Compressor summary: Key points: - Text segmentation is a prerequisite for natural language processing tasks - Existing methods work well on narrative texts with distinct topics - The challenge is to segment marriage announcements from newspapers - The text is not structured, has noisy OCR output, and adjacent segments are similar - A novel deep learning model beats the state-of-the-art method Summary: The paper proposes a new deep learning method for segmenting noisy, unstructured newspaper marriage announcements, which are challenging for existing text segmentation techniques.


AMD:Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion

Beibei Jing,Youjia Zhang,Zikai Song,Junqing Yu,Wei Yang

http://arxiv.org/abs/2312.12763v1

Compressor summary: The Adaptable Motion Diffusion (AMD) model uses a Large Language Model to parse text descriptions of complex human motions into anatomical scripts and then synthesizes realistic motion sequences from them.


Spectral Prompt Tuning:Unveiling Unseen Classes for Zero-Shot Semantic Segmentation

Wenhao Xu,Rongtao Xu,Changwei Wang,Shibiao Xu,Li Guo,Man Zhang,Xiaopeng Zhang

http://arxiv.org/abs/2312.12754v1

Compressor summary: The paper proposes SPT-SEG, a one-stage approach that uses Spectral Prompt Tuning and Spectral Guided Decoder to improve CLIP's zero-shot pixel-level segmentation performance on unseen classes.


ALMANACS: A Simulatability Benchmark for Language Model Explainability

Edmund Mills,Shiye Su,Stuart Russell,Scott Emmons

http://arxiv.org/abs/2312.12747v1

Compressor summary: ALMANACS is a benchmark to measure how well language model explainability methods improve prediction of new inputs on safety-relevant topics, and it finds no existing method outperforms the explanation-free control.


ChatFDA: Medical Records Risk Assessment

M Tran,C Sun

http://arxiv.org/abs/2312.12746v1

Compressor summary: The study presents an application that uses openFDA data to help caregivers in low-resource settings reduce medical errors and improve patient safety by analyzing prescriptions.


PointeNet: A Lightweight Framework for Effective and Efficient Point Cloud Analysis

Lipeng Gu,Xuefeng Yan,Liangliang Nan,Dingkun Zhu,Honghua Chen,Weiming Wang,Mingqiang Wei

http://arxiv.org/abs/2312.12743v1

Compressor summary: PointeNet is a lightweight network for efficient point cloud analysis that captures 3D geometries and enhances semantic perception, outperforming existing methods on object-level and scene-level tasks.


Cached Transformers: Improving Transformers with Differentiable Memory Cache

Zhaoyang Zhang,Wenqi Shao,Yixiao Ge,Xiaogang Wang,Jinwei Gu,Ping Luo

http://arxiv.org/abs/2312.12742v1

Compressor summary: The Cached Transformer is a new model that improves self-attention with a memory cache, enabling it to handle longer dependencies and perform better on various language and vision tasks.


Locally Optimal Fixed-Budget Best Arm Identification in Two-Armed Gaussian Bandits with Unknown Variances

Masahiro Kato

http://arxiv.org/abs/2312.12741v1

Compressor summary: The paper proposes a new strategy to identify the best arm in two-armed Gaussian bandits with unknown variances, and shows that it is asymptotically optimal under a small-gap regime.


Fine-tuning Large Language Models for Adaptive Machine Translation

Yasmin Moslem,Rejwanul Haque,Andy Way

http://arxiv.org/abs/2312.12740v1

Compressor summary: The paper shows how fine-tuning the Mistral 7B language model improves its adaptive machine translation capabilities, outperforming or matching other models in zero-shot and one-shot translation scenarios within the medical domain.


FSscore: A Machine Learning-based Synthetic Feasibility Score Leveraging Human Expertise

Rebecca M. Neeser,Bruno Correia,Philippe Schwaller

http://arxiv.org/abs/2312.12737v1

Compressor summary: The Focused Synthesizability score (FSscore) is a scoring approach that learns to rank molecules based on binary preferences using a graph attention network and human expert feedback, improving synthetic feasibility assessment in chemistry and drug discovery.


Learning and Forgetting Unsafe Examples in Large Language Models

Jiachen Zhao,Zhun Deng,David Madras,James Zou,Mengye Ren

http://arxiv.org/abs/2312.12736v1

Compressor summary: The authors propose a forgetting-based filtering algorithm called ForgetFilter to ensure safe large language models finetuning by removing unsafe data based on how easily the model forgets it.


MetaSegNet: Metadata-collaborative Vision-Language Representation Learning for Semantic Segmentation of Remote Sensing Images

Libo Wang,Sijun Dong,Ying Chen,Xiaoliang Meng,Shenghui Fang

http://arxiv.org/abs/2312.12735v1

Compressor summary: The authors propose MetaSegNet, a metadata-collaborative multimodal segmentation network that uses vision-language representation learning for semantic segmentation of remote sensing images, improving generalization and accuracy compared to existing methods.


Robustly Improving Bandit Algorithms with Confounded and Selection Biased Offline Data: A Causal Approach

Wen Huang,Xintao Wu

http://arxiv.org/abs/2312.12731v1

Compressor summary: The paper proposes a causal approach to deal with biases in bandit problems using data from offline observations, leading to better decision policies and reduced regret.


A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models

Julio Silva-Rodriguez,Sina Hajimiri,Ismail Ben Ayed,Jose Dolz

http://arxiv.org/abs/2312.12730v1

Compressor summary: CLAP is a new approach for efficient transfer learning that adapts to different classes and tasks without relying on large labeled samples or case-specific hyperparameter tuning.


Segment Anything Model Meets Image Harmonization

Haoxing Chen,Yaohui Li,Zhangxuan Gu,Zhuoer Xu,Jun Lan,Huaxiong Li

http://arxiv.org/abs/2312.12729v1

Compressor summary: SRIN is a new technique that uses semantic segmentation maps to improve image harmonization by matching foreground and background features.


Reducing Shape-Radiance Ambiguity in Radiance Fields with a Closed-Form Color Estimation Method

Qihang Fang,Yafei Song,Keqiang Li,Liefeng Bo

http://arxiv.org/abs/2312.12726v1

Compressor summary: The paper proposes a more adaptive method to reduce shape-radiance ambiguity in neural radiance fields by estimating the color field based on the density field and posed images, and then applying it to regularize NeRF's density field.


Multi-Clue Reasoning with Memory Augmentation for Knowledge-based Visual Question Answering

Chengxiang Yin,Zhengping Che,Kun Wu,Zhiyuan Xu,Jian Tang

http://arxiv.org/abs/2312.12723v1

Compressor summary: The paper proposes a novel framework for Knowledge-based Visual Question Answering that uses Multiple Clues for Reasoning with Memory Neural Networks to exploit external knowledge and answer more general questions.


Fine-Grained Knowledge Selection and Restoration for Non-Exemplar Class Incremental Learning

Jiang-Tian Zhai,Xialei Liu,Lu Yu,Ming-Ming Cheng

http://arxiv.org/abs/2312.12722v1

Compressor summary: Our method learns new tasks without accessing past data by selectively distilling patches for plasticity and stability, and restoring old task knowledge with realistic prototypes.


Cross-Modal Reasoning with Event Correlation for Video Question Answering

Chengxiang Yin,Zhengping Che,Kun Wu,Zhiyuan Xu,Qinru Qiu,Jian Tang

http://arxiv.org/abs/2312.12721v1

Compressor summary: The paper introduces EC-GNNs, a model that uses dense captioning to distill event-correlated information for cross-modal reasoning in VideoQA tasks.


AdvST: Revisiting Data Augmentations for Single Domain Generalization

Guangtao Zheng,Mengdi Huai,Aidong Zhang

http://arxiv.org/abs/2312.12720v1

Compressor summary: The paper proposes AdvST, a simple but effective method for single domain generalization that uses standard data augmentations with learnable parameters to manipulate sample semantics and learns a robust model with the augmented data.


BloomVQA: Assessing Hierarchical Multi-modal Comprehension

Yunye Gong,Robik Shrestha,Jared Claypoole,Michael Cogswell,Arijit Ray,Christopher Kanan,Ajay Divakaran

http://arxiv.org/abs/2312.12716v1

Compressor summary: The proposed BloomVQA dataset evaluates vision-language models on different levels of comprehension using picture stories based on Bloom's Taxonomy, revealing weaknesses and inconsistencies in current models.


Response Enhanced Semi-Supervised Dialogue Query Generation

Jianheng Huang,Ante Wang,Linfeng Gao,Linfeng Song,Jinsong Su

http://arxiv.org/abs/2312.12713v1

Compressor summary: The paper proposes a semi-supervised learning framework, SemiDQG, to improve dialogue query generation by using response-augmented queries and pseudo instances for training.


DGCLUSTER: A Neural Framework for Attributed Graph Clustering via Modularity Maximization

Aritra Bhowmick,Mert Kosan,Zexi Huang,Ambuj Singh,Sourav Medya

http://arxiv.org/abs/2312.12697v1

Compressor summary: DGCluster is a new method for graph clustering that uses graph neural networks, automatically determines the number of clusters, and performs well across various metrics and datasets.


How Good Are Deep Generative Models for Solving Inverse Problems?

Shichong Peng,Alireza Moazeni,Ke Li

http://arxiv.org/abs/2312.12691v1

Compressor summary: This study compares different deep generative models on three inverse problems and finds that CHIMLE produces the best valid solutions and reliable uncertainty estimates.


Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?

Tannon Kew,Florian Schottmann,Rico Sennrich

http://arxiv.org/abs/2312.12683v1

Compressor summary: The text discusses the need for cross-lingual transfer in large language models and shows that multilingual instruction tuning with only three languages can improve performance on generative tasks but is less important for structured tasks.


Mini-GPTs: Efficient Large Language Models through Contextual Pruning

Tim Valicenti,Justice Vidal,Ritik Patnaik

http://arxiv.org/abs/2312.12682v1

Compressor summary: The paper introduces Mini-GPTs, smaller and efficient language models created by contextual pruning of traditional LLMs, and demonstrates their effectiveness on diverse domains.


Imitation of Life: A Search Engine for Biologically Inspired Design

Hen Emuna,Nadav Borenstein,Xin Qian,Hyeonsu Kang,Joel Chan,Aniket Kittur,Dafna Shahaf

http://arxiv.org/abs/2312.12681v1

Compressor summary: BARcode is a search engine for finding biological solutions to engineering problems by mining inspirations from the web at scale, overcoming the limitations of existing hand-curated datasets.


Trajectory Approximation of Video Based on Phase Correlation for Forward Facing Camera

Abdulkadhem A. Abdulkadhem

http://arxiv.org/abs/2312.12680v1

Compressor summary: The paper presents an innovative method to extract camera trajectories from video footage without GPS in noisy environments, using phase correlation and dynamic chain code techniques.


Towards Efficient Verification of Quantized Neural Networks

Pei Huang,Haoze Wu,Yuting Yang,Ieva Daukantas,Min Wu,Yedi Zhang,Clark Barrett

http://arxiv.org/abs/2312.12679v1

Compressor summary: The authors propose a framework for verifying properties of quantized neural networks using integer linear programming, heuristic search methods, and bound-propagation techniques, which improves scalability and efficiency compared to existing approaches.


Combinatorial Gaussian Process Bandits in Bayesian Settings: Theory and Application for Energy-Efficient Navigation

Jack Sandberg,Niklas Åkerblom,Morteza Haghir Chehreghani

http://arxiv.org/abs/2312.12676v1

Compressor summary: The paper studies a combinatorial bandit problem with time-varying arm availability and provides novel regret bounds for three GP-based algorithms; it also applies these methods to an energy-efficient navigation problem on real roads.