Agent Research Papers

Updated on 2025.10.30 Current Search Keywords: Agent,Multi-Agent,Tool Learning,Agent RL,Autonomous Agent,LLM Agent

If you have any other keywords, please feel free to let us know :)

Web Page (Scrape Code)

Table of Contents
  1. <a href=#agent>Agent</a>
  2. <a href=#large-language-models>Large Language Models</a>
  3. <a href=#reinforcement-learning>Reinforcement Learning</a>

Agent

(<a href=#updated-on-20251030>back to top</a>)

Large Language Models

  • 2025-10-22, olmOCR 2: Unit Test Rewards for Document OCR, Jake Poznanski et.al., Paper: http://arxiv.org/abs/2510.19817
  • 2025-10-23, Zhyper: Factorized Hypernetworks for Conditioned LLM Fine-Tuning, M. H. I. Abdalla et.al., Paper: http://arxiv.org/abs/2510.19733
  • 2025-10-21, Zero-Shot Vehicle Model Recognition via Text-Based Retrieval-Augmented Generation, Wei-Chia Chang et.al., Paper: http://arxiv.org/abs/2510.18502
  • 2025-10-28, Zero-Shot Cross-Lingual Transfer using Prefix-Based Adaptation, Snegha A et.al., Paper: http://arxiv.org/abs/2510.24619
  • 2025-10-24, Wisdom and Delusion of LLM Ensembles for Code Generation and Repair, Fernando Vallecillos Ruiz et.al., Paper: http://arxiv.org/abs/2510.21513
  • 2025-10-21, Why Policy Gradient Algorithms Work for Undiscounted Total-Reward MDPs, Jongmin Lee et.al., Paper: http://arxiv.org/abs/2510.18340
  • 2025-10-28, WebLeaper: Empowering Efficiency and Efficacy in WebAgent via Enabling Info-Rich Seeking, Zhengwei Tao et.al., Paper: http://arxiv.org/abs/2510.24697
  • 2025-10-23, Video Prediction of Dynamic Physical Simulations With Pixel-Space Spatiotemporal Transformers, Dean L Slack et.al., Paper: http://arxiv.org/abs/2510.20807
  • 2025-10-21, Verifiable Accuracy and Abstention Rewards in Curriculum RL to Alleviate Lost-in-Conversation, Ming Li et.al., Paper: http://arxiv.org/abs/2510.18731
  • 2025-10-27, VOLD: Reasoning Transfer from LLMs to Vision-Language Models via On-Policy Distillation, Walid Bousselham et.al., Paper: http://arxiv.org/abs/2510.23497
  • 2025-10-21, VAR: Visual Attention Reasoning via Structured Search and Backtracking, Wei Cai et.al., Paper: http://arxiv.org/abs/2510.18619
  • 2025-10-23, User Perceptions of Privacy and Helpfulness in LLM Responses to Privacy-Sensitive Scenarios, Xiaoyuan Wu et.al., Paper: http://arxiv.org/abs/2510.20721
  • 2025-10-21, Unifying and Enhancing Graph Transformers via a Hierarchical Mask Framework, Yujie Xing et.al., Paper: http://arxiv.org/abs/2510.18825
  • 2025-10-21, UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation, Yibin Wang et.al., Paper: http://arxiv.org/abs/2510.18701
  • 2025-10-21, UWBench: A Comprehensive Vision-Language Benchmark for Underwater Understanding, Da Zhang et.al., Paper: http://arxiv.org/abs/2510.18262
  • 2025-10-21, Training Diverse Graph Experts for Ensembles: A Systematic Empirical Study, Gangda Deng et.al., Paper: http://arxiv.org/abs/2510.18370
  • 2025-10-21, Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning, Chenghao Zhu et.al., Paper: http://arxiv.org/abs/2510.18849
  • 2025-10-21, Topoformer: brain-like topographic organization in Transformer language models through spatial querying and reweighting, Taha Binhuraib et.al., Paper: http://arxiv.org/abs/2510.18745
  • 2025-10-22, Top-P Masking for Cross Language Information Retrieval, Joseph Casale et.al., Paper: http://arxiv.org/abs/2510.19758
  • 2025-10-22, ToolDreamer: Instilling LLM Reasoning Into Tool Retrievers, Saptarshi Sengupta et.al., Paper: http://arxiv.org/abs/2510.19791
  • 2025-10-28, Tongyi DeepResearch Technical Report, Tongyi DeepResearch Team et.al., Paper: http://arxiv.org/abs/2510.24701
  • 2025-10-21, Tokencake: A KV-Cache-centric Serving Framework for LLM-based Multi-Agent Applications, Zhuohang Bian et.al., Paper: http://arxiv.org/abs/2510.18586
  • 2025-10-21, Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views, Zhangquan Chen et.al., Paper: http://arxiv.org/abs/2510.18632
  • 2025-10-27, Think Twice: Branch-and-Rethink Reasoning Reward Model, Yizhu Jiao et.al., Paper: http://arxiv.org/abs/2510.23596
  • 2025-10-24, The Universal Landscape of Human Reasoning, Qiguang Chen et.al., Paper: http://arxiv.org/abs/2510.21623
  • 2025-10-21, The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability, Zijie Xu et.al., Paper: http://arxiv.org/abs/2510.18563
  • 2025-10-22, The Tail Tells All: Estimating Model-Level Membership Inference Vulnerability Without Reference Models, Euodia Dodd et.al., Paper: http://arxiv.org/abs/2510.19773
  • 2025-10-21, The Impact of Image Resolution on Biomedical Multimodal Large Language Models, Liangyu Chen et.al., Paper: http://arxiv.org/abs/2510.18304
  • 2025-10-22, The Feasibility of Training Sovereign Language Models in the Global South: A Study of Brazil and Mexico, Sandra Malagon et.al., Paper: http://arxiv.org/abs/2510.19801
  • 2025-10-21, The Attribution Story of WhisperGate: An Academic Perspective, Oleksandr Adamov et.al., Paper: http://arxiv.org/abs/2510.18484
  • 2025-10-22, The Art of Asking: Multilingual Prompt Optimization for Synthetic Data, David Mora et.al., Paper: http://arxiv.org/abs/2510.19806
  • 2025-10-21, Text or Pixels? It Takes Half: On the Token Efficiency of Visual Text Inputs in Multimodal LLMs, Yanhong Li et.al., Paper: http://arxiv.org/abs/2510.18279
  • 2025-10-23, Structure-Conditional Minimum Bayes Risk Decoding, Bryan Eikema et.al., Paper: http://arxiv.org/abs/2510.20700
  • 2025-10-21, Streamlining Acceptance Test Generation for Mobile Applications Through Large Language Models: An Industrial Case Study, Pedro Luís Fonseca et.al., Paper: http://arxiv.org/abs/2510.18861
  • 2025-10-21, StreamingTOM: Streaming Token Compression for Efficient Video Understanding, Xueyi Chen et.al., Paper: http://arxiv.org/abs/2510.18269
  • 2025-10-21, StarBench: A Turn-Based RPG Benchmark for Agentic Multimodal Decision-Making and Information Seeking, Haoran Zhang et.al., Paper: http://arxiv.org/abs/2510.18483
  • 2025-10-21, Socialized Learning and Emergent Behaviors in Multi-Agent Systems based on Multimodal Large Language Models, Sureyya Akin et.al., Paper: http://arxiv.org/abs/2510.18515
  • 2025-10-22, SmartSwitch: Advancing LLM Reasoning by Overcoming Underthinking via Promoting Deeper Thought Exploration, Xichen Zhang et.al., Paper: http://arxiv.org/abs/2510.19767
  • 2025-10-23, Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation, Yuhan Liu et.al., Paper: http://arxiv.org/abs/2510.20812
  • 2025-10-21, Simple and Efficient Heterogeneous Temporal Graph Neural Network, Yili Wang et.al., Paper: http://arxiv.org/abs/2510.18467
  • 2025-10-23, Simple Context Compression: Mean-Pooling and Multi-Ratio Training, Yair Feldman et.al., Paper: http://arxiv.org/abs/2510.20797
  • 2025-10-21, ShaRE your Data! Characterizing Datasets for LLM-based Requirements Engineering, Quim Motger et.al., Paper: http://arxiv.org/abs/2510.18787
  • 2025-10-21, SemiAdapt and SemiLoRA: Efficient Domain Adaptation for Transformer-based Low-Resource Language Translation with a Case Study on Irish, Josh McGiff et.al., Paper: http://arxiv.org/abs/2510.18725
  • 2025-10-22, Semantic World Models, Jacob Berg et.al., Paper: http://arxiv.org/abs/2510.19818
  • 2025-10-21, SegTune: Structured and Fine-Grained Control for Song Generation, Pengfei Cai et.al., Paper: http://arxiv.org/abs/2510.18416
  • 2025-10-21, Seg the HAB: Language-Guided Geospatial Algae Bloom Reasoning and Segmentation, Patterson Hsieh et.al., Paper: http://arxiv.org/abs/2510.18751
  • 2025-10-21, See the Text: From Tokenization to Visual Reading, Ling Xing et.al., Paper: http://arxiv.org/abs/2510.18840
  • 2025-10-22, Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning, Xichen Zhang et.al., Paper: http://arxiv.org/abs/2510.19807
  • 2025-10-28, STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence, Zihan Liu et.al., Paper: http://arxiv.org/abs/2510.24693
  • 2025-10-21, SSD: Spatial-Semantic Head Decoupling for Efficient Autoregressive Image Generation, Siyong Jian et.al., Paper: http://arxiv.org/abs/2510.18716
  • 2025-10-21, SLICE: SLO-Driven Scheduling for LLM Inference on Edge Computing Devices, Pan Zhou et.al., Paper: http://arxiv.org/abs/2510.18544
  • 2025-10-24, SBASH: a Framework for Designing and Evaluating RAG vs. Prompt-Tuned LLM Honeypots, Adetayo Adebimpe et.al., Paper: http://arxiv.org/abs/2510.21459
  • 2025-10-28, Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance, Yujie Wei et.al., Paper: http://arxiv.org/abs/2510.24711
  • 2025-10-27, RobotArena $\infty$ : Scalable Robot Benchmarking via Real-to-Sim Translation, Yash Jangir et.al., Paper: http://arxiv.org/abs/2510.23571
  • 2025-10-24, Risk Management for Mitigating Benchmark Failure Modes: BenchRisk, Sean McGregor et.al., Paper: http://arxiv.org/abs/2510.21460
  • 2025-10-22, Review of Tools for Zero-Code LLM Based Application Development, Priyaranjan Pattnayak et.al., Paper: http://arxiv.org/abs/2510.19747
  • 2025-10-21, Retaining by Doing: The Role of On-Policy Data in Mitigating Forgetting, Howard Chen et.al., Paper: http://arxiv.org/abs/2510.18874
  • 2025-10-28, ReplicationBench: Can AI Agents Replicate Astrophysics Research Papers?, Christine Ye et.al., Paper: http://arxiv.org/abs/2510.24591
  • 2025-10-28, Relative Scaling Laws for LLMs, William Held et.al., Paper: http://arxiv.org/abs/2510.24626
  • 2025-10-21, Reasoning Language Model Inference Serving Unveiled: An Empirical Study, Qi Li et.al., Paper: http://arxiv.org/abs/2510.18672
  • 2025-10-28, ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization, Guoxin Chen et.al., Paper: http://arxiv.org/abs/2510.24592
  • 2025-10-27, ReCode: Unify Plan and Action for Universal Granularity Control, Zhaoyang Yu et.al., Paper: http://arxiv.org/abs/2510.23564
  • 2025-10-22, RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models, Yang Yang et.al., Paper: http://arxiv.org/abs/2510.19698
  • 2025-10-24, RETuning: Upgrading Inference-Time Scaling for Stock Movement Prediction with Large Language Models, Xueyuan Lin et.al., Paper: http://arxiv.org/abs/2510.21604
  • 2025-10-24, REMONI: An Autonomous System Integrating Wearables and Multimodal Large Language Models for Enhanced Remote Health Monitoring, Thanh Cong Ho et.al., Paper: http://arxiv.org/abs/2510.21445
  • 2025-10-23, RAGRank: Using PageRank to Counter Poisoning in CTI LLM Pipelines, Austin Jia et.al., Paper: http://arxiv.org/abs/2510.20768
  • 2025-10-21, Prompting the Priorities: A First Look at Evaluating LLMs for Vulnerability Triage and Prioritization, Osama Al Haddad et.al., Paper: http://arxiv.org/abs/2510.18508
  • 2025-10-21, Probabilistic Modeling of Intentions in Socially Intelligent LLM Agents, Feifan Xia et.al., Paper: http://arxiv.org/abs/2510.18476
  • 2025-10-21, Proactive Reasoning-with-Retrieval Framework for Medical Multimodal Large Language Models, Lehan Wang et.al., Paper: http://arxiv.org/abs/2510.18303
  • 2025-10-21, Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options, Joongkyu Lee et.al., Paper: http://arxiv.org/abs/2510.18713
  • 2025-10-21, Position: LLM Watermarking Should Align Stakeholders’ Incentives for Practical Adoption, Yepeng Liu et.al., Paper: http://arxiv.org/abs/2510.18333
  • 2025-10-27, Point Convergence of Nesterov’s Accelerated Gradient Method: An AI-Assisted Proof, Uijeong Jang et.al., Paper: http://arxiv.org/abs/2510.23513
  • 2025-10-21, PlanU: Large Language Model Decision Making through Planning under Uncertainty, Ziwei Deng et.al., Paper: http://arxiv.org/abs/2510.18442
  • 2025-10-23, Plan Then Retrieve: Reinforcement Learning-Guided Complex Reasoning over Knowledge Graphs, Yanlin Song et.al., Paper: http://arxiv.org/abs/2510.20691
  • 2025-10-27, PixelRefer: A Unified Framework for Spatio-Temporal Object Referring with Arbitrary Granularity, Yuqian Yuan et.al., Paper: http://arxiv.org/abs/2510.23603
  • 2025-10-21, ParaStyleTTS: Toward Efficient and Robust Paralinguistic Style Control for Expressive Text-to-Speech Generation, Haowei Lou et.al., Paper: http://arxiv.org/abs/2510.18308
  • 2025-10-24, ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models, Federico Danieli et.al., Paper: http://arxiv.org/abs/2510.21450
  • 2025-10-28, PRISM-Bench: A Benchmark of Puzzle-Based Visual Tasks with CoT Error Detection, Yusu Qian et.al., Paper: http://arxiv.org/abs/2510.23594
  • 2025-10-28, Optimizing Retrieval for RAG via Reinforced Contrastive Learning, Jiawei Zhou et.al., Paper: http://arxiv.org/abs/2510.24652
  • 2025-10-29, OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning, Ziyou Hu et.al., Paper: http://arxiv.org/abs/2510.24636
  • 2025-10-28, Open Korean Historical Corpus: A Millennia-Scale Diachronic Collection of Public Domain Texts, Seyoung Song et.al., Paper: http://arxiv.org/abs/2510.24541
  • 2025-10-21, One Size Fits All? A Modular Adaptive Sanitization Kit (MASK) for Customizable Privacy-Preserving Phone Scam Detection, Kangzhong Wang et.al., Paper: http://arxiv.org/abs/2510.18493
  • 2025-10-23, On the Detectability of LLM-Generated Text: What Exactly Is LLM-Generated Text?, Mingmeng Geng et.al., Paper: http://arxiv.org/abs/2510.20810
  • 2025-10-22, On Controlled Change: Generative AI’s Impact on Professional Authority in Journalism, Tomás Dodds et.al., Paper: http://arxiv.org/abs/2510.19792
  • 2025-10-21, Noise-Conditioned Mixture-of-Experts Framework for Robust Speaker Verification, Bin Gu et.al., Paper: http://arxiv.org/abs/2510.18533
  • 2025-10-23, Neural Diversity Regularizes Hallucinations in Small Models, Kushal Chakrabarti et.al., Paper: http://arxiv.org/abs/2510.20690
  • 2025-10-28, Multi-Agent Evolve: LLM Self-Improve through Co-evolution, Yixing Chen et.al., Paper: http://arxiv.org/abs/2510.23595
  • 2025-10-24, MoniTor: Exploiting Large Language Models with Instruction for Online Video Anomaly Detection, Shengtian Yang et.al., Paper: http://arxiv.org/abs/2510.21449
  • 2025-10-24, Modest-Align: Data-Efficient Alignment for Vision-Language Models, Jiaxiang Liu et.al., Paper: http://arxiv.org/abs/2510.21606
  • 2025-10-23, Mixing Importance with Diversity: Joint Optimization for KV Cache Compression in Large Vision-Language Models, Xuyang Liu et.al., Paper: http://arxiv.org/abs/2510.20707
  • 2025-10-27, Minimizing Human Intervention in Online Classification, William Réveillard et.al., Paper: http://arxiv.org/abs/2510.23557
  • 2025-10-21, Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents, Guangfu Guo et.al., Paper: http://arxiv.org/abs/2510.18424
  • 2025-10-21, MTraining: Distributed Dynamic Sparse Attention for Efficient Ultra-Long Context Training, Wenxuan Li et.al., Paper: http://arxiv.org/abs/2510.18830
  • 2025-10-24, MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization, Chenglong Wang et.al., Paper: http://arxiv.org/abs/2510.21473
  • 2025-10-21, MLMA: Towards Multilingual with Mamba Based Architectures, Mohamed Nabih Ali et.al., Paper: http://arxiv.org/abs/2510.18684
  • 2025-10-21, MENTOR: A Reinforcement Learning Framework for Model Enhancement via Teacher-Optimized Rewards in Small Models, ChangSu Choi et.al., Paper: http://arxiv.org/abs/2510.18383
  • 2025-10-27, Lightweight Robust Direct Preference Optimization, Cheol Woo Kim et.al., Paper: http://arxiv.org/abs/2510.23590
  • 2025-10-21, LightMem: Lightweight and Efficient Memory-Augmented Generation, Jizhan Fang et.al., Paper: http://arxiv.org/abs/2510.18866
  • 2025-10-23, Learning to Triage Taint Flows Reported by Dynamic Program Analysis in Node.js Packages, Ronghao Ni et.al., Paper: http://arxiv.org/abs/2510.20739
  • 2025-10-27, Learning to Reason Efficiently with Discounted Reinforcement Learning, Alex Ayoub et.al., Paper: http://arxiv.org/abs/2510.23486
  • 2025-10-27, Learning the PTM Code through a Coarse-to-Fine, Mechanism-Aware Framework, Jingjie Zhang et.al., Paper: http://arxiv.org/abs/2510.23492
  • 2025-10-21, Large language models for folktale type automation based on motifs: Cinderella case study, Tjaša Arčon et.al., Paper: http://arxiv.org/abs/2510.18561
  • 2025-10-21, Large Language Models in Thematic Analysis: Prompt Engineering, Evaluation, and Guidelines for Qualitative Software Engineering Research, Cristina Martinez Montes et.al., Paper: http://arxiv.org/abs/2510.18456
  • 2025-10-21, LLMs as Sparse Retrievers:A Framework for First-Stage Product Search, Hongru Song et.al., Paper: http://arxiv.org/abs/2510.18527
  • 2025-10-21, LAFA: Agentic LLM-Driven Federated Analytics over Decentralized Data Sources, Haichao Ji et.al., Paper: http://arxiv.org/abs/2510.18477
  • 2025-10-21, KrishokBondhu: A Retrieval-Augmented Voice-Based Agricultural Advisory Call Center for Bengali Farmers, Mohd Ruhul Ameen et.al., Paper: http://arxiv.org/abs/2510.18355
  • 2025-10-21, KoSimpleQA: A Korean Factuality Benchmark with an Analysis of Reasoning LLMs, Donghyeon Ko et.al., Paper: http://arxiv.org/abs/2510.18368
  • 2025-10-23, KL-Regularized Reinforcement Learning is Designed to Mode Collapse, Anthony GX-Chen et.al., Paper: http://arxiv.org/abs/2510.20817
  • 2025-10-21, KAT-Coder Technical Report, Zizheng Zhan et.al., Paper: http://arxiv.org/abs/2510.18779
  • 2025-10-21, JAUNT: Joint Alignment of User Intent and Network State for QoE-centric LLM Tool Routing, Enhan Li et.al., Paper: http://arxiv.org/abs/2510.18550
  • 2025-10-22, Integrating Transparent Models, LLMs, and Practitioner-in-the-Loop: A Case of Nonprofit Program Evaluation, Ji Ma et.al., Paper: http://arxiv.org/abs/2510.19799
  • 2025-10-21, Integrating Large Language Models and Evaluating Student Outcomes in an Introductory Computer Science Course, Annapurna Vadaparty et.al., Paper: http://arxiv.org/abs/2510.18806
  • 2025-10-21, InspectCoder: Dynamic Analysis-Enabled Self Repair through interactive LLM-Debugger Collaboration, Yunkun Wang et.al., Paper: http://arxiv.org/abs/2510.18327
  • 2025-10-21, ImageGem: In-the-wild Generative Image Interaction Dataset for Generative Model Personalization, Yuanhe Guo et.al., Paper: http://arxiv.org/abs/2510.18433
  • 2025-10-21, Illusions of reflection: open-ended task reveals systematic failures in Large Language Models’ reflective reasoning, Sion Weatherhead et.al., Paper: http://arxiv.org/abs/2510.18254
  • 2025-10-21, Identity-Aware Large Language Models require Cultural Reasoning, Alistair Plum et.al., Paper: http://arxiv.org/abs/2510.18510
  • 2025-10-27, ISA-Bench: Benchmarking Instruction Sensitivity for Large Audio Language Models, Bohan Li et.al., Paper: http://arxiv.org/abs/2510.23558
  • 2025-10-27, IPQA: A Benchmark for Core Intent Identification in Personalized Question Answering, Jieyong Kim et.al., Paper: http://arxiv.org/abs/2510.23536
  • 2025-10-21, IMB: An Italian Medical Benchmark for Question Answering, Antonio Romano et.al., Paper: http://arxiv.org/abs/2510.18468
  • 2025-10-21, IF-VidCap: Can Video Caption Models Follow Instructions?, Shihao Li et.al., Paper: http://arxiv.org/abs/2510.18726
  • 2025-10-24, Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine, Wenyi Wang et.al., Paper: http://arxiv.org/abs/2510.21614
  • 2025-10-22, Hubble: a Model Suite to Advance the Study of LLM Memorization, Johnny Tian-Zheng Wei et.al., Paper: http://arxiv.org/abs/2510.19811
  • 2025-10-21, How Efficient Are Diffusion Language Models? A Critical Examination of Efficiency Evaluation Practices, Han Peng et.al., Paper: http://arxiv.org/abs/2510.18480
  • 2025-10-21, How Do LLMs Use Their Depth?, Akshat Gupta et.al., Paper: http://arxiv.org/abs/2510.18871
  • 2025-10-24, Head Pursuit: Probing Attention Specialization in Multimodal Transformers, Lorenzo Basile et.al., Paper: http://arxiv.org/abs/2510.21518
  • 2025-10-21, HarmNet: A Framework for Adaptive Multi-Turn Jailbreak Attacks on Large Language Models, Sidhant Narula et.al., Paper: http://arxiv.org/abs/2510.18728
  • 2025-10-21, Hardness of Learning Regular Languages in the Next Symbol Prediction Setting, Satwik Bhattamishra et.al., Paper: http://arxiv.org/abs/2510.18634
  • 2025-10-21, Grounding or Guessing? Visual Signals for Detecting Hallucinations in Sign Language Translation, Yasser Hamidullah et.al., Paper: http://arxiv.org/abs/2510.18439
  • 2025-10-28, Greedy Sampling Is Provably Efficient for RLHF, Di Wu et.al., Paper: http://arxiv.org/abs/2510.24700
  • 2025-10-22, Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs, Haochen Wang et.al., Paper: http://arxiv.org/abs/2510.18876
  • 2025-10-21, Genesis: Evolving Attack Strategies for LLM Web Agent Red-Teaming, Zheng Zhang et.al., Paper: http://arxiv.org/abs/2510.18314
  • 2025-10-23, Generative Reasoning Recommendation via LLMs, Minjie Hong et.al., Paper: http://arxiv.org/abs/2510.20815
  • 2025-10-28, Generative AI for Healthcare: Fundamentals, Challenges, and Perspectives, Gang Chen et.al., Paper: http://arxiv.org/abs/2510.24551
  • 2025-10-21, GPTFace: Generative Pre-training of Facial-Linguistic Transformer by Span Masking and Weakly Correlated Text-image Data, Yudong Li et.al., Paper: http://arxiv.org/abs/2510.18345
  • 2025-10-28, FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling, Zengzhuang Xu et.al., Paper: http://arxiv.org/abs/2510.24645
  • 2025-10-21, From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering, Lei Li et.al., Paper: http://arxiv.org/abs/2510.18297
  • 2025-10-21, From Quarter to All: Accelerating Speculative LLM Decoding via Floating-Point Exponent Remapping and Parameter Sharing, Yushu Zhao et.al., Paper: http://arxiv.org/abs/2510.18525
  • 2025-10-24, From Polyester Girlfriends to Blind Mice: Creating the First Pragmatics Understanding Benchmarks for Slovene, Mojca Brglez et.al., Paper: http://arxiv.org/abs/2510.21575
  • 2025-10-22, Forbidden Sidon subsets of perfect difference sets, featuring a human-assisted proof, Boris Alexeev et.al., Paper: http://arxiv.org/abs/2510.19804
  • 2025-10-21, Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health Monitoring, Shuxin Lin et.al., Paper: http://arxiv.org/abs/2510.18817
  • 2025-10-21, FedDEAP: Adaptive Dual-Prompt Tuning for Multi-Domain Federated Learning, Yubin Zheng et.al., Paper: http://arxiv.org/abs/2510.18837
  • 2025-10-21, FeClustRE: Hierarchical Clustering and Semantic Tagging of App Features from User Reviews, Max Tiessler et.al., Paper: http://arxiv.org/abs/2510.18799
  • 2025-10-23, Fast Inference via Hierarchical Speculative Decoding, Clara Mohri et.al., Paper: http://arxiv.org/abs/2510.19705
  • 2025-10-27, FARMER: Flow AutoRegressive Transformer over Pixels, Guangting Zheng et.al., Paper: http://arxiv.org/abs/2510.23588
  • 2025-10-21, Exploring a Unified Vision-Centric Contrastive Alternatives on Multi-Modal Web Documents, Yiqi Lin et.al., Paper: http://arxiv.org/abs/2510.18703
  • 2025-10-21, Exploring Membership Inference Vulnerabilities in Clinical Large Language Models, Alexander Nemecek et.al., Paper: http://arxiv.org/abs/2510.18674
  • 2025-10-23, Exploring Large Language Models for Access Control Policy Synthesis and Summarization, Adarsh Vatsa et.al., Paper: http://arxiv.org/abs/2510.20692
  • 2025-10-28, Evolving Diagnostic Agents in a Virtual Clinical Environment, Pengcheng Qiu et.al., Paper: http://arxiv.org/abs/2510.24654
  • 2025-10-21, Evaluating Large Language Models in detecting Secrets in Android Apps, Marco Alecci et.al., Paper: http://arxiv.org/abs/2510.18601
  • 2025-10-21, Evaluating LLM-Based Mobile App Recommendations: An Empirical Study, Quim Motger et.al., Paper: http://arxiv.org/abs/2510.18364
  • 2025-10-21, Enhancing Hotel Recommendations with AI: LLM-Based Review Summarization and Query-Driven Insights, Nikolaos Belibasakis et.al., Paper: http://arxiv.org/abs/2510.18277
  • 2025-10-21, Engagement Undermines Safety: How Stereotypes and Toxicity Shape Humor in Language Models, Atharvan Dogra et.al., Paper: http://arxiv.org/abs/2510.18454
  • 2025-10-23, Empathic Prompting: Non-Verbal Context Integration for Multimodal LLM Conversations, Lorenzo Stacchio et.al., Paper: http://arxiv.org/abs/2510.20743
  • 2025-10-27, Emotion-Coherent Reasoning for Multimodal LLMs via Emotional Rationale Verifier, Hyeongseop Rha et.al., Paper: http://arxiv.org/abs/2510.23506
  • 2025-10-27, EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT, Baoqi Pei et.al., Paper: http://arxiv.org/abs/2510.23569
  • 2025-10-21, EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval, Zebin Yang et.al., Paper: http://arxiv.org/abs/2510.18546
  • 2025-10-21, EffiReasonTrans: RL-Optimized Reasoning for Code Translation, Yanlin Wang et.al., Paper: http://arxiv.org/abs/2510.18863
  • 2025-10-24, EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law, Ilija Lichkovski et.al., Paper: http://arxiv.org/abs/2510.21524
  • 2025-10-21, ECG-LLM– training and evaluation of domain-specific large language models for electrocardiography, Lara Ahrens et.al., Paper: http://arxiv.org/abs/2510.18339
  • 2025-10-28, Dissecting Role Cognition in Medical LLMs via Neuronal Ablation, Xun Liang et.al., Paper: http://arxiv.org/abs/2510.24677
  • 2025-10-28, Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way, Yicun Yang et.al., Paper: http://arxiv.org/abs/2510.24605
  • 2025-10-23, Diagnosing Visual Reasoning: Challenges, Insights, and a Path Forward, Jing Bi et.al., Paper: http://arxiv.org/abs/2510.20696
  • 2025-10-21, DelvePO: Direction-Guided Self-Evolving Framework for Flexible Prompt Optimization, Tao Tao et.al., Paper: http://arxiv.org/abs/2510.18257
  • 2025-10-21, DeepTx: Real-Time Transaction Risk Analysis via Multi-Modal Features and LLM Reasoning, Yixuan Liu et.al., Paper: http://arxiv.org/abs/2510.18438
  • 2025-10-27, Deductive Chain-of-Thought Augmented Socially-aware Robot Navigation World Model, Weizheng Wang et.al., Paper: http://arxiv.org/abs/2510.23509
  • 2025-10-21, DSI-Bench: A Benchmark for Dynamic Spatial Intelligence, Ziang Zhang et.al., Paper: http://arxiv.org/abs/2510.18873
  • 2025-10-21, DART: A Structured Dataset of Regulatory Drug Documents in Italian for Clinical NLP, Mariano Barone et.al., Paper: http://arxiv.org/abs/2510.18475
  • 2025-10-21, CovMatch: Cross-Covariance Guided Multimodal Dataset Distillation with Trainable Text Encoder, Yongmin Lee et.al., Paper: http://arxiv.org/abs/2510.18583
  • 2025-10-21, Counterfactual Reasoning for Steerable Pluralistic Value Alignment of Large Language Models, Hanze Guo et.al., Paper: http://arxiv.org/abs/2510.18526
  • 2025-10-28, ComboBench: Can LLMs Manipulate Physical Devices to Play Virtual Reality Games?, Shuqing Li et.al., Paper: http://arxiv.org/abs/2510.24706
  • 2025-10-21, Combining Distantly Supervised Models with In Context Learning for Monolingual and Cross-Lingual Relation Extraction, Vipul Rathore et.al., Paper: http://arxiv.org/abs/2510.18344
  • 2025-10-24, ColorEcosystem: Powering Personalized, Standardized, and Trustworthy Agentic Service in massive-agent Ecosystem, Fangwen Wu et.al., Paper: http://arxiv.org/abs/2510.21566
  • 2025-10-21, CodeRL+: Improving Code Generation via Reinforcement with Execution Semantics Alignment, Xue Jiang et.al., Paper: http://arxiv.org/abs/2510.18471
  • 2025-10-22, Class-Aware Prototype Learning with Negative Contrast for Test-Time Adaptation of Vision-Language Models, Xiaozhen Qiao et.al., Paper: http://arxiv.org/abs/2510.19802
  • 2025-10-21, CircuitSeer: Mining High-Quality Data by Probing Mathematical Reasoning Circuits in LLMs, Shaobo Wang et.al., Paper: http://arxiv.org/abs/2510.18470
  • 2025-10-21, Chain-of-Conceptual-Thought: Eliciting the Agent to Deeply Think within the Response, Qingqing Gu et.al., Paper: http://arxiv.org/abs/2510.18434
  • 2025-10-21, CUARewardBench: A Benchmark for Evaluating Reward Models on Computer-using Agent, Haojia Lin et.al., Paper: http://arxiv.org/abs/2510.18596
  • 2025-10-21, CLASP: Cost-Optimized LLM-based Agentic System for Phishing Detection, Fouad Trad et.al., Paper: http://arxiv.org/abs/2510.18585
  • 2025-10-21, CEFR-Annotated WordNet: LLM-Based Proficiency-Guided Semantic Database for Language Learning, Masato Kikuchi et.al., Paper: http://arxiv.org/abs/2510.18466
  • 2025-10-21, Building Trust in Clinical LLMs: Bias Analysis and Dataset Transparency, Svetlana Maslenkova et.al., Paper: http://arxiv.org/abs/2510.18556
  • 2025-10-24, Brain-tuning Improves Generalizability and Efficiency of Brain Alignment in Speech Models, Omer Moussa et.al., Paper: http://arxiv.org/abs/2510.21520
  • 2025-10-21, BrailleLLM: Braille Instruction Tuning with Large Language Models for Braille Domain Tasks, Tianyuan Huang et.al., Paper: http://arxiv.org/abs/2510.18288
  • 2025-10-22, Blackbox Model Provenance via Palimpsestic Membership Inference, Rohith Kuditipudi et.al., Paper: http://arxiv.org/abs/2510.19796
  • 2025-10-21, Beyond Single Models: Mitigating Multimodal Hallucinations via Adaptive Token Ensemble Decoding, Jinlin Li et.al., Paper: http://arxiv.org/abs/2510.18321
  • 2025-10-23, Bayesian Jammer Localization with a Hybrid CNN and Path-Loss Mixture of Experts, Mariona Jaramillo-Civill et.al., Paper: http://arxiv.org/abs/2510.20666
  • 2025-10-21, Automated urban waterlogging assessment and early warning through a mixture of foundation models, Chenxu Zhang et.al., Paper: http://arxiv.org/abs/2510.18425
  • 2025-10-23, Automated Extraction of Fluoropyrimidine Treatment and Treatment-Related Toxicities from Clinical Notes Using Natural Language Processing, Xizhi Wu et.al., Paper: http://arxiv.org/abs/2510.20727
  • 2025-10-24, Are the LLMs Capable of Maintaining at Least the Language Genus?, Sandra Mitrović et.al., Paper: http://arxiv.org/abs/2510.21561
  • 2025-10-21, An Encoder-Decoder Foundation Chemical Language Model for Generative Polymer Design, Harikrishna Sahu et.al., Paper: http://arxiv.org/abs/2510.18860
  • 2025-10-27, Alita-G: Self-Evolving Generative Agent for Agent Generation, Jiahao Qiu et.al., Paper: http://arxiv.org/abs/2510.23601
  • 2025-10-28, AgentFrontier: Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis, Xuanzhong Chen et.al., Paper: http://arxiv.org/abs/2510.24695
  • 2025-10-28, Advancing site-specific disease and pest management in precision agriculture: From reasoning-driven foundation models to adaptive, feedback-based learning, Nitin Rai et.al., Paper: http://arxiv.org/abs/2510.24650
  • 2025-10-21, Adamas: Hadamard Sparse Attention for Efficient Long-Context Inference, Siyuan Yan et.al., Paper: http://arxiv.org/abs/2510.18413
  • 2025-10-22, AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders, Yuezhou Hu et.al., Paper: http://arxiv.org/abs/2510.19779
  • 2025-10-24, Actionable Cybersecurity Notifications for Smart Homes: A User Study on the Role of Length and Complexity, Victor Jüttner et.al., Paper: http://arxiv.org/abs/2510.21508
  • 2025-10-23, ARGenSeg: Image Segmentation with Autoregressive Image Generation Model, Xiaolong Wang et.al., Paper: http://arxiv.org/abs/2510.20803
  • 2025-10-23, A Use-Case Specific Dataset for Measuring Dimensions of Responsible Performance in LLM-generated Text, Alicia Sagae et.al., Paper: http://arxiv.org/abs/2510.20782
  • 2025-10-27, A Survey of Data Agents: Emerging Paradigm or Overstated Hype?, Yizhang Zhu et.al., Paper: http://arxiv.org/abs/2510.23587
  • 2025-10-24, A Multimodal Benchmark for Framing of Oil & Gas Advertising and Potential Greenwashing Detection, Gaku Morio et.al., Paper: http://arxiv.org/abs/2510.21679
  • 2025-10-24, A Data-Centric Approach to Multilingual E-Commerce Product Search: Case Study on Query-Category and Query-Item Relevance, Yabo Yin et.al., Paper: http://arxiv.org/abs/2510.21671

(<a href=#updated-on-20251030>back to top</a>)

Reinforcement Learning

  • 2025-10-22, olmOCR 2: Unit Test Rewards for Document OCR, Jake Poznanski et.al., Paper: http://arxiv.org/abs/2510.19817
  • 2025-10-21, Why Policy Gradient Algorithms Work for Undiscounted Total-Reward MDPs, Jongmin Lee et.al., Paper: http://arxiv.org/abs/2510.18340
  • 2025-10-20, When 5G NTN Meets GNSS: Tracking GNSS Signals under Overlaid 5G Waveforms, Idir Edjekouane et.al., Paper: http://arxiv.org/abs/2510.17324
  • 2025-10-21, WebSeer: Training Deeper Search Agents through Reinforcement Learning with Self-Reflection, Guanzhong He et.al., Paper: http://arxiv.org/abs/2510.18798
  • 2025-10-27, VideoTG-R1: Boosting Video Temporal Grounding via Curriculum Reinforcement Learning on Reflected Boundary Annotations, Lu Dong et.al., Paper: http://arxiv.org/abs/2510.23397
  • 2025-10-27, Video-Thinker: Sparking “Thinking with Videos” via Reinforcement Learning, Shijian Wang et.al., Paper: http://arxiv.org/abs/2510.23473
  • 2025-10-21, Verifiable Accuracy and Abstention Rewards in Curriculum RL to Alleviate Lost-in-Conversation, Ming Li et.al., Paper: http://arxiv.org/abs/2510.18731
  • 2025-10-27, Variational Thermal State Preparation on Digital Quantum Processors Assisted by Matrix Product States, Rui-Hao Li et.al., Paper: http://arxiv.org/abs/2510.23546
  • 2025-10-28, VOLD: Reasoning Transfer from LLMs to Vision-Language Models via On-Policy Distillation, Walid Bousselham et.al., Paper: http://arxiv.org/abs/2510.23497
  • 2025-10-22, Using Non-Expert Data to Robustify Imitation Learning via Offline Reinforcement Learning, Kevin Huang et.al., Paper: http://arxiv.org/abs/2510.19495
  • 2025-10-22, Universal Quantitative Abstraction: Categorical Duality and Logical Completeness for Probabilistic Systems, Nivar Anwer et.al., Paper: http://arxiv.org/abs/2510.19444
  • 2025-10-24, Unified token representations for sequential decision models, Zhuojing Tian et.al., Paper: http://arxiv.org/abs/2510.21448
  • 2025-10-20, UniRL-Zero: Reinforcement Learning on Unified Models with Joint Language Model and Diffusion Model Experts, Fu-Yun Wang et.al., Paper: http://arxiv.org/abs/2510.17937
  • 2025-10-21, Uncovering critical temperature dependence in Heusler magnets via explicit machine learning, Jean-Baptiste Morée et.al., Paper: http://arxiv.org/abs/2510.18469
  • 2025-10-20, UltraCUA: A Foundation Model for Computer Use Agents with Hybrid Action, Yuhao Yang et.al., Paper: http://arxiv.org/abs/2510.17790
  • 2025-10-21, Two-loop QCD corrections for real and off-shell diphoton and triphoton production via quark loops, Dario Kermanschah et.al., Paper: http://arxiv.org/abs/2510.18801
  • 2025-10-20, Train for Truth, Keep the Skills: Binary Retrieval-Augmented Reward Mitigates Hallucinations, Tong Chen et.al., Paper: http://arxiv.org/abs/2510.17733
  • 2025-10-20, Trading with the Devil: Risk and Return in Foundation Model Strategies, Jinrui Zhang et.al., Paper: http://arxiv.org/abs/2510.17165
  • 2025-10-27, Towards Stochastic (N-1)-Secure Redispatch, Oleksii Molodchyk et.al., Paper: http://arxiv.org/abs/2510.23551
  • 2025-10-28, Towards Quadrupedal Jumping and Walking for Dynamic Locomotion using Reinforcement Learning, Jørgen Anker Olsen et.al., Paper: http://arxiv.org/abs/2510.24584
  • 2025-10-20, Towards Optimal Control and Algorithmic Structure of Decompression Schedules, Benjamin Marsh et.al., Paper: http://arxiv.org/abs/2510.17551
  • 2025-10-21, Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning, Chenghao Zhu et.al., Paper: http://arxiv.org/abs/2510.18849
  • 2025-10-20, Toward Autonomous Neural VMC: An Energy-Variance Convergence Criterion for Quantum Systems, Huan-Chen Shi et.al., Paper: http://arxiv.org/abs/2510.17490
  • 2025-10-24, Three-nucleon lepton-number-violating potentials in chiral EFT and their matrix elements in light nuclei, Graham Chambers-Wall et.al., Paper: http://arxiv.org/abs/2510.21564
  • 2025-10-27, Think Twice: Branch-and-Rethink Reasoning Reward Model, Yizhu Jiao et.al., Paper: http://arxiv.org/abs/2510.23596
  • 2025-10-24, The population of Galactic young massive star clusters in the TeV range, Rowan Batzofin et.al., Paper: http://arxiv.org/abs/2510.21480
  • 2025-10-21, The implications of inflation for the last ACT, Zhi-Chong Qiu et.al., Paper: http://arxiv.org/abs/2510.18320
  • 2025-10-23, The Shape of Reasoning: Topological Analysis of Reasoning Traces in Large Language Models, Xue Wen Tan et.al., Paper: http://arxiv.org/abs/2510.20665
  • 2025-10-21, The Picard-Lagrange Framework for Higher-Order Langevin Monte Carlo, Jaideep Mahajan et.al., Paper: http://arxiv.org/abs/2510.18242
  • 2025-10-20, The Marked Edge Walk: A Novel MCMC Algorithm for Sampling of Graph Partitions, Atticus McWhorter et.al., Paper: http://arxiv.org/abs/2510.17714
  • 2025-10-22, The Confusing Instance Principle for Online Linear Quadratic Control, Waris Radji et.al., Paper: http://arxiv.org/abs/2510.19531
  • 2025-10-27, The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation, Farid Bagirov et.al., Paper: http://arxiv.org/abs/2510.23393
  • 2025-10-20, TabR1: Taming GRPO for tabular reasoning LLMs, Pengxiang Cai et.al., Paper: http://arxiv.org/abs/2510.17385
  • 2025-10-24, System-Theoretic Analysis of Dynamic Generalized Nash Equilibrium Problems – Turnpikes and Dissipativity, Sophie Hall et.al., Paper: http://arxiv.org/abs/2510.21556
  • 2025-10-24, Surrogate-based quantification of policy uncertainty in generative flow networks, Ramón Nartallo-Kaluarachchi et.al., Paper: http://arxiv.org/abs/2510.21523
  • 2025-10-20, SoftMimic: Learning Compliant Whole-body Control from Examples, Gabriel B. Margolis et.al., Paper: http://arxiv.org/abs/2510.17792
  • 2025-10-21, Socialized Learning and Emergent Behaviors in Multi-Agent Systems based on Multimodal Large Language Models, Sureyya Akin et.al., Paper: http://arxiv.org/abs/2510.18515
  • 2025-10-22, SmartSwitch: Advancing LLM Reasoning by Overcoming Underthinking via Promoting Deeper Thought Exploration, Xichen Zhang et.al., Paper: http://arxiv.org/abs/2510.19767
  • 2025-10-21, Sherlock Your Queries: Learning to Ask the Right Questions for Dialogue-Based Retrieval, Dong Yun et.al., Paper: http://arxiv.org/abs/2510.18659
  • 2025-10-27, Sequential Multi-Agent Dynamic Algorithm Configuration, Chen Lu et.al., Paper: http://arxiv.org/abs/2510.23535
  • 2025-10-22, Semi-Implicit Approaches for Large-Scale Bayesian Spatial Interpolation, Sébastien Garneau et.al., Paper: http://arxiv.org/abs/2510.19722
  • 2025-10-21, Search Self-play: Pushing the Frontier of Agent Capability without Supervision, Hongliang Lu et.al., Paper: http://arxiv.org/abs/2510.18821
  • 2025-10-22, Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning, Xichen Zhang et.al., Paper: http://arxiv.org/abs/2510.19807
  • 2025-10-28, Sample-efficient and Scalable Exploration in Continuous-Time RL, Klemens Iten et.al., Paper: http://arxiv.org/abs/2510.24482
  • 2025-10-21, Safe But Not Sorry: Reducing Over-Conservatism in Safety Critics via Uncertainty-Aware Modulation, Daniel Bethell et.al., Paper: http://arxiv.org/abs/2510.18478
  • 2025-10-28, SPICE: Self-Play In Corpus Environments Improves Reasoning, Bo Liu et.al., Paper: http://arxiv.org/abs/2510.24684
  • 2025-10-28, SPARTA: Evaluating Reasoning Segmentation Robustness through Black-Box Adversarial Paraphrasing in Text Autoencoder Latent Space, Viktoriia Zinkovich et.al., Paper: http://arxiv.org/abs/2510.24446
  • 2025-10-20, SPACeR: Self-Play Anchoring with Centralized Reference Models, Wei-Jer Chang et.al., Paper: http://arxiv.org/abs/2510.18060
  • 2025-10-28, SGFusion: Stochastic Geographic Gradient Fusion in Federated Learning, Khoa Nguyen et.al., Paper: http://arxiv.org/abs/2510.23455
  • 2025-10-22, SEA: Semantic Map Prediction for Active Exploration of Uncertain Areas, Hongyu Ding et.al., Paper: http://arxiv.org/abs/2510.19766
  • 2025-10-20, Rewarding the Journey, Not Just the Destination: A Composite Path and Answer Self-Scoring Reward Mechanism for Test-Time Reinforcement Learning, Chenwei Tang et.al., Paper: http://arxiv.org/abs/2510.17923
  • 2025-10-20, Rethinking On-policy Optimization for Query Augmentation, Zhichao Xu et.al., Paper: http://arxiv.org/abs/2510.17139
  • 2025-10-21, Retaining by Doing: The Role of On-Policy Data in Mitigating Forgetting, Howard Chen et.al., Paper: http://arxiv.org/abs/2510.18874
  • 2025-10-21, Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen Approach, Chenbei Lu et.al., Paper: http://arxiv.org/abs/2510.18687
  • 2025-10-23, Reinforcement Learning and Consumption-Savings Behavior, Brandon Kaplowitz et.al., Paper: http://arxiv.org/abs/2510.20748
  • 2025-10-24, Reduced Floating-Point Precision Implicit Monte Carlo, Simon Butson et.al., Paper: http://arxiv.org/abs/2510.21683
  • 2025-10-22, Reasoning Like Experts: Leveraging Multimodal Large Language Models for Drawing-based Psychoanalysis, Xueqi Ma et.al., Paper: http://arxiv.org/abs/2510.19451
  • 2025-10-24, Real-Time Gait Adaptation for Quadrupeds using Model Predictive Control and Reinforcement Learning, Prakrut Kotecha et.al., Paper: http://arxiv.org/abs/2510.20706
  • 2025-10-21, Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback, Yi-Lun Wu et.al., Paper: http://arxiv.org/abs/2510.18353
  • 2025-10-20, RL-Driven Security-Aware Resource Allocation Framework for UAV-Assisted O-RAN, Zaineh Abughazzah et.al., Paper: http://arxiv.org/abs/2510.18084
  • 2025-10-24, RETuning: Upgrading Inference-Time Scaling for Stock Movement Prediction with Large Language Models, Xueyuan Lin et.al., Paper: http://arxiv.org/abs/2510.21604
  • 2025-10-20, RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation, Yuquan Xue et.al., Paper: http://arxiv.org/abs/2510.17640
  • 2025-10-20, R2L: Reliable Reinforcement Learning: Guaranteed Return & Reliable Policies in Reinforcement Learning, Nadir Farhi et.al., Paper: http://arxiv.org/abs/2510.18074
  • 2025-10-20, R2BC: Multi-Agent Imitation Learning from Single-Agent Demonstrations, Connor Mattson et.al., Paper: http://arxiv.org/abs/2510.18085
  • 2025-10-20, QueST: Incentivizing LLMs to Generate Difficult Problems, Hanxu Hu et.al., Paper: http://arxiv.org/abs/2510.17715
  • 2025-10-22, Quantum Monte Carlo study of low-dimensional Fermi fluids of dipolar atoms, Clio Johnson et.al., Paper: http://arxiv.org/abs/2510.19533
  • 2025-10-22, Quantum Machine Learning methods for Fourier-based distribution estimation with application in option pricing, Fernando Alonso et.al., Paper: http://arxiv.org/abs/2510.19494
  • 2025-10-20, Provably Optimal Reinforcement Learning under Safety Filtering, Donggeon David Oh et.al., Paper: http://arxiv.org/abs/2510.18082
  • 2025-10-29, Prospects for a 95 GeV Higgs Boson at Future Higgs Factories with Transformer Networks, Yabo Dong et.al., Paper: http://arxiv.org/abs/2510.24662
  • 2025-10-21, Proactive Reasoning-with-Retrieval Framework for Medical Multimodal Large Language Models, Lehan Wang et.al., Paper: http://arxiv.org/abs/2510.18303
  • 2025-10-21, Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options, Joongkyu Lee et.al., Paper: http://arxiv.org/abs/2510.18713
  • 2025-10-24, Predicted observational effects of rapid rotation for Be stars, Rina G. Rast et.al., Paper: http://arxiv.org/abs/2510.21640
  • 2025-10-22, Practical algorithm for simulating thermal pure quantum states, Wei-Bo He et.al., Paper: http://arxiv.org/abs/2510.19504
  • 2025-10-20, Plasma Shape Control via Zero-shot Generative Reinforcement Learning, Niannian Wu et.al., Paper: http://arxiv.org/abs/2510.17531
  • 2025-10-21, PlanU: Large Language Model Decision Making through Planning under Uncertainty, Ziwei Deng et.al., Paper: http://arxiv.org/abs/2510.18442
  • 2025-10-23, Plan Then Retrieve: Reinforcement Learning-Guided Complex Reasoning over Knowledge Graphs, Yanlin Song et.al., Paper: http://arxiv.org/abs/2510.20691
  • 2025-10-22, Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing, Yusu Qian et.al., Paper: http://arxiv.org/abs/2510.19808
  • 2025-10-28, Pair Approximation Meets Reality: Diffusion of Innovation in Organizational Networks within the biased-independence q-Voter Model, Angelika Abramiuk-Szurlej et.al., Paper: http://arxiv.org/abs/2510.24447
  • 2025-10-21, PGTT: Phase-Guided Terrain Traversal for Perceptive Legged Locomotion, Alexandros Ntagkas et.al., Paper: http://arxiv.org/abs/2510.18348
  • 2025-10-21, PCMS: Parallel Coupler For Multimodel Simulations, Jacob S. Merson et.al., Paper: http://arxiv.org/abs/2510.18838
  • 2025-10-20, Oxidation State Dynamics and Emerging Patterns in Magnetite, Emre Gürsoy et.al., Paper: http://arxiv.org/abs/2510.18061
  • 2025-10-22, Optimizing the Unknown: Black Box Bayesian Optimization with Energy-Based Model and Reinforcement Learning, Ruiyao Miao et.al., Paper: http://arxiv.org/abs/2510.19530
  • 2025-10-20, Optimizing Energy Management of Smart Grid using Reinforcement Learning aided by Surrogate models built using Physics-informed Neural Networks, Julen Cestero et.al., Paper: http://arxiv.org/abs/2510.17380
  • 2025-10-29, OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning, Ziyou Hu et.al., Paper: http://arxiv.org/abs/2510.24636
  • 2025-10-23, Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence, Jiahao Meng et.al., Paper: http://arxiv.org/abs/2510.20579
  • 2025-10-21, Online SFT for LLM Reasoning: Surprising Effectiveness of Self-Tuning without Rewards, Mengqi Li et.al., Paper: http://arxiv.org/abs/2510.18814
  • 2025-10-20, OncoReason: Structuring Clinical Reasoning in LLMs for Robust and Interpretable Survival Prediction, Raghu Vamshi Hemadri et.al., Paper: http://arxiv.org/abs/2510.17532
  • 2025-10-23, On Multiple Robustness of Proximal Dynamic Treatment Regimes, Yuanshan Gao et.al., Paper: http://arxiv.org/abs/2510.20451
  • 2025-10-21, On AI Verification in Open RAN, Rahul Soundrarajan et.al., Paper: http://arxiv.org/abs/2510.18417
  • 2025-10-27, Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences, Zhuoran Jin et.al., Paper: http://arxiv.org/abs/2510.23451
  • 2025-10-20, OPTAGENT: Optimizing Multi-Agent LLM Interactions Through Verbal Reinforcement Learning for Enhanced Reasoning, Zhenyu Bi et.al., Paper: http://arxiv.org/abs/2510.18032
  • 2025-10-23, No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes, Jasmine Bayrooti et.al., Paper: http://arxiv.org/abs/2510.20725
  • 2025-10-21, Nash Policy Gradient: A Policy Gradient Method with Iteratively Refined Regularization for Finding Nash Equilibria, Eason Yu et.al., Paper: http://arxiv.org/abs/2510.18183
  • 2025-10-21, NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective, Xiaohan Qin et.al., Paper: http://arxiv.org/abs/2510.18258
  • 2025-10-20, Multimodal Safety Is Asymmetric: Cross-Modal Exploits Unlock Black-Box MLLMs Jailbreaks, Xinkai Wang et.al., Paper: http://arxiv.org/abs/2510.17277
  • 2025-10-24, Multilevel Picard scheme for solving high-dimensional drift control problems with state constraints, Yuan Zhong et.al., Paper: http://arxiv.org/abs/2510.21607
  • 2025-10-28, Multi-Agent Evolve: LLM Self-Improve through Co-evolution, Yixing Chen et.al., Paper: http://arxiv.org/abs/2510.23595
  • 2025-10-22, Monte Carlo study of the $O(2)$-invariant $φ^4$ theory with a cubic perturbation in three dimensions, Martin Hasenbusch et.al., Paper: http://arxiv.org/abs/2510.19473
  • 2025-10-23, Monte Carlo Sampling for Wave Functions Requiring (Anti)Symmetrization, Koyena Bose et.al., Paper: http://arxiv.org/abs/2510.20577
  • 2025-10-21, MoMaGen: Generating Demonstrations under Soft and Hard Constraints for Multi-Step Bimanual Mobile Manipulation, Chengshu Li et.al., Paper: http://arxiv.org/abs/2510.18316
  • 2025-10-28, MiniOneRec: An Open-Source Framework for Scaling Generative Recommendation, Xiaoyu Kong et.al., Paper: http://arxiv.org/abs/2510.24431
  • 2025-10-27, MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding, Xin Jin et.al., Paper: http://arxiv.org/abs/2510.23479
  • 2025-10-22, Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning, Gunshi Gupta et.al., Paper: http://arxiv.org/abs/2510.19732
  • 2025-10-22, MedReason-R1: Learning to Reason for CT Diagnosis with Reinforcement Learning and Local Zoom, Yifan Li et.al., Paper: http://arxiv.org/abs/2510.19626
  • 2025-10-21, Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents, Guangfu Guo et.al., Paper: http://arxiv.org/abs/2510.18424
  • 2025-10-24, Mechanistic Interpretability for Neural TSP Solvers, Reuben Narad et.al., Paper: http://arxiv.org/abs/2510.21693
  • 2025-10-23, Measuring cosmic dipole with the GRB luminosity-time relation, Jessica Santiago et.al., Paper: http://arxiv.org/abs/2510.20705
  • 2025-10-20, Measuring Reasoning in LLMs: a New Dialectical Angle, Soheil Abbasloo et.al., Paper: http://arxiv.org/abs/2510.18134
  • 2025-10-24, MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization, Chenglong Wang et.al., Paper: http://arxiv.org/abs/2510.21473
  • 2025-10-21, MENTOR: A Reinforcement Learning Framework for Model Enhancement via Teacher-Optimized Rewards in Small Models, ChangSu Choi et.al., Paper: http://arxiv.org/abs/2510.18383
  • 2025-10-21, MADR: MPC-guided Adversarial DeepReach, Ryan Teoh et.al., Paper: http://arxiv.org/abs/2510.18845
  • 2025-10-21, Lyapunov-Aware Quantum-Inspired Reinforcement Learning for Continuous-Time Vehicle Control: A Feasibility Study, Nutkritta Kraipatthanapong et.al., Paper: http://arxiv.org/abs/2510.18852
  • 2025-10-28, Low-lying baryon resonances from lattice QCD, Colin Morningstar et.al., Paper: http://arxiv.org/abs/2510.24596
  • 2025-10-20, Local Coherence or Global Validity? Investigating RLVR Traces in Math Domains, Soumya Rani Samineni et.al., Paper: http://arxiv.org/abs/2510.18176
  • 2025-10-20, Leveraging Group Relative Policy Optimization to Advance Large Language Models in Traditional Chinese Medicine, Jiacheng Xie et.al., Paper: http://arxiv.org/abs/2510.17402
  • 2025-10-27, Learning to Reason Efficiently with Discounted Reinforcement Learning, Alex Ayoub et.al., Paper: http://arxiv.org/abs/2510.23486
  • 2025-10-21, Learning to Navigate Under Imperfect Perception: Conformalised Segmentation for Safe Reinforcement Learning, Daniel Bethell et.al., Paper: http://arxiv.org/abs/2510.18485
  • 2025-10-28, Learning to Drive Safely with Hybrid Options, Bram De Cooman et.al., Paper: http://arxiv.org/abs/2510.24674
  • 2025-10-20, Learning to Design Soft Hands using Reward Models, Xueqian Bai et.al., Paper: http://arxiv.org/abs/2510.17086
  • 2025-10-22, Learning Upper Lower Value Envelopes to Shape Online RL: A Principled Approach, Sebastian Reboul et.al., Paper: http://arxiv.org/abs/2510.19528
  • 2025-10-20, LLMs Encode How Difficult Problems Are, William Lugoloobi et.al., Paper: http://arxiv.org/abs/2510.18147
  • 2025-10-23, KL-Regularized Reinforcement Learning is Designed to Mode Collapse, Anthony GX-Chen et.al., Paper: http://arxiv.org/abs/2510.20817
  • 2025-10-20, Inference of Deterministic Finite Automata via Q-Learning, Elaheh Hosseinkhani et.al., Paper: http://arxiv.org/abs/2510.17386
  • 2025-10-21, Improved thermonuclear rate of $^{42}$Ti($p$,$γ$)$^{43}$ V and its astrophysical implication in rp-process, S. Q. Hou et.al., Paper: http://arxiv.org/abs/2510.18531
  • 2025-10-20, Humanoid Goalkeeper: Learning from Position Conditioned Task-Motion Constraints, Junli Ren et.al., Paper: http://arxiv.org/abs/2510.18002
  • 2025-10-28, How Flat is a Plateau? Evolution of Late-Time TDE Disks, Yael Alush et.al., Paper: http://arxiv.org/abs/2510.24696
  • 2025-10-21, Higher Embedding Dimension Creates a Stronger World Model for a Simple Sorting Task, Brady Bhalla et.al., Paper: http://arxiv.org/abs/2510.18315
  • 2025-10-27, Ground-state phase diagram of S = 1/2 Heisenberg model on 2D square-hexagon-octagon lattice, Yumeng Luo et.al., Paper: http://arxiv.org/abs/2510.23376
  • 2025-10-28, Greedy Sampling Is Provably Efficient for RLHF, Di Wu et.al., Paper: http://arxiv.org/abs/2510.24700
  • 2025-10-24, Goal-based portfolio selection with fixed transaction costs, Erhan Bayraktar et.al., Paper: http://arxiv.org/abs/2510.21650
  • 2025-10-23, GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering via Reinforcement Learning, Jinchang Luo et.al., Paper: http://arxiv.org/abs/2510.20548
  • 2025-10-23, GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation, Guangqi Jiang et.al., Paper: http://arxiv.org/abs/2510.20813
  • 2025-10-20, GACO-CAD: Geometry-Augmented and Conciseness-Optimized CAD Model Generation from Single Image, Yinghui Wang et.al., Paper: http://arxiv.org/abs/2510.17157
  • 2025-10-20, Functional Distribution Networks (FDN), Omer Haq et.al., Paper: http://arxiv.org/abs/2510.17794
  • 2025-10-20, From Preferences to Prejudice: The Role of Alignment Tuning in Shaping Social Bias in Video Diffusion Models, Zefan Cai et.al., Paper: http://arxiv.org/abs/2510.17247
  • 2025-10-21, From Competition to Synergy: Unlocking Reinforcement Learning for Subject-Driven Image Generation, Ziwei Huang et.al., Paper: http://arxiv.org/abs/2510.18263
  • 2025-10-20, Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains, Austin Xu et.al., Paper: http://arxiv.org/abs/2510.17793
  • 2025-10-21, Food4All: A Multi-Agent Framework for Real-time Free Food Discovery with Integrated Nutritional Metadata, Zhengqing Yuan et.al., Paper: http://arxiv.org/abs/2510.18289
  • 2025-10-20, Finite-Time Bounds for Average-Reward Fitted Q-Iteration, Jongmin Lee et.al., Paper: http://arxiv.org/abs/2510.17391
  • 2025-10-21, Fingerprints of cluster-based Haldane and bound-magnon states in a spin-1 Heisenberg diamond chain, Azam Zoshki et.al., Paper: http://arxiv.org/abs/2510.18447
  • 2025-10-20, Fine-tuning Flow Matching Generative Models with Intermediate Feedback, Jiajun Fan et.al., Paper: http://arxiv.org/abs/2510.18072
  • 2025-10-28, Fill in the Blanks: Accelerating Q-Learning with a Handful of Demonstrations in Sparse Reward Settings, Seyed Mahdi Basiri Azad et.al., Paper: http://arxiv.org/abs/2510.24432
  • 2025-10-28, Fast Bayesian Multilevel Quasi-Monte Carlo, Aleksei G. Sorokin et.al., Paper: http://arxiv.org/abs/2510.24604
  • 2025-10-28, Fare: Failure Resilience in Learned Visual Navigation Control, Zishuo Wang et.al., Paper: http://arxiv.org/abs/2510.24680
  • 2025-10-28, Evolving Diagnostic Agents in a Virtual Clinical Environment, Pengcheng Qiu et.al., Paper: http://arxiv.org/abs/2510.24654
  • 2025-10-20, EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning, He Du et.al., Paper: http://arxiv.org/abs/2510.17928
  • 2025-10-21, Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model, Ling Team et.al., Paper: http://arxiv.org/abs/2510.18855
  • 2025-10-20, Estimating Orbital Parameters of Direct Imaging Exoplanet Using Neural Network, Bo Liang et.al., Paper: http://arxiv.org/abs/2510.17459
  • 2025-10-24, Enhancing Tactile-based Reinforcement Learning for Robotic Control, Elle Miller et.al., Paper: http://arxiv.org/abs/2510.21609
  • 2025-10-23, EmbodiedBrain: Expanding Performance Boundaries of Task Planning for Embodied Intelligence, Ding Zou et.al., Paper: http://arxiv.org/abs/2510.20578
  • 2025-10-24, Electroweak corrections to $gg\rightarrow γγ$, Gabriele Fiore et.al., Paper: http://arxiv.org/abs/2510.21643
  • 2025-10-21, Efficient Model-Based Reinforcement Learning for Robot Control via Online Learning, Fang Nan et.al., Paper: http://arxiv.org/abs/2510.18518
  • 2025-10-20, Efficient Algorithms for Mitigating Uncertainty and Risk in Reinforcement Learning, Xihong Su et.al., Paper: http://arxiv.org/abs/2510.17690
  • 2025-10-21, EffiReasonTrans: RL-Optimized Reasoning for Code Translation, Yanlin Wang et.al., Paper: http://arxiv.org/abs/2510.18863
  • 2025-10-28, Dual-Mind World Models: A General Framework for Learning in Dynamic Wireless Networks, Lingyi Wang et.al., Paper: http://arxiv.org/abs/2510.24546
  • 2025-10-23, Downsizing Diffusion Models for Cardinality Estimation, Xinhe Mu et.al., Paper: http://arxiv.org/abs/2510.20681
  • 2025-10-23, Detection of ultra-high-energy cosmic rays in the southern hemisphere with FAST: data acquisition and preliminary results, Jakub Kmec et.al., Paper: http://arxiv.org/abs/2510.20522
  • 2025-10-22, Demonstrating Real Advantage of Machine-Learning-Enhanced Monte Carlo for Combinatorial Optimization, Luca Maria Del Bono et.al., Paper: http://arxiv.org/abs/2510.19544
  • 2025-10-24, DeepAgent: A General Reasoning Agent with Scalable Toolsets, Xiaoxi Li et.al., Paper: http://arxiv.org/abs/2510.21618
  • 2025-10-21, Deep Q-Learning Assisted Bandwidth Reservation for Multi-Operator Time-Sensitive Vehicular Networking, Abdullah Al-Khatib et.al., Paper: http://arxiv.org/abs/2510.18553
  • 2025-10-20, Deep Neural Network extraction of Unpolarized Transverse Momentum Distributions, I. P. Fernando et.al., Paper: http://arxiv.org/abs/2510.17243
  • 2025-10-20, Decentralized Real-Time Planning for Multi-UAV Cooperative Manipulation via Imitation Learning, Shantnav Agarwal et.al., Paper: http://arxiv.org/abs/2510.17143
  • 2025-10-21, DeLoad: Demand-Driven Short-Video Preloading with Scalable Watch-Time Estimation, Tong Liu et.al., Paper: http://arxiv.org/abs/2510.18459
  • 2025-10-24, DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection, Tala Aljaafari et.al., Paper: http://arxiv.org/abs/2510.21638
  • 2025-10-23, DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning, Runpeng Xie et.al., Paper: http://arxiv.org/abs/2510.19562
  • 2025-10-20, D2C-HRHR: Discrete Actions with Double Distributional Critics for High-Risk-High-Return Tasks, Jundong Zhang et.al., Paper: http://arxiv.org/abs/2510.17212
  • 2025-10-20, CrossGuard: Safeguarding MLLMs against Joint-Modal Implicit Malicious Attacks, Xu Zhang et.al., Paper: http://arxiv.org/abs/2510.17687
  • 2025-10-24, Cost Minimization for Space-Air-Ground Integrated Multi-Access Edge Computing Systems, Weihong Qin et.al., Paper: http://arxiv.org/abs/2510.21541
  • 2025-10-27, Cosmic magnification on multi-catalogue Herschel submillimetre galaxies, R. Fernandez-Fernandez et.al., Paper: http://arxiv.org/abs/2510.23582
  • 2025-10-20, Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control, Chengxiu Hua et.al., Paper: http://arxiv.org/abs/2510.17122
  • 2025-10-23, Consumption-Investment Problem in Rank-Based Models, David Itkin et.al., Paper: http://arxiv.org/abs/2510.20763
  • 2025-10-24, Constraints on ultra-heavy dark matter from the CDEX-10 experiment at the China Jinping Underground Laboratory, Y. F. Wang et.al., Paper: http://arxiv.org/abs/2510.21458
  • 2025-10-20, Consistent Zero-Shot Imitation with Contrastive Goal Inference, Kathryn Wantlin et.al., Paper: http://arxiv.org/abs/2510.17059
  • 2025-10-23, Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence, Kun Ouyang et.al., Paper: http://arxiv.org/abs/2510.20470
  • 2025-10-21, Computational Foundations for Strategic Coopetition: Formalizing Interdependence and Complementarity, Vik Pant et.al., Paper: http://arxiv.org/abs/2510.18802
  • 2025-10-20, Colour coherence in small collision systems, Isobel Kolbé et.al., Paper: http://arxiv.org/abs/2510.17570
  • 2025-10-20, Collider Searches for Near-Continuum Dark Matter, Steven Ferrante et.al., Paper: http://arxiv.org/abs/2510.17989
  • 2025-10-20, Coinvisor: An RL-Enhanced Chatbot Agent for Interactive Cryptocurrency Investment Analysis, Chong Chen et.al., Paper: http://arxiv.org/abs/2510.17235
  • 2025-10-21, CodeRL+: Improving Code Generation via Reinforcement with Execution Semantics Alignment, Xue Jiang et.al., Paper: http://arxiv.org/abs/2510.18471
  • 2025-10-28, Cluster Dose Prediction in Carbon Ion Therapy: Using Transfer Learning from a Pretrained Dose Prediction U-Net, Miriam Schwarze et.al., Paper: http://arxiv.org/abs/2510.24703
  • 2025-10-21, Chemistry, Climate, and Transmission Spectra of TRAPPIST-1 e Explored with a Multimodel Sparse Sampled Ensemble, Eric T. Wolf et.al., Paper: http://arxiv.org/abs/2510.18704
  • 2025-10-20, Characterizing expansivity through $C^*$ -algebras, S. Bautista et.al., Paper: http://arxiv.org/abs/2510.17255
  • 2025-10-20, Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs, Paula Cordero-Encinar et.al., Paper: http://arxiv.org/abs/2510.17472
  • 2025-10-24, Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems, Hao Liang et.al., Paper: http://arxiv.org/abs/2510.21427
  • 2025-10-27, Causal Deep Q Network, Elouanes Khelifi et.al., Paper: http://arxiv.org/abs/2510.23424
  • 2025-10-21, CUARewardBench: A Benchmark for Evaluating Reward Models on Computer-using Agent, Haojia Lin et.al., Paper: http://arxiv.org/abs/2510.18596
  • 2025-10-20, CLAWS:Creativity detection for LLM-generated solutions using Attention Window of Sections, Keuntae Kim et.al., Paper: http://arxiv.org/abs/2510.17921
  • 2025-10-21, Beware of the running $n_s$ when producing heavy primordial black holes, Sasha Allegrini et.al., Paper: http://arxiv.org/abs/2510.18791
  • 2025-10-20, B-Meson Anomalies: Effective Field Theory Meets Machine Learning, Alejandro Mir et.al., Paper: http://arxiv.org/abs/2510.17742
  • 2025-10-20, Auto-Rubric: Learning to Extract Generalizable Criteria for Reward Modeling, Lipeng Xie et.al., Paper: http://arxiv.org/abs/2510.17314
  • 2025-10-27, Approximately optimal distributed controls for high-dimensional stochastic systems with pairwise interaction through controls, Elise Devey et.al., Paper: http://arxiv.org/abs/2510.23537
  • 2025-10-21, Analysis note: measurement of thrust and track energy-energy correlator in $e^+e^-$ collisions at 91.2 GeV with DELPHI open data, Jingyu Zhang et.al., Paper: http://arxiv.org/abs/2510.18762
  • 2025-10-21, An integrated neural wavefunction solver for spinful Fermi systems, Alexander Avdoshkin et.al., Paper: http://arxiv.org/abs/2510.18621
  • 2025-10-27, An Information-Theoretic Analysis of Out-of-Distribution Generalization in Meta-Learning with Applications to Meta-RL, Xingtu Liu et.al., Paper: http://arxiv.org/abs/2510.23448
  • 2025-10-20, An Exact Quantile-Energy Equality for Terminal Halfspaces in Linear-Gaussian Control with a Discrete-Time Companion, KL/Schrodinger Links, and High-Precision Validation, Sandro Andric et.al., Paper: http://arxiv.org/abs/2510.17945
  • 2025-10-20, An Empirical Study of Lagrangian Methods in Safe Reinforcement Learning, Lindsay Spoor et.al., Paper: http://arxiv.org/abs/2510.17564
  • 2025-10-20, Agentic Reinforcement Learning for Search is Unsafe, Yushi Yang et.al., Paper: http://arxiv.org/abs/2510.17431
  • 2025-10-28, Advancing site-specific disease and pest management in precision agriculture: From reasoning-driven foundation models to adaptive, feedback-based learning, Nitin Rai et.al., Paper: http://arxiv.org/abs/2510.24650
  • 2025-10-28, Adaptive Surrogate Gradients for Sequential Reinforcement Learning in Spiking Neural Networks, Korneel Van den Berghe et.al., Paper: http://arxiv.org/abs/2510.24461
  • 2025-10-27, Adaptive Multilevel Splitting: First Application to Rare-Event Derivative Pricing, Riccardo Gozzo et.al., Paper: http://arxiv.org/abs/2510.23461
  • 2025-10-20, Adaptive Divergence Regularized Policy Optimization for Fine-tuning Generative Models, Jiajun Fan et.al., Paper: http://arxiv.org/abs/2510.18053
  • 2025-10-23, AdaDoS: Adaptive DoS Attack via Deep Adversarial Reinforcement Learning in SDN, Wei Shao et.al., Paper: http://arxiv.org/abs/2510.20566
  • 2025-10-21, Actor-Free Continuous Control via Structurally Maximizable Q-Functions, Yigit Korkmaz et.al., Paper: http://arxiv.org/abs/2510.18828
  • 2025-10-20, Accelerating Bayesian Inference via Multi-Fidelity Transport Map Coupling, Sanjan C. Muchandimath et.al., Paper: http://arxiv.org/abs/2510.17946
  • 2025-10-20, ALPINE: A Lightweight and Adaptive Privacy-Decision Agent Framework for Dynamic Edge Crowdsensing, Guanjie Cheng et.al., Paper: http://arxiv.org/abs/2510.17162
  • 2025-10-24, A Unified Model for Multi-Task Drone Routing in Post-Disaster Road Assessment, Huatian Gong et.al., Paper: http://arxiv.org/abs/2510.21525
  • 2025-10-23, A Unified Framework for Zero-Shot Reinforcement Learning, Jacopo Di Ventura et.al., Paper: http://arxiv.org/abs/2510.20542
  • 2025-10-27, A Sequential Planning Framework for the Operational Reality of Interacting Air Traffic Flow Regulations and Traffic Flow Programs, Thinh Hoang et.al., Paper: http://arxiv.org/abs/2510.23402
  • 2025-10-20, A Principle of Targeted Intervention for Multi-Agent Reinforcement Learning, Anjie Liu et.al., Paper: http://arxiv.org/abs/2510.17697
  • 2025-10-23, A Microphysical Probe of Neutron Star Interiors: Constraining the Equation of State with Glitch Dynamics, Zhonghao Tu et.al., Paper: http://arxiv.org/abs/2510.20791

(<a href=#updated-on-20251030>back to top</a>)

Notes:

  • We have modified the sorting rule of the above table to prioritize papers based on the time of their latest update rather than their initial publication date. If an article has been recently modified, it will appear earlier in the list.

Function added:

  • Support more reliable text parser. Link

  • Support rich markdown format (better at parsing experimental tables). Link