Agent Research Papers
Automatically Updated on 2025.10.30
Current Search Keywords: Agent,Multi-Agent,Tool Learning,Agent RL,Autonomous Agent,LLM Agent
If you have any other keywords, please feel free to let us know :)
Agent
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2025-10-28 | Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents | Yueqi Song et.al. | 2510.24702 | null |
| 2025-10-28 | AgentFold: Long-Horizon Web Agents with Proactive Context Management | Rui Ye et.al. | 2510.24699 | null |
| 2025-10-28 | AgentFrontier: Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis | Xuanzhong Chen et.al. | 2510.24695 | null |
| 2025-10-28 | Repurposing Synthetic Data for Fine-grained Search Agent Supervision | Yida Zhao et.al. | 2510.24694 | null |
| 2025-10-28 | OrchDAG: Complex Tool Orchestration in Multi-Turn Interactions with Plan DAGs | Yifu Lu et.al. | 2510.24663 | null |
| 2025-10-28 | FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling | Zengzhuang Xu et.al. | 2510.24645 | null |
| 2025-10-28 | ReplicationBench: Can AI Agents Replicate Astrophysics Research Papers? | Christine Ye et.al. | 2510.24591 | null |
| 2025-10-28 | Affordance Representation and Recognition for Autonomous Agents | Habtom Kahsay Gidey et.al. | 2510.24459 | null |
| 2025-10-28 | Law in Silico: Simulating Legal Society with LLM-Based Agents | Yiding Wang et.al. | 2510.24442 | null |
| 2025-10-28 | Can LLMs Write Faithfully? An Agent-Based Evaluation of LLM-generated Islamic Content | Abdullah Mushtaq et.al. | 2510.24438 | null |
| 2025-10-28 | Policy Cards: Machine-Readable Runtime Governance for Autonomous AI Agents | Juraj Mavračić et.al. | 2510.24383 | null |
| 2025-10-28 | Automatically Benchmarking LLM Code Agents through Agent-Driven Annotation and Evaluation | Lingyue Fu et.al. | 2510.24358 | null |
| 2025-10-28 | Cybersecurity AI Benchmark (CAIBench): A Meta-Benchmark for Evaluating Cybersecurity AI Agents | María Sanz-Gómez et.al. | 2510.24317 | null |
| 2025-10-28 | Retrieval and Argumentation Enhanced Multi-Agent LLMs for Judgmental Forecasting | Deniz Gorur et.al. | 2510.24303 | null |
| 2025-10-28 | MCP-Flow: Facilitating LLM Agents to Master Real-World, Diverse and Scaling MCP Tools | Wenhao Wang et.al. | 2510.24284 | null |
| 2025-10-28 | Investigating Software Aging in LLM-Generated Software Systems | César Santos et.al. | 2510.24188 | null |
| 2025-10-28 | BLM $_1$ : A Boundless Large Model for Cross-Space, Cross-Task, and Cross-Embodiment Learning | Wentao Tan et.al. | 2510.24161 | null |
| 2025-10-28 | From Observability Data to Diagnosis: An Evolving Multi-agent System for Incident Management in Cloud Systems | Yu Luo et.al. | 2510.24145 | null |
| 2025-10-28 | Reinforcement Learning for Long-Horizon Multi-Turn Search Agents | Vivek Kalyan et.al. | 2510.24126 | null |
| 2025-10-28 | PFEA: An LLM-based High-Level Natural Language Planning and Feedback Embodied Agent for Human-Centered AI | Wenbin Ding et.al. | 2510.24109 | null |
| 2025-10-28 | BrowseConf: Confidence-Guided Test-Time Scaling for Web Agents | Litu Ou et.al. | 2510.23458 | null |
| 2025-10-28 | Look and Tell: A Dataset for Multimodal Grounding Across Egocentric and Exocentric Views | Anna Deichler et.al. | 2510.22672 | null |
| 2025-10-27 | Are Agents Just Automata? On the Formal Equivalence Between Agentic AI and the Chomsky Hierarchy | Roham Koohestani et.al. | 2510.23487 | null |
| 2025-10-27 | Model Proficiency in Centralized Multi-Agent Systems: A Performance Study | Anna Guerra et.al. | 2510.23447 | null |
| 2025-10-27 | AutoStreamPipe: LLM Assisted Automatic Generation of Data Stream Processing Pipelines | Abolfazl Younesi et.al. | 2510.23408 | null |
| 2025-10-27 | Multi-Stakeholder Alignment in LLM-Powered Collaborative AI Systems: A Multi-Agent Framework for Intelligent Tutoring | Alexandre P Uchoa et.al. | 2510.23245 | null |
| 2025-10-27 | Evaluation of Vision-LLMs in Surveillance Video | Pascal Benschop et.al. | 2510.23190 | null |
| 2025-10-27 | SI-Bench: Benchmarking Social Intelligence of Large Language Models in Human-to-Human Conversations | Shuai Huang et.al. | 2510.23182 | null |
| 2025-10-27 | Adapting Interleaved Encoders with PPO for Language-Guided Reinforcement Learning in BabyAI | Aryan Mathur et.al. | 2510.23148 | null |
| 2025-10-27 | Lost in Tokenization: Context as the Key to Unlocking Biomolecular Understanding in Scientific LLMs | Kai Zhuang et.al. | 2510.23127 | null |
| 2025-10-27 | Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning | Ran Xu et.al. | 2510.23038 | null |
| 2025-10-27 | P1GPT: a multi-agent LLM workflow module for multi-modal financial information analysis | Chen-Che Lu et.al. | 2510.23032 | null |
| 2025-10-27 | TALM: Dynamic Tree-Structured Multi-Agent Framework with Long-Term Memory for Scalable Code Generation | Ming-Tung Shen et.al. | 2510.23010 | null |
| 2025-10-27 | CodeAD: Synthesize Code of Rules for Log-based Anomaly Detection with LLMs | Junjie Huang et.al. | 2510.22986 | null |
| 2025-10-27 | Language Server CLI Empowers Language Agents with Process Rewards | Yifan Zhang et.al. | 2510.22907 | null |
| 2025-10-27 | On Generalization in Agentic Tool Calling: CoreThink Agentic Reasoner and MAVEN Dataset | Vishvesh Bhat et.al. | 2510.22898 | null |
| 2025-10-26 | Distributed Multi-Agent Bandits Over Erdős-Rényi Random Networks | Jingyuan Liu et.al. | 2510.22811 | null |
| 2025-10-26 | Collaborative LLM Agents for C4 Software Architecture Design Automation | Kamil Szczepanik et.al. | 2510.22787 | null |
| 2025-10-26 | How Do AI Agents Do Human Work? Comparing AI and Human Workflows Across Diverse Occupations | Zora Zhiruo Wang et.al. | 2510.22780 | null |
| 2025-10-26 | ATLAS: Actor-Critic Task-Completion with Look-ahead Action Simulation | Jiali Cheng et.al. | 2510.22732 | null |
| 2025-10-24 | A Knowledge-Graph Translation Layer for Mission-Aware Multi-Agent Path Planning in Spatiotemporal Dynamics | Edward Holmberg et.al. | 2510.21695 | null |
| 2025-10-24 | AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite | Jonathan Bragg et.al. | 2510.21652 | null |
| 2025-10-24 | Five-loop beta function for gauge theories: computations, results and consequences | F. Herzog et.al. | 2510.21624 | null |
| 2025-10-24 | DeepAgent: A General Reasoning Agent with Scalable Toolsets | Xiaoxi Li et.al. | 2510.21618 | null |
| 2025-10-24 | Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine | Wenyi Wang et.al. | 2510.21614 | null |
| 2025-10-24 | Doc-Researcher: A Unified System for Multimodal Document Parsing and Deep Research | Kuicai Dong et.al. | 2510.21603 | null |
| 2025-10-24 | EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law | Ilija Lichkovski et.al. | 2510.21524 | null |
| 2025-10-24 | OpenHype: Hyperbolic Embeddings for Hierarchical Open-Vocabulary Radiance Fields | Lisa Weijler et.al. | 2510.21441 | null |
| 2025-10-24 | Context Engineering for AI Agents in Open-Source Software | Seyedmoein Mohsenimofidi et.al. | 2510.21413 | null |
| 2025-10-24 | HIKMA: Human-Inspired Knowledge by Machine Agents through a Multi-Agent Framework for Semi-Autonomous Scientific Conferences | Zain Ul Abideen Tariq et.al. | 2510.21370 | null |
| 2025-10-24 | Magellan: Guided MCTS for Latent Space Exploration and Novelty Generation | Lufan Chang et.al. | 2510.21341 | null |
| 2025-10-24 | Towards Reliable Code-as-Policies: A Neuro-Symbolic Framework for Embodied Task Planning | Sanghyun Ahn et.al. | 2510.21302 | null |
| 2025-10-24 | Securing AI Agent Execution | Christoph Bühler et.al. | 2510.21236 | null |
| 2025-10-24 | DispatchMAS: Fusing taxonomy and artificial intelligence agents for emergency medical services | Xiang Li et.al. | 2510.21228 | null |
| 2025-10-24 | DAO-AI: Evaluating Collective Decision-Making through Agentic AI in Decentralized Governance | Chunghyun Han et.al. | 2510.21117 | null |
| 2025-10-24 | Soft Instruction De-escalation Defense | Nils Philipp Walter et.al. | 2510.21057 | null |
| 2025-10-24 | Mixture-of-Minds: Multi-Agent Reinforcement Learning for Table Understanding | Yuhang Zhou et.al. | 2510.20176 | null |
| 2025-10-23 | From Questions to Queries: An AI-powered Multi-Agent Framework for Spatial Text-to-SQL | Ali Khosravi Kazazi et.al. | 2510.21045 | null |
| 2025-10-23 | AgentArcEval: An Architecture Evaluation Method for Foundation Model based Agents | Qinghua Lu et.al. | 2510.21031 | null |
| 2025-10-23 | Co-Designing Quantum Codes with Transversal Diagonal Gates via Multi-Agent Systems | Xi He et.al. | 2510.20728 | null |
| 2025-10-23 | C-NAV: Towards Self-Evolving Continual Object Navigation in Open World | Ming-Ming Yu et.al. | 2510.20685 | null |
| 2025-10-23 | Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence | Jiahao Meng et.al. | 2510.20579 | null |
| 2025-10-23 | EmbodiedBrain: Expanding Performance Boundaries of Task Planning for Embodied Intelligence | Ding Zou et.al. | 2510.20578 | null |
| 2025-10-23 | Designing Intent Communication for Agent-Human Collaboration | Yi Li et.al. | 2510.20409 | null |
| 2025-10-23 | Balancing Specialization and Centralization: A Multi-Agent Reinforcement Learning Benchmark for Sequential Industrial Control | Tom Maus et.al. | 2510.20408 | null |
| 2025-10-23 | GhostEI-Bench: Do Mobile Agents Resilience to Environmental Injection in Dynamic On-Device Environments? | Chiyu Chen et.al. | 2510.20333 | null |
| 2025-10-23 | From Generation to Attribution: Music AI Agent Architectures for the Post-Streaming Era | Wonil Kim et.al. | 2510.20276 | null |
| 2025-10-23 | ImpossibleBench: Measuring LLMs’ Propensity of Exploiting Test Cases | Ziqian Zhong et.al. | 2510.20270 | null |
| 2025-10-23 | Towards AI Agents for Course Instruction in Higher Education: Early Experiences from the Field | Yogesh Simmhan et.al. | 2510.20255 | null |
| 2025-10-23 | Automated Cloud Infrastructure-as-Code Reconciliation with AI Agents | Zhenning Yang et.al. | 2510.20211 | null |
| 2025-10-23 | Merge and Conquer: Evolutionarily Optimizing AI for 2048 | Maggie Bai et.al. | 2510.20205 | null |
| 2025-10-23 | Human-Centered LLM-Agent System for Detecting Anomalous Digital Asset Transactions | Gyuyeon Na et.al. | 2510.20102 | null |
| 2025-10-22 | ToolScope: Enhancing LLM Agent Tool Use through Tool Merging and Context-Aware Filtering | Marianne Menglin Liu et.al. | 2510.20036 | null |
| 2025-10-22 | Communication to Completion: Modeling Collaborative Workflows with Intelligent Multi-Agent Communication | Yiming Lu et.al. | 2510.19995 | null |
| 2025-10-22 | A Tutorial on Cognitive Biases in Agentic AI-Driven 6G Autonomous Networks | Hatim Chergui et.al. | 2510.19973 | null |
| 2025-10-22 | Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets | Jiashi Feng et.al. | 2510.19944 | null |
| 2025-10-22 | Learning from Supervision with Semantic and Episodic Memory: A Reflective Approach to Agent Adaptation | Jackson Hassell et.al. | 2510.19897 | null |
| 2025-10-22 | Large Language Model enabled Mathematical Modeling | Guoyun Zhang et.al. | 2510.19895 | null |
| 2025-10-22 | Beyond Reactivity: Measuring Proactive Problem Solving in LLM Agents | Gil Pasternak et.al. | 2510.19771 | null |
| 2025-10-22 | Review of Tools for Zero-Code LLM Based Application Development | Priyaranjan Pattnayak et.al. | 2510.19747 | null |
| 2025-10-22 | Misalignment Bounty: Crowdsourcing AI Agent Misbehavior | Rustem Turtayev et.al. | 2510.19738 | null |
| 2025-10-22 | Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning | Gunshi Gupta et.al. | 2510.19732 | null |
| 2025-10-22 | Are Large Language Models Sensitive to the Motives Behind Communication? | Addison J. Wu et.al. | 2510.19687 | null |
| 2025-10-22 | Pragmatic Heterogeneous Collaborative Perception via Generative Communication Mechanism | Junfei Zhou et.al. | 2510.19618 | null |
| 2025-10-22 | Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1 | Qianli Ma et.al. | 2510.19600 | null |
| 2025-10-22 | gem5 Co-Pilot: AI Assistant Agent for Architectural Design Space Exploration | Zuoming Fu et.al. | 2510.19577 | null |
| 2025-10-22 | AegisMCP: Online Graph Intrusion Detection for Tool-Augmented LLMs on Edge Devices | Zhonghao Zhan et.al. | 2510.19462 | null |
| 2025-10-22 | MSC-Bench: A Rigorous Benchmark for Multi-Server Tool Orchestration | Jia-Kai Dong et.al. | 2510.19423 | null |
| 2025-10-22 | ColorAgent: Building A Robust, Personalized, and Interactive OS Agent | Ning Li et.al. | 2510.19386 | null |
| 2025-10-22 | Nonmonotone subgradient methods based on a local descent lemma | Francisco J. Aragón-Artacho et.al. | 2510.19341 | null |
| 2025-10-22 | Learning to Make Friends: Coaching LLM Agents toward Emergent Social Ties | Philipp J. Schneider et.al. | 2510.19299 | null |
| 2025-10-22 | Trace: Securing Smart Contract Repository Against Access Control Vulnerability | Chong Chen et.al. | 2510.19254 | null |
| 2025-10-22 | SheetBrain: A Neuro-Symbolic Agent for Accurate Reasoning over Complex and Large Spreadsheets | Ziwei Wang et.al. | 2510.19247 | null |
| 2025-10-22 | DiSRouter: Distributed Self-Routing for LLM Selections | Hang Zheng et.al. | 2510.19208 | null |
| 2025-10-22 | Defending Against Prompt Injection with DataFilter | Yizhu Wang et.al. | 2510.19207 | null |
| 2025-10-22 | WebGraphEval: Multi-Turn Trajectory Evaluation for Web Agents using Graph Representation | Yaoyao Qian et.al. | 2510.19205 | null |
| 2025-10-21 | When Your AI Agent Succumbs to Peer-Pressure: Studying Opinion-Change Dynamics of LLMs | Aliakbar Mehdizadeh et.al. | 2510.19107 | null |
| 2025-10-21 | Plural Voices, Single Agent: Towards Inclusive AI in Multi-User Domestic Spaces | Joydeep Chandra et.al. | 2510.19008 | null |
| 2025-10-21 | Search Self-play: Pushing the Frontier of Agent Capability without Supervision | Hongliang Lu et.al. | 2510.18821 | null |
| 2025-10-21 | WebSeer: Training Deeper Search Agents through Reinforcement Learning with Self-Reflection | Guanzhong He et.al. | 2510.18798 | null |
| 2025-10-21 | KAT-Coder Technical Report | Zizheng Zhan et.al. | 2510.18779 | null |
| 2025-10-21 | Fetch.ai: An Architecture for Modern Multi-Agent Systems | Michael J. Wooldridge et.al. | 2510.18699 | null |
| 2025-10-21 | Tokencake: A KV-Cache-centric Serving Framework for LLM-based Multi-Agent Applications | Zhuohang Bian et.al. | 2510.18586 | null |
| 2025-10-21 | WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality | Chunyang Li et.al. | 2510.18560 | null |
| 2025-10-21 | SOCIA-Nabla: Textual Gradient Meets Multi-Agent Orchestration for Automated Simulator Generation | Yuncheng Hua et.al. | 2510.18551 | null |
| 2025-10-21 | JAUNT: Joint Alignment of User Intent and Network State for QoE-centric LLM Tool Routing | Enhan Li et.al. | 2510.18550 | null |
| 2025-10-21 | EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval | Zebin Yang et.al. | 2510.18546 | null |
| 2025-10-21 | Socialized Learning and Emergent Behaviors in Multi-Agent Systems based on Multimodal Large Language Models | Sureyya Akin et.al. | 2510.18515 | null |
| 2025-10-21 | Crucible: Quantifying the Potential of Control Algorithms through LLM Agents | Lianchen Jia et.al. | 2510.18491 | null |
| 2025-10-21 | LAFA: Agentic LLM-Driven Federated Analytics over Decentralized Data Sources | Haichao Ji et.al. | 2510.18477 | null |
| 2025-10-21 | Probabilistic Modeling of Intentions in Socially Intelligent LLM Agents | Feifan Xia et.al. | 2510.18476 | null |
| 2025-10-21 | Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents | Guangfu Guo et.al. | 2510.18424 | null |
| 2025-10-21 | Memory-Augmented State Machine Prompting: A Novel LLM Agent Framework for Real-Time Strategy Games | Runnan Qi et.al. | 2510.18395 | null |
| 2025-10-21 | MENTOR: A Reinforcement Learning Framework for Model Enhancement via Teacher-Optimized Rewards in Small Models | ChangSu Choi et.al. | 2510.18383 | null |
| 2025-10-21 | InspectCoder: Dynamic Analysis-Enabled Self Repair through interactive LLM-Debugger Collaboration | Yunkun Wang et.al. | 2510.18327 | null |
| 2025-10-21 | Earth AI: Unlocking Geospatial Insights with Foundation Models and Cross-Modal Reasoning | Aaron Bell et.al. | 2510.18318 | null |
| 2025-10-21 | Genesis: Evolving Attack Strategies for LLM Web Agent Red-Teaming | Zheng Zhang et.al. | 2510.18314 | null |
| 2025-10-21 | Food4All: A Multi-Agent Framework for Real-time Free Food Discovery with Integrated Nutritional Metadata | Zhengqing Yuan et.al. | 2510.18289 | null |
| 2025-10-21 | Optimal allocations with distortion risk measures and mixed risk attitudes | Mario Ghossoub et.al. | 2510.18236 | null |
| 2025-10-21 | Applying voxel-based analysis to oropharyngeal cancer proton therapy patients: a correlation study on radiation-induced acute dysphagia | Qianxia Wang et.al. | 2510.18210 | null |
| 2025-10-21 | Adaptive Coopetition: Leveraging Coarse Verifier Signals for Resilient Multi-Agent LLM Reasoning | Rui Jerry Huang et.al. | 2510.18179 | null |
| 2025-10-21 | NEBULA: Do We Evaluate Vision-Language-Action Agents Correctly? | Jierui Peng et.al. | 2510.16263 | null |
| 2025-10-21 | SentinelNet: Safeguarding Multi-Agent Collaboration Through Credit-Based Dynamic Threat Detection | Yang Feng et.al. | 2510.16219 | null |
| 2025-10-21 | PokeeResearch: Effective Deep Research via Reinforcement Learning from AI Feedback and Robust Reasoning Scaffold | Yi Wan et.al. | 2510.15862 | null |
| 2025-10-21 | FinAI Data Assistant: LLM-based Financial Database Query Processing with the OpenAI Function Calling API | Juhyeong Kim et.al. | 2510.14162 | null |
| 2025-10-21 | A $^2$ FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning | Qianben Chen et.al. | 2510.12838 | null |
| 2025-10-20 | AgentChangeBench: A Multi-Dimensional Evaluation Framework for Goal-Shift Robustness in Conversational AI | Manik Rana et.al. | 2510.18170 | null |
| 2025-10-20 | World-in-World: World Models in a Closed-Loop World | Jiahan Zhang et.al. | 2510.18135 | null |
| 2025-10-20 | SafeCoop: Unravelling Full Stack Safety in Agentic Collaborative Driving | Xiangbo Gao et.al. | 2510.18123 | null |
| 2025-10-20 | Investigating the Impact of Dark Patterns on LLM-Based Web Agents | Devin Ersoy et.al. | 2510.18113 | null |
| 2025-10-20 | Does Reasoning Help LLM Agents Play Dungeons and Dragons? A Prompt Engineering Experiment | Patricia Delafuente et.al. | 2510.18112 | null |
| 2025-10-20 | CompactPrompt: A Unified Pipeline for Prompt Data Compression in LLM Workflows | Joong Ho Choi et.al. | 2510.18043 | null |
| 2025-10-20 | OPTAGENT: Optimizing Multi-Agent LLM Interactions Through Verbal Reinforcement Learning for Enhanced Reasoning | Zhenyu Bi et.al. | 2510.18032 | null |
| 2025-10-20 | FABRIC: Framework for Agent-Based Realistic Intelligence Creation | Abhigya Verma et.al. | 2510.17995 | null |
| 2025-10-20 | PLAGUE: Plug-and-play framework for Lifelong Adaptive Generation of Multi-turn Exploits | Neeladri Bhuiya et.al. | 2510.17947 | null |
| 2025-10-20 | Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics | Akshara Prabhakar et.al. | 2510.17797 | null |
| 2025-10-20 | Executable Knowledge Graphs for Replicating AI Research | Yujie Luo et.al. | 2510.17795 | null |
| 2025-10-20 | A Mimamsa Inspired Framework For Instruction Sequencing In AI Agents | Bama Srinivasan et.al. | 2510.17691 | null |
| 2025-10-20 | ShapeCraft: LLM Agents for Structured, Textured and Interactive 3D Modeling | Shuyuan Zhang et.al. | 2510.17603 | null |
| 2025-10-20 | MIRAGE: Agentic Framework for Multimodal Misinformation Detection with Web-Grounded Reasoning | Mir Nafis Sharear Shopnil et.al. | 2510.17590 | null |
| 2025-10-20 | Cybersecurity AI: Evaluating Agentic Cybersecurity in Attack/Defense CTFs | Francesco Balassone et.al. | 2510.17521 | null |
| 2025-10-20 | Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents | Yihong Tang et.al. | 2510.17491 | null |
| 2025-10-20 | Agentic Reinforcement Learning for Search is Unsafe | Yushi Yang et.al. | 2510.17431 | null |
| 2025-10-20 | Diverse Planning with Simulators via Linear Temporal Logic | Mustafa F. Abdelwahed et.al. | 2510.17418 | null |
| 2025-10-20 | Breaking and Fixing Defenses Against Control-Flow Hijacking in Multi-Agent Systems | Rishi Jha et.al. | 2510.17276 | null |
| 2025-10-20 | Coinvisor: An RL-Enhanced Chatbot Agent for Interactive Cryptocurrency Investment Analysis | Chong Chen et.al. | 2510.17235 | null |
| 2025-10-20 | ALPINE: A Lightweight and Adaptive Privacy-Decision Agent Framework for Dynamic Edge Crowdsensing | Guanjie Cheng et.al. | 2510.17162 | null |
| 2025-10-20 | Decentralized Real-Time Planning for Multi-UAV Cooperative Manipulation via Imitation Learning | Shantnav Agarwal et.al. | 2510.17143 | null |
| 2025-10-20 | Do LLMs Recognize Your Latent Preferences? A Benchmark for Latent Information Discovery in Personalized Interaction | Ioannis Tsaknakis et.al. | 2510.17132 | null |
| 2025-10-20 | Semantic Intelligence: A Bio-Inspired Cognitive Framework for Embodied Agents | Wenbing Tang et.al. | 2510.17129 | null |
| 2025-10-20 | Verification-Aware Planning for Multi-Agent Systems | Tianyang Xu et.al. | 2510.17109 | null |
| 2025-10-20 | Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models | Elias Hossain et.al. | 2510.17098 | null |
| 2025-10-20 | A Brain Cell Type Resource Created by Large Language Models and a Multi-Agent AI System for Collaborative Community Annotation | Rongbin Li et.al. | 2510.17064 | null |
| 2025-10-20 | Consistent Zero-Shot Imitation with Contrastive Goal Inference | Kathryn Wantlin et.al. | 2510.17059 | null |
| 2025-10-20 | Echoes of Human Malice in Agents: Benchmarking LLMs for Multi-Turn Online Harassment Attacks | Trilok Padhi et.al. | 2510.14207 | null |
| 2025-10-19 | ToolCritic: Detecting and Correcting Tool-Use Errors in Dialogue Systems | Hassan Hamad et.al. | 2510.17052 | null |
| 2025-10-19 | ReclAIm: A multi-agent framework for degradation-aware performance tuning of medical imaging AI | Eleftherios Tzanis et.al. | 2510.17004 | null |
| 2025-10-19 | EEschematic: Multimodal-LLM Based AI Agent for Schematic Generation of Analog Circuit | Chang Liu et.al. | 2510.17002 | null |
| 2025-10-19 | STARK: Strategic Team of Agents for Refining Kernels | Juncheng Dong et.al. | 2510.16996 | null |
| 2025-10-19 | Towards Interpretable and Trustworthy Time Series Reasoning: A BlueSky Vision | Kanghui Ning et.al. | 2510.16980 | null |
| 2025-10-19 | Lark: Biologically Inspired Neuroevolution for Multi-Stakeholder LLM Agents | Dheeraj Chintapalli et.al. | 2510.16978 | null |
| 2025-10-19 | Learning Ecology with VERA Using Conceptual Models and Simulations | Spencer Rugaber et.al. | 2510.16944 | null |
| 2025-10-19 | VAGEN: Reinforcing World Model Reasoning for Multi-Turn VLM Agents | Kangrui Wang et.al. | 2510.16907 | null |
| 2025-10-19 | Agentic Inequality | Matthew Sharp et.al. | 2510.16853 | null |
| 2025-10-19 | FinSight: Towards Real-World Financial Deep Research | Jiajie Jin et.al. | 2510.16844 | null |
| 2025-10-19 | More with Less: An Empirical Study of Turn-Control Strategies for Efficient Coding Agents | Pengfei Gao et.al. | 2510.16786 | null |
| 2025-10-19 | Beyond Pipelines: A Survey of the Paradigm Shift toward Model-Native Agentic AI | Jitao Sang et.al. | 2510.16720 | null |
| 2025-10-19 | An Agentic Framework with LLMs for Solving Complex Vehicle Routing Problems | Ni Zhang et.al. | 2510.16701 | null |
| 2025-10-19 | Pursuing Minimal Sufficiency in Spatial Reasoning | Yejie Guo et.al. | 2510.16688 | null |
| 2025-10-19 | Agentic Design of Compositional Machines | Wenqian Zhang et.al. | 2510.14980 | null |
| 2025-10-18 | Unleashing Diverse Thinking Modes in LLMs through Multi-Agent Collaboration | Zhixuan He et.al. | 2510.16645 | null |
| 2025-10-18 | Prompt Optimization via Retrieved Reasoning Assets and Multi-Agent Analysis | Wonduk Seo et.al. | 2510.16635 | null |
| 2025-10-18 | Prior Makes It Possible: From Sublinear Graph Algorithms to LLM Test-Time Methods | Avrim Blum et.al. | 2510.16609 | null |
| 2025-10-18 | Ripple Effect Protocol: Coordinating Agent Populations | Ayush Chopra et.al. | 2510.16572 | null |
| 2025-10-18 | BuildArena: A Physics-Aligned Interactive Benchmark of LLMs for Engineering Construction | Tian Xia et.al. | 2510.16559 | null |
| 2025-10-18 | Check Yourself Before You Wreck Yourself: Selectively Quitting Improves LLM Agent Safety | Vamshi Krishna Bonagiri et.al. | 2510.16492 | null |
| 2025-10-18 | REALM: An MLLM-Agent Framework for Open World 3D Reasoning Segmentation and Editing on Gaussian Splatting | Changyue Shi et.al. | 2510.16410 | null |
| 2025-10-18 | ATA: A Neuro-Symbolic Approach to Implement Autonomous and Trustworthy Agents | David Peer et.al. | 2510.16381 | null |
| 2025-10-18 | Synergizing chemical and AI communities for advancing laboratories of the future | Saejin Oh et.al. | 2510.16293 | null |
| 2025-10-17 | Outraged AI: Large language models prioritise emotion over cost in fairness enforcement | Hao Liu et.al. | 2510.17880 | null |
| 2025-10-17 | WEBSERV: A Browser-Server Environment for Efficient Training of Reinforcement Learning-based Web Agents at Scale | Yuxuan Lu et.al. | 2510.16252 | null |
| 2025-10-17 | Towards Automatic Evaluation and Selection of PHI De-identification Models via Multi-Agent Collaboration | Guanchen Wu et.al. | 2510.16194 | null |
| 2025-10-17 | Agentic AI for Ultra-Modern Networks: Multi-Agent Framework for RAN Autonomy and Assurance | Sukhdeep Singh et.al. | 2510.16144 | null |
| 2025-10-17 | Narrowing Action Choices with AI Improves Human Sequential Decisions | Eleni Straitouri et.al. | 2510.16097 | null |
| 2025-10-17 | TriAgent: Automated Biomarker Discovery with Deep Research Grounding for Triage in Acute Care by LLM-Based Multi-Agent Collaboration | Kerem Delikoyun et.al. | 2510.16080 | null |
| 2025-10-17 | EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle | Rong Wu et.al. | 2510.16079 | null |
| 2025-10-17 | SIADAFIX: issue description response for adaptive program repair | Xin Cao et.al. | 2510.16059 | null |
| 2025-10-17 | PolySkill: Learning Generalizable Skills Through Polymorphic Abstraction | Simon Yu et.al. | 2510.15863 | null |
| 2025-10-17 | Self-evolving expertise in complex non-verifiable subject domains: dialogue as implicit meta-RL | Richard M. Bailey et.al. | 2510.15772 | null |
| 2025-10-17 | AURA: An Agent Autonomy Risk Assessment Framework | Lorenzo Satta Chiris et.al. | 2510.15739 | null |
| 2025-10-17 | Build Your Personalized Research Group: A Multiagent Framework for Continual and Interactive Science Automation | Ed Li et.al. | 2510.15624 | null |
| 2025-10-17 | The Spark Effect: On Engineering Creative Diversity in Multi-Agent AI Systems | Alexander Doudkin et.al. | 2510.15568 | null |
| 2025-10-17 | MARS: Reinforcing Multi-Agent Reasoning of LLMs through Self-Play in Strategic Games | Huining Yuan et.al. | 2510.15414 | null |
| 2025-10-17 | SHARE: Scene-Human Aligned Reconstruction | Joshua Li et.al. | 2510.15342 | null |
| 2025-10-17 | VERA-MH Concept Paper | Luca Belli et.al. | 2510.15297 | null |
| 2025-10-17 | Exemplar-Guided Planing: Enhanced LLM Agent for KGQA | Jingao Xu et.al. | 2510.15283 | null |
| 2025-10-17 | Experience-Driven Exploration for Efficient API-Free AI Agents | Chenwei Tang et.al. | 2510.15259 | null |
| 2025-10-17 | Multi-dimensional Data Analysis and Applications Basing on LLM Agents and Knowledge Graph Interactions | Xi Wang et.al. | 2510.15258 | null |
| 2025-10-17 | Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding | Sensen Gao et.al. | 2510.15253 | null |
| 2025-10-17 | Where to Search: Measure the Prior-Structured Search Space of LLM Agents | Zhuo-Yang Song et.al. | 2510.14846 | null |
| 2025-10-16 | GUIrilla: A Scalable Framework for Automated Desktop UI Exploration | Sofiya Garkot et.al. | 2510.16051 | null |
| 2025-10-16 | MAGPIE: A benchmark for Multi-AGent contextual PrIvacy Evaluation | Gurusha Juneja et.al. | 2510.15186 | null |
| 2025-10-16 | Internalizing World Models via Self-Play Finetuning for Agentic RL | Shiqi Chen et.al. | 2510.15047 | null |
| 2025-10-16 | Generalized Dynamics Generation towards Scannable Physical World Model | Yichen Li et.al. | 2510.15041 | null |
| 2025-10-16 | UrbanVerse: Scaling Urban Simulation by Watching City-Tour Videos | Mingxuan Liu et.al. | 2510.15018 | null |
| 2025-10-16 | Data-driven Calibration Sample Selection and Forecast Combination in Electricity Price Forecasting: An Application of the ARHNN Method | Tomasz Serafin et.al. | 2510.15011 | null |
| 2025-10-16 | Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents | Guoqing Wang et.al. | 2510.14967 | null |
| 2025-10-16 | VLA^2: Empowering Vision-Language-Action Models with an Agentic Framework for Unseen Concept Manipulation | Han Zhao et.al. | 2510.14902 | null |
| 2025-10-16 | The Gatekeeper Knows Enough | Fikresilase Wondmeneh Abebayew et.al. | 2510.14881 | null |
| 2025-10-16 | LabOS: The AI-XR Co-Scientist That Sees and Works With Humans | Le Cong et.al. | 2510.14861 | null |
| 2025-10-16 | RoboGPT-R1: Enhancing Robot Planning with Reinforcement Learning | Jinrui Liu et.al. | 2510.14828 | null |
| 2025-10-16 | To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models | Eran Malach et.al. | 2510.14826 | null |
| 2025-10-16 | ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling | Jianghao Lin et.al. | 2510.14703 | null |
| 2025-10-16 | LLM Agents for Automated Web Vulnerability Reproduction: Are We There Yet? | Bin Liu et.al. | 2510.14700 | null |
| 2025-10-16 | LLM Agents Beyond Utility: An Open-Ended Perspective | Asen Nachkov et.al. | 2510.14548 | null |
| 2025-10-16 | Agentic Entropy-Balanced Policy Optimization | Guanting Dong et.al. | 2510.14545 | null |
| 2025-10-16 | Helmsman: Autonomous Synthesis of Federated Learning Systems via Multi-Agent Collaboration | Haoyuan Li et.al. | 2510.14512 | null |
| 2025-10-16 | LiRA: Linguistic Robust Anchoring for Cross-lingual Large Language Models | Haolin Li et.al. | 2510.14466 | null |
| 2025-10-16 | Towards Automated Governance: A DSL for Human-Agent Collaboration in Software Projects | Adem Ait et.al. | 2510.14465 | null |
| 2025-10-16 | Why Instant-Runoff Voting Is So Resilient to Coalitional Manipulation: Phase Transitions in the Perturbed Culture | François Durand et.al. | 2510.14450 | null |
| 2025-10-16 | Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents | Rui Wang et.al. | 2510.14438 | null |
| 2025-10-16 | Bounds and asymptotic expansions for the radii of convexity and uniform convexity of normalized Bessel functions | Árpád Baricz et.al. | 2510.14323 | null |
| 2025-10-16 | Terrarium: Revisiting the Blackboard for Multi-Agent Safety, Privacy, and Security Studies | Mason Nakamura et.al. | 2510.14312 | null |
| 2025-10-16 | ReUseIt: Synthesizing Reusable AI Agent Workflows for Web Automation | Yimeng Liu et.al. | 2510.14308 | null |
| 2025-10-16 | AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement Learning Framework for Stock Trading | Zheye Deng et.al. | 2510.14264 | null |
| 2025-10-16 | MAFA: A Multi-Agent Framework for Enterprise-Scale Annotation with Configurable Task Adaptation | Mahmood Hegazy et.al. | 2510.14184 | null |
| 2025-10-16 | Training LLM Agents to Empower Humans | Evan Ellis et.al. | 2510.13709 | null |
| 2025-10-16 | OpenDerisk: An Industrial Framework for AI-Driven SRE, with Design, Implementation, and Case Studies | Peng Di et.al. | 2510.13561 | null |
| 2025-10-16 | SVAG-Bench: A Large-Scale Benchmark for Multi-Instance Spatio-temporal Video Action Grounding | Tanveer Hannan et.al. | 2510.13016 | null |
| 2025-10-16 | Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics | Marco Del Tredici et.al. | 2510.12787 | null |
| 2025-10-16 | Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning | Xingang Guo et.al. | 2510.12712 | null |
| 2025-10-16 | MetaCaptioner: Towards Generalist Visual Captioning with Open-source Suites | Zhenxin Lei et.al. | 2510.12126 | null |
| 2025-10-15 | When “Correct” Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents? | Yibo Peng et.al. | 2510.17862 | null |
| 2025-10-15 | CiteGuard: Faithful Citation Attribution for LLMs via Retrieval-Augmented Validation | Yee Man Choi et.al. | 2510.17853 | null |
| 2025-10-15 | CodeEvolve: An open source evolutionary coding agent for algorithm discovery and optimization | Henrique Assumpção et.al. | 2510.14150 | null |
| 2025-10-15 | Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems | Edoardo Allegrini et.al. | 2510.14133 | null |
| 2025-10-15 | Cortex: Workflow-Aware Resource Pooling and Scheduling for Agentic Serving | Nikos Pagonas et.al. | 2510.14126 | null |
| 2025-10-15 | STEMS: Spatial-Temporal Enhanced Safe Multi-Agent Coordination for Building Energy Management | Huiliang Zhang et.al. | 2510.14112 | null |
| 2025-10-15 | Three-Dimensional Simulation of the University of Hawai`i FEL Oscillator: Superradiant Emission and Cavity Desynchronization | Amir Weinberg et.al. | 2510.14061 | null |
| 2025-10-15 | Sequential Quantum Measurements and the Instrumental Group Algebra | Christopher S. Jackson et.al. | 2510.13980 | null |
| 2025-10-15 | An LLM-Powered AI Agent Framework for Holistic IoT Traffic Interpretation | Daniel Adu Worae et.al. | 2510.13925 | null |
| 2025-10-15 | FACTS: Table Summarization via Offline Template Generation with Agentic Workflows | Ye Yuan et.al. | 2510.13920 | null |
| 2025-10-15 | Synthesizing Agentic Data for Web Agents with Progressive Difficulty Enhancement Mechanisms | Shrey Pandit et.al. | 2510.13913 | null |
| 2025-10-15 | RECODE: Reasoning Through Code Generation for Visual Question Answering | Junhong Shen et.al. | 2510.13756 | null |
| 2025-10-15 | From Refusal to Recovery: A Control-Theoretic Approach to Generative AI Guardrails | Ravi Pandya et.al. | 2510.13727 | null |
| 2025-10-15 | Steer-MoE: Efficient Audio-Language Alignment with a Mixture-of-Experts Steering Module | Ruitao Feng et.al. | 2510.13558 | null |
| 2025-10-15 | Tandem Training for Language Models | Robert West et.al. | 2510.13551 | null |
| 2025-10-15 | In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers | Avihay Cohen et.al. | 2510.13543 | null |
| 2025-10-15 | MADREC: A Multi-Aspect Driven LLM Agent for Explainable and Adaptive Recommendation | Jiin Park et.al. | 2510.13371 | null |
| 2025-10-15 | Higher Satisfaction, Lower Cost: A Technical Report on How LLMs Revolutionize Meituan’s Intelligent Interaction Systems | Xuxin Cheng et.al. | 2510.13291 | null |
| 2025-10-15 | Automated Network Protocol Testing with LLM Agents | Yunze Wei et.al. | 2510.13248 | null |
| 2025-10-15 | EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic Systems | Yufei He et.al. | 2510.13220 | null |
| 2025-10-15 | Addressing the alignment problem in transportation policy making: an LLM approach | Xiaoyu Yan et.al. | 2510.13139 | null |
| 2025-10-14 | Using Kolmogorov-Smirnov Distance for Measuring Distribution Shift in Machine Learning | Ozan K. Tonguz et.al. | 2510.15996 | null |
| 2025-10-14 | MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents | Dongsen Zhang et.al. | 2510.15994 | null |
| 2025-10-14 | Benefits and Limitations of Communication in Multi-Agent Reasoning | Michael Rizvi-Martel et.al. | 2510.13903 | null |
| 2025-10-14 | GenCellAgent: Generalizable, Training-Free Cellular Image Segmentation via Large Language Model Agents | Xi Yu et.al. | 2510.13896 | null |
| 2025-10-14 | MultiFoodhat: A potential new paradigm for intelligent food quality inspection | Yue Hu et.al. | 2510.13889 | null |
| 2025-10-14 | Deliberate Lab: A Platform for Real-Time Human-AI Social Experiments | Crystal Qian et.al. | 2510.13011 | null |
| 2025-10-14 | SENTINEL: A Multi-Level Formal Framework for Safety Evaluation of LLM-based Embodied Agents | Simon Sinong Zhan et.al. | 2510.12985 | null |
| 2025-10-14 | From Literal to Liberal: A Meta-Prompting Framework for Eliciting Human-Aligned Exception Handling in Large Language Models | Imran Khan et.al. | 2510.12864 | null |
| 2025-10-14 | Three Lenses on the AI Revolution: Risk, Transformation, Continuity | Masoud Makrehchi et.al. | 2510.12859 | null |
| 2025-10-14 | VQArt-Bench: A semantically rich VQA Benchmark for Art and Cultural Heritage | A. Alfarano et.al. | 2510.12750 | null |
| 2025-10-14 | SPORTS: Simultaneous Panoptic Odometry, Rendering, Tracking and Segmentation for Urban Scenes Understanding | Zhiliu Yang et.al. | 2510.12749 | null |
| 2025-10-14 | Multi-Agent Debate for LLM Judges with Adaptive Stability Detection | Tianyu Hu et.al. | 2510.12697 | null |
| 2025-10-14 | ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning | Hanyang Chen et.al. | 2510.12693 | null |
| 2025-10-14 | Designing Tools with Control Confidence | Ajith Anil Meera et.al. | 2510.12630 | null |
| 2025-10-14 | A Survey of Vibe Coding with Large Language Models | Yuyao Ge et.al. | 2510.12399 | null |
| 2025-10-14 | GOAT: A Training Framework for Goal-Oriented Agent with Tools | Hyunji Min et.al. | 2510.12218 | null |
| 2025-10-14 | Agent-Based Simulation of a Financial Market with Large Language Models | Ryuji Hashimoto et.al. | 2510.12189 | null |
| 2025-10-14 | IL3D: A Large-Scale Indoor Layout Dataset for LLM-Driven 3D Scene Generation | Wenxu Zhou et.al. | 2510.12095 | null |
| 2025-10-14 | ToPolyAgent: AI Agents for Coarse-Grained Topological Polymer Simulations | Lijie Ding et.al. | 2510.12091 | null |
| 2025-10-14 | Evaluating the Quality of Randomness and Entropy in Tasks Supported by Large Language Models | Rabimba Karanjai et.al. | 2510.12080 | null |
| 2025-10-14 | EmboMatrix: A Scalable Training-Ground for Embodied Decision-Making | Zixing Lei et.al. | 2510.12072 | null |
| 2025-10-14 | AI Agents as Universal Task Solvers | Alessandro Achille et.al. | 2510.12066 | null |
| 2025-10-14 | Empowering LLM Agents with Geospatial Awareness: Toward Grounded Reasoning for Wildfire Response | Yiheng Chen et.al. | 2510.12061 | null |
| 2025-10-14 | On the Number of Small Points for Rational Maps | Jit Wu Yap et.al. | 2510.12039 | null |
| 2025-10-14 | ManiAgent: An Agentic Framework for General Robotic Manipulation | Yi Yang et.al. | 2510.11660 | null |
| 2025-10-14 | Stronger Together: On-Policy Reinforcement Learning for Collaborative LLMs | Yujie Zhao et.al. | 2510.11062 | null |
| 2025-10-13 | Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation | Sayash Kapoor et.al. | 2510.11977 | null |
| 2025-10-13 | Scaling Long-Horizon LLM Agent via Context-Folding | Weiwei Sun et.al. | 2510.11967 | null |
| 2025-10-13 | DMAS-Forge: A Framework for Transparent Deployment of AI Applications as Distributed Systems | Alessandro Cornacchia et.al. | 2510.11872 | null |
| 2025-10-13 | Demystifying Reinforcement Learning in Agentic Reasoning | Zhaochen Yu et.al. | 2510.11701 | null |
| 2025-10-13 | When Agents Trade: Live Multi-Market Trading Benchmark for LLM Agents | Lingfei Qian et.al. | 2510.11695 | null |
| 2025-10-13 | Chronologically Consistent Generative AI | Songrun He et.al. | 2510.11677 | null |
| 2025-10-13 | FinVet: A Collaborative Framework of RAG and External Fact-Checking Agents for Financial Misinformation Detection | Daniel Berhane Araya et.al. | 2510.11654 | null |
| 2025-10-13 | Analyzing and Internalizing Complex Policy Documents for LLM Agents | Jiateng Liu et.al. | 2510.11588 | null |
| 2025-10-13 | Uncertainty-Aware, Risk-Adaptive Access Control for Agentic Systems using an LLM-Judged TBAC Model | Charles Fleming et.al. | 2510.11414 | null |
| 2025-10-13 | DocReward: A Document Reward Model for Structuring and Stylizing | Junpeng Liu et.al. | 2510.11391 | null |
| 2025-10-13 | Evolution in Simulation: AI-Agent School with Dual Memory for High-Fidelity Educational Dynamics | Sheng Jin et.al. | 2510.11290 | null |
| 2025-10-13 | PADME: Procedure Aware DynaMic Execution | Deepeka Garg et.al. | 2510.11281 | null |
| 2025-10-13 | A Large-Language-Model Assisted Automated Scale Bar Detection and Extraction Framework for Scanning Electron Microscopic Images | Yuxuan Chen et.al. | 2510.11260 | null |
| 2025-10-13 | Collaborative Shadows: Distributed Backdoor Attacks in LLM-Based Multi-Agent Systems | Pengyu Zhu et.al. | 2510.11246 | null |
| 2025-10-13 | Attacks by Content: Automated Fact-checking is an AI Security Issue | Michael Schlichtkrull et.al. | 2510.11238 | null |
| 2025-10-13 | WebRouter: Query-specific Router via Variational Information Bottleneck for Cost-sensitive Web Agent | Tao Li et.al. | 2510.11221 | null |
| 2025-10-13 | Can Tool-Integrated Reinforcement Learning Generalize Across Diverse Domains? | Zhengyu Chen et.al. | 2510.11184 | null |
| 2025-10-13 | $How^{2}$ : How to learn from procedural How-to questions | Gautier Dagan et.al. | 2510.11144 | null |
| 2025-10-13 | video-SALMONN S: Streaming Audio-Visual LLMs Beyond Length Limits via Memory | Guangzhi Sun et.al. | 2510.11129 | null |
| 2025-10-13 | SusBench: An Online Benchmark for Evaluating Dark Pattern Susceptibility of Computer-Use Agents | Longjie Guo et.al. | 2510.11035 | null |
| 2025-10-13 | A Survey on Agentic Multimodal Large Language Models | Huanjin Yao et.al. | 2510.10991 | null |
| 2025-10-13 | Rethinking Reward Miscalibration of GRPO in Agentic RL | Jingyu Liu et.al. | 2509.23870 | null |
| 2025-10-13 | EvoEmo: Towards Evolved Emotional Policies for Adversarial LLM Agents in Multi-Turn Price Negotiation | Yunbo Long et.al. | 2509.04310 | null |
| 2025-10-12 | Zero-Shot Large Language Model Agents for Fully Automated Radiotherapy Treatment Planning | Dongrong Yang et.al. | 2510.11754 | null |
| 2025-10-12 | GraphTracer: Graph-Guided Failure Tracing in LLM Agents for Robust Multi-Turn Deep Search | Heng Zhang et.al. | 2510.10581 | null |
| 2025-10-12 | MedCoAct: Confidence-Aware Multi-Agent Collaboration for Complete Clinical Decision | Hongjie Zheng et.al. | 2510.10461 | null |
| 2025-10-12 | Retro*: Optimizing LLMs for Reasoning-Intensive Document Retrieval | Junwei Lan et.al. | 2509.24869 | null |
| 2025-10-12 | Talk Less, Call Right: Enhancing Role-Play LLM Agents with Automatic Prompt Optimization and Role Prompting | Saksorn Ruangtanusak et.al. | 2509.00482 | null |
| 2025-10-11 | KG-MAS: Knowledge Graph-Enhanced Multi-Agent Infrastructure for coupling physical and digital robotic environments | Walid Abdela et.al. | 2510.10325 | null |
| 2025-10-11 | Simulating Viva Voce Examinations to Evaluate Clinical Reasoning in Large Language Models | Christopher Chiu et.al. | 2510.10278 | null |
| 2025-10-11 | Don’t Just Fine-tune the Agent, Tune the Environment | Siyuan Lu et.al. | 2510.10197 | null |
| 2025-10-11 | ALLOY: Generating Reusable Agent Workflows from User Demonstration | Jiawen Li et.al. | 2510.10049 | null |
| 2025-10-11 | SwarmSys: Decentralized Swarm-Inspired Agents for Scalable and Adaptive Reasoning | Ruohao Li et.al. | 2510.10047 | null |
| 2025-10-11 | Leveraging Large Language Models for Cybersecurity Risk Assessment – A Case from Forestry Cyber-Physical Systems | Fikret Mert Gultekin et.al. | 2510.06343 | null |
| 2025-10-11 | Tree Search for LLM Agent Reinforcement Learning | Yuxiang Ji et.al. | 2509.21240 | null |
| 2025-10-11 | ASTREA: Introducing Agentic Intelligence for Orbital Thermal Autonomy | Alejandro D. Mousist et.al. | 2509.13380 | null |
| 2025-10-10 | Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics | Lianhao Zhou et.al. | 2510.09901 | null |
| 2025-10-10 | How can we assess human-agent interactions? Case studies in software agent design | Valerie Chen et.al. | 2510.09801 | null |
| 2025-10-10 | Building a Foundational Guardrail for General Agentic Systems via Synthetic Data | Yue Huang et.al. | 2510.09781 | null |
| 2025-10-10 | Preference-Aware Memory Update for Long-Term LLM Agents | Haoran Sun et.al. | 2510.09720 | null |
| 2025-10-10 | StreamingVLM: Real-Time Understanding for Infinite Video Streams | Ruyi Xu et.al. | 2510.09608 | null |
| 2025-10-10 | Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols | Mikhail Terekhov et.al. | 2510.09462 | null |
| 2025-10-10 | Safety Game: Balancing Safe and Informative Conversations with Blackbox Agentic AI using LP Solvers | Tuan Nguyen et.al. | 2510.09330 | null |
| 2025-10-10 | Fundamentals of Building Autonomous LLM Agents | Victor de Lamo Castrillo et.al. | 2510.09244 | null |
| 2025-10-10 | Leading the Follower: Learning Persuasive Agents in Social Deduction Games | Zhang Zheng et.al. | 2510.09087 | null |
| 2025-10-10 | When LLM Agents Meet Graph Optimization: An Automated Data Quality Improvement Approach | Zhihan Zhang et.al. | 2510.08952 | null |
| 2025-10-10 | Reimagining Agent-based Modeling with Large Language Model Agents via Shachi | So Kuroki et.al. | 2509.21862 | null |
| 2025-10-09 | CommandSans: Securing AI Agents with Surgical Precision Prompt Sanitization | Debeshee Das et.al. | 2510.08829 | null |
| 2025-10-09 | COMPASS: Enhancing Agent Long-Horizon Reasoning with Evolving Context | Guangya Wan et.al. | 2510.08790 | null |
| 2025-10-09 | Automating Android Build Repair: Bridging the Reasoning-Execution Gap in LLM Agents with Domain-Specific Tools | Ha Min Son et.al. | 2510.08640 | null |
| 2025-10-09 | CaRT: Teaching LLM Agents to Know When They Know Enough | Grace Liu et.al. | 2510.08517 | null |
| 2025-10-09 | Opponent Shaping in LLM Agents | Marta Emili Garcia Segura et.al. | 2510.08255 | null |
| 2025-10-09 | Simulating Teams with LLM Agents: Interactive 2D Environments for Studying Human-AI Dynamics | Mohammed Almutairi et.al. | 2510.08242 | null |
| 2025-10-09 | Training-Free Group Relative Policy Optimization | Yuzheng Cai et.al. | 2510.08191 | null |
| 2025-10-09 | AutoQual: An LLM Agent for Automated Discovery of Interpretable Features for Review Quality Assessment | Xiaochong Lan et.al. | 2510.08081 | null |
| 2025-10-09 | Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks | Cheng Yang et.al. | 2510.08002 | null |
| 2025-10-09 | Team Xiaomi EV-AD VLA: Learning to Navigate Socially Through Proactive Risk Perception – Technical Report for IROS 2025 RoboSense Challenge Social Navigation Track | Erjia Xiao et.al. | 2510.07871 | null |
| 2025-10-09 | Self-Improving LLM Agents at Test-Time | Emre Can Acikgoz et.al. | 2510.07841 | null |
| 2025-10-09 | Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models | Eric Hanchen Jiang et.al. | 2510.07799 | null |
| 2025-10-09 | Neuro-Symbolic Agents with Modal Logic for Autonomous Diagnostics | Antonin Sulc et.al. | 2509.11943 | null |
| 2025-10-08 | PARSE: LLM Driven Schema Optimization for Reliable Entity Extraction | Anubhav Shrimal et.al. | 2510.08623 | null |
| 2025-10-08 | L2M-AID: Autonomous Cyber-Physical Defense by Fusing Semantic Reasoning of Large Language Models with Multi-Agent Reinforcement Learning (Preprint) | Tianxiang Xu et.al. | 2510.07363 | null |
| 2025-10-08 | LAD-RAG: Layout-aware Dynamic RAG for Visually-Rich Document Understanding | Zhivar Sourati et.al. | 2510.07233 | null |
| 2025-10-08 | Customer-R1: Personalized Simulation of Human Behaviors via RL-based LLM Agent in Online Shopping | Ziyi Wang et.al. | 2510.07230 | null |
| 2025-10-08 | Exposing LLM User Privacy via Traffic Fingerprint Analysis: A Study of Privacy Risks in LLM Agent Interactions | Yixiang Zhang et.al. | 2510.07176 | null |
| 2025-10-08 | NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents | Tianshi Zheng et.al. | 2510.07172 | null |
| 2025-10-08 | Prompt Optimization Across Multiple Agents for Representing Diverse Human Populations | Manh Hung Nguyen et.al. | 2510.07064 | null |
| 2025-10-08 | COMPASS: A Multi-Turn Benchmark for Tool-Mediated Planning & Preference Optimization | Tian Qin et.al. | 2510.07043 | null |
| 2025-10-08 | LongRM: Revealing and Unlocking the Context Boundary of Reward Modeling | Zecheng Tang et.al. | 2510.06915 | null |
| 2025-10-08 | When Machines Meet Each Other: Network Effects and the Strategic Role of History in Multi-Agent AI | Yu Liu et.al. | 2510.06903 | null |
| 2025-10-08 | SID: Multi-LLM Debate Driven by Self Signals | Xuhang Chen et.al. | 2510.06843 | null |
| 2025-10-08 | Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management | Miao Lu et.al. | 2510.06727 | null |
| 2025-10-08 | WebDART: Dynamic Decomposition and Re-planning for Complex Web Tasks | Jingbo Yang et.al. | 2510.06587 | null |
| 2025-10-08 | Spiral of Silence in Large Language Model Agents | Mingze Zhong et.al. | 2510.02360 | null |
| 2025-10-08 | Toward Causal-Visual Programming: Enhancing Agentic Reasoning in Low-Code Environments | Jiexi Xu et.al. | 2509.25282 | null |
| 2025-10-07 | A Survey on Agentic Security: Applications, Threats and Defenses | Asif Shahriar et.al. | 2510.06445 | null |
| 2025-10-07 | Stratified GRPO: Handling Structural Heterogeneity in Reinforcement Learning of LLM Search Agents | Mingkang Zhu et.al. | 2510.06214 | null |
| 2025-10-07 | RECODE-H: A Benchmark for Research Code Development with Interactive Human Feedback | Chunyu Miao et.al. | 2510.06186 | null |
| 2025-10-07 | LLMs as Policy-Agnostic Teammates: A Case Study in Human Proxy Design for Heterogeneous Agent Teams | Aju Ani Justus et.al. | 2510.06151 | null |
| 2025-10-07 | Constraint-Aware Route Recommendation from Natural Language via Hierarchical LLM Agents | Tao Zhe et.al. | 2510.06078 | null |
| 2025-10-07 | Training-Free Time Series Classification via In-Context Reasoning with LLM Agents | Songyuan Sui et.al. | 2510.05950 | null |
| 2025-10-07 | EARL: Efficient Agentic Reinforcement Learning Systems for Large Language Models | Zheyue Tan et.al. | 2510.05943 | null |
| 2025-10-07 | LLM-FS-Agent: A Deliberative Role-based Large Language Model Architecture for Transparent Feature Selection | Mohamed Bal-Ghaoui et.al. | 2510.05935 | null |
| 2025-10-07 | Communication Enables Cooperation in LLM Agents: A Comparison with Curriculum-Based Approaches | Hachem Madmoun et.al. | 2510.05748 | null |
| 2025-10-07 | AutoPentester: An LLM Agent-based Framework for Automated Pentesting | Yasod Ginige et.al. | 2510.05605 | null |
| 2025-10-07 | AgentDR Dynamic Recommendation with Implicit Item-Item Relations via LLM-based Agents | Mingdai Yang et.al. | 2510.05598 | null |
| 2025-10-07 | From Agentification to Self-Evolving Agentic AI for Wireless Networks: Concepts, Approaches, and Future Research Directions | Changyuan Zhao et.al. | 2510.05596 | null |
| 2025-10-07 | BrowserArena: Evaluating LLM Agents on Real-World Web Navigation Tasks | Sagnik Anupam et.al. | 2510.02418 | null |
| 2025-10-06 | Adversarial Reinforcement Learning for Large Language Model Agent Safety | Zizhao Wang et.al. | 2510.05442 | null |
| 2025-10-06 | A Lightweight Large Language Model-Based Multi-Agent System for 2D Frame Structural Analysis | Ziheng Geng et.al. | 2510.05414 | null |
| 2025-10-06 | Plug-and-Play Dramaturge: A Divide-and-Conquer Approach for Iterative Narrative Script Refinement via Collaborative LLM Agents | Wenda Xie et.al. | 2510.05188 | null |
| 2025-10-06 | RL Is a Hammer and LLMs Are Nails: A Simple Reinforcement Learning Recipe for Strong Prompt Injection | Yuxin Wen et.al. | 2510.04885 | null |
| 2025-10-06 | Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails | Siwei Han et.al. | 2510.04860 | null |
| 2025-10-06 | Beyond Outcome Reward: Decoupling Search and Answering Improves LLM Agents | Yiding Wang et.al. | 2510.04695 | null |
| 2025-10-06 | Multi-Agent Tool-Integrated Policy Optimization | Zhanfeng Mo et.al. | 2510.04678 | null |
| 2025-10-06 | Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents | Zeyi Zhang et.al. | 2510.04637 | null |
| 2025-10-06 | Autonomy Matters: A Study on Personalization-Privacy Dilemma in LLM Agents | Zhiping Zhang et.al. | 2510.04465 | null |
| 2025-10-06 | Beyond Manuals and Tasks: Instance-Level Context Learning for LLM Agents | Kuntai Cai et.al. | 2510.02369 | null |
| 2025-10-05 | Internal World Models as Imagination Networks in Cognitive Agents | Saurabh Ranjan et.al. | 2510.04391 | null |
| 2025-10-05 | Just-in-time Episodic Feedback Hinter: Leveraging Offline Knowledge to Improve LLM Agents Adaptation | Hadi Nekoei et.al. | 2510.04373 | null |
| 2025-10-05 | Closing the Loop: Coordinating Inventory and Recommendation via Deep Reinforcement Learning on Multiple Timescales | Jinyang Jiang et.al. | 2510.04272 | null |
| 2025-10-05 | AgentRL: Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework | Hanchen Zhang et.al. | 2510.04206 | null |
| 2025-10-05 | Constructing coherent spatial memory in LLM agents through graph rectification | Puzhen Zhang et.al. | 2510.04195 | null |
| 2025-10-05 | From Shadow to Light: Toward Safe and Efficient Policy Learning Across MPC, DeePC, RL, and LLM Agents | Amin Vahidi-Moghaddam et.al. | 2510.04076 | null |
| 2025-10-04 | Adversarial Agent Collaboration for C to Rust Translation | Tianyu Li et.al. | 2510.03879 | null |
| 2025-10-04 | InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents | Yaxin Du et.al. | 2510.02271 | null |
| 2025-10-04 | Extracting Conceptual Knowledge to Locate Software Issues | Ying Wang et.al. | 2509.21427 | null |
| 2025-10-03 | VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation | Lesly Miculicich et.al. | 2510.05156 | null |
| 2025-10-03 | LLM Agents for Automated Dependency Upgrades | Vali Tawosi et.al. | 2510.03480 | null |
| 2025-10-03 | ALMAS: an Autonomous LLM-based Multi-Agent Software Engineering Framework | Vali Tawosi et.al. | 2510.03463 | null |
| 2025-10-03 | Improving GUI Grounding with Explicit Position-to-Coordinate Mapping | Suyuchen Wang et.al. | 2510.03230 | null |
| 2025-10-03 | CoDA: Agentic Systems for Collaborative Data Visualization | Zichen Chen et.al. | 2510.03194 | null |
| 2025-10-03 | AudioToolAgent: An Agentic Framework for Audio-Language Models | Gijs Wijngaard et.al. | 2510.02995 | null |
| 2025-10-03 | Beyond the Final Answer: Evaluating the Reasoning Trajectories of Tool-Augmented Agents | Wonjoong Kim et.al. | 2510.02837 | null |
| 2025-10-02 | AgentCaster: Reasoning-Guided Tornado Forecasting | Michael Chen et.al. | 2510.03349 | null |
| 2025-10-02 | Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge | Charlie Masters et.al. | 2510.02557 | null |
| 2025-10-02 | StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets? | Yanxu Chen et.al. | 2510.02209 | null |
| 2025-10-02 | TACOS: Task Agnostic COordinator of a multi-drone System | Alessandro Nazzari et.al. | 2510.01869 | null |
| 2025-10-02 | Pre-Hoc Predictions in AutoML: Leveraging LLMs to Enhance Model Selection and Benchmarking for Tabular datasets | Yannis Belkhiter et.al. | 2510.01842 | null |
| 2025-10-02 | GuruAgents: Emulating Wise Investors with Prompt-Guided LLM Agents | Yejin Kim et.al. | 2510.01664 | null |
| 2025-10-02 | SoK: Measuring What Matters for Closed-Loop Security Agents | Mudita Khurana et.al. | 2510.01654 | null |
| 2025-10-02 | Position: Privacy Is Not Just Memorization! | Niloofar Mireshghallah et.al. | 2510.01645 | null |
| 2025-10-02 | GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments | Hanlin Zhu et.al. | 2509.21998 | null |
| 2025-10-02 | Gala: Global LLM Agents for Text-to-Model Translation | Junyang Cai et.al. | 2509.08970 | null |
| 2025-10-01 | Automating Data-Driven Modeling and Analysis for Engineering Applications using Large Language Model Agents | Yang Liu et.al. | 2510.01398 | null |
| 2025-10-01 | Beyond Single LLMs: Enhanced Code Generation via Multi-Stage Performance-Guided LLM Orchestration | Huashan Chen et.al. | 2510.01379 | null |
| 2025-10-01 | Fine-tuning with RAG for Improving LLM Learning of New Skills | Humaid Ibrahim et.al. | 2510.01375 | null |
| 2025-10-01 | Breaking the Code: Security Assessment of AI Code Agents Through Systematic Jailbreaking Attacks | Shoumik Saha et.al. | 2510.01359 | null |
| 2025-10-01 | The Social Laboratory: A Psychometric Framework for Multi-Agent LLM Evaluation | Zarreen Reza et.al. | 2510.01295 | null |
| 2025-10-01 | TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments | Zhangchen Xu et.al. | 2510.01179 | null |
| 2025-10-01 | Social Welfare Function Leaderboard: When LLM Agents Allocate Social Welfare | Zhengliang Shi et.al. | 2510.01164 | null |
| 2025-10-01 | A Practitioner’s Guide to Multi-turn Agentic Reinforcement Learning | Ruiyi Wang et.al. | 2510.01132 | null |
| 2025-10-01 | QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL | Cong Yu et.al. | 2510.00967 | null |
| 2025-10-01 | ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs | Adi Simhi et.al. | 2510.00857 | null |
| 2025-10-01 | ACON: Optimizing Context Compression for Long-horizon LLM Agents | Minki Kang et.al. | 2510.00615 | null |
| 2025-10-01 | JoyAgent-JDGenie: Technical Report on the GAIA | Jiarun Liu et.al. | 2510.00510 | null |
| 2025-10-01 | Seeing through Uncertainty: Robust Task-Oriented Optimization in Visual Navigation | Yiyuan Pan et.al. | 2510.00441 | null |
| 2025-10-01 | RELATE-Sim: Leveraging Turning Point Theory and LLM Agents to Predict and Understand Long-Term Relationship Dynamics through Interactive Narrative Simulations | Matthew Yue et.al. | 2510.00414 | null |
| 2025-10-01 | Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs | Siyu Zhu et.al. | 2509.25779 | null |
| 2025-10-01 | Automatically Generating Web Applications from Requirements Via Multi-Agent Test-Driven Development | Yuxuan Wan et.al. | 2509.25297 | null |
| 2025-10-01 | Beyond the Strongest LLM: Multi-Turn Multi-Agent Orchestration vs. Single LLMs on Benchmarks | Aaron Xuxiang Tian et.al. | 2509.23537 | null |
| 2025-10-01 | On the Soundness and Consistency of LLM Agents for Executing Test Cases Written in Natural Language | Sébastien Salva et.al. | 2509.19136 | null |
| 2025-10-01 | A Multi-Agent LLM Defense Pipeline Against Prompt Injection Attacks | S M Asif Hossain et.al. | 2509.14285 | null |
| 2025-09-30 | From Trace to Line: LLM Agent for Real-World OSS Vulnerability Localization | Haoran Xi et.al. | 2510.02389 | null |
| 2025-09-30 | CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage | Bowen Wei et.al. | 2510.00311 | null |
| 2025-09-30 | Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents | Zhen Yang et.al. | 2509.26539 | null |
| 2025-09-30 | VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications | Wei He et.al. | 2509.26490 | null |
| 2025-09-30 | ErrorPrism: Reconstructing Error Propagation Paths in Cloud Service Systems | Junsong Pu et.al. | 2509.26463 | null |
| 2025-09-30 | Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents | Shuai Shao et.al. | 2509.26354 | null |
| 2025-09-30 | LLM Agents for Knowledge Discovery in Atomic Layer Processing | Andreas Werbrouck et.al. | 2509.26201 | null |
| 2025-09-30 | RoRecomp: Enhancing Reasoning Efficiency via Rollout Response Recomposition in Reinforcement Learning | Gang Li et.al. | 2509.25958 | null |
| 2025-09-30 | Mem-α: Learning Memory Construction via Reinforcement Learning | Yu Wang et.al. | 2509.25911 | null |
| 2025-09-30 | SafeMind: Benchmarking and Mitigating Safety Risks in Embodied LLM Agents | Ruolin Chen et.al. | 2509.25885 | null |
| 2025-09-30 | Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs | Hankun Dai et.al. | 2509.25873 | null |
| 2025-09-30 | STAC: When Innocent Tools Form Dangerous Chains to Jailbreak LLM Agents | Jing-Jing Li et.al. | 2509.25624 | null |
| 2025-09-30 | MASLegalBench: Benchmarking Multi-Agent Systems in Deductive Legal Reasoning | Huihao Jing et.al. | 2509.24922 | null |
| 2025-09-30 | TENET: Leveraging Tests Beyond Validation for Code Generation | Yiran Hu et.al. | 2509.24148 | null |
| 2025-09-30 | Dual-Scale World Models for LLM Agents Towards Hard-Exploration Problems | Minsoo Kim et.al. | 2509.24116 | null |
| 2025-09-30 | InfiAgent: Self-Evolving Pyramid Agent Framework for Infinite Scenarios | Chenglin Yu et.al. | 2509.22502 | null |
| 2025-09-30 | Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents | Davide Paglieri et.al. | 2509.03581 | null |
| 2025-09-30 | Towards Agentic OS: An LLM Agent Framework for Linux Schedulers | Yusheng Zheng et.al. | 2509.01245 | null |
| 2025-09-29 | A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory | Qianshan Wei et.al. | 2510.02373 | null |
| 2025-09-29 | Causal Autoencoder-like Generation of Feedback Fuzzy Cognitive Maps with an LLM Agent | Akash Kumar Panda et.al. | 2509.25593 | null |
| 2025-09-29 | RadOnc-GPT: An Autonomous LLM Agent for Real-Time Patient Outcomes Labeling at Scale | Jason Holmes et.al. | 2509.25540 | null |
| 2025-09-29 | Where LLM Agents Fail and How They can Learn From Failures | Kunlun Zhu et.al. | 2509.25370 | null |
| 2025-09-29 | Dive into the Agent Matrix: A Realistic Evaluation of Self-Replication Risk in LLM Agents | Boxuan Zhang et.al. | 2509.25302 | null |
| 2025-09-29 | PanoWorld-X: Generating Explorable Panoramic Worlds via Sphere-Aware Video Diffusion | Yuyang Yin et.al. | 2509.24997 | null |
| 2025-09-29 | When Greedy Wins: Emergent Exploitation Bias in Meta-Bandit LLM Training | Sanxing Chen et.al. | 2509.24923 | null |
| 2025-09-29 | MAS $^2$ : Self-Generative, Self-Configuring, Self-Rectifying Multi-Agent Systems | Kun Wang et.al. | 2509.24323 | null |
| 2025-09-29 | SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents | Gyuhyeon Seo et.al. | 2509.24282 | null |
| 2025-09-28 | WAREX: Web Agent Reliability Evaluation on Existing Benchmarks | Su Kara et.al. | 2510.03285 | null |
| 2025-09-28 | Optimism as Risk-Seeking in Multi-Agent Reinforcement Learning | Runyu Zhang et.al. | 2509.24047 | null |
| 2025-09-28 | PartnerMAS: An LLM Hierarchical Multi-Agent Framework for Business Partner Selection on High-Dimensional Features | Lingyao Li et.al. | 2509.24046 | null |
| 2025-09-28 | LLM/Agent-as-Data-Analyst: A Survey | Zirui Tang et.al. | 2509.23988 | null |
| 2025-09-28 | Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation | Pengxiang Li et.al. | 2509.23866 | null |
| 2025-09-28 | AgentGuard: Runtime Verification of AI Agents | Roham Koohestani et.al. | 2509.23864 | null |
| 2025-09-28 | Mix-Ecom: Towards Mixed-Type E-Commerce Dialogues with Complex Domain Rules | Chenyu Zhou et.al. | 2509.23836 | null |
| 2025-09-28 | FedAgentBench: Towards Automating Real-world Federated Medical Image Analysis with Server-Client LLM Agents | Pramit Saha et.al. | 2509.23803 | null |
| 2025-09-28 | GUI-Shepherd: Reliable Process Reward and Verification for Long-Sequence GUI Tasks | Cong Chen et.al. | 2509.23738 | null |
| 2025-09-28 | Improving the Efficiency of LLM Agent Systems through Trajectory Reduction | Yuan-An Xiao et.al. | 2509.23586 | null |
| 2025-09-28 | Agentic Reinforcement Learning with Implicit Step Rewards | Xiaoqian Liu et.al. | 2509.19199 | null |
| 2025-09-27 | Memory Management and Contextual Consistency for Long-Running Low-Code Agents | Jiexi Xu et.al. | 2509.25250 | null |
| 2025-09-27 | BuildBench: Benchmarking LLM Agents on Compiling Real-World Open-Source Software | Zehua Zhang et.al. | 2509.25248 | null |
| 2025-09-27 | Situational Awareness for Safe and Robust Multi-Agent Interactions Under Uncertainty | Benjamin Alcorn et.al. | 2509.23425 | null |
| 2025-09-27 | “Shall We Dig Deeper?”: Designing and Evaluating Strategies for LLM Agents to Advance Knowledge Co-Construction in Asynchronous Online Discussions | Yuanhao Zhang et.al. | 2509.23327 | null |
| 2025-09-27 | Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents | Yaorui Shi et.al. | 2509.23040 | null |
| 2025-09-26 | Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents | Heyang Gao et.al. | 2510.03253 | null |
| 2025-09-26 | AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering | Ziqing Wang et.al. | 2510.02328 | null |
| 2025-09-26 | Infusing Theory of Mind into Socially Intelligent LLM Agents | EunJeong Hwang et.al. | 2509.22887 | null |
| 2025-09-26 | ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents | Hwan Chang et.al. | 2509.22830 | null |
| 2025-09-26 | EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning | Wujiang Xu et.al. | 2509.22576 | null |
| 2025-09-26 | The Emergence of Altruism in Large-Language-Model Agents Society | Haoyang Li et.al. | 2509.22537 | null |
| 2025-09-26 | Do LLM Agents Know How to Ground, Recover, and Assess? A Benchmark for Epistemic Competence in Information-Seeking Agents | Jiaqi Shao et.al. | 2509.22391 | null |
| 2025-09-26 | Impact of Collective Behaviors of Autonomous Vehicles on Urban Traffic Dynamics: A Multi-Agent Reinforcement Learning Approach | Ahmet Onur Akman et.al. | 2509.22216 | null |
| 2025-09-26 | Leveraging LLM Agents for Automated Video Game Testing | Chengjia Wang et.al. | 2509.22170 | null |
| 2025-09-26 | CoBel-World: Harnessing LLM Reasoning to Build a Collaborative Belief World for Optimizing Embodied Multi-Agent Collaboration | Zhimin Wang et.al. | 2509.21981 | null |
| 2025-09-26 | What Makes LLM Agent Simulations Useful for Policy? Insights From an Iterative Design Engagement in Emergency Preparedness | Yuxuan Li et.al. | 2509.21868 | null |
| 2025-09-26 | UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios | Haotian Luo et.al. | 2509.21766 | null |
| 2025-09-26 | JudgeAgent: Knowledge-wise and Dynamic LLM Evaluation with Agent-as-Interviewer | Zhichao Shi et.al. | 2509.02097 | null |
| 2025-09-25 | LLM Agent Meets Agentic AI: Can LLM Agents Simulate Customers to Evaluate Agentic-AI-based Shopping Assistants? | Lu Sun et.al. | 2509.21501 | null |
| 2025-09-25 | What Do LLM Agents Do When Left Alone? Evidence of Spontaneous Meta-Cognitive Patterns | Stefan Szeider et.al. | 2509.21224 | null |
| 2025-09-25 | CORE: Full-Path Evaluation of LLM Agents Beyond Final State | Panagiotis Michelakis et.al. | 2509.20998 | null |
| 2025-09-25 | LIMI: Less is More for Agency | Yang Xiao et.al. | 2509.17567 | null |
| 2025-09-24 | EpidemIQs: Prompt-to-Paper LLM Agents for Epidemic Modeling and Analysis | Mohammad Hossein Samaei et.al. | 2510.00024 | null |
| 2025-09-24 | Blueprint-Bench: Comparing spatial intelligence of LLMs, agents and image models | Lukas Petersson et.al. | 2509.25229 | null |
| 2025-09-24 | LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet? | Rushil Gupta et.al. | 2509.21403 | null |
| 2025-09-24 | Training Task Reasoning LLM Agents for Multi-turn Task Planning via Single-turn Reinforcement Learning | Hanjiang Hu et.al. | 2509.20616 | null |
| 2025-09-24 | SAMULE: Self-Learning Agents Enhanced by Multi-level Reflection | Yubin Ge et.al. | 2509.20562 | null |
| 2025-09-24 | Perspectra: Choosing Your Experts Enhances Critical Thinking in Multi-Agent Research Ideation | Yiren Liu et.al. | 2509.20553 | null |
| 2025-09-24 | Agentic Metacognition: Designing a “Self-Aware” Low-Code Agent for Failure Prediction and Human Handoff | Jiexi Xu et.al. | 2509.19783 | null |
| 2025-09-23 | Structured Cognition for Behavioral Intelligence in Large Language Model Agents: Preliminary Study | Myung Ho Kim et.al. | 2510.05107 | null |
| 2025-09-23 | The Heterogeneous Multi-Agent Challenge | Charles Dansereau et.al. | 2509.19512 | null |
| 2025-09-23 | Simulating Online Social Media Conversations on Controversial Topics Using AI Agents Calibrated on Real-World Data | Elisa Composta et.al. | 2509.18985 | null |
| 2025-09-23 | MemOrb: A Plug-and-Play Verbal-Reinforcement Memory Layer for E-Commerce Customer Service | Yizhe Huang et.al. | 2509.18713 | null |
| 2025-09-23 | LCMF: Lightweight Cross-Modality Mambaformer for Embodied Robotics VQA | Zeyi Kang et.al. | 2509.18576 | null |
| 2025-09-23 | LLMZ+: Contextual Prompt Whitelist Principles for Agentic LLMs | Tom Pawelek et.al. | 2509.18557 | null |
| 2025-09-23 | LLM Agents for Interactive Workflow Provenance: Reference Architecture and Evaluation Methodology | Renan Souza et.al. | 2509.13978 | null |
| 2025-09-22 | ARK-V1: An LLM-Agent for Knowledge Graph Question Answering Requiring Commonsense Reasoning | Jan-Felix Klein et.al. | 2509.18063 | null |
| 2025-09-22 | Through the Lens of Human-Human Collaboration: A Configurable Research Platform for Exploring Human-Agent Collaboration | Bingsheng Yao et.al. | 2509.18008 | null |
| 2025-09-22 | MSCoRe: A Benchmark for Multi-Stage Collaborative Reasoning in LLM Agents | Yuzhen Lei et.al. | 2509.17628 | null |
| 2025-09-22 | Human vs. Agent in Task-Oriented Conversations | Zhefan Wang et.al. | 2509.17619 | null |
| 2025-09-22 | Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents | Shouju Wang et.al. | 2509.17488 | null |
| 2025-09-22 | Asteria: Semantic-Aware Cross-Region Caching for Agentic LLM Tool Access | Chaoyi Ruan et.al. | 2509.17360 | null |
| 2025-09-22 | UIPro: Unleashing Superior Interaction Capability For GUI Agents | Hongxin Li et.al. | 2509.17328 | null |
| 2025-09-22 | Generalizable End-to-End Tool-Use RL with Synthetic CodeGym | Weihua Du et.al. | 2509.17325 | null |
| 2025-09-21 | SignalLLM: A General-Purpose LLM Agent Framework for Automated Signal Processing | Junlong Ke et.al. | 2509.17197 | null |
| 2025-09-21 | LLMs as Layout Designers: A Spatial Reasoning Perspective | Sha Li et.al. | 2509.16891 | null |
| 2025-09-20 | Towards Transparent and Incentive-Compatible Collaboration in Decentralized LLM Multi-Agent Systems: A Blockchain-Driven Approach | Minfeng Qi et.al. | 2509.16736 | null |
| 2025-09-20 | OPEN-THEATRE: An Open-Source Toolkit for LLM-based Interactive Drama | Tianyang Xu et.al. | 2509.16713 | null |
| 2025-09-20 | Governed By Agents: A Survey On The Role Of Agentic AI In Future Computing Environments | Nauman Ali Murad et.al. | 2509.16676 | null |
| 2025-09-19 | Evaluating Behavioral Alignment in Conflict Dialogue: A Multi-Dimensional Comparison of LLM Agents and Humans | Deuksin Kwon et.al. | 2509.16394 | null |
| 2025-09-19 | Overhearing LLM Agents: A Survey, Taxonomy, and Roadmap | Andrew Zhu et.al. | 2509.16325 | null |
| 2025-09-19 | Towards Robust Visual Continual Learning with Multi-Prototype Supervision | Xiwei Liu et.al. | 2509.16011 | null |
| 2025-09-19 | How do Language Models Generate Slang: A Systematic Comparison between Human and Machine-Generated Slang Usages | Siyang Wu et.al. | 2509.15518 | null |
| 2025-09-19 | LLM Agents at the Roundtable: A Multi-Perspective and Dialectical Reasoning Framework for Essay Scoring | Jinhee Jang et.al. | 2509.14834 | null |
| 2025-09-18 | SecureFixAgent: A Hybrid LLM Agent for Automated Python Static Vulnerability Repair | Jugal Gajjar et.al. | 2509.16275 | null |
| 2025-09-18 | Diagnostics of cognitive failures in multi-agent expert systems using dynamic evaluation protocols and subsequent mutation of the processing context | Andrejs Sorstkins et.al. | 2509.15366 | null |
| 2025-09-18 | A Knowledge-driven Adaptive Collaboration of LLMs for Enhancing Medical Decision-making | Xiao Wu et.al. | 2509.14998 | null |
| 2025-09-18 | ToolSample: Dual Dynamic Sampling Methods with Curriculum Learning for RL-based Tool Learning | Zihao Feng et.al. | 2509.14718 | null |
| 2025-09-18 | SWE-QA: Can Language Models Answer Repository-level Code Questions? | Weihan Peng et.al. | 2509.14635 | null |
| 2025-09-17 | Ticket-Bench: A Kickoff for Multilingual and Regionalized Agent Evaluation | Thales Sales Almeida et.al. | 2509.14477 | null |
| 2025-09-17 | TopoSizing: An LLM-aided Framework of Topology-based Understanding and Sizing for AMS Circuits | Ziming Wei et.al. | 2509.14169 | null |
| 2025-09-17 | Understanding the Process of Human-AI Value Alignment | Jack McKinlay et.al. | 2509.13854 | null |
| 2025-09-17 | From Legacy Fortran to Portable Kokkos: An Autonomous Agentic AI Workflow | Sparsh Gupta et.al. | 2509.12443 | null |
| 2025-09-17 | Co-Investigator AI: The Rise of Agentic AI for Smarter, Trustworthy AML Compliance Narratives | Prathamesh Vasudeo Naik et.al. | 2509.08380 | null |
| 2025-09-17 | Emergent Social Dynamics of LLM Agents in the El Farol Bar Problem | Ryosuke Takata et.al. | 2509.04537 | null |
| 2025-09-17 | How Does Cognitive Bias Affect Large Language Models? A Case Study on the Anchoring Effect in Price Negotiation Simulations | Yoshiki Takenami et.al. | 2508.21137 | null |
| 2025-09-16 | Agentic JWT: A Secure Delegation Protocol for Autonomous AI Agents | Abhishek Goswami et.al. | 2509.13597 | null |
| 2025-09-16 | AI Agents with Human-Like Collaborative Tools: Adaptive Strategies for Enhanced Problem-Solving | Harper Reed et.al. | 2509.13547 | null |
| 2025-09-16 | An LLM Agentic Approach for Legal-Critical Software: A Case Study for Tax Prep Software | Sina Gogani-Khiabani et.al. | 2509.13471 | null |
| 2025-09-16 | WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning | Kuan Li et.al. | 2509.13305 | null |
| 2025-09-16 | Agentic AI for Financial Crime Compliance | Henrik Axelsen et.al. | 2509.13137 | null |
| 2025-09-16 | Toward PDDL Planning Copilot | Yarin Benyamin et.al. | 2509.12987 | null |
| 2025-09-16 | H $^2$ R: Hierarchical Hindsight Reflection for Multi-Task LLM Agents | Shicheng Ye et.al. | 2509.12810 | null |
| 2025-09-16 | Agentic Lybic: Multi-Agent Execution System with Tiered Reasoning and Orchestration | Liangxuan Guo et.al. | 2509.11067 | null |
| 2025-09-16 | PromptSleuth: Detecting Prompt Injection via Semantic Intent Invariance | Mengxiao Wang et.al. | 2508.20890 | null |
| 2025-09-16 | Mining the Long Tail: A Comparative Study of Data-Centric Criticality Metrics for Robust Offline Reinforcement Learning in Autonomous Motion Planning | Antonio Guillen-Perez et.al. | 2508.18397 | null |
| 2025-09-16 | Enhancing LLM-Based Social Bot via an Adversarial Learning Framework | Fanqi Kong et.al. | 2508.17711 | null |
| 2025-09-15 | Emotions are Recognized Patterns of Cognitive Activities | Yue Jin et.al. | 2509.16232 | null |
| 2025-09-15 | Redefining Website Fingerprinting Attacks With Multiagent LLMs | Chuxu Song et.al. | 2509.12462 | null |
| 2025-09-15 | Survival at Any Cost? LLMs and the Choice Between Self-Preservation and Human Harm | Alireza Mohamadi et.al. | 2509.12190 | null |
| 2025-09-15 | VisDocSketcher: Towards Scalable Visual Documentation with Agentic Systems | Luís F. Gomes et.al. | 2509.11942 | null |
| 2025-09-15 | $ε$ -Optimal Multi-Agent Patrol using Recurrent Strategy | Deepak Mallya et.al. | 2509.11640 | null |
| 2025-09-15 | Automated Creation and Enrichment Framework for Improved Invocation of Enterprise APIs as Tools | Prerna Agarwal et.al. | 2509.11626 | null |
| 2025-09-15 | MedicalOS: An LLM Agent based Operating System for Digital Healthcare | Jared Zhu et.al. | 2509.11507 | null |
| 2025-09-14 | Agentic UAVs: LLM-Driven Autonomy with Integrated Tool-Calling and Cognitive Reasoning | Anis Koubaa et.al. | 2509.13352 | null |
| 2025-09-14 | Prompts to Proxies: Emulating Human Preferences via a Compact LLM Ensemble | Bingchen Wang et.al. | 2509.11311 | null |
| 2025-09-14 | Free-MAD: Consensus-Free Multi-Agent Debate | Yu Cui et.al. | 2509.11035 | null |
| 2025-09-12 | FHIR-AgentBench: Benchmarking LLM Agents for Realistic Interoperable EHR Question Answering | Gyubok Lee et.al. | 2509.19319 | null |
| 2025-09-12 | V-Math: An Agentic Approach to the Vietnamese National High School Graduation Mathematics Exams | Duong Q. Nguyen et.al. | 2509.12251 | null |
| 2025-09-12 | Dark Patterns Meet GUI Agents: LLM Agent Susceptibility to Manipulative Interfaces and the Role of Human Oversight | Jingyu Tang et.al. | 2509.10723 | null |
| 2025-09-12 | Self-Supervised Goal-Reaching Results in Multi-Agent Cooperation and Exploration | Chirayu Nimonkar et.al. | 2509.10656 | null |
| 2025-09-12 | SciML Agents: Write the Solver, Not the Solution | Saarth Gaonkar et.al. | 2509.09936 | null |
| 2025-09-12 | Tackling One Health Risks: How Large Language Models are leveraged for Risk Negotiation and Consensus-building | Alexandra Fetsch et.al. | 2509.09906 | null |
| 2025-09-12 | Strategic Tradeoffs Between Humans and AI in Multi-Agent Bargaining | Crystal Qian et.al. | 2509.09071 | null |
| 2025-09-11 | TrEnv: Transparently Share Serverless Execution Environments Across Different Functions and Nodes | Jialiang Huang et.al. | 2509.09525 | null |
| 2025-09-11 | Curriculum-Based Multi-Tier Semantic Exploration via Deep Reinforcement Learning | Abdel Hakim Drid et.al. | 2509.09356 | null |
| 2025-09-11 | Flip Co-op: Cooperative Takeovers in Shared Autonomy | Sandeep Banik et.al. | 2509.09281 | null |
| 2025-09-11 | Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents | Jiawei Wang et.al. | 2509.09265 | null |
| 2025-09-11 | Enabling Regulatory Multi-Agent Collaboration: Architecture, Challenges, and Solutions | Qinnan Hu et.al. | 2509.09215 | null |
| 2025-09-10 | HypoGeneAgent: A Hypothesis Language Agent for Gene-Set Cluster Resolution Selection Using Perturb-seq Datasets | Ying Yuan et.al. | 2509.09740 | null |
| 2025-09-10 | AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning | Zhiheng Xi et.al. | 2509.08755 | null |
| 2025-09-10 | Architecting Resilient LLM Agents: A Guide to Secure Plan-then-Execute Implementations | Ron F. Del Rosario et.al. | 2509.08646 | null |
| 2025-09-10 | AutoODD: Agentic Audits via Bayesian Red Teaming in Black-Box Models | Rebecca Martin et.al. | 2509.08638 | null |
| 2025-09-09 | Multi Robot Coordination in Highly Dynamic Environments: Tackling Asymmetric Obstacles and Limited Communication | Vincenzo Suriani et.al. | 2509.08859 | null |
| 2025-09-09 | EnvX: Agentize Everything with Agentic AI | Linyao Chen et.al. | 2509.08088 | null |
| 2025-09-09 | Guided Reasoning in LLM-Driven Penetration Testing Using Structured Attack Trees | Katsuaki Nakano et.al. | 2509.07939 | null |
| 2025-09-09 | Getting In Contract with Large Language Models – An Agency Theory Perspective On Large Language Model Alignment | Sascha Kaltenpoth et.al. | 2509.07642 | null |
| 2025-09-09 | Astra: A Multi-Agent System for GPU Kernel Performance Optimization | Anjiang Wei et.al. | 2509.07506 | null |
| 2025-09-09 | Talking with Oompa Loompas: A novel framework for evaluating linguistic acquisition of LLM agents | Sankalp Tattwadarshi Swain et.al. | 2509.07389 | null |
| 2025-09-09 | Autonomous Code Evolution Meets NP-Completeness | Cunxi Yu et.al. | 2509.07367 | null |
| 2025-09-09 | CancerGUIDE: Cancer Guideline Understanding via Internal Disagreement Estimation | Alyssa Unell et.al. | 2509.07325 | null |
| 2025-09-08 | AxelSMOTE: An Agent-Based Oversampling Algorithm for Imbalanced Classification | Sukumar Kishanthan et.al. | 2509.06875 | null |
| 2025-09-08 | RAFFLES: Reasoning-based Attribution of Faults for LLM Systems | Chenyang Zhu et.al. | 2509.06822 | null |
| 2025-09-08 | Reinforcement Learning Foundations for Deep Research Systems: A Survey | Wenjun Li et.al. | 2509.06733 | null |
| 2025-09-08 | REMI: A Novel Causal Schema Memory Architecture for Personalized Lifestyle Recommendation Agents | Vishal Raman et.al. | 2509.06269 | null |
| 2025-09-08 | TalkToAgent: A Human-centric Explanation of Reinforcement Learning Agents with Large Language Models | Haechang Kim et.al. | 2509.04809 | null |
| 2025-09-08 | Meta-Policy Reflexion: Reusable Reflective Memory and Rule Admissibility for Resource-Efficient LLM Agent | Chunlong Wu et.al. | 2509.03990 | null |
| 2025-09-07 | From Digital Distrust to Codified Honesty: Experimental Evidence on Generative AI in Credence Goods Markets | Alexander Erlei et.al. | 2509.06069 | null |
| 2025-09-07 | Let’s Roleplay: Examining LLM Alignment in Collaborative Dialogues | Abhijnan Nath et.al. | 2509.05882 | null |
| 2025-09-06 | DRF: LLM-AGENT Dynamic Reputation Filtering Framework | Yuwei Lou et.al. | 2509.05764 | null |
| 2025-09-05 | Internet 3.0: Architecture for a Web-of-Agents with it’s Algorithm for Ranking Agents | Rajesh Tembarai Krishnamachari et.al. | 2509.04979 | null |
| 2025-09-05 | OSC: Cognitive Orchestration through Dynamic Knowledge Alignment in Multi-Agent LLM Collaboration | Jusheng Zhang et.al. | 2509.04876 | null |
| 2025-09-05 | UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning | Haoming Wang et.al. | 2509.02544 | null |
| 2025-09-04 | Maestro: Joint Graph & Config Optimization for Reliable AI Agents | Wenxiao Wang et.al. | 2509.04642 | null |
| 2025-09-04 | Psychologically Enhanced AI Agents | Maciej Besta et.al. | 2509.04343 | null |
| 2025-09-04 | Are LLM Agents the New RPA? A Comparative Study with RPA Across Enterprise Workflows | Petr Průcha et.al. | 2509.04198 | null |
| 2025-09-04 | MAGneT: Coordinated Multi-Agent Generation of Synthetic Multi-Turn Mental Health Counseling Sessions | Aishik Mandal et.al. | 2509.04183 | null |
| 2025-09-04 | Real-time adaptive quantum error correction by model-free multi-agent learning | Manuel Guatto et.al. | 2509.03974 | null |
| 2025-09-04 | FaMA: LLM-Empowered Agentic Assistant for Consumer-to-Consumer Marketplace | Yineng Yan et.al. | 2509.03890 | null |
| 2025-09-04 | Leveraging LLM-Based Agents for Intelligent Supply Chain Planning | Yongzhi Qi et.al. | 2509.03811 | null |
| 2025-09-04 | AgenTracer: Who Is Inducing Failure in the LLM Agentic Systems? | Guibin Zhang et.al. | 2509.03312 | null |
| 2025-09-03 | Are LLM Agents Behaviorally Coherent? Latent Profiles for Social Simulation | James Mooney et.al. | 2509.03736 | null |
| 2025-09-02 | DeepTRACE: Auditing Deep Research AI Systems for Tracking Reliability Across Citations and Evidence | Pranav Narayanan Venkit et.al. | 2509.04499 | null |
| 2025-09-02 | Deep Research is the New Analytics System: Towards Building the Runtime for AI-Driven Analytics | Matthew Russo et.al. | 2509.02751 | null |
| 2025-09-02 | The Landscape of Agentic Reinforcement Learning for LLMs: A Survey | Guibin Zhang et.al. | 2509.02547 | null |
| 2025-09-02 | Towards Agents That Know When They Don’t Know: Uncertainty as a Control Signal for Structured Reasoning | Josefa Lia Stoisser et.al. | 2509.02401 | null |
| 2025-09-02 | When Agents go Astray: Course-Correcting SWE Agents with PRMs | Shubham Gandhi et.al. | 2509.02360 | null |
| 2025-09-01 | The Need for Verification in AI-Driven Scientific Discovery | Cristina Cornelio et.al. | 2509.01398 | null |
| 2025-09-01 | Multi-Agent Reinforcement Learning for Task Offloading in Wireless Edge Networks | Andrea Fox et.al. | 2509.01257 | null |
| 2025-09-01 | ORCA: ORchestrating Causal Agent | Joanie Hayoun Chung et.al. | 2508.21304 | null |
| 2025-09-01 | How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on $τ$ -bench | Venkatesh Mishra et.al. | 2508.20931 | null |
| 2025-09-01 | Instructional Agents: LLM Agents on Automated Course Material Generation for Teaching Faculties | Huaiyuan Yao et.al. | 2508.19611 | null |
| 2025-08-31 | Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First | Shu Liu et.al. | 2509.00997 | null |
| 2025-08-30 | Inducing State Anxiety in LLM Agents Reproduces Human-Like Biases in Consumer Decision-Making | Ziv Ben-Zion et.al. | 2510.06222 | null |
| 2025-08-30 | Exploring Decision-Making Capabilities of LLM Agents: An Experimental Study on Jump-Jump Game | Juwu Li et.al. | 2509.00483 | null |
| 2025-08-29 | COCORELI: Cooperative, Compositional Reconstitution \& Execution of Language Instructions | Swarnadeep Bhar et.al. | 2509.04470 | null |
| 2025-08-29 | ReLATE: Learning Efficient Sparse Encoding for High-Performance Tensor Decomposition | Ahmed E. Helal et.al. | 2509.00280 | null |
| 2025-08-29 | HiVA: Self-organized Hierarchical Variable Agent via Goal-driven Semantic-Topological Evolution | Jinzhou Tang et.al. | 2509.00189 | null |
| 2025-08-28 | A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers | Ming Hu et.al. | 2508.21148 | null |
| 2025-08-28 | Provable Benefits of In-Tool Learning for Large Language Models | Sam Houliston et.al. | 2508.20755 | null |
| 2025-08-28 | rStar2-Agent: Agentic Reasoning Technical Report | Ning Shang et.al. | 2508.20722 | null |
| 2025-08-28 | CyberSleuth: Autonomous Blue-Team LLM Agent for Web Attack Forensics | Stefano Fumero et.al. | 2508.20643 | null |
| 2025-08-28 | MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers | Zhenting Wang et.al. | 2508.20453 | null |
| 2025-08-28 | MindGuard: Tracking, Detecting, and Attributing MCP Tool Poisoning Attack via Decision Dependence Graph | Zhiqiang Wang et.al. | 2508.20412 | null |
| 2025-08-27 | CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning | Zeyi Sun et.al. | 2508.20096 | null |
| 2025-08-27 | AgentCoMa: A Compositional Benchmark Mixing Commonsense and Mathematical Reasoning in Real-World Scenarios | Lisa Alazraki et.al. | 2508.19988 | null |
| 2025-08-27 | Evaluating Language Model Reasoning about Confidential Information | Dylan Sam et.al. | 2508.19980 | null |
| 2025-08-27 | Secure Multi-LLM Agentic AI and Agentification for Edge General Intelligence by Zero-Trust: A Survey | Yinqiu Liu et.al. | 2508.19870 | null |
| 2025-08-27 | Survey of Specialized Large Language Model | Chenghan Yang et.al. | 2508.19667 | null |
| 2025-08-27 | CompLex: Music Theory Lexicon Constructed by Autonomous Agents for Automatic Music Generation | Zhejing Hu et.al. | 2508.19603 | null |
| 2025-08-27 | Encouraging Good Processes Without the Need for Good Answers: Reinforcement Learning for LLM Agent Planning | Zhiwei Li et.al. | 2508.19598 | null |
| 2025-08-27 | Aegis: Taxonomy and Optimizations for Overcoming Agent-Environment Failures in LLM Agents | Kevin Song et.al. | 2508.19504 | null |
| 2025-08-27 | Interactive Graph Visualization and TeamingRecommendation in an Interdisciplinary Project’sTalent Knowledge Graph | Jiawei Xu et.al. | 2508.19489 | null |
| 2025-08-26 | Reliable Weak-to-Strong Monitoring of LLM Agents | Neil Kale et.al. | 2508.19461 | null |
| 2025-08-26 | Real-Time Model Checking for Closed-Loop Robot Reactive Planning | Christopher Chandler et.al. | 2508.19186 | null |
| 2025-08-26 | MATRIX: Multi-Agent simulaTion fRamework for safe Interactions and conteXtual clinical conversational evaluation | Ernest Lim et.al. | 2508.19163 | null |
| 2025-08-26 | A Concurrent Modular Agent: Framework for Autonomous LLM Agents | Norihiro Maruyama et.al. | 2508.19042 | null |
| 2025-08-26 | CausalMACE: Causality Empowered Multi-Agents in Minecraft Cooperative Tasks | Qi Chai et.al. | 2508.18797 | null |
| 2025-08-26 | Toward Edge General Intelligence with Agentic AI and Agentification: Concepts, Technologies, and Future Directions | Ruichen Zhang et.al. | 2508.18725 | null |
| 2025-08-26 | FALCON: Autonomous Cyber Threat Intelligence Mining with LLMs for IDS Rule Generation | Shaswata Mitra et.al. | 2508.18684 | null |
| 2025-08-26 | Utilizing Training Data to Improve LLM Reasoning for Tabular Understanding | Chufan Gao et.al. | 2508.18676 | null |
| 2025-08-26 | Bias-Adjusted LLM Agents for Human-Like Decision-Making via Behavioral Economics | Ayato Kitadai et.al. | 2508.18600 | null |
| 2025-08-26 | Generative Artificial Intelligence and Agents in Research and Teaching | Jussi S. Jauhiainen et.al. | 2508.16701 | null |
| 2025-08-25 | Toward Generalized Autonomous Agents: A Neuro-Symbolic AI Framework for Integrating Social and Technical Support in Education | Ryan Hare et.al. | 2508.18406 | null |
| 2025-08-25 | The AI Data Scientist | Farkhad Akimov et.al. | 2508.18113 | null |
| 2025-08-25 | Memento: Fine-tuning LLM Agents without Fine-tuning LLMs | Huichi Zhou et.al. | 2508.16153 | null |
| 2025-08-24 | FLAIRR-TS – Forecasting LLM-Agents with Iterative Refinement and Retrieval for Time Series | Gunjan Jalori et.al. | 2508.19279 | null |
| 2025-08-24 | Agent-Testing Agent: A Meta-Agent for Automated Testing and Evaluation of Conversational AI Agents | Sameer Komoravolu et.al. | 2508.17393 | null |
| 2025-08-24 | From Language to Action: A Review of Large Language Models as Autonomous Agents and Tool Users | Sadia Sultana Chowa et.al. | 2508.17281 | null |
| 2025-08-22 | AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications | Dawei Gao et.al. | 2508.16279 | null |
| 2025-08-22 | IR-Agent: Expert-Inspired LLM Agents for Structure Elucidation from Infrared Spectra | Heewoong Noh et.al. | 2508.16112 | null |
| 2025-08-21 | Noise, Adaptation, and Strategy: Assessing LLM Fidelity in Decision-Making | Yuanjun Feng et.al. | 2508.15926 | null |
| 2025-08-21 | End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning | Qiaoyu Zheng et.al. | 2508.15746 | null |
Large Language Models
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2025-10-29 | OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning | Ziyou Hu et.al. | 2510.24636 | null |
| 2025-10-28 | Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance | Yujie Wei et.al. | 2510.24711 | null |
| 2025-10-28 | ComboBench: Can LLMs Manipulate Physical Devices to Play Virtual Reality Games? | Shuqing Li et.al. | 2510.24706 | null |
| 2025-10-28 | Tongyi DeepResearch Technical Report | Tongyi DeepResearch Team et.al. | 2510.24701 | null |
| 2025-10-28 | Greedy Sampling Is Provably Efficient for RLHF | Di Wu et.al. | 2510.24700 | null |
| 2025-10-28 | WebLeaper: Empowering Efficiency and Efficacy in WebAgent via Enabling Info-Rich Seeking | Zhengwei Tao et.al. | 2510.24697 | null |
| 2025-10-28 | AgentFrontier: Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis | Xuanzhong Chen et.al. | 2510.24695 | null |
| 2025-10-28 | STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence | Zihan Liu et.al. | 2510.24693 | null |
| 2025-10-28 | Dissecting Role Cognition in Medical LLMs via Neuronal Ablation | Xun Liang et.al. | 2510.24677 | null |
| 2025-10-28 | Evolving Diagnostic Agents in a Virtual Clinical Environment | Pengcheng Qiu et.al. | 2510.24654 | null |
| 2025-10-28 | Optimizing Retrieval for RAG via Reinforced Contrastive Learning | Jiawei Zhou et.al. | 2510.24652 | null |
| 2025-10-28 | Advancing site-specific disease and pest management in precision agriculture: From reasoning-driven foundation models to adaptive, feedback-based learning | Nitin Rai et.al. | 2510.24650 | null |
| 2025-10-28 | FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling | Zengzhuang Xu et.al. | 2510.24645 | null |
| 2025-10-28 | Relative Scaling Laws for LLMs | William Held et.al. | 2510.24626 | null |
| 2025-10-28 | Zero-Shot Cross-Lingual Transfer using Prefix-Based Adaptation | Snegha A et.al. | 2510.24619 | null |
| 2025-10-28 | Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way | Yicun Yang et.al. | 2510.24605 | null |
| 2025-10-28 | ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization | Guoxin Chen et.al. | 2510.24592 | null |
| 2025-10-28 | ReplicationBench: Can AI Agents Replicate Astrophysics Research Papers? | Christine Ye et.al. | 2510.24591 | null |
| 2025-10-28 | Generative AI for Healthcare: Fundamentals, Challenges, and Perspectives | Gang Chen et.al. | 2510.24551 | null |
| 2025-10-28 | Open Korean Historical Corpus: A Millennia-Scale Diachronic Collection of Public Domain Texts | Seyoung Song et.al. | 2510.24541 | null |
| 2025-10-28 | Multi-Agent Evolve: LLM Self-Improve through Co-evolution | Yixing Chen et.al. | 2510.23595 | null |
| 2025-10-28 | PRISM-Bench: A Benchmark of Puzzle-Based Visual Tasks with CoT Error Detection | Yusu Qian et.al. | 2510.23594 | null |
| 2025-10-27 | PixelRefer: A Unified Framework for Spatio-Temporal Object Referring with Arbitrary Granularity | Yuqian Yuan et.al. | 2510.23603 | null |
| 2025-10-27 | Alita-G: Self-Evolving Generative Agent for Agent Generation | Jiahao Qiu et.al. | 2510.23601 | null |
| 2025-10-27 | Think Twice: Branch-and-Rethink Reasoning Reward Model | Yizhu Jiao et.al. | 2510.23596 | null |
| 2025-10-27 | Lightweight Robust Direct Preference Optimization | Cheol Woo Kim et.al. | 2510.23590 | null |
| 2025-10-27 | FARMER: Flow AutoRegressive Transformer over Pixels | Guangting Zheng et.al. | 2510.23588 | null |
| 2025-10-27 | A Survey of Data Agents: Emerging Paradigm or Overstated Hype? | Yizhang Zhu et.al. | 2510.23587 | null |
| 2025-10-27 | RobotArena $\infty$ : Scalable Robot Benchmarking via Real-to-Sim Translation | Yash Jangir et.al. | 2510.23571 | null |
| 2025-10-27 | EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT | Baoqi Pei et.al. | 2510.23569 | null |
| 2025-10-27 | ReCode: Unify Plan and Action for Universal Granularity Control | Zhaoyang Yu et.al. | 2510.23564 | null |
| 2025-10-27 | ISA-Bench: Benchmarking Instruction Sensitivity for Large Audio Language Models | Bohan Li et.al. | 2510.23558 | null |
| 2025-10-27 | Minimizing Human Intervention in Online Classification | William Réveillard et.al. | 2510.23557 | null |
| 2025-10-27 | IPQA: A Benchmark for Core Intent Identification in Personalized Question Answering | Jieyong Kim et.al. | 2510.23536 | null |
| 2025-10-27 | Point Convergence of Nesterov’s Accelerated Gradient Method: An AI-Assisted Proof | Uijeong Jang et.al. | 2510.23513 | null |
| 2025-10-27 | Deductive Chain-of-Thought Augmented Socially-aware Robot Navigation World Model | Weizheng Wang et.al. | 2510.23509 | null |
| 2025-10-27 | Emotion-Coherent Reasoning for Multimodal LLMs via Emotional Rationale Verifier | Hyeongseop Rha et.al. | 2510.23506 | null |
| 2025-10-27 | VOLD: Reasoning Transfer from LLMs to Vision-Language Models via On-Policy Distillation | Walid Bousselham et.al. | 2510.23497 | null |
| 2025-10-27 | Learning the PTM Code through a Coarse-to-Fine, Mechanism-Aware Framework | Jingjie Zhang et.al. | 2510.23492 | null |
| 2025-10-27 | Learning to Reason Efficiently with Discounted Reinforcement Learning | Alex Ayoub et.al. | 2510.23486 | null |
| 2025-10-24 | A Multimodal Benchmark for Framing of Oil & Gas Advertising and Potential Greenwashing Detection | Gaku Morio et.al. | 2510.21679 | null |
| 2025-10-24 | A Data-Centric Approach to Multilingual E-Commerce Product Search: Case Study on Query-Category and Query-Item Relevance | Yabo Yin et.al. | 2510.21671 | null |
| 2025-10-24 | The Universal Landscape of Human Reasoning | Qiguang Chen et.al. | 2510.21623 | null |
| 2025-10-24 | Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine | Wenyi Wang et.al. | 2510.21614 | null |
| 2025-10-24 | Modest-Align: Data-Efficient Alignment for Vision-Language Models | Jiaxiang Liu et.al. | 2510.21606 | null |
| 2025-10-24 | RETuning: Upgrading Inference-Time Scaling for Stock Movement Prediction with Large Language Models | Xueyuan Lin et.al. | 2510.21604 | null |
| 2025-10-24 | From Polyester Girlfriends to Blind Mice: Creating the First Pragmatics Understanding Benchmarks for Slovene | Mojca Brglez et.al. | 2510.21575 | null |
| 2025-10-24 | ColorEcosystem: Powering Personalized, Standardized, and Trustworthy Agentic Service in massive-agent Ecosystem | Fangwen Wu et.al. | 2510.21566 | null |
| 2025-10-24 | Are the LLMs Capable of Maintaining at Least the Language Genus? | Sandra Mitrović et.al. | 2510.21561 | null |
| 2025-10-24 | EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law | Ilija Lichkovski et.al. | 2510.21524 | null |
| 2025-10-24 | Brain-tuning Improves Generalizability and Efficiency of Brain Alignment in Speech Models | Omer Moussa et.al. | 2510.21520 | null |
| 2025-10-24 | Head Pursuit: Probing Attention Specialization in Multimodal Transformers | Lorenzo Basile et.al. | 2510.21518 | null |
| 2025-10-24 | Wisdom and Delusion of LLM Ensembles for Code Generation and Repair | Fernando Vallecillos Ruiz et.al. | 2510.21513 | null |
| 2025-10-24 | Actionable Cybersecurity Notifications for Smart Homes: A User Study on the Role of Length and Complexity | Victor Jüttner et.al. | 2510.21508 | null |
| 2025-10-24 | MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization | Chenglong Wang et.al. | 2510.21473 | null |
| 2025-10-24 | Risk Management for Mitigating Benchmark Failure Modes: BenchRisk | Sean McGregor et.al. | 2510.21460 | null |
| 2025-10-24 | SBASH: a Framework for Designing and Evaluating RAG vs. Prompt-Tuned LLM Honeypots | Adetayo Adebimpe et.al. | 2510.21459 | null |
| 2025-10-24 | ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models | Federico Danieli et.al. | 2510.21450 | null |
| 2025-10-24 | MoniTor: Exploiting Large Language Models with Instruction for Online Video Anomaly Detection | Shengtian Yang et.al. | 2510.21449 | null |
| 2025-10-24 | REMONI: An Autonomous System Integrating Wearables and Multimodal Large Language Models for Enhanced Remote Health Monitoring | Thanh Cong Ho et.al. | 2510.21445 | null |
| 2025-10-23 | KL-Regularized Reinforcement Learning is Designed to Mode Collapse | Anthony GX-Chen et.al. | 2510.20817 | null |
| 2025-10-23 | Generative Reasoning Recommendation via LLMs | Minjie Hong et.al. | 2510.20815 | null |
| 2025-10-23 | Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation | Yuhan Liu et.al. | 2510.20812 | null |
| 2025-10-23 | On the Detectability of LLM-Generated Text: What Exactly Is LLM-Generated Text? | Mingmeng Geng et.al. | 2510.20810 | null |
| 2025-10-23 | Video Prediction of Dynamic Physical Simulations With Pixel-Space Spatiotemporal Transformers | Dean L Slack et.al. | 2510.20807 | null |
| 2025-10-23 | ARGenSeg: Image Segmentation with Autoregressive Image Generation Model | Xiaolong Wang et.al. | 2510.20803 | null |
| 2025-10-23 | Simple Context Compression: Mean-Pooling and Multi-Ratio Training | Yair Feldman et.al. | 2510.20797 | null |
| 2025-10-23 | A Use-Case Specific Dataset for Measuring Dimensions of Responsible Performance in LLM-generated Text | Alicia Sagae et.al. | 2510.20782 | null |
| 2025-10-23 | RAGRank: Using PageRank to Counter Poisoning in CTI LLM Pipelines | Austin Jia et.al. | 2510.20768 | null |
| 2025-10-23 | Empathic Prompting: Non-Verbal Context Integration for Multimodal LLM Conversations | Lorenzo Stacchio et.al. | 2510.20743 | null |
| 2025-10-23 | Learning to Triage Taint Flows Reported by Dynamic Program Analysis in Node.js Packages | Ronghao Ni et.al. | 2510.20739 | null |
| 2025-10-23 | Automated Extraction of Fluoropyrimidine Treatment and Treatment-Related Toxicities from Clinical Notes Using Natural Language Processing | Xizhi Wu et.al. | 2510.20727 | null |
| 2025-10-23 | User Perceptions of Privacy and Helpfulness in LLM Responses to Privacy-Sensitive Scenarios | Xiaoyuan Wu et.al. | 2510.20721 | null |
| 2025-10-23 | Mixing Importance with Diversity: Joint Optimization for KV Cache Compression in Large Vision-Language Models | Xuyang Liu et.al. | 2510.20707 | null |
| 2025-10-23 | Structure-Conditional Minimum Bayes Risk Decoding | Bryan Eikema et.al. | 2510.20700 | null |
| 2025-10-23 | Diagnosing Visual Reasoning: Challenges, Insights, and a Path Forward | Jing Bi et.al. | 2510.20696 | null |
| 2025-10-23 | Exploring Large Language Models for Access Control Policy Synthesis and Summarization | Adarsh Vatsa et.al. | 2510.20692 | null |
| 2025-10-23 | Plan Then Retrieve: Reinforcement Learning-Guided Complex Reasoning over Knowledge Graphs | Yanlin Song et.al. | 2510.20691 | null |
| 2025-10-23 | Neural Diversity Regularizes Hallucinations in Small Models | Kushal Chakrabarti et.al. | 2510.20690 | null |
| 2025-10-23 | Bayesian Jammer Localization with a Hybrid CNN and Path-Loss Mixture of Experts | Mariona Jaramillo-Civill et.al. | 2510.20666 | null |
| 2025-10-23 | Zhyper: Factorized Hypernetworks for Conditioned LLM Fine-Tuning | M. H. I. Abdalla et.al. | 2510.19733 | null |
| 2025-10-23 | Fast Inference via Hierarchical Speculative Decoding | Clara Mohri et.al. | 2510.19705 | null |
| 2025-10-22 | Semantic World Models | Jacob Berg et.al. | 2510.19818 | null |
| 2025-10-22 | olmOCR 2: Unit Test Rewards for Document OCR | Jake Poznanski et.al. | 2510.19817 | null |
| 2025-10-22 | Hubble: a Model Suite to Advance the Study of LLM Memorization | Johnny Tian-Zheng Wei et.al. | 2510.19811 | null |
| 2025-10-22 | Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning | Xichen Zhang et.al. | 2510.19807 | null |
| 2025-10-22 | The Art of Asking: Multilingual Prompt Optimization for Synthetic Data | David Mora et.al. | 2510.19806 | null |
| 2025-10-22 | Forbidden Sidon subsets of perfect difference sets, featuring a human-assisted proof | Boris Alexeev et.al. | 2510.19804 | null |
| 2025-10-22 | Class-Aware Prototype Learning with Negative Contrast for Test-Time Adaptation of Vision-Language Models | Xiaozhen Qiao et.al. | 2510.19802 | null |
| 2025-10-22 | The Feasibility of Training Sovereign Language Models in the Global South: A Study of Brazil and Mexico | Sandra Malagon et.al. | 2510.19801 | null |
| 2025-10-22 | Integrating Transparent Models, LLMs, and Practitioner-in-the-Loop: A Case of Nonprofit Program Evaluation | Ji Ma et.al. | 2510.19799 | null |
| 2025-10-22 | Blackbox Model Provenance via Palimpsestic Membership Inference | Rohith Kuditipudi et.al. | 2510.19796 | null |
| 2025-10-22 | On Controlled Change: Generative AI’s Impact on Professional Authority in Journalism | Tomás Dodds et.al. | 2510.19792 | null |
| 2025-10-22 | ToolDreamer: Instilling LLM Reasoning Into Tool Retrievers | Saptarshi Sengupta et.al. | 2510.19791 | null |
| 2025-10-22 | AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders | Yuezhou Hu et.al. | 2510.19779 | null |
| 2025-10-22 | The Tail Tells All: Estimating Model-Level Membership Inference Vulnerability Without Reference Models | Euodia Dodd et.al. | 2510.19773 | null |
| 2025-10-22 | SmartSwitch: Advancing LLM Reasoning by Overcoming Underthinking via Promoting Deeper Thought Exploration | Xichen Zhang et.al. | 2510.19767 | null |
| 2025-10-22 | Top-P Masking for Cross Language Information Retrieval | Joseph Casale et.al. | 2510.19758 | null |
| 2025-10-22 | Review of Tools for Zero-Code LLM Based Application Development | Priyaranjan Pattnayak et.al. | 2510.19747 | null |
| 2025-10-22 | RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models | Yang Yang et.al. | 2510.19698 | null |
| 2025-10-22 | Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs | Haochen Wang et.al. | 2510.18876 | null |
| 2025-10-21 | Retaining by Doing: The Role of On-Policy Data in Mitigating Forgetting | Howard Chen et.al. | 2510.18874 | null |
| 2025-10-21 | DSI-Bench: A Benchmark for Dynamic Spatial Intelligence | Ziang Zhang et.al. | 2510.18873 | null |
| 2025-10-21 | How Do LLMs Use Their Depth? | Akshat Gupta et.al. | 2510.18871 | null |
| 2025-10-21 | LightMem: Lightweight and Efficient Memory-Augmented Generation | Jizhan Fang et.al. | 2510.18866 | null |
| 2025-10-21 | EffiReasonTrans: RL-Optimized Reasoning for Code Translation | Yanlin Wang et.al. | 2510.18863 | null |
| 2025-10-21 | Streamlining Acceptance Test Generation for Mobile Applications Through Large Language Models: An Industrial Case Study | Pedro Luís Fonseca et.al. | 2510.18861 | null |
| 2025-10-21 | An Encoder-Decoder Foundation Chemical Language Model for Generative Polymer Design | Harikrishna Sahu et.al. | 2510.18860 | null |
| 2025-10-21 | Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning | Chenghao Zhu et.al. | 2510.18849 | null |
| 2025-10-21 | See the Text: From Tokenization to Visual Reading | Ling Xing et.al. | 2510.18840 | null |
| 2025-10-21 | FedDEAP: Adaptive Dual-Prompt Tuning for Multi-Domain Federated Learning | Yubin Zheng et.al. | 2510.18837 | null |
| 2025-10-21 | MTraining: Distributed Dynamic Sparse Attention for Efficient Ultra-Long Context Training | Wenxuan Li et.al. | 2510.18830 | null |
| 2025-10-21 | Unifying and Enhancing Graph Transformers via a Hierarchical Mask Framework | Yujie Xing et.al. | 2510.18825 | null |
| 2025-10-21 | Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health Monitoring | Shuxin Lin et.al. | 2510.18817 | null |
| 2025-10-21 | Integrating Large Language Models and Evaluating Student Outcomes in an Introductory Computer Science Course | Annapurna Vadaparty et.al. | 2510.18806 | null |
| 2025-10-21 | FeClustRE: Hierarchical Clustering and Semantic Tagging of App Features from User Reviews | Max Tiessler et.al. | 2510.18799 | null |
| 2025-10-21 | ShaRE your Data! Characterizing Datasets for LLM-based Requirements Engineering | Quim Motger et.al. | 2510.18787 | null |
| 2025-10-21 | KAT-Coder Technical Report | Zizheng Zhan et.al. | 2510.18779 | null |
| 2025-10-21 | Seg the HAB: Language-Guided Geospatial Algae Bloom Reasoning and Segmentation | Patterson Hsieh et.al. | 2510.18751 | null |
| 2025-10-21 | Topoformer: brain-like topographic organization in Transformer language models through spatial querying and reweighting | Taha Binhuraib et.al. | 2510.18745 | null |
| 2025-10-21 | Verifiable Accuracy and Abstention Rewards in Curriculum RL to Alleviate Lost-in-Conversation | Ming Li et.al. | 2510.18731 | null |
| 2025-10-21 | HarmNet: A Framework for Adaptive Multi-Turn Jailbreak Attacks on Large Language Models | Sidhant Narula et.al. | 2510.18728 | null |
| 2025-10-21 | IF-VidCap: Can Video Caption Models Follow Instructions? | Shihao Li et.al. | 2510.18726 | null |
| 2025-10-21 | SemiAdapt and SemiLoRA: Efficient Domain Adaptation for Transformer-based Low-Resource Language Translation with a Case Study on Irish | Josh McGiff et.al. | 2510.18725 | null |
| 2025-10-21 | SSD: Spatial-Semantic Head Decoupling for Efficient Autoregressive Image Generation | Siyong Jian et.al. | 2510.18716 | null |
| 2025-10-21 | Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options | Joongkyu Lee et.al. | 2510.18713 | null |
| 2025-10-21 | Exploring a Unified Vision-Centric Contrastive Alternatives on Multi-Modal Web Documents | Yiqi Lin et.al. | 2510.18703 | null |
| 2025-10-21 | UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation | Yibin Wang et.al. | 2510.18701 | null |
| 2025-10-21 | MLMA: Towards Multilingual with Mamba Based Architectures | Mohamed Nabih Ali et.al. | 2510.18684 | null |
| 2025-10-21 | Exploring Membership Inference Vulnerabilities in Clinical Large Language Models | Alexander Nemecek et.al. | 2510.18674 | null |
| 2025-10-21 | Reasoning Language Model Inference Serving Unveiled: An Empirical Study | Qi Li et.al. | 2510.18672 | null |
| 2025-10-21 | Hardness of Learning Regular Languages in the Next Symbol Prediction Setting | Satwik Bhattamishra et.al. | 2510.18634 | null |
| 2025-10-21 | Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views | Zhangquan Chen et.al. | 2510.18632 | null |
| 2025-10-21 | VAR: Visual Attention Reasoning via Structured Search and Backtracking | Wei Cai et.al. | 2510.18619 | null |
| 2025-10-21 | Evaluating Large Language Models in detecting Secrets in Android Apps | Marco Alecci et.al. | 2510.18601 | null |
| 2025-10-21 | CUARewardBench: A Benchmark for Evaluating Reward Models on Computer-using Agent | Haojia Lin et.al. | 2510.18596 | null |
| 2025-10-21 | Tokencake: A KV-Cache-centric Serving Framework for LLM-based Multi-Agent Applications | Zhuohang Bian et.al. | 2510.18586 | null |
| 2025-10-21 | CLASP: Cost-Optimized LLM-based Agentic System for Phishing Detection | Fouad Trad et.al. | 2510.18585 | null |
| 2025-10-21 | CovMatch: Cross-Covariance Guided Multimodal Dataset Distillation with Trainable Text Encoder | Yongmin Lee et.al. | 2510.18583 | null |
| 2025-10-21 | The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability | Zijie Xu et.al. | 2510.18563 | null |
| 2025-10-21 | Large language models for folktale type automation based on motifs: Cinderella case study | Tjaša Arčon et.al. | 2510.18561 | null |
| 2025-10-21 | Building Trust in Clinical LLMs: Bias Analysis and Dataset Transparency | Svetlana Maslenkova et.al. | 2510.18556 | null |
| 2025-10-21 | JAUNT: Joint Alignment of User Intent and Network State for QoE-centric LLM Tool Routing | Enhan Li et.al. | 2510.18550 | null |
| 2025-10-21 | EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval | Zebin Yang et.al. | 2510.18546 | null |
| 2025-10-21 | SLICE: SLO-Driven Scheduling for LLM Inference on Edge Computing Devices | Pan Zhou et.al. | 2510.18544 | null |
| 2025-10-21 | Noise-Conditioned Mixture-of-Experts Framework for Robust Speaker Verification | Bin Gu et.al. | 2510.18533 | null |
| 2025-10-21 | LLMs as Sparse Retrievers:A Framework for First-Stage Product Search | Hongru Song et.al. | 2510.18527 | null |
| 2025-10-21 | Counterfactual Reasoning for Steerable Pluralistic Value Alignment of Large Language Models | Hanze Guo et.al. | 2510.18526 | null |
| 2025-10-21 | From Quarter to All: Accelerating Speculative LLM Decoding via Floating-Point Exponent Remapping and Parameter Sharing | Yushu Zhao et.al. | 2510.18525 | null |
| 2025-10-21 | Socialized Learning and Emergent Behaviors in Multi-Agent Systems based on Multimodal Large Language Models | Sureyya Akin et.al. | 2510.18515 | null |
| 2025-10-21 | Identity-Aware Large Language Models require Cultural Reasoning | Alistair Plum et.al. | 2510.18510 | null |
| 2025-10-21 | Prompting the Priorities: A First Look at Evaluating LLMs for Vulnerability Triage and Prioritization | Osama Al Haddad et.al. | 2510.18508 | null |
| 2025-10-21 | Zero-Shot Vehicle Model Recognition via Text-Based Retrieval-Augmented Generation | Wei-Chia Chang et.al. | 2510.18502 | null |
| 2025-10-21 | One Size Fits All? A Modular Adaptive Sanitization Kit (MASK) for Customizable Privacy-Preserving Phone Scam Detection | Kangzhong Wang et.al. | 2510.18493 | null |
| 2025-10-21 | The Attribution Story of WhisperGate: An Academic Perspective | Oleksandr Adamov et.al. | 2510.18484 | null |
| 2025-10-21 | StarBench: A Turn-Based RPG Benchmark for Agentic Multimodal Decision-Making and Information Seeking | Haoran Zhang et.al. | 2510.18483 | null |
| 2025-10-21 | How Efficient Are Diffusion Language Models? A Critical Examination of Efficiency Evaluation Practices | Han Peng et.al. | 2510.18480 | null |
| 2025-10-21 | LAFA: Agentic LLM-Driven Federated Analytics over Decentralized Data Sources | Haichao Ji et.al. | 2510.18477 | null |
| 2025-10-21 | Probabilistic Modeling of Intentions in Socially Intelligent LLM Agents | Feifan Xia et.al. | 2510.18476 | null |
| 2025-10-21 | DART: A Structured Dataset of Regulatory Drug Documents in Italian for Clinical NLP | Mariano Barone et.al. | 2510.18475 | null |
| 2025-10-21 | CodeRL+: Improving Code Generation via Reinforcement with Execution Semantics Alignment | Xue Jiang et.al. | 2510.18471 | null |
| 2025-10-21 | CircuitSeer: Mining High-Quality Data by Probing Mathematical Reasoning Circuits in LLMs | Shaobo Wang et.al. | 2510.18470 | null |
| 2025-10-21 | IMB: An Italian Medical Benchmark for Question Answering | Antonio Romano et.al. | 2510.18468 | null |
| 2025-10-21 | Simple and Efficient Heterogeneous Temporal Graph Neural Network | Yili Wang et.al. | 2510.18467 | null |
| 2025-10-21 | CEFR-Annotated WordNet: LLM-Based Proficiency-Guided Semantic Database for Language Learning | Masato Kikuchi et.al. | 2510.18466 | null |
| 2025-10-21 | Large Language Models in Thematic Analysis: Prompt Engineering, Evaluation, and Guidelines for Qualitative Software Engineering Research | Cristina Martinez Montes et.al. | 2510.18456 | null |
| 2025-10-21 | Engagement Undermines Safety: How Stereotypes and Toxicity Shape Humor in Language Models | Atharvan Dogra et.al. | 2510.18454 | null |
| 2025-10-21 | PlanU: Large Language Model Decision Making through Planning under Uncertainty | Ziwei Deng et.al. | 2510.18442 | null |
| 2025-10-21 | Grounding or Guessing? Visual Signals for Detecting Hallucinations in Sign Language Translation | Yasser Hamidullah et.al. | 2510.18439 | null |
| 2025-10-21 | DeepTx: Real-Time Transaction Risk Analysis via Multi-Modal Features and LLM Reasoning | Yixuan Liu et.al. | 2510.18438 | null |
| 2025-10-21 | Chain-of-Conceptual-Thought: Eliciting the Agent to Deeply Think within the Response | Qingqing Gu et.al. | 2510.18434 | null |
| 2025-10-21 | ImageGem: In-the-wild Generative Image Interaction Dataset for Generative Model Personalization | Yuanhe Guo et.al. | 2510.18433 | null |
| 2025-10-21 | Automated urban waterlogging assessment and early warning through a mixture of foundation models | Chenxu Zhang et.al. | 2510.18425 | null |
| 2025-10-21 | Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents | Guangfu Guo et.al. | 2510.18424 | null |
| 2025-10-21 | SegTune: Structured and Fine-Grained Control for Song Generation | Pengfei Cai et.al. | 2510.18416 | null |
| 2025-10-21 | Adamas: Hadamard Sparse Attention for Efficient Long-Context Inference | Siyuan Yan et.al. | 2510.18413 | null |
| 2025-10-21 | MENTOR: A Reinforcement Learning Framework for Model Enhancement via Teacher-Optimized Rewards in Small Models | ChangSu Choi et.al. | 2510.18383 | null |
| 2025-10-21 | Training Diverse Graph Experts for Ensembles: A Systematic Empirical Study | Gangda Deng et.al. | 2510.18370 | null |
| 2025-10-21 | KoSimpleQA: A Korean Factuality Benchmark with an Analysis of Reasoning LLMs | Donghyeon Ko et.al. | 2510.18368 | null |
| 2025-10-21 | Evaluating LLM-Based Mobile App Recommendations: An Empirical Study | Quim Motger et.al. | 2510.18364 | null |
| 2025-10-21 | KrishokBondhu: A Retrieval-Augmented Voice-Based Agricultural Advisory Call Center for Bengali Farmers | Mohd Ruhul Ameen et.al. | 2510.18355 | null |
| 2025-10-21 | GPTFace: Generative Pre-training of Facial-Linguistic Transformer by Span Masking and Weakly Correlated Text-image Data | Yudong Li et.al. | 2510.18345 | null |
| 2025-10-21 | Combining Distantly Supervised Models with In Context Learning for Monolingual and Cross-Lingual Relation Extraction | Vipul Rathore et.al. | 2510.18344 | null |
| 2025-10-21 | Why Policy Gradient Algorithms Work for Undiscounted Total-Reward MDPs | Jongmin Lee et.al. | 2510.18340 | null |
| 2025-10-21 | ECG-LLM– training and evaluation of domain-specific large language models for electrocardiography | Lara Ahrens et.al. | 2510.18339 | null |
| 2025-10-21 | Position: LLM Watermarking Should Align Stakeholders’ Incentives for Practical Adoption | Yepeng Liu et.al. | 2510.18333 | null |
| 2025-10-21 | InspectCoder: Dynamic Analysis-Enabled Self Repair through interactive LLM-Debugger Collaboration | Yunkun Wang et.al. | 2510.18327 | null |
| 2025-10-21 | Beyond Single Models: Mitigating Multimodal Hallucinations via Adaptive Token Ensemble Decoding | Jinlin Li et.al. | 2510.18321 | null |
| 2025-10-21 | Genesis: Evolving Attack Strategies for LLM Web Agent Red-Teaming | Zheng Zhang et.al. | 2510.18314 | null |
| 2025-10-21 | ParaStyleTTS: Toward Efficient and Robust Paralinguistic Style Control for Expressive Text-to-Speech Generation | Haowei Lou et.al. | 2510.18308 | null |
| 2025-10-21 | The Impact of Image Resolution on Biomedical Multimodal Large Language Models | Liangyu Chen et.al. | 2510.18304 | null |
| 2025-10-21 | Proactive Reasoning-with-Retrieval Framework for Medical Multimodal Large Language Models | Lehan Wang et.al. | 2510.18303 | null |
| 2025-10-21 | From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering | Lei Li et.al. | 2510.18297 | null |
| 2025-10-21 | BrailleLLM: Braille Instruction Tuning with Large Language Models for Braille Domain Tasks | Tianyuan Huang et.al. | 2510.18288 | null |
| 2025-10-21 | Text or Pixels? It Takes Half: On the Token Efficiency of Visual Text Inputs in Multimodal LLMs | Yanhong Li et.al. | 2510.18279 | null |
| 2025-10-21 | Enhancing Hotel Recommendations with AI: LLM-Based Review Summarization and Query-Driven Insights | Nikolaos Belibasakis et.al. | 2510.18277 | null |
| 2025-10-21 | StreamingTOM: Streaming Token Compression for Efficient Video Understanding | Xueyi Chen et.al. | 2510.18269 | null |
| 2025-10-21 | UWBench: A Comprehensive Vision-Language Benchmark for Underwater Understanding | Da Zhang et.al. | 2510.18262 | null |
| 2025-10-21 | DelvePO: Direction-Guided Self-Evolving Framework for Flexible Prompt Optimization | Tao Tao et.al. | 2510.18257 | null |
| 2025-10-21 | Illusions of reflection: open-ended task reveals systematic failures in Large Language Models’ reflective reasoning | Sion Weatherhead et.al. | 2510.18254 | null |
Reinforcement Learning
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2025-10-29 | Prospects for a 95 GeV Higgs Boson at Future Higgs Factories with Transformer Networks | Yabo Dong et.al. | 2510.24662 | null |
| 2025-10-29 | OpenReward: Learning to Reward Long-form Agentic Tasks via Reinforcement Learning | Ziyou Hu et.al. | 2510.24636 | null |
| 2025-10-28 | Cluster Dose Prediction in Carbon Ion Therapy: Using Transfer Learning from a Pretrained Dose Prediction U-Net | Miriam Schwarze et.al. | 2510.24703 | null |
| 2025-10-28 | Greedy Sampling Is Provably Efficient for RLHF | Di Wu et.al. | 2510.24700 | null |
| 2025-10-28 | How Flat is a Plateau? Evolution of Late-Time TDE Disks | Yael Alush et.al. | 2510.24696 | null |
| 2025-10-28 | SPICE: Self-Play In Corpus Environments Improves Reasoning | Bo Liu et.al. | 2510.24684 | null |
| 2025-10-28 | Fare: Failure Resilience in Learned Visual Navigation Control | Zishuo Wang et.al. | 2510.24680 | null |
| 2025-10-28 | Learning to Drive Safely with Hybrid Options | Bram De Cooman et.al. | 2510.24674 | null |
| 2025-10-28 | Evolving Diagnostic Agents in a Virtual Clinical Environment | Pengcheng Qiu et.al. | 2510.24654 | null |
| 2025-10-28 | Advancing site-specific disease and pest management in precision agriculture: From reasoning-driven foundation models to adaptive, feedback-based learning | Nitin Rai et.al. | 2510.24650 | null |
| 2025-10-28 | Fast Bayesian Multilevel Quasi-Monte Carlo | Aleksei G. Sorokin et.al. | 2510.24604 | null |
| 2025-10-28 | Low-lying baryon resonances from lattice QCD | Colin Morningstar et.al. | 2510.24596 | null |
| 2025-10-28 | Towards Quadrupedal Jumping and Walking for Dynamic Locomotion using Reinforcement Learning | Jørgen Anker Olsen et.al. | 2510.24584 | null |
| 2025-10-28 | Dual-Mind World Models: A General Framework for Learning in Dynamic Wireless Networks | Lingyi Wang et.al. | 2510.24546 | null |
| 2025-10-28 | Sample-efficient and Scalable Exploration in Continuous-Time RL | Klemens Iten et.al. | 2510.24482 | null |
| 2025-10-28 | Adaptive Surrogate Gradients for Sequential Reinforcement Learning in Spiking Neural Networks | Korneel Van den Berghe et.al. | 2510.24461 | null |
| 2025-10-28 | Pair Approximation Meets Reality: Diffusion of Innovation in Organizational Networks within the biased-independence q-Voter Model | Angelika Abramiuk-Szurlej et.al. | 2510.24447 | null |
| 2025-10-28 | SPARTA: Evaluating Reasoning Segmentation Robustness through Black-Box Adversarial Paraphrasing in Text Autoencoder Latent Space | Viktoriia Zinkovich et.al. | 2510.24446 | null |
| 2025-10-28 | Fill in the Blanks: Accelerating Q-Learning with a Handful of Demonstrations in Sparse Reward Settings | Seyed Mahdi Basiri Azad et.al. | 2510.24432 | null |
| 2025-10-28 | MiniOneRec: An Open-Source Framework for Scaling Generative Recommendation | Xiaoyu Kong et.al. | 2510.24431 | null |
| 2025-10-28 | Multi-Agent Evolve: LLM Self-Improve through Co-evolution | Yixing Chen et.al. | 2510.23595 | null |
| 2025-10-28 | VOLD: Reasoning Transfer from LLMs to Vision-Language Models via On-Policy Distillation | Walid Bousselham et.al. | 2510.23497 | null |
| 2025-10-28 | SGFusion: Stochastic Geographic Gradient Fusion in Federated Learning | Khoa Nguyen et.al. | 2510.23455 | null |
| 2025-10-27 | Think Twice: Branch-and-Rethink Reasoning Reward Model | Yizhu Jiao et.al. | 2510.23596 | null |
| 2025-10-27 | Cosmic magnification on multi-catalogue Herschel submillimetre galaxies | R. Fernandez-Fernandez et.al. | 2510.23582 | null |
| 2025-10-27 | Towards Stochastic (N-1)-Secure Redispatch | Oleksii Molodchyk et.al. | 2510.23551 | null |
| 2025-10-27 | Variational Thermal State Preparation on Digital Quantum Processors Assisted by Matrix Product States | Rui-Hao Li et.al. | 2510.23546 | null |
| 2025-10-27 | Approximately optimal distributed controls for high-dimensional stochastic systems with pairwise interaction through controls | Elise Devey et.al. | 2510.23537 | null |
| 2025-10-27 | Sequential Multi-Agent Dynamic Algorithm Configuration | Chen Lu et.al. | 2510.23535 | null |
| 2025-10-27 | Learning to Reason Efficiently with Discounted Reinforcement Learning | Alex Ayoub et.al. | 2510.23486 | null |
| 2025-10-27 | MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding | Xin Jin et.al. | 2510.23479 | null |
| 2025-10-27 | Video-Thinker: Sparking “Thinking with Videos” via Reinforcement Learning | Shijian Wang et.al. | 2510.23473 | null |
| 2025-10-27 | Adaptive Multilevel Splitting: First Application to Rare-Event Derivative Pricing | Riccardo Gozzo et.al. | 2510.23461 | null |
| 2025-10-27 | Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences | Zhuoran Jin et.al. | 2510.23451 | null |
| 2025-10-27 | An Information-Theoretic Analysis of Out-of-Distribution Generalization in Meta-Learning with Applications to Meta-RL | Xingtu Liu et.al. | 2510.23448 | null |
| 2025-10-27 | Causal Deep Q Network | Elouanes Khelifi et.al. | 2510.23424 | null |
| 2025-10-27 | A Sequential Planning Framework for the Operational Reality of Interacting Air Traffic Flow Regulations and Traffic Flow Programs | Thinh Hoang et.al. | 2510.23402 | null |
| 2025-10-27 | VideoTG-R1: Boosting Video Temporal Grounding via Curriculum Reinforcement Learning on Reflected Boundary Annotations | Lu Dong et.al. | 2510.23397 | null |
| 2025-10-27 | The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation | Farid Bagirov et.al. | 2510.23393 | null |
| 2025-10-27 | Ground-state phase diagram of S = 1/2 Heisenberg model on 2D square-hexagon-octagon lattice | Yumeng Luo et.al. | 2510.23376 | null |
| 2025-10-24 | Mechanistic Interpretability for Neural TSP Solvers | Reuben Narad et.al. | 2510.21693 | null |
| 2025-10-24 | Reduced Floating-Point Precision Implicit Monte Carlo | Simon Butson et.al. | 2510.21683 | null |
| 2025-10-24 | Goal-based portfolio selection with fixed transaction costs | Erhan Bayraktar et.al. | 2510.21650 | null |
| 2025-10-24 | Electroweak corrections to $gg\rightarrow γγ$ | Gabriele Fiore et.al. | 2510.21643 | null |
| 2025-10-24 | Predicted observational effects of rapid rotation for Be stars | Rina G. Rast et.al. | 2510.21640 | null |
| 2025-10-24 | DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection | Tala Aljaafari et.al. | 2510.21638 | null |
| 2025-10-24 | DeepAgent: A General Reasoning Agent with Scalable Toolsets | Xiaoxi Li et.al. | 2510.21618 | null |
| 2025-10-24 | Enhancing Tactile-based Reinforcement Learning for Robotic Control | Elle Miller et.al. | 2510.21609 | null |
| 2025-10-24 | Multilevel Picard scheme for solving high-dimensional drift control problems with state constraints | Yuan Zhong et.al. | 2510.21607 | null |
| 2025-10-24 | RETuning: Upgrading Inference-Time Scaling for Stock Movement Prediction with Large Language Models | Xueyuan Lin et.al. | 2510.21604 | null |
| 2025-10-24 | Three-nucleon lepton-number-violating potentials in chiral EFT and their matrix elements in light nuclei | Graham Chambers-Wall et.al. | 2510.21564 | null |
| 2025-10-24 | System-Theoretic Analysis of Dynamic Generalized Nash Equilibrium Problems – Turnpikes and Dissipativity | Sophie Hall et.al. | 2510.21556 | null |
| 2025-10-24 | Cost Minimization for Space-Air-Ground Integrated Multi-Access Edge Computing Systems | Weihong Qin et.al. | 2510.21541 | null |
| 2025-10-24 | A Unified Model for Multi-Task Drone Routing in Post-Disaster Road Assessment | Huatian Gong et.al. | 2510.21525 | null |
| 2025-10-24 | Surrogate-based quantification of policy uncertainty in generative flow networks | Ramón Nartallo-Kaluarachchi et.al. | 2510.21523 | null |
| 2025-10-24 | The population of Galactic young massive star clusters in the TeV range | Rowan Batzofin et.al. | 2510.21480 | null |
| 2025-10-24 | MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization | Chenglong Wang et.al. | 2510.21473 | null |
| 2025-10-24 | Constraints on ultra-heavy dark matter from the CDEX-10 experiment at the China Jinping Underground Laboratory | Y. F. Wang et.al. | 2510.21458 | null |
| 2025-10-24 | Unified token representations for sequential decision models | Zhuojing Tian et.al. | 2510.21448 | null |
| 2025-10-24 | Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems | Hao Liang et.al. | 2510.21427 | null |
| 2025-10-24 | Real-Time Gait Adaptation for Quadrupeds using Model Predictive Control and Reinforcement Learning | Prakrut Kotecha et.al. | 2510.20706 | null |
| 2025-10-23 | KL-Regularized Reinforcement Learning is Designed to Mode Collapse | Anthony GX-Chen et.al. | 2510.20817 | null |
| 2025-10-23 | GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation | Guangqi Jiang et.al. | 2510.20813 | null |
| 2025-10-23 | A Microphysical Probe of Neutron Star Interiors: Constraining the Equation of State with Glitch Dynamics | Zhonghao Tu et.al. | 2510.20791 | null |
| 2025-10-23 | Consumption-Investment Problem in Rank-Based Models | David Itkin et.al. | 2510.20763 | null |
| 2025-10-23 | Reinforcement Learning and Consumption-Savings Behavior | Brandon Kaplowitz et.al. | 2510.20748 | null |
| 2025-10-23 | No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes | Jasmine Bayrooti et.al. | 2510.20725 | null |
| 2025-10-23 | Measuring cosmic dipole with the GRB luminosity-time relation | Jessica Santiago et.al. | 2510.20705 | null |
| 2025-10-23 | Plan Then Retrieve: Reinforcement Learning-Guided Complex Reasoning over Knowledge Graphs | Yanlin Song et.al. | 2510.20691 | null |
| 2025-10-23 | Downsizing Diffusion Models for Cardinality Estimation | Xinhe Mu et.al. | 2510.20681 | null |
| 2025-10-23 | The Shape of Reasoning: Topological Analysis of Reasoning Traces in Large Language Models | Xue Wen Tan et.al. | 2510.20665 | null |
| 2025-10-23 | Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence | Jiahao Meng et.al. | 2510.20579 | null |
| 2025-10-23 | EmbodiedBrain: Expanding Performance Boundaries of Task Planning for Embodied Intelligence | Ding Zou et.al. | 2510.20578 | null |
| 2025-10-23 | Monte Carlo Sampling for Wave Functions Requiring (Anti)Symmetrization | Koyena Bose et.al. | 2510.20577 | null |
| 2025-10-23 | AdaDoS: Adaptive DoS Attack via Deep Adversarial Reinforcement Learning in SDN | Wei Shao et.al. | 2510.20566 | null |
| 2025-10-23 | GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering via Reinforcement Learning | Jinchang Luo et.al. | 2510.20548 | null |
| 2025-10-23 | A Unified Framework for Zero-Shot Reinforcement Learning | Jacopo Di Ventura et.al. | 2510.20542 | null |
| 2025-10-23 | Detection of ultra-high-energy cosmic rays in the southern hemisphere with FAST: data acquisition and preliminary results | Jakub Kmec et.al. | 2510.20522 | null |
| 2025-10-23 | Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence | Kun Ouyang et.al. | 2510.20470 | null |
| 2025-10-23 | On Multiple Robustness of Proximal Dynamic Treatment Regimes | Yuanshan Gao et.al. | 2510.20451 | null |
| 2025-10-23 | DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning | Runpeng Xie et.al. | 2510.19562 | null |
| 2025-10-22 | olmOCR 2: Unit Test Rewards for Document OCR | Jake Poznanski et.al. | 2510.19817 | null |
| 2025-10-22 | Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing | Yusu Qian et.al. | 2510.19808 | null |
| 2025-10-22 | Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning | Xichen Zhang et.al. | 2510.19807 | null |
| 2025-10-22 | SmartSwitch: Advancing LLM Reasoning by Overcoming Underthinking via Promoting Deeper Thought Exploration | Xichen Zhang et.al. | 2510.19767 | null |
| 2025-10-22 | SEA: Semantic Map Prediction for Active Exploration of Uncertain Areas | Hongyu Ding et.al. | 2510.19766 | null |
| 2025-10-22 | Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning | Gunshi Gupta et.al. | 2510.19732 | null |
| 2025-10-22 | Semi-Implicit Approaches for Large-Scale Bayesian Spatial Interpolation | Sébastien Garneau et.al. | 2510.19722 | null |
| 2025-10-22 | MedReason-R1: Learning to Reason for CT Diagnosis with Reinforcement Learning and Local Zoom | Yifan Li et.al. | 2510.19626 | null |
| 2025-10-22 | Demonstrating Real Advantage of Machine-Learning-Enhanced Monte Carlo for Combinatorial Optimization | Luca Maria Del Bono et.al. | 2510.19544 | null |
| 2025-10-22 | Quantum Monte Carlo study of low-dimensional Fermi fluids of dipolar atoms | Clio Johnson et.al. | 2510.19533 | null |
| 2025-10-22 | The Confusing Instance Principle for Online Linear Quadratic Control | Waris Radji et.al. | 2510.19531 | null |
| 2025-10-22 | Optimizing the Unknown: Black Box Bayesian Optimization with Energy-Based Model and Reinforcement Learning | Ruiyao Miao et.al. | 2510.19530 | null |
| 2025-10-22 | Learning Upper Lower Value Envelopes to Shape Online RL: A Principled Approach | Sebastian Reboul et.al. | 2510.19528 | null |
| 2025-10-22 | Practical algorithm for simulating thermal pure quantum states | Wei-Bo He et.al. | 2510.19504 | null |
| 2025-10-22 | Using Non-Expert Data to Robustify Imitation Learning via Offline Reinforcement Learning | Kevin Huang et.al. | 2510.19495 | null |
| 2025-10-22 | Quantum Machine Learning methods for Fourier-based distribution estimation with application in option pricing | Fernando Alonso et.al. | 2510.19494 | null |
| 2025-10-22 | Monte Carlo study of the $O(2)$-invariant $φ^4$ theory with a cubic perturbation in three dimensions | Martin Hasenbusch et.al. | 2510.19473 | null |
| 2025-10-22 | Reasoning Like Experts: Leveraging Multimodal Large Language Models for Drawing-based Psychoanalysis | Xueqi Ma et.al. | 2510.19451 | null |
| 2025-10-22 | Universal Quantitative Abstraction: Categorical Duality and Logical Completeness for Probabilistic Systems | Nivar Anwer et.al. | 2510.19444 | null |
| 2025-10-21 | Retaining by Doing: The Role of On-Policy Data in Mitigating Forgetting | Howard Chen et.al. | 2510.18874 | null |
| 2025-10-21 | EffiReasonTrans: RL-Optimized Reasoning for Code Translation | Yanlin Wang et.al. | 2510.18863 | null |
| 2025-10-21 | Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model | Ling Team et.al. | 2510.18855 | null |
| 2025-10-21 | Lyapunov-Aware Quantum-Inspired Reinforcement Learning for Continuous-Time Vehicle Control: A Feasibility Study | Nutkritta Kraipatthanapong et.al. | 2510.18852 | null |
| 2025-10-21 | Towards Faithful and Controllable Personalization via Critique-Post-Edit Reinforcement Learning | Chenghao Zhu et.al. | 2510.18849 | null |
| 2025-10-21 | MADR: MPC-guided Adversarial DeepReach | Ryan Teoh et.al. | 2510.18845 | null |
| 2025-10-21 | PCMS: Parallel Coupler For Multimodel Simulations | Jacob S. Merson et.al. | 2510.18838 | null |
| 2025-10-21 | Actor-Free Continuous Control via Structurally Maximizable Q-Functions | Yigit Korkmaz et.al. | 2510.18828 | null |
| 2025-10-21 | Search Self-play: Pushing the Frontier of Agent Capability without Supervision | Hongliang Lu et.al. | 2510.18821 | null |
| 2025-10-21 | Online SFT for LLM Reasoning: Surprising Effectiveness of Self-Tuning without Rewards | Mengqi Li et.al. | 2510.18814 | null |
| 2025-10-21 | Computational Foundations for Strategic Coopetition: Formalizing Interdependence and Complementarity | Vik Pant et.al. | 2510.18802 | null |
| 2025-10-21 | Two-loop QCD corrections for real and off-shell diphoton and triphoton production via quark loops | Dario Kermanschah et.al. | 2510.18801 | null |
| 2025-10-21 | WebSeer: Training Deeper Search Agents through Reinforcement Learning with Self-Reflection | Guanzhong He et.al. | 2510.18798 | null |
| 2025-10-21 | Beware of the running $n_s$ when producing heavy primordial black holes | Sasha Allegrini et.al. | 2510.18791 | null |
| 2025-10-21 | Analysis note: measurement of thrust and track energy-energy correlator in $e^+e^-$ collisions at 91.2 GeV with DELPHI open data | Jingyu Zhang et.al. | 2510.18762 | null |
| 2025-10-21 | Verifiable Accuracy and Abstention Rewards in Curriculum RL to Alleviate Lost-in-Conversation | Ming Li et.al. | 2510.18731 | null |
| 2025-10-21 | Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options | Joongkyu Lee et.al. | 2510.18713 | null |
| 2025-10-21 | Chemistry, Climate, and Transmission Spectra of TRAPPIST-1 e Explored with a Multimodel Sparse Sampled Ensemble | Eric T. Wolf et.al. | 2510.18704 | null |
| 2025-10-21 | Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen Approach | Chenbei Lu et.al. | 2510.18687 | null |
| 2025-10-21 | Sherlock Your Queries: Learning to Ask the Right Questions for Dialogue-Based Retrieval | Dong Yun et.al. | 2510.18659 | null |
| 2025-10-21 | An integrated neural wavefunction solver for spinful Fermi systems | Alexander Avdoshkin et.al. | 2510.18621 | null |
| 2025-10-21 | CUARewardBench: A Benchmark for Evaluating Reward Models on Computer-using Agent | Haojia Lin et.al. | 2510.18596 | null |
| 2025-10-21 | Deep Q-Learning Assisted Bandwidth Reservation for Multi-Operator Time-Sensitive Vehicular Networking | Abdullah Al-Khatib et.al. | 2510.18553 | null |
| 2025-10-21 | Improved thermonuclear rate of $^{42}$Ti($p$,$γ$)$^{43}$ V and its astrophysical implication in rp-process | S. Q. Hou et.al. | 2510.18531 | null |
| 2025-10-21 | Efficient Model-Based Reinforcement Learning for Robot Control via Online Learning | Fang Nan et.al. | 2510.18518 | null |
| 2025-10-21 | Socialized Learning and Emergent Behaviors in Multi-Agent Systems based on Multimodal Large Language Models | Sureyya Akin et.al. | 2510.18515 | null |
| 2025-10-21 | Learning to Navigate Under Imperfect Perception: Conformalised Segmentation for Safe Reinforcement Learning | Daniel Bethell et.al. | 2510.18485 | null |
| 2025-10-21 | Safe But Not Sorry: Reducing Over-Conservatism in Safety Critics via Uncertainty-Aware Modulation | Daniel Bethell et.al. | 2510.18478 | null |
| 2025-10-21 | CodeRL+: Improving Code Generation via Reinforcement with Execution Semantics Alignment | Xue Jiang et.al. | 2510.18471 | null |
| 2025-10-21 | Uncovering critical temperature dependence in Heusler magnets via explicit machine learning | Jean-Baptiste Morée et.al. | 2510.18469 | null |
| 2025-10-21 | DeLoad: Demand-Driven Short-Video Preloading with Scalable Watch-Time Estimation | Tong Liu et.al. | 2510.18459 | null |
| 2025-10-21 | Fingerprints of cluster-based Haldane and bound-magnon states in a spin-1 Heisenberg diamond chain | Azam Zoshki et.al. | 2510.18447 | null |
| 2025-10-21 | PlanU: Large Language Model Decision Making through Planning under Uncertainty | Ziwei Deng et.al. | 2510.18442 | null |
| 2025-10-21 | Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents | Guangfu Guo et.al. | 2510.18424 | null |
| 2025-10-21 | On AI Verification in Open RAN | Rahul Soundrarajan et.al. | 2510.18417 | null |
| 2025-10-21 | MENTOR: A Reinforcement Learning Framework for Model Enhancement via Teacher-Optimized Rewards in Small Models | ChangSu Choi et.al. | 2510.18383 | null |
| 2025-10-21 | Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback | Yi-Lun Wu et.al. | 2510.18353 | null |
| 2025-10-21 | PGTT: Phase-Guided Terrain Traversal for Perceptive Legged Locomotion | Alexandros Ntagkas et.al. | 2510.18348 | null |
| 2025-10-21 | Why Policy Gradient Algorithms Work for Undiscounted Total-Reward MDPs | Jongmin Lee et.al. | 2510.18340 | null |
| 2025-10-21 | The implications of inflation for the last ACT | Zhi-Chong Qiu et.al. | 2510.18320 | null |
| 2025-10-21 | MoMaGen: Generating Demonstrations under Soft and Hard Constraints for Multi-Step Bimanual Mobile Manipulation | Chengshu Li et.al. | 2510.18316 | null |
| 2025-10-21 | Higher Embedding Dimension Creates a Stronger World Model for a Simple Sorting Task | Brady Bhalla et.al. | 2510.18315 | null |
| 2025-10-21 | Proactive Reasoning-with-Retrieval Framework for Medical Multimodal Large Language Models | Lehan Wang et.al. | 2510.18303 | null |
| 2025-10-21 | Food4All: A Multi-Agent Framework for Real-time Free Food Discovery with Integrated Nutritional Metadata | Zhengqing Yuan et.al. | 2510.18289 | null |
| 2025-10-21 | From Competition to Synergy: Unlocking Reinforcement Learning for Subject-Driven Image Generation | Ziwei Huang et.al. | 2510.18263 | null |
| 2025-10-21 | NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective | Xiaohan Qin et.al. | 2510.18258 | null |
| 2025-10-21 | The Picard-Lagrange Framework for Higher-Order Langevin Monte Carlo | Jaideep Mahajan et.al. | 2510.18242 | null |
| 2025-10-21 | Nash Policy Gradient: A Policy Gradient Method with Iteratively Refined Regularization for Finding Nash Equilibria | Eason Yu et.al. | 2510.18183 | null |
| 2025-10-20 | Local Coherence or Global Validity? Investigating RLVR Traces in Math Domains | Soumya Rani Samineni et.al. | 2510.18176 | null |
| 2025-10-20 | LLMs Encode How Difficult Problems Are | William Lugoloobi et.al. | 2510.18147 | null |
| 2025-10-20 | Measuring Reasoning in LLMs: a New Dialectical Angle | Soheil Abbasloo et.al. | 2510.18134 | null |
| 2025-10-20 | R2BC: Multi-Agent Imitation Learning from Single-Agent Demonstrations | Connor Mattson et.al. | 2510.18085 | null |
| 2025-10-20 | RL-Driven Security-Aware Resource Allocation Framework for UAV-Assisted O-RAN | Zaineh Abughazzah et.al. | 2510.18084 | null |
| 2025-10-20 | Provably Optimal Reinforcement Learning under Safety Filtering | Donggeon David Oh et.al. | 2510.18082 | null |
| 2025-10-20 | R2L: Reliable Reinforcement Learning: Guaranteed Return & Reliable Policies in Reinforcement Learning | Nadir Farhi et.al. | 2510.18074 | null |
| 2025-10-20 | Fine-tuning Flow Matching Generative Models with Intermediate Feedback | Jiajun Fan et.al. | 2510.18072 | null |
| 2025-10-20 | Oxidation State Dynamics and Emerging Patterns in Magnetite | Emre Gürsoy et.al. | 2510.18061 | null |
| 2025-10-20 | SPACeR: Self-Play Anchoring with Centralized Reference Models | Wei-Jer Chang et.al. | 2510.18060 | null |
| 2025-10-20 | Adaptive Divergence Regularized Policy Optimization for Fine-tuning Generative Models | Jiajun Fan et.al. | 2510.18053 | null |
| 2025-10-20 | OPTAGENT: Optimizing Multi-Agent LLM Interactions Through Verbal Reinforcement Learning for Enhanced Reasoning | Zhenyu Bi et.al. | 2510.18032 | null |
| 2025-10-20 | Humanoid Goalkeeper: Learning from Position Conditioned Task-Motion Constraints | Junli Ren et.al. | 2510.18002 | null |
| 2025-10-20 | Collider Searches for Near-Continuum Dark Matter | Steven Ferrante et.al. | 2510.17989 | null |
| 2025-10-20 | Accelerating Bayesian Inference via Multi-Fidelity Transport Map Coupling | Sanjan C. Muchandimath et.al. | 2510.17946 | null |
| 2025-10-20 | An Exact Quantile-Energy Equality for Terminal Halfspaces in Linear-Gaussian Control with a Discrete-Time Companion, KL/Schrodinger Links, and High-Precision Validation | Sandro Andric et.al. | 2510.17945 | null |
| 2025-10-20 | UniRL-Zero: Reinforcement Learning on Unified Models with Joint Language Model and Diffusion Model Experts | Fu-Yun Wang et.al. | 2510.17937 | null |
| 2025-10-20 | EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning | He Du et.al. | 2510.17928 | null |
| 2025-10-20 | Rewarding the Journey, Not Just the Destination: A Composite Path and Answer Self-Scoring Reward Mechanism for Test-Time Reinforcement Learning | Chenwei Tang et.al. | 2510.17923 | null |
| 2025-10-20 | CLAWS:Creativity detection for LLM-generated solutions using Attention Window of Sections | Keuntae Kim et.al. | 2510.17921 | null |
| 2025-10-20 | Functional Distribution Networks (FDN) | Omer Haq et.al. | 2510.17794 | null |
| 2025-10-20 | Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains | Austin Xu et.al. | 2510.17793 | null |
| 2025-10-20 | SoftMimic: Learning Compliant Whole-body Control from Examples | Gabriel B. Margolis et.al. | 2510.17792 | null |
| 2025-10-20 | UltraCUA: A Foundation Model for Computer Use Agents with Hybrid Action | Yuhao Yang et.al. | 2510.17790 | null |
| 2025-10-20 | B-Meson Anomalies: Effective Field Theory Meets Machine Learning | Alejandro Mir et.al. | 2510.17742 | null |
| 2025-10-20 | Train for Truth, Keep the Skills: Binary Retrieval-Augmented Reward Mitigates Hallucinations | Tong Chen et.al. | 2510.17733 | null |
| 2025-10-20 | QueST: Incentivizing LLMs to Generate Difficult Problems | Hanxu Hu et.al. | 2510.17715 | null |
| 2025-10-20 | The Marked Edge Walk: A Novel MCMC Algorithm for Sampling of Graph Partitions | Atticus McWhorter et.al. | 2510.17714 | null |
| 2025-10-20 | A Principle of Targeted Intervention for Multi-Agent Reinforcement Learning | Anjie Liu et.al. | 2510.17697 | null |
| 2025-10-20 | Efficient Algorithms for Mitigating Uncertainty and Risk in Reinforcement Learning | Xihong Su et.al. | 2510.17690 | null |
| 2025-10-20 | CrossGuard: Safeguarding MLLMs against Joint-Modal Implicit Malicious Attacks | Xu Zhang et.al. | 2510.17687 | null |
| 2025-10-20 | RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation | Yuquan Xue et.al. | 2510.17640 | null |
| 2025-10-20 | Colour coherence in small collision systems | Isobel Kolbé et.al. | 2510.17570 | null |
| 2025-10-20 | An Empirical Study of Lagrangian Methods in Safe Reinforcement Learning | Lindsay Spoor et.al. | 2510.17564 | null |
| 2025-10-20 | Towards Optimal Control and Algorithmic Structure of Decompression Schedules | Benjamin Marsh et.al. | 2510.17551 | null |
| 2025-10-20 | OncoReason: Structuring Clinical Reasoning in LLMs for Robust and Interpretable Survival Prediction | Raghu Vamshi Hemadri et.al. | 2510.17532 | null |
| 2025-10-20 | Plasma Shape Control via Zero-shot Generative Reinforcement Learning | Niannian Wu et.al. | 2510.17531 | null |
| 2025-10-20 | Toward Autonomous Neural VMC: An Energy-Variance Convergence Criterion for Quantum Systems | Huan-Chen Shi et.al. | 2510.17490 | null |
| 2025-10-20 | Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs | Paula Cordero-Encinar et.al. | 2510.17472 | null |
| 2025-10-20 | Estimating Orbital Parameters of Direct Imaging Exoplanet Using Neural Network | Bo Liang et.al. | 2510.17459 | null |
| 2025-10-20 | Agentic Reinforcement Learning for Search is Unsafe | Yushi Yang et.al. | 2510.17431 | null |
| 2025-10-20 | Leveraging Group Relative Policy Optimization to Advance Large Language Models in Traditional Chinese Medicine | Jiacheng Xie et.al. | 2510.17402 | null |
| 2025-10-20 | Finite-Time Bounds for Average-Reward Fitted Q-Iteration | Jongmin Lee et.al. | 2510.17391 | null |
| 2025-10-20 | Inference of Deterministic Finite Automata via Q-Learning | Elaheh Hosseinkhani et.al. | 2510.17386 | null |
| 2025-10-20 | TabR1: Taming GRPO for tabular reasoning LLMs | Pengxiang Cai et.al. | 2510.17385 | null |
| 2025-10-20 | Optimizing Energy Management of Smart Grid using Reinforcement Learning aided by Surrogate models built using Physics-informed Neural Networks | Julen Cestero et.al. | 2510.17380 | null |
| 2025-10-20 | When 5G NTN Meets GNSS: Tracking GNSS Signals under Overlaid 5G Waveforms | Idir Edjekouane et.al. | 2510.17324 | null |
| 2025-10-20 | Auto-Rubric: Learning to Extract Generalizable Criteria for Reward Modeling | Lipeng Xie et.al. | 2510.17314 | null |
| 2025-10-20 | Multimodal Safety Is Asymmetric: Cross-Modal Exploits Unlock Black-Box MLLMs Jailbreaks | Xinkai Wang et.al. | 2510.17277 | null |
| 2025-10-20 | Characterizing expansivity through $C^*$ -algebras | S. Bautista et.al. | 2510.17255 | null |
| 2025-10-20 | From Preferences to Prejudice: The Role of Alignment Tuning in Shaping Social Bias in Video Diffusion Models | Zefan Cai et.al. | 2510.17247 | null |
| 2025-10-20 | Deep Neural Network extraction of Unpolarized Transverse Momentum Distributions | I. P. Fernando et.al. | 2510.17243 | null |
| 2025-10-20 | Coinvisor: An RL-Enhanced Chatbot Agent for Interactive Cryptocurrency Investment Analysis | Chong Chen et.al. | 2510.17235 | null |
| 2025-10-20 | D2C-HRHR: Discrete Actions with Double Distributional Critics for High-Risk-High-Return Tasks | Jundong Zhang et.al. | 2510.17212 | null |
| 2025-10-20 | Trading with the Devil: Risk and Return in Foundation Model Strategies | Jinrui Zhang et.al. | 2510.17165 | null |
| 2025-10-20 | ALPINE: A Lightweight and Adaptive Privacy-Decision Agent Framework for Dynamic Edge Crowdsensing | Guanjie Cheng et.al. | 2510.17162 | null |
| 2025-10-20 | GACO-CAD: Geometry-Augmented and Conciseness-Optimized CAD Model Generation from Single Image | Yinghui Wang et.al. | 2510.17157 | null |
| 2025-10-20 | Decentralized Real-Time Planning for Multi-UAV Cooperative Manipulation via Imitation Learning | Shantnav Agarwal et.al. | 2510.17143 | null |
| 2025-10-20 | Rethinking On-policy Optimization for Query Augmentation | Zhichao Xu et.al. | 2510.17139 | null |
| 2025-10-20 | Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control | Chengxiu Hua et.al. | 2510.17122 | null |
| 2025-10-20 | Learning to Design Soft Hands using Reward Models | Xueqian Bai et.al. | 2510.17086 | null |
| 2025-10-20 | Consistent Zero-Shot Imitation with Contrastive Goal Inference | Kathryn Wantlin et.al. | 2510.17059 | null |
Notes:
- We have modified the
sorting ruleof the above table to prioritize papers based on the time of their latest update rather than their initial publication date. If an article has been recently modified, it will appear earlier in the list.
Function added: