Skip to content

CMander02/DailyAgentPapers

Repository files navigation

DailyAgentPapers

每日 Arxiv Agent 论文自动摘要 | Daily Arxiv Agent Paper Summaries

2026-06-18 (28 篇)

分数 论文 标签
9.5 Beyond Global Replanning: Hierarchical Recovery for Cross-Device Agent Systems Multi-Device Agent, Hierarchical Replanning, Agent Recovery
9.5 LLM agent safety, multi-turn red-teaming, jailbreak benchmarks, adversarial robustness, safety-critical systems agent safety, multi-turn red-teaming, jailbreak benchmark
9.5 ScholarQuest: A Taxonomy-Guided Benchmark for Agentic Academic Paper Search in Open Literature Environments 学术搜索Agent, Agent评测基准, 信息检索Agent
9.5 Beyond Static Endpoints: Tool Programs as an Interface for Flexible Agentic Web Services Agent工具使用, Agent规划, Agent执行优化
9.5 MetaResearcher: Scaling Deep Research via Self-Reflective Reinforcement Learning in Adversarial Virtual Environments 多智能体协作, 对抗训练, 深度研究Agent
9.5 Heterogeneous LLM Debate Under Adversarial Peers: Honest Gains, Replacement Costs, and Resilience 多智能体辩论, 对抗鲁棒性, 异构LLM
9.5 SIGMA: Skill-Incidence Graphs for Compositional Multi-Agent Design 多智能体系统, 技能组合, 图神经网络
9.0 LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents Tool-Use Agent, State Management, Policy Compliance
9.0 Marginal Advantage Accumulation for Memory-Driven Agent Self-Evolution Agent自我进化, 内存驱动, 边际优势累积
9.0 MedRLM: Recursive Multimodal Health Intelligence for Long-Context Clinical Reasoning, Sensor-Guided Screening, Evidence-Grounded Decision Support, and Community-to-Tertiary Referral Optimization 多智能体协作, 医学AI Agent, 递归推理
9.0 Connect the Dots: Training LLMs for Long-Lifecycle Agents with Cross-Domain Generalization Via Reinforcement Learning [Code] LLM Agent, 强化学习, 长周期Agent
9.0 Multi-Agent Transactive Memory 多智能体系统, 记忆共享, 检索增强生成
9.0 AtomMem: Building Simple and Effective Memory System for LLM Agents via Atomic Facts Agent记忆系统, 记忆增强, 原子事实提取
9.0 ORAgentBench: Can LLM Agents Solve Challenging Operations Research Tasks End to End? LLM Agent, Operations Research, Benchmark
9.0 Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents Agent评估, 预测效度, 基准设计
8.5 Contagion Networks: Evaluator Bias Propagation in Multi-Agent LLM Systems 多智能体系统, 评估偏差传播, LLM评估器
8.5 UltraQuant: 4-bit KV Caching for Context-Heavy Agents LLM Agent, KV-Cache压缩, 4-bit量化
8.5 SoftSkill: Behavioral Compression for Contextual Adaptation Agent skill compression, Soft prompt tuning, Frozen backbone agent
8.5 ScaffoldAgent: Utility-Guided Dynamic Outline Optimization for Open-Ended Deep Research LLM Agent, Deep Research Agent, Multi-Round Retrieval
8.5 When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents [Code] Agent安全, 工具使用, 权限最小化
8.5 AgentFinVQA: A Deployable Multi-Agent Pipeline for Auditable Financial Chart QA Multi-Agent pipeline, Financial chart QA, Auditability
8 Probe-and-Refine Tuning of Repository Guidance for Coding Agents Coding Agent, Repository Guidance, Probe-and-Refine Tuning
7.5 Efficient and Sound Probabilistic Verification for AI Agents Agent 安全, 运行时监控, 概率验证
7.5 Phoenix: Safe GitHub Issue Resolution via Multi-Agent LLMs Multi-Agent Systems, Code Agent, Software Engineering Agent
7.5 N-Version Programming with Coding Agents Coding Agent, 多版本编程, 代理多样性
7.5 VIMPO: Value-Implicit Policy Optimization for LLMs LLM Agent 推理增强, 强化学习, 策略优化
7.5 A Systematic Evaluation of Black-Box Uncertainty Estimation Methods for Large Language Models Black-Box UE, Multi-Agent, Uncertainty Estimation
7.5 Large Language Models Do Not Always Need Readable Language LLM Agent, 跨智能体通信, Agent记忆

About

我的每日 Arxiv Agent 论文摘要

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors