arXiv:2508.13167v1 Announce Type: new
Abstract: Recent advances in large language models (LLMs) and multi-agent systems have demonstrated remarkable capabilities in complex problem-solving tasks such as deep research, vibe coding, and mathematical reasoning. However, most existing multi-agent systems are built upon manual prompt/workflow engineering with sophisticated agent frameworks, making them computationally inefficient, less capable, and can not benefit from data-centric learning. In this work, we introduce Chain-of-Agents (CoA), a novel paradigm of LLM reasoning that enables native end-to-end complex problem-solving in the same way as a multi-agent system (i.e., multi-turn problem solving with multiple tools and multiple agents) within one model. In chain-of-agents problem-solving, the model dynamically activates different tool agents and role-playing agents to simulate multi-agent collaboration in an end-to-end fashion. To elicit end-to-end chain-of-agents problem-solving abilities in LLMs, we introduce a multi-agent distillation framework to distill state-of-the-art multi-agent systems into chain-of-agents trajectories for agentic supervised fine-tuning. We then use agentic reinforcement learning on verifiable agentic tasks to further improve the models’ capabilities on chain-of-agents problem solving. We call the resulting models Agent Foundation Models (AFMs). Our empirical studies demonstrate that AFM establishes new state-of-the-art performance across diverse benchmarks in both web agent and code agent settings. We make the entire research, including the model weights, code for training and evaluation, and the training data, fully open-sourced, which offers a solid starting point for future research on agent models and agentic RL.
Expert Commentary: Innovations in Chain-of-Agents (CoA) Model for Problem-Solving
The recent developments in large language models (LLMs) and multi-agent systems have significantly advanced the capabilities of complex problem-solving tasks, including deep research, vibe coding, and mathematical reasoning. However, traditional multi-agent systems often rely on manual prompt/workflow engineering, which can be computationally inefficient and less adaptable to data-centric learning processes.
This work introduces an innovative approach called Chain-of-Agents (CoA), which revolutionizes LLM reasoning by enabling native end-to-end complex problem-solving within a single model. CoA mimics the dynamics of multi-agent collaboration by activating different tool agents and role-playing agents in a seamless, multi-turn problem-solving process. This paradigm shift allows for more efficient and effective problem-solving strategies.
To enhance the capabilities of LLMs in chain-of-agents problem-solving, the authors propose a multi-agent distillation framework to train Agent Foundation Models (AFMs). By distilling state-of-the-art multi-agent systems into CoA trajectories and leveraging agentic reinforcement learning, the AFMs achieve superior performance across various benchmarks in web agent and code agent settings.
Multi-Disciplinary Implications
The concepts presented in this research have profound multi-disciplinary implications. The integration of language models, multi-agent systems, and reinforcement learning techniques opens up new possibilities for solving complex problems across diverse domains. By combining insights from natural language processing, artificial intelligence, and cognitive science, the CoA model exemplifies the power of interdisciplinary collaboration in pushing the boundaries of problem-solving capabilities.
Furthermore, the decision to make the research transparent and open-source sets a precedent for fostering collaboration and innovation in the field of agent models and agentic reinforcement learning. The availability of model weights, training code, and data enables researchers to build upon this work and drive further advancements in AI-driven problem-solving methodologies.
In conclusion, the Chain-of-Agents model represents a significant step forward in enhancing the capabilities of LLMs for complex problem-solving tasks. By leveraging multi-agent collaboration dynamics and reinforcement learning, the AFMs demonstrate state-of-the-art performance and pave the way for future developments in the intersection of language models, agent systems, and AI technologies.