AI在线 AI在线

Disrupting Tradition! New Multi-Agent Framework OWL Gains 17K Stars, Surpassing OpenAI to Pioneer a New Era of Intelligent Collaboration

With the rapid development of large language models (LLMs), single agents have revealed many limitations when dealing with complex real-world tasks. To address this issue, a new multi-agent framework named Workforce and an accompanying training method called OWL (Optimized Workforce Learning) were jointly introduced by institutions such as Hong Kong University and camel-ai. Recently, this innovative achievement achieved an accuracy rate of 69.70% on the authoritative benchmark test GAIA, not only breaking the record for open-source systems but also surpassing commercial systems like OpenAI Deep Research..

With the rapid development of large language models (LLMs), single agents have revealed many limitations when dealing with complex real-world tasks. To address this issue, a new multi-agent framework named Workforce and an accompanying training method called OWL (Optimized Workforce Learning) were jointly introduced by institutions such as Hong Kong University and camel-ai. Recently, this innovative achievement achieved an accuracy rate of 69.70% on the authoritative benchmark test GAIA, not only breaking the record for open-source systems but also surpassing commercial systems like OpenAI Deep Research.

All the code related to this research result has been made open-source on GitHub, receiving over 17,000 stars, indicating the community's recognition of this innovation.

image.png

So how does the Workforce framework break through the limitations of multi-agent systems? Its core lies in the innovative "decoupled design." The framework divides the entire system into three key components: domain-agnostic planners (Planner Agents), intelligent coordinators (Coordinator Agents), and specialized worker nodes (Worker Nodes). This design not only enhances system flexibility but also significantly reduces the complexity of cross-domain migration. Especially when adapting to new domains, users only need to replace or add worker nodes without making comprehensive modifications to the core system.

image.png

The OWL training method is another highlight of this framework. It adopts a two-stage training strategy. In the first stage, supervised fine-tuning is used to initially train the planner using expert demonstration data. In the second stage, reinforcement learning optimization is applied to further enhance decision-making capabilities through the Direct Preference Optimization (DPO) algorithm. These optimizations ensure that the planner can handle diverse tasks in the real world.

In the GAIA benchmark test, the Workforce framework demonstrated its significant advantages, particularly in multi-agent reasoning, achieving an accuracy rate of 69.70%, far exceeding previous open-source systems. Meanwhile, the OWL training method also achieved remarkable results in the test, enhancing the performance of the Qwen2.5-32B-Instruct model. This breakthrough allows multi-agent systems to handle complex tasks without being constrained by previous design concepts, showcasing powerful self-correction and evolution capabilities.

The introduction of the Workforce framework not only improves the overall performance of multi-agent systems but also points the way forward for the future development of intelligent assistants.

相关资讯

Patronus AI Launches Percival: One-Minute Diagnosis of Hidden Faults in Hundred-Step Agent Chains

As enterprises increasingly deploy autonomous AI agent systems, the demand for monitoring and debugging these complex systems is rapidly growing. Today, AI security company Patronus AI, headquartered in San Francisco, released its latest product, Percival, a monitoring platform capable of automatically identifying fault patterns in AI agent systems and providing repair recommendations."Percival is the industry's first intelligent agent that can automatically track agent trajectories, identify complex faults, and systematically output repair suggestions," said Anand Kannappan, CEO and co-founder of Patronus AI, in an exclusive interview with VentureBeat.Solving the Real-World Challenges of "Uncontrollable" AI AgentsDifferent from traditional machine learning, AI agents can autonomously execute large-scale operation processes involving multiple stages.
5/15/2025 10:02:04 AM
AI在线

颠覆传统!新多智能体框架OWL获17K Star,超越OpenAI开创智能协作新时代

随着大型语言模型(LLM)的迅猛发展,单一智能体在应对复杂现实任务时显露出诸多局限。 为了解决这个问题,香港大学与 camel-ai 等多家机构联合推出了一种名为 Workforce 的全新多智能体框架,并配套了一种名为 OWL(Optimized Workforce Learning)的训练方法。 最近,这一创新成果在权威基准测试 GAIA 上获得了69.70% 的准确率,不仅刷新了开源系统的记录,还超越了 OpenAI Deep Research 等多家商业系统。
6/17/2025 3:01:49 PM
AI在线

Apple MLX Supports NVIDIA CUDA, AI Developers Benefit from Cost and Efficiency

Apple is adding support for NVIDIA CUDA to its machine learning framework MLX, which is designed for Apple Silicon chips. This breakthrough will provide AI developers with unprecedented flexibility and cost advantages.According to Appleinsider, developers can now use the MLX framework to develop AI applications on Macs equipped with Apple Silicon and export their code to run on NVIDIA GPUs or server environments that support CUDA.
7/16/2025 4:12:01 PM
AI在线
  • 1