Multi-Agent Systems

Decomposition of complex tasks into a set of autonomous agents with specialized roles that communicate and coordinate, enabling systems to exceed the reasoning limitations of single-model inference and complete long-horizon tasks that cannot be completed within a single LLM call.

Agent

Core computational and decision-making unit of the system.

Modular

Autonomous computational entity with its own internal state, perception, reasoning, and action capabilities. In LLM-based MAS: a language model with a system prompt defining the agent's role, goal, and constraints.

Communication Channel

Enables coordination and state transfer between agents.

Modular

Mechanism for information exchange between agents. Can take the form of direct message passing, shared memory, publish-subscribe systems, or event queues.

Orchestrator / Coordinator

Managing global task progress and coordinating between agents.

Modular

Component responsible for workflow management: task decomposition, routing to agents, dependency management, error handling, and aggregating final results. Can be an LLM agent or a programmatic controller.

Memory Subsystem

Persists state across agent calls and system sessions.

Modular

State storage mechanisms: short-term memory (conversation context / LLM context window), long-term memory (vector database, external database), episodic memory (interaction history), and procedural memory (learned procedures).

Tool Interface

Extends agent capabilities beyond language processing to actions in the external world.

Modular

Layer integrating agents with external systems: APIs, search engines, code interpreters, databases, file services. Enables agents to act beyond their textual context.

Parallelism

Conditionally parallel

MAS systems can be fully parallel (agents operating asynchronously on independent subtasks), partially sequential (with synchronization barriers), or hybrid, depending on the task topology and its dependency graph.

Paradigm

Conditional

Input dependent

Not all agents are active for every task — the set of active agents depends on the nature of the task and the orchestrator's dynamic routing.

Number of Agents

Standard

2Minimal system: one orchestrator + one worker.
5–10Typical range for complex tasks involving several specializations.
> 20Large-scale systems; requires advanced orchestration and flow control.

Number of agents in the system. Higher agent count can increase parallelism and specialization but complicates coordination and increases cost.

Coordination Topology

Critical

sequentialAgent A → Agent B → Agent C; simple, deterministic.
hierarchicalThe orchestrator manages a group of worker agents.
parallelMultiple agents operate concurrently on independent subtasks.

Communication and dependency pattern between agents: sequential chain, hierarchy (orchestrator-worker), peer-to-peer mesh, publish-subscribe, parallel fan-out/fan-in, or hybrid DAG.

Memory Architecture

Standard

Type and scope of memory allocated to each agent and the system as a whole: context-only (within LLM context window), external long-term (vector database), shared across agents, or private per agent.

Human Involvement

Standard

fully autonomousNo human intervention.
human-in-the-loop on decisionsHuman approves high-risk decisions.
human-in-the-loop on errorsHuman intervenes only when errors are detected.

Mode and entry points for human participation: none (full autonomy), approval at every step, approval at key decisions, or intervention only on errors.

Agent specialization degree

Standard

Degree to which individual agents have specialized roles (e.g., search agent, coding agent, verification agent) versus general-purpose agents.

Common pitfalls

Error Cascading Between Agents

HIGH

Incorrect output from one agent is passed as input to the next, leading to error accumulation across the pipeline. Without a validator agent, this can result in completely incorrect final outputs.

Deploy critic/validator agents after key steps; use multiple independent execution paths with voting or result aggregation.

Excessive Token Costs (Token Cost Explosion)

HIGH

Each inter-agent communication consumes tokens (passing conversation history, context, tools). With many agents and long communication chains, costs can grow disproportionately fast.

Use context compression; limit the history passed between agents to the necessary minimum; use lighter models for simpler subtasks.

Infinite Loops and Non-Convergence

CRITICAL

Agents can enter loops — e.g., one agent requests revision from another, which requests clarification back — without a termination mechanism. Absence of stopping criteria causes infinite loops.

Define explicit termination criteria (maximum number of iterations, a 'TERMINATE' condition, timeout); use a supervising monitor agent.

State Inconsistency Between Agents

HIGH

During parallel execution, agents may operate on inconsistent versions of shared state, leading to conflicts and race conditions, especially when shared memory is non-transactional.

Use transactional updates for shared state; design agents to be idempotent; minimize write-write contention by assigning distinct state partitions to separate agents.

Role Confusion and Overlapping Responsibilities Between Agents

MEDIUM

Agents with poorly defined roles may duplicate work, conflict in decision-making, or skip tasks because each assumes another agent will handle it.

Precisely define each agent's scope of responsibility in the system prompt; apply formal handoff protocols and task-receipt acknowledgments.

Reference implementations

AutoGen (Microsoft Research)

Python · Microsoft Research

MetaGPT

Python · DeepWisdom

CrewAI

Python · CrewAI Inc.

LangGraph (LangChain)

Python · LangChain AI

1995

Wooldridge and Jennings formalize the notion of intelligent agents and MAS

breakthrough

Wooldridge and Jennings publish 'Intelligent Agents: Theory and Practice' (Knowledge Engineering Review), defining agent properties (autonomy, reactivity, pro-activeness, social ability) and foundations of MAS theory.

Intelligent Agents: Theory and Practice

1999

Gerhard Weiss edits 'Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence' (MIT Press)

Textbook standardizing MAS terminology and architecture, becoming the primary academic reference for the following decade.

2023

LLM-MAS framework explosion: CAMEL, AutoGen, MetaGPT

breakthrough

2023 sees the first wave of LLM-based MAS frameworks: CAMEL (Li et al., March 2023), AutoGen (Wu et al., Microsoft, August 2023), MetaGPT (Hong et al., August 2023), transforming MAS from rule-based to language-model-based systems.

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

2024

Standardization of communication protocols (MCP, A2A) and orchestration frameworks (LangGraph, CrewAI)

Anthropic announces Model Context Protocol (MCP), Google announces Agent-to-Agent Protocol (A2A). More mature frameworks emerge: LangGraph (LangChain), CrewAI, supporting complex agent topologies with state persistence and controlled flow.

Hardware agnosticPRIMARY

MAS is an architectural pattern operating above the hardware layer. Individual agents may run on GPUs (LLM inference), CPUs (orchestration logic), or in the cloud, depending on the implementation.

For LLM-MAS systems using large models, the hardware bottleneck becomes GPU memory bandwidth during parallel inference when multiple agents share the same model.

Related AI models

Muse

Muse Spark

Title	Publisher	Type
Multi-Agent Systems: An Introduction	—	—

Multi-Agent Systems: An Introduction

—

Back to technology catalog

Multi-Agent Systems

Main components

Agent

Communication Channel

Orchestrator / Coordinator

Memory Subsystem

Tool Interface

Computational complexity

Configuration axes

Implementation

Common pitfalls

Reference implementations

History and evolution

Preferred hardware

Related models and families

Related AI models

Muse

Sources