datarekha

A2A — the Agent2Agent Protocol

MCP connects one agent to its tools. A2A connects independent, opaque agents to each other. A deep dive on the Agent Card, the task lifecycle, streaming, and when you actually need a protocol instead of a function call.

8 min read Intermediate Agentic AI Lesson 9 of 29

What you'll learn

  • Why A2A exists — delegating tasks across opaque agents from different vendors and frameworks
  • The Agent Card — a JSON manifest published at /.well-known/agent-card.json for discovery
  • The task lifecycle — submitted, working, input-required, completed and the terminal-state rule
  • Streaming over SSE vs push notifications for long-running, disconnected tasks
  • When A2A earns its weight versus an in-process sub-agent call

Before you start

In the MCP lesson you saw how one agent reaches down to its tools and data — a USB-C port for models. A2A (the Agent2Agent Protocol) solves the orthogonal problem: how one agent reaches across to another independent agent as a peer. The two are not rivals. The official spec calls them highly complementary: MCP is agent-to-tools; A2A is agent-to-agent. A real system uses both — A2A between agents, MCP inside each agent for its own tools.

The word that does all the work here is opaque. A2A’s central design principle is Opaque Execution: the agents “collaborate effectively without exposing their internal logic, memory, or proprietary tools.” That single constraint is what makes A2A a protocol and not just a function call — and it dictates every design choice that follows.

Two protocols, two directions

A2A — agent ↔ agent (horizontal)independent, opaque peers delegate tasksClient agent(your travel agent)MCP ↓its own tools / dataA2A taskartifactsRemote agent(currency specialist)MCP ↓FX rate API
A2A runs horizontally between agents; each agent still uses MCP vertically for its own tools. Different layers, not competitors.

For the full landscape of MCP vs A2A vs ACP vs ANP, see the agent protocols overview. This lesson goes deep on A2A alone.

Step 1 — discovery via the Agent Card

You cannot delegate to an agent you cannot describe. So A2A starts with a discovery document: the Agent Card, a JSON “business card” that advertises an agent’s identity and what it can do. The canonical way to publish it is at a well-known URL — a plain HTTP GET away, following the RFC 8615 convention:

https://currency.example.com/.well-known/agent-card.json

A note on the path. The current spec uses /.well-known/agent-card.json. Earlier v0.x drafts used /.well-known/agent.json, and tooling often still accepts the old name for backward compatibility — but write new agents against agent-card.json.

A card carries the agent’s name, description, and provider; its A2A service url; a version; the capabilities it supports (notably streaming and pushNotifications); its security schemes (Bearer, OAuth2, API keys); default input/output modes; and a list of skills — each with an id, name, description, and example invocations. That skills array is the menu a client reads to decide whether this is the right agent for the job.

The well-known URL is only one of three discovery mechanisms the spec defines. For enterprises there are curated registries — a catalog you query by skill or tag — and for tightly-coupled systems there is direct configuration, a hardcoded URL or env var. The card format is identical; only how the client finds it changes.

Step 2–4 — the task, its lifecycle, and the artifact

Once the client has the card and picks a skill, it sends a task. This is the heart of A2A and the cleanest way to see how it differs from a tool call. A tool call is request → response: synchronous, stateless, done. A task is a stateful object with a lifecycle. It has an id, a contextId that groups related tasks, a status, an optional history of messages, and the artifacts it eventually produces.

Each message between the agents is built from typed partstext, a file/url pointer, structured data (arbitrary JSON), or raw bytes. That is what makes A2A modality-independent: the same envelope carries a sentence, a spreadsheet, or an image. When the work finishes, the remote agent returns its results as artifacts (each an artifactId, a name, and its own parts) — the outputs of the task, as distinct from the conversational messages along the way.

The status walks a fixed set of states. These string values are fixed by the spec — you do not invent your own:

submittedworkinginput-requiredclient replies →completedterminal • immutableOther terminal states• failed• canceled• rejectedalso: auth-required
The happy path is submitted → working → completed. input-required and auth-required pause for the client. completed, failed, canceled and rejected are terminal.

The terminal-state rule is the gotcha worth memorizing: once a task is completed, failed, canceled, or rejected, it is immutable — you cannot reopen it. Follow-up work starts a new task in the same contextId, linked back via referenceTaskIds so the remote agent can infer continuity. This is deliberate: it keeps an auditable, append-only history across organizational boundaries.

Long-running tasks — stream or get called back

A currency conversion is instant. A “render this 90-second video” task is not. A2A gives a remote agent two ways to report progress on work that outlives a single request.

Streaming over SSE. The client calls the JSON-RPC method message/stream; the server replies 200 with Content-Type: text/event-stream and pushes Server-Sent Events. Each event’s data field carries a complete JSON-RPC response delivering an incremental status change or an artifact chunk. If the connection drops, tasks/resubscribe reconnects to the live stream.

Push notifications. Holding an SSE connection open for an hour is fragile. So for long-running or disconnected work, the client registers a webhook with tasks/pushNotificationConfig/set, and the remote agent makes server-initiated HTTP POST callbacks when the task updates — no polling, no held-open socket. The Agent Card’s capabilities advertise which of these (streaming, pushNotifications) an agent supports.

A2A is built on boring, sturdy web tech on purpose — HTTP, JSON-RPC 2.0, and SSE — with v1.0 adding gRPC and HTTP+JSON/REST bindings and version negotiation. Auth follows suit: schemes are declared in the Agent Card and negotiated out-of-band like any HTTP API, so credentials are never put inside the A2A message payload. v1.0 also adds cryptographically signed Agent Cards, so a client can verify an agent’s identity before trusting it across a trust boundary.

When do you actually need A2A?

Here is the honest engineering answer, because A2A is not free — it is a network hop, a contract, and an auth dance.

  • Reach for a plain in-process call (or MCP for tools) when the sub-agent is your own code: same process, same framework, sharing state. That is faster and simpler. A function call beats a protocol every time you are allowed to make one.
  • Reach for A2A when the other agent is a separate, independently deployed, opaque service — a different team, vendor, or framework, across a network or trust boundary — and you need standardized discovery (the Agent Card), long-running async task semantics with streaming or webhooks, and enterprise auth. In short: when you cannot, or should not, share internal state, and you need a vendor-neutral contract instead of a function signature.

That boundary — can I just call its function? — is the whole decision. A2A exists for every time the answer is no.

A note on governance

A2A is not a single vendor’s project. Google announced it on April 9, 2025, then donated it to the Linux Foundation on June 23, 2025, forming the Agent2Agent Protocol Project with founding members including AWS, Cisco, Google, Microsoft, Salesforce, SAP, and ServiceNow. The first stable release, A2A v1.0, shipped recently. Sources differ on the exact month — Google’s anniversary blog says March, some write-ups say January — so treat it as soft; the version jump from the v0.x drafts to a production 1.0 is solid. It is Apache-2.0 licensed and vendor-neutral.

Quick check

Quick check

0/3
Q1What is the key difference between MCP and A2A?
Q2A client delegated a task that reached the 'completed' state. The user now asks a follow-up. What does A2A prescribe?
Q3Transfer: you're building a research agent. It needs to (a) read your internal Postgres and (b) hand a sub-question to a partner company's specialist agent for an hour-long analysis. Which protocols fit each, and how should progress on the long task be reported?

Next

You now know how one agent delegates to another. For the wider map of competing and complementary standards — MCP, A2A, ACP, ANP and where each fits — read the agent protocols overview, and revisit MCP to see the tool-facing half of the same picture.

Practice this in an interview

All questions
How do function/tool calling and LLM agents work at a high level?

Tool calling extends the LLM's output space to include structured function invocations. The model emits a JSON object naming a tool and its arguments; the runtime executes the tool and feeds the result back as a new message. An agent is a loop that repeats this cycle — observe, think, act — until the task is complete or a stopping condition is met.

How do you read ACF and PACF plots, and what do they tell you about AR and MA orders?

The ACF measures correlation between a series and its own lags including indirect effects; the PACF strips out those indirect effects to show direct correlation at each lag. A cut-off in the PACF after lag p signals an AR(p) process; a cut-off in the ACF after lag q signals an MA(q) process.

How does the Spark Catalyst optimizer work, and what does Adaptive Query Execution add?

Catalyst is a rule-based and cost-based query optimizer that transforms a logical plan through four phases — analysis, logical optimization, physical planning, and code generation — before any data is touched. Adaptive Query Execution (AQE), introduced in Spark 3, extends this by re-optimizing the physical plan at runtime using actual shuffle statistics rather than stale estimates.

When would you choose gRPC over REST for model serving, and what are the practical trade-offs?

gRPC uses HTTP/2 and Protocol Buffers to deliver lower latency, strongly typed contracts, and built-in streaming, making it the better choice for high-throughput internal model services. REST remains the standard for public-facing APIs where broad client compatibility and human-readable payloads matter more than raw performance.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Explore further

Related lessons

Skip to content