OpenAI Agents SDK: handoffs & guardrails
A lightweight, production-minded agent framework: Agents, Runner, Tools, Handoffs, Guardrails, and Sessions. The default starting point for many production agentic systems in 2026.
What you'll learn
- The core primitives — Agent, Runner, tools, handoffs, guardrails, sessions
- How handoffs route a task between specialized agents
- Where input/output guardrails sit and why they matter
Before you start
After a wave of heavy frameworks, the OpenAI Agents SDK went the other way: small, explicit, few abstractions. In 2026 it’s a common default for production-grade agents, precisely because there’s so little magic — you can read the whole mental model in one lesson. (It’s provider-flexible too, not OpenAI-only.)
Five primitives
- Agent — an LLM plus instructions, a set of tools, and optionally some handoffs. That’s it.
- Runner — runs the agent loop: call the model, run any tool it picked, feed the result back, repeat until a final answer.
- Tools — Python functions exposed to the agent (plus hosted tools and MCP servers).
- Handoffs — one agent can delegate the conversation to another, more
specialized agent. A handoff is literally implemented as a tool call
(
transfer_to_X), so it shows up in the trace like any other action. - Guardrails — input and output checks that run alongside the agent and can halt it (e.g. block off-topic input, validate output) — see prompt injection & guardrails.
- Sessions — automatic conversation history across runs, so you don’t hand-thread state.
Handoffs: a triage agent routing to specialists
The signature pattern is handoffs: a cheap triage agent classifies the request and transfers to the right specialist.
from agents import Agent, Runner, input_guardrail, GuardrailFunctionOutput
billing = Agent(name="Billing", instructions="Handle refunds and invoices.")
technical = Agent(name="Technical", instructions="Handle bugs and how-tos.")
@input_guardrail
async def on_topic(ctx, agent, user_input) -> GuardrailFunctionOutput:
ok = "support" in user_input.lower() or True # your real check here
return GuardrailFunctionOutput(tripwire_triggered=not ok, output_info={})
triage = Agent(
name="Triage",
instructions="Route the user to Billing or Technical.",
handoffs=[billing, technical], # delegate to a specialist
input_guardrails=[on_topic], # block off-topic before running
)
# result = await Runner.run(triage, "I want a refund on last month's invoice")
# → triage calls transfer_to_billing; Billing answers. The handoff is in the trace.
Quick check
Quick check
Next
Whatever framework you pick, production agents need measurement and limits: evaluating agents, observability, and cost control.
Practice this in an interview
All questionsKey risks include prompt injection, especially indirect injection via tool or retrieval outputs, hijacking the agent, excessive tool permissions enabling damaging actions, data exfiltration, confused-deputy privilege escalation, and unbounded loops driving cost or harm. Mitigations include least-privilege tools, sandboxing, input and output guardrails, human-in-the-loop approval for sensitive actions, and audit logging.
An agent is an LLM placed in a loop where it reasons, chooses and calls tools or actions, observes the results, and repeats until a goal is met, rather than producing one response and stopping. The key differences are autonomy, tool use, memory and state, and multi-step control flow driven by the model's own decisions.
Operationalizing responsible AI means turning principles like fairness, transparency, and accountability into concrete, automated controls: bias and fairness tests in the pipeline, data and model documentation, human oversight, and continuous monitoring with audit trails. Under the EU AI Act, high-risk systems carry specific obligations including data governance and bias assessment, risk management, technical documentation, logging, human oversight, and post-market monitoring. The practical shift is that fairness and governance become gated, evidenced requirements rather than optional add-ons.
Tool calling extends the LLM's output space to include structured function invocations. The model emits a JSON object naming a tool and its arguments; the runtime executes the tool and feeds the result back as a new message. An agent is a loop that repeats this cycle — observe, think, act — until the task is complete or a stopping condition is met.