NLP & LLMs Medium Asked at OpenAIAsked at AnthropicAsked at GoogleAsked at LangChain

How do function/tool calling and LLM agents work at a high level?

For AI / LLM Engineer ML Engineer Data Scientist

The short answer

Tool calling extends the LLM's output space to include structured function invocations. The model emits a JSON object naming a tool and its arguments; the runtime executes the tool and feeds the result back as a new message. An agent is a loop that repeats this cycle — observe, think, act — until the task is complete or a stopping condition is met.

How to think about it

Tool calling

Modern APIs let you describe available functions in the request. The model decides when to call one and returns a structured call rather than prose. The application executes the function and sends the result back as a tool role message.

from openai import OpenAI

client = OpenAI()

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city.",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["city"],
            },
        },
    }
]

messages = [{"role": "user", "content": "What's the weather in Tokyo?"}]

response = client.chat.completions.create(
    model="gpt-4o", messages=messages, tools=tools
)

tool_call = response.choices[0].message.tool_calls[0]
# Execute: result = get_weather(**json.loads(tool_call.function.arguments))
# Append tool result and call again for final answer

The agent loop

An agent wraps tool calling in an observe-think-act loop:

Observe — receive user goal or previous tool output.
Think — the LLM decides the next action (or “finish”).
Act — call the chosen tool; get result.
Repeat until the model emits a final answer.

The two dominant patterns are ReAct (Reason + Act interleaved in text) and plan-and-execute (separate planning and execution steps for long-horizon tasks).

Key design decisions

Max iterations — cap the loop to prevent infinite cycles.
Tool output truncation — tool results can be large; summarize before re-injecting.
Error handling — pass tool errors back to the model so it can retry or gracefully fail.
Memory — for long sessions, summarize prior turns to avoid context overflow.

Multi-agent systems

Orchestrator agents delegate subtasks to specialist agents (e.g., a web search agent, a code execution agent). Communication happens through message passing; the orchestrator aggregates results.

How do function/tool calling and LLM agents work at a high level?

Keep practising

Explore further