datarekha

ReAct, Plan-Execute, Reflexion

The three agent reasoning loops every engineer should know in 2026 — interleaved ReAct, upfront Plan-and-Execute, and self-correcting Reflexion. How each works and when to reach for it.

8 min read Intermediate Agentic AI Lesson 3 of 42

What you'll learn

  • The three core agent reasoning loops and how each is structured
  • The tradeoffs — LLM calls, adaptivity, and predictability
  • Which loop fits a dynamic, a known-multi-step, or a verifiable-hard task

Before you start

The design-patterns lesson covered the workflow shapes (prompt chaining, routing, parallelization). This lesson is about the reasoning loops an agent runs inside those shapes — how it actually decides what to do next. Three dominate the field, and knowing which to reach for is the difference between an agent that’s reliable and one that’s slow, expensive, or brittle.

ReAct — reason and act, interleaved

ReAct (Reason + Act) is the workhorse. The agent loops: Thought → Action → Observation → Thought → … — it reasons about what to do, takes one action (a tool call), observes the result, and reasons again with that new information. Because it reacts to each observation, it handles dynamic situations gracefully — it doesn’t need to know the steps in advance.

The cost: an LLM call per step, so a long task is many calls (latency and money), and it can wander or loop if the stop condition is loose. ReAct is the default shape behind most tool-using agents.

Plan-and-Execute — decide everything first

Plan-and-Execute splits reasoning from doing. A planner LLM lays out all the steps up front; an executor then runs them, often without calling the planner again. This is cheaper and more predictable — one expensive planning call, then mechanical execution — and easier to debug because the plan is explicit.

The weakness: it’s blind to surprises. If step 2 returns something the plan didn’t anticipate, a pure plan-execute agent plows ahead. In practice you add a re-plan step when execution deviates — a hybrid that recovers some of ReAct’s adaptivity.

Reflexion — try, critique, retry

Reflexion adds a self-correction loop: the agent makes an attempt, then a reflection step critiques its own output (“I cited the wrong policy section”), and it retries with that feedback. It trades extra passes for quality, and it shines when first attempts often fail and the result can be checked — code that must pass tests, structured extraction you can validate, math you can verify.

Quick check

Quick check

0/3
Q1What characterizes the ReAct loop?
Q2When is Plan-and-Execute the better choice over ReAct?
Q3What problem does Reflexion specifically address?

Next

These loops run inside a single agent. When one agent isn’t enough, see multi-agent orchestration — and the equally important question of when not to go multi-agent.

Sign in to track your progress

Completed lessons, your XP, level, and streak save to your account — it's free and takes a few seconds.

Practice this in an interview

All questions
Explain the ReAct agent pattern and how it compares to Plan-and-Execute and Reflexion.

ReAct interleaves reasoning traces with actions step by step, deciding the next tool call based on the latest observation. Plan-and-Execute first drafts a full multi-step plan and then executes it, which is more efficient and predictable for complex tasks but less adaptive, while Reflexion adds a self-reflection step where the agent critiques past failures and retries with that feedback.

What is an AI agent, and how does it differ from a single LLM call?

An agent is an LLM placed in a loop where it reasons, chooses and calls tools or actions, observes the results, and repeats until a goal is met, rather than producing one response and stopping. The key differences are autonomy, tool use, memory and state, and multi-step control flow driven by the model's own decisions.

What prompt engineering techniques should every LLM practitioner know?

The core toolkit is: system prompts (role and constraints), few-shot examples (format and tone anchoring), chain-of-thought (step-by-step reasoning), and output constraints (JSON schema, stop sequences). Combining these predictably closes the gap between a capable base model and a production-ready feature.

How do function/tool calling and LLM agents work at a high level?

Tool calling extends the LLM's output space to include structured function invocations. The model emits a JSON object naming a tool and its arguments; the runtime executes the tool and feeds the result back as a new message. An agent is a loop that repeats this cycle — observe, think, act — until the task is complete or a stopping condition is met.

Related lessons

Explore further

Skip to content