Running a retrospective that changes things

The sprint ends on Friday. The team books ninety minutes for the retrospective — the retro — the meeting where Agile (the iterative, incremental software development methodology) promises you will inspect your process and adapt it. Someone writes three columns on a whiteboard: What went well. What didn’t. What we’ll do differently.

For the next hour, the team produces a careful, honest, sometimes uncomfortable account of everything that went wrong. The sprint two was too ambitious. The definition of done kept shifting. The platform team did not deliver the dependency on time. The QA loop at the end is always the crunch. Everyone nods. The facilitator thanks people for their candor. Someone writes “improve communication with platform team” under the third column. The meeting ends.

Three sprints later, the same whiteboard. The same columns. The same complaints.

That retro was theater. It felt productive because speaking honestly in a room takes courage, and the room rewarded that courage. But courage without a committed decision is just testimony. It changes nothing.

This post is about the structural difference between a retro that produces change and one that produces a running archive of the team’s suffering.

The problem is not honesty — it is the back half

Most teams have learned to be reasonably honest in retros. Psychological safety (the belief that you can speak up without being punished) has been a real focus in most engineering and product organizations over the last decade, and most teams have some version of it. The honest observation of what went wrong is not where retros fail.

They fail in the transition from observation to commitment. Specifically, they fail in two places:

First, discussion that does not reach a decision. The team identifies that the QA crunch is a recurring problem. They discuss why: late scope changes, unclear acceptance criteria, testing left until the last two days. The discussion is accurate and goes around in circles for twenty minutes. Then time runs out. The facilitator writes “address QA crunch” on the board and moves on.

Second, action items without owners or dates. “We will try to write acceptance criteria earlier.” Try. Earlier. No who, no when, no measurable definition of done. This kind of sentence has a half-life of about four days. Everyone leaves the meeting believing someone else will follow up.

The antidote to both failures is not more honesty, more discussion, or a better facilitator. It is a structural rule applied after every discussion: no item leaves the retro without a named human owner and a specific date.

The two loops

The dead-end retro loops back to the same observations. The live retro exits the discussion with exactly one committed change — named owner, committed date — and advances to the next cycle.

The loops look similar from the outside. Both involve honest observation and discussion. The difference is whether anything escapes the conversation and enters the world.

The structure: what went well, what didn’t, what we will change

This is the three-column format nearly every team uses, and it is a sound structure. The problem is how teams execute the third column.

What went well is not optional. Teams under time pressure often skip it or spend three minutes on it to get to the “real” work. This is a mistake. The things that went well are your anchors — the practices worth protecting when the next crisis hits. Naming them explicitly also prevents the retro from becoming purely extractive, which burns people out over time. Spend real time here.

What didn’t is where most teams are actually skilled. The honest naming of things that hurt — unrealistic scope, unclear ownership, a bottleneck that everyone felt and no one said anything about — is uncomfortable and necessary. The rule in this section is that you stay at the team-process level, not the personal-fault level. “The deployment process took four hours and blocked two engineers” is a process problem. “Arjun’s code broke the build again” is not a retro item; it is a one-on-one conversation the manager should be having separately.

What we will change is where retros die. The key structural rule: the team picks one item — not five, not ten — and assigns it to one person with a specific date.

One item, not a list. Five action items with no priority is the same as no action items. Human attention and organizational follow-through are not infinite resources. When you leave a retro with five things to improve, what actually happens is that each person silently ranks them differently, the top item on each person’s private list is different, and three weeks later nothing has moved because everyone was waiting for someone else to start.

One item with one owner focuses the available energy. The owner does not have to do the work alone. They have to do the work of making the change happen — finding the people, setting the meeting, writing the proposal, whatever the specific change requires. The accountability is theirs. The execution can be shared.

Why “we will try to do better” is not an action

It is worth being specific about the language, because the language is where the commitment disappears.

These are not actions:

“We’ll try to communicate more proactively.”
“We should start writing tickets earlier.”
“Let’s see if we can get the platform team more aligned.”

The word “try” is a hedge. It builds in the permission to fail before any attempt has been made. “We should” is an aspiration. “Let’s see if” is even softer — it defers the decision to an undefined future moment when the conditions are right. The conditions are never right.

A real action has a verb in the simple future or imperative, a person’s name, and a date:

“Priya will draft the acceptance criteria template and share it in Slack by next Thursday.”
“Rohit will book a thirty-minute sync with the platform team lead before sprint planning on the fourteenth.”
“The team will spend the first thirty minutes of next sprint planning reviewing the scope against the actual capacity number from this sprint.”

Notice that none of these guarantee the outcome. Priya might draft a template that the team ends up not using. Rohit’s sync might not fix the dependency problem. But they are events that will either happen or not happen by a specific date. That is auditable. You can open the next retro with “Did the template get drafted?” and get a yes or a no.

“Did we communicate more proactively?” produces a discussion. That discussion is where the next forty minutes go.

Keeping it blameless — and what that actually means

Blameless retros have become a norm in most engineering organizations, borrowed from the Site Reliability Engineering (SRE) tradition at companies like Google. The principle is sound: when blame enters the room, honest observation exits. People describe what happened in ways that protect themselves rather than in ways that are accurate. You lose the data.

But “blameless” is frequently misread as “no one is accountable for anything.” That misread is where teams end up with action items that have no owners.

The distinction is straightforward. Blame says: “Rahul’s mistake caused the outage.” Accountability says: “Rahul is the owner of the post-incident review, due Friday.” The first is backward-looking and personal. The second is forward-looking and structural. Both name a person. Only one of them produces a document by Friday.

In practice, keeping a retro blameless means the facilitator has to catch specific kinds of drift. When someone says “the QA crunch happens because some people treat testing as an afterthought,” they are making a judgment about other people’s values. Redirect: “What in our current process creates the conditions for testing to get compressed?” Now you are talking about a system, and the system can be changed.

When someone says “we just need people to take ownership,” the facilitator needs to hear that as a symptom. “People are not taking ownership” usually means the ownership was never explicitly assigned in the first place. The retro action is to assign it explicitly — to a specific person — next time.

Running the meeting

The mechanics matter because most retros drift in predictable ways.

Set a time limit on the discuss phase. Twenty-five minutes per item is enough for most problems. When discussion exceeds that, it is usually because the team is working on the diagnosis rather than the response. The diagnosis is useful but it is not the product of the retro. The decision is the product. Move to decision-making even if the diagnosis feels incomplete.

Use silent writing before spoken discussion. The most common pathology in retros is that the first two or three people to speak set the frame and the rest of the room anchors to it. If you go around the table and ask what went wrong, you will hear the opinions of the extroverts and the most senior people in the room. Give everyone five minutes to write before anyone speaks. Then discuss. The range of observations is usually richer.

Close by reading the one action item back to the room. Out loud. With the owner’s name and the date. This sounds performative. It is not. It confirms that the owner heard the commitment, that the room has the same understanding of what the action actually is, and it creates a brief moment of social contract. People feel the weight of their name being said in the room in a way they do not feel the weight of their name in a document that gets filed somewhere.

What to do with the accumulation

Most teams have a backlog of retro items — problems that were named in previous retros and never acted on. This backlog is corrosive. Every time the same problem appears in a retro, two things happen: the team loses a small amount of confidence that the retro process is real, and the emotional cost of raising the problem gets slightly higher.

The practical repair is a retro on your retros. Book ninety minutes outside the sprint cycle. Go through the list of items from the last six months. For each item: was it acted on? If yes, did it help? If no, why not? Was it too vague? No owner? Wrong priority? Classify them and notice the pattern.

Most teams will find that the items that did not move share the same two features: no specific owner, and no date. That is the structural diagnosis. The fix is the same one described above — applied retroactively to the backlog and enforced going forward.

The long game

A team that takes one real action per sprint — one genuine process improvement, executed, evaluated, refined — improves at roughly thirteen to twenty-six things per year. That sounds obvious. It is not common.

The teams that actually do this look different after a year. Not because they have solved everything. Because they have a working evidence base: things they tried, results they observed, adjustments they made. They know what their process can and cannot do. They know which problems are systemic and which were situational. They trust the retro because the retro has produced things.

The teams that run theatrical retros know their problems very well. They have been naming them for a year. The gap between naming and changing is the exact distance between a team that stays frustrated and a team that gets better.

One item. One owner. One date. Open the next retro by asking whether it happened.

Everything else is commentary.