What is constrained decoding and how does it guarantee structured outputs like valid JSON?
Constrained decoding masks the model's next-token logits at each step so only tokens permitted by a grammar or JSON schema can be sampled, guaranteeing structurally valid output without changing the model's weights. It is how structured-output and function-calling features enforce schema conformance; placing reasoning fields before answer fields lets the model think before it commits.
How to think about it
Constrained decoding masks the model’s next-token logits at each step so only tokens permitted by a grammar or JSON schema can be sampled, guaranteeing structurally valid output without changing the model’s weights. It is how structured-output and function-calling features enforce schema conformance; placing reasoning fields before answer fields lets the model think before it commits.