Stop Parsing Broken Outputs: Generate Valid Structure Directly From Any LLM
You know the pain. You ask an LLM for JSON, and it gives you almost JSON. Or JSON with trailing commas. Or a code block that starts with json but ends abruptly. So you write regex parsers, try json.loads() with error handling, or resort to asking the model to "please fix your output" in the next turn. It works, but it's fragile and slow.
What if you could skip the parsing entirely? What if you could tell the LLM "only output a valid Python function call" or "generate a string that matches this regex" — and have it actually do that, reliably?
Outlines does exactly that. It's a library that lets you constrain any LLM's output to match a structured format — JSON, Pydantic models, regex, grammar, or even a function signature — before generation. No post-processing. No retries. Just valid output, the first time.
What It Does
Outlines is a Python library that works with any text generation model (OpenAI, Anthropic, local models via transformers or llama.cpp). Instead of letting the model generate free text and then parsing it, you define the structure you want upfront. The library uses the model's next-token probabilities to force the output to follow a schema — grammatically, structurally, and type-safely.
You can constrain output to:
- JSON matching a Pydantic model or dataclass
- A specific regex pattern
- A context-free grammar (CFG)
- A function call with typed arguments (great for tool use)
- Any combination of the above
The magic is that it works with any autoregressive model and adds minimal overhead. The constraints are applied at token sampling time, so the model never even considers invalid tokens.
Why It’s Cool
The "parse broken output" approach is a kludge. It works 80% of the time, then breaks silently. Outlines flips the script — it makes structure a first-class part of generation. Here's why that matters:
No more “please fix your JSON” prompts. You define a Pydantic model once, and Outlines guarantees the output matches it. No post-hoc validation, no fallback logic.
Faster generation. Because the model never considers invalid tokens, it can't waste time generating garbage it will have to correct later. For many use cases, this makes generation actually faster.
Tool calling without the wrapper. If you want the LLM to call send_email(to: str, body: str), Outlines can force the output to be exactly that function call — no need for custom OpenAI function-calling middleware.
Works with local models. You're n