LangChain JSON Output Parser: Fix Parsing Errors and Ensure Valid Output
LangChain returning \"invalid json output: none\" or OutputParserException? Fix malformed output, Python literals, and streaming issues — with exact code fixes for each case.
Have broken JSON right now? Fix it free in under 1 second — no signup.
Fix My JSON →LangChain's output parsers abstract away the messiness of LLM text responses, but they don't eliminate it. JsonOutputParser raises OutputParserException when the model returns markdown-wrapped JSON, partial output, or prose mixed with data. This guide covers every parser option in LangChain's ecosystem — how each works, when it fails, and how to build a robust pipeline that handles edge cases without crashing.
How LangChain's JsonOutputParser Works
JsonOutputParser is the simplest JSON parser in LangChain. It receives the model's text output, strips markdown fences, then calls json.loads(). That's essentially it.
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o", temperature=0)
parser = JsonOutputParser()
chain = model | parser
result = chain.invoke("Return a JSON object with name and age for a fictional user.")
result is a dict if the model cooperated
raises OutputParserException if it didn't
The parser does handle one edge case automatically: it strips `json ` fences before parsing. But it won't handle trailing commas, Python literals (True/False/None), comments, or truncated output.
Common OutputParserException Errors
1. Model Returned Prose with JSON Embedded
OutputParserException: Failed to parse.
Got: "Here is the JSON you requested: {"name": "Alice", "age": 30}"
The parser found {...} but the surrounding text confused the extraction. Fix: strengthen the prompt with explicit format instructions.
from langchain_core.output_parsers import JsonOutputParser
parser = JsonOutputParser()
Include format instructions in the prompt
format_instructions = parser.get_format_instructions()
Returns: "Return a JSON markdown code snippet..."
prompt = ChatPromptTemplate.from_template(
"Answer the user query.\n{format_instructions}\n\nQuery: {query}"
)
chain = prompt | model | parser
result = chain.invoke({
"query": "Give me a user profile for Alice",
"format_instructions": format_instructions
})
2. Trailing Commas
OutputParserException: Failed to parse. Got: {"name": "Alice", "age": 30,}
json.JSONDecodeError: Expecting property name enclosed in double quotes
JsonOutputParser does not fix trailing commas. Options:
- Use a pre-processing step (shown below)
- Use the JSON Fixer to clean the output manually
- Switch to
with_structured_output()(see below)
3. Partial / Truncated JSON
OutputParserException: Failed to parse.
Got: {"users": [{"id": 1, "name": "Alic
json.JSONDecodeError: Unterminated string starting at...
This is common when your output hits the model's token limit. For streaming use cases, LangChain provides JsonOutputParser with partial parse support.
4. Python-Style Values
OutputParserException: Failed to parse.
Got: {"active": True, "data": None}
Some models (especially smaller ones) output Python booleans. JsonOutputParser won't fix this. Use a pre-processing wrapper or switch to a repair-aware approach. See Why LLMs Output True Instead of true for a full fix guide including ast.literal_eval and provider-specific prevention.
with_structured_output() — The Recommended Approach
For models that support it (OpenAI, Anthropic, Google), with_structured_output() is the most reliable path. It uses tool/function calling under the hood, bypassing the text-parsing problem entirely.
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from pydantic import BaseModel, Field
from typing import Optional
class UserProfile(BaseModel):
name: str = Field(description="Full name of the user")
age: int = Field(description="Age in years", ge=0, le=150)
email: Optional[str] = Field(default=None, description="Email address")
active: bool = Field(description="Whether the account is active")
OpenAI
openai_model = ChatOpenAI(model="gpt-4o", temperature=0)
structured = openai_model.with_structured_output(UserProfile)
result = structured.invoke("Create a profile for Bob, age 25, [email protected]")
result is a UserProfile instance — fully typed, validated
Anthropic
anthropic_model = ChatAnthropic(model="claude-opus-4-5", temperature=0)
structured = anthropic_model.with_structured_output(UserProfile)
result = structured.invoke("Create a profile for Alice, age 30")
with_structured_output() returns a Pydantic model instance, not a dict. Validation happens automatically — if the model returns the wrong type for a field, Pydantic raises a ValidationError with a clear message.
Pydantic Output Parsers for Type Safety
When you need a dict but also want type validation, PydanticOutputParser bridges both:
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, field_validator
from typing import List
class ProductList(BaseModel):
products: List[str]
count: int
@field_validator('count')
@classmethod
def count_must_match(cls, v, info):
products = info.data.get('products', [])
if v != len(products):
raise ValueError(f"count {v} doesn't match products length {len(products)}")
return v
parser = PydanticOutputParser(pydantic_object=ProductList)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a product catalog assistant.\n{format_instructions}"),
("human", "{input}")
])
chain = prompt | model | parser
result = chain.invoke({
"input": "List 3 popular programming frameworks",
"format_instructions": parser.get_format_instructions()
})
print(result.products) # ['React', 'Django', 'Spring Boot']
print(result.count) # 3
The format instructions generated by PydanticOutputParser.get_format_instructions() include the full JSON schema, which significantly improves compliance from less capable models.
RetryOutputParser — Automatic Self-Correction
When the first parse attempt fails, RetryOutputParser sends the model's output back to the LLM with a correction request:
from langchain.output_parsers import RetryOutputParser
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI
base_parser = JsonOutputParser()
retry_model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
retry_parser = RetryOutputParser.from_llm(
parser=base_parser,
llm=retry_model,
max_retries=3
)
Use with a chain:
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
prompt = ChatPromptTemplate.from_template(
"Return a JSON object with: name, score (0-100), passed (boolean).\n"
"Subject: {subject}"
)
completion = prompt | model
chain = RunnablePassthrough.assign(
completion=completion
) | RunnableLambda(
lambda x: retry_parser.parse_with_prompt(
x["completion"].content,
prompt.invoke({"subject": x["subject"]})
)
)
result = chain.invoke({"subject": "Mathematics"})
RetryOutputParser is useful when you're working with models that occasionally slip up. It costs extra tokens per correction but saves you from handling exceptions manually.
Robust Pipeline Pattern with Fallback Repair
For production pipelines that can't afford to fail, combine multiple strategies:
import json
import re
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.exceptions import OutputParserException
class RobustJSONParser:
"""
A LangChain-compatible JSON parser with multi-stage repair.
Stage 1: Standard JsonOutputParser (handles markdown fences)
Stage 2: Manual extraction + common fixes
Stage 3: Raise with detailed error for logging
"""
def __init__(self):
self._base_parser = JsonOutputParser()
def parse(self, text: str) -> dict:
# Stage 1: try standard parser
try:
return self._base_parser.parse(text)
except OutputParserException:
pass
# Stage 2: manual repair
cleaned = self._repair(text)
try:
return json.loads(cleaned)
except json.JSONDecodeError as e:
raise OutputParserException(
f"Failed to parse JSON after repair attempt.\n"
f"Original: {text[:300]}\n"
f"After repair: {cleaned[:300]}\n"
f"Error: {e.msg} at pos {e.pos}"
)
def _repair(self, text: str) -> str:
# 1. Extract from markdown fences
fence = re.search(r'
(?:json)?\s\n?([\s\S]?)\n?```', text)
if fence:
text = fence.group(1)
# 2. Find first JSON structure
start = min(
(text.find('{') if text.find('{') != -1 else float('inf')),
(text.find('[') if text.find('[') != -1 else float('inf'))
)
if start != float('inf'):
text = text[int(start):]
# 3. Remove trailing commas (simplified — doesn't handle strings)
text = re.sub(r',\s*([}\]])', r'\1', text)
# 4. Normalize Python literals
text = re.sub(r'\bTrue\b', 'true', text)
text = re.sub(r'\bFalse\b', 'false', text)
text = re.sub(r'\bNone\b', 'null', text)
return text.strip()
# Make it work as a LangChain runnable
def __or__(self, other):
from langchain_core.runnables import RunnableLambda
return RunnableLambda(self.parse) | other
Usage in a chain
parser = RobustJSONParser()
chain = prompt | model | RunnableLambda(lambda x: parser.parse(x.content))
Note: The _repair method above uses simplified regex. For production use with complex JSON (nested structures, strings containing commas), paste the raw output into the JSON Fixer — its 16-stage repair engine handles all edge cases safely.
Streaming JSON with LangChain
For streaming use cases, JsonOutputParser supports incremental output:
python
from langchain_core.output_parsers import JsonOutputParser
parser = JsonOutputParser()
Stream partial JSON as it arrives
async for chunk in (prompt | model | parser).astream({"query": "..."}):
# chunk is a partial dict that grows as tokens arrive
print(chunk)
Output sequence might be:
{}
{"name": ""}
{"name": "Alice"}
{"name": "Alice", "age": 30}
This works because JsonOutputParser internally uses a streaming-aware JSON parser that yields partial objects. For this to work, the model must output JSON directly (no markdown fences around the stream).
Diagnosing OutputParserException
When OutputParserException is raised, the message contains the raw model output. Extract it and paste it into the JSON Fixer to see exactly what's wrong:
python
try:
result = chain.invoke(input)
except OutputParserException as e:
# e.llm_output contains the raw model text
print("Raw LLM output that failed to parse:")
print(e.llm_output)
# Paste this into aijsonmedic.com for repair + diagnosis
raise
For comparison and debugging, use the JSON Diff Tool to compare the broken output against a known-good example. Use the JSON Formatter to pretty-print messy one-liner JSON before reading it.
Choosing the Right Parser
Use Case Recommended Parser Simple JSON dict, modern model with_structured_output() + PydanticNeed a dict, not Pydantic model JsonOutputParser + strong promptNeed type validation + dict PydanticOutputParserOccasional model mistakes, want auto-fix RetryOutputParser wrapping aboveStreaming JSON JsonOutputParser (streaming mode)Legacy or unreliable model RobustJSONParser pattern above
Quick Reference: Checklist for Reliable LangChain JSON
- Use
with_structured_output() whenever the model supports it — it avoids text parsing entirely - Include
parser.get_format_instructions() in the prompt for text-based parsers - Set
temperature=0 for structured output tasks - Wrap parser invocation in try/except
OutputParserException - Log
e.llm_output on failure for debugging - Use
RetryOutputParser for models that occasionally produce bad JSON - For complex repair needs, the JSON repair tool handles trailing commas, Python literals, truncation, comments, and markdown fences
For interactive debugging, the JSON Validator lets you check whether a repaired output is structurally valid, and the JSON Formatter makes complex nested output readable at a glance. For a deeper look at production LangChain JSON error patterns and end-to-end pipeline examples, see the LangChain JSON use case guide. For a broader guide to all the ways LLMs break JSON output across providers, see LLM JSON Repair Guide.
FAQ
What is the most reliable way to get JSON from LangChain?
Use JsonOutputParser with Pydantic's model_json_schema() in the prompt instructions, combined with a low-temperature model (0.1–0.3). For maximum reliability, use OpenAI functions/tools or structured output mode rather than free-form JSON generation — the model fills a schema rather than generating text that happens to look like JSON.
Why does LangChain's JsonOutputParser fail on valid-looking JSON?
Usually because the output contains markdown code fences (
json ` ), trailing text after the closing }, or Python-style literals (True/False/None). The parser expects a raw JSON string with no decoration. Add "Return only a valid JSON object. No explanation, no markdown fences, nothing before or after the JSON." to your prompt.
How do I fix JsonOutputParser when the model returns partial JSON?
Use RetryOutputParser to automatically retry with a correction message when parsing fails. For truncation specifically (output cut off mid-JSON), check your max_tokens setting — increase it so the model has room to close all brackets. For a one-off repair, pass the truncated output to AI JSONMedic to reconstruct the complete structure.
What is the difference between JsonOutputParser and PydanticOutputParser in LangChain?
JsonOutputParser returns a raw Python dict — no validation of the structure. PydanticOutputParser validates the parsed JSON against a Pydantic model and raises a ValidationError if fields are missing or types are wrong. Use PydanticOutputParser in production where you need guaranteed schema compliance; use JsonOutputParser for exploratory work or schemas that vary.
How do I handle streaming JSON output in LangChain?
Use JsonOutputParser with a streaming chain: chain = prompt | llm | JsonOutputParser(), then async for chunk in chain.astream(input). The parser accumulates partial output and emits dict chunks as they become parseable. For structured output that must be complete before use, don't stream — collect the full response and parse once.
Still dealing with broken JSON?
Paste it in and get it fixed in under 1 second — free, no signup, no install. Works with ChatGPT, Claude, n8n, and any AI output.
Fix My JSON Free →Related Articles