Extract valid JSON from an LLM response — even when it's wrapped in reasoning/thinking tags, markdown fences or prose. Zero dependencies.
Security posture is tracked in docs/security-posture.md, including CodeQL, OpenSSF Scorecard, Dependabot and branch rules.
You asked for JSON. The model gave you:
<think>
Let me reason about this. The score should reflect... maybe {draft: 6}?
</think>
Sure! Here's the result:
```json
{"score": 8, "reason": "clear"}
```
Hope that helps!
JSON.parse throws on all of that. json-from-llm returns { score: 8, reason: "clear" }.
import { extractJson } from 'json-from-llm';
const data = extractJson<{ score: number }>(modelOutput);- Reasoning-model aware. Strips
<think>/<thinking>blocks first, including unclosed reasoning prefixes, so brace-laden reasoning (a real cause ofNo object generatedfailures with DeepSeek R1, Gemini 2.5 thinking, prompted Claude) never gets mistaken for the payload. - Handles the real wrappers. Markdown fences (
jsonand bare ```), conversational prose before/after, and the JSON sitting bare in the text. - String-aware, delimiter-aware, never corrupts. The scanner and the trailing-comma repair both respect string contents — a
}or,inside"a string value"is left alone, and mismatched or truncated JSON-looking drafts are skipped. - Conservative repair. Removes trailing commas (the most common malformation); it will never rewrite your data.
- Fixture-backed edge cases. Public fixtures cover reasoning tags, fenced JSON, prose wrappers, trailing commas, top-level type expectations and no-JSON failures.
- Two library entry points + CLI.
extractJsonthrows on failure;tryExtractJsonreturns{ found };json-from-llmreads stdin for shell pipelines. - Zero dependencies, ESM + CJS, fully typed.
npm install json-from-llmPipe model output directly into the binary:
cat response.txt | npx json-from-llmExample:
printf '%s\n' '<think>{draft: true}</think>```json
{"score":8,"reason":"clear"}
```' | npx json-from-llm
# {"score":8,"reason":"clear"}Useful flags:
# Skip an earlier array and require the first object that parses
cat response.txt | npx json-from-llm --expect object
# Disable trailing-comma repair when you want strict parsing
cat response.txt | npx json-from-llm --no-repairExit codes:
0— JSON extracted and printed to stdout.1— no matching JSON value found.2— invalid CLI options.
Returns the extracted JSON value, or throws JsonExtractionError if none can be recovered.
The non-throwing variant.
interface ExtractOptions {
repair?: boolean; // remove trailing commas (default true)
expect?: 'object' | 'array' | 'any'; // restrict the top-level type (default 'any')
}expect is handy when prose contains a stray array but you want the object:
extractJson('[1,2] then the answer {"a":1}', { expect: 'object' }); // { a: 1 }OpenAI-style fenced output:
const value = extractJson<{ score: number }>(
`Here is the JSON:
```json
{"score":8,"reason":"clear"}
````,
{ expect: 'object' },
);Anthropic-style prose around the object:
const result = tryExtractJson<{ safe: boolean }>(
'I will return the object first.\n{"safe":true}\nLet me know if you need more.',
{ expect: 'object' },
);Gemini-style thinking plus a top-level array:
const items = extractJson<Array<{ id: string }>>(
'<thinking>{draft: true}</thinking>\n[{"id":"a"}]',
{ expect: 'array' },
);- Strip
<think>/<thinking>/<reasoning>blocks. If a reasoning tag is opened and never closed, treat the rest as reasoning. - Prefer complete contents of fenced
json(or bare) code blocks. - If a fence contains prose, scan inside those fences for balanced JSON after complete fence payloads have been tried.
- Otherwise scan for the first balanced
{…}/[…]that parses, string-aware and delimiter-aware. - If parsing fails, apply conservative repair (trailing commas) and retry.
The low-level pieces (stripReasoning, fencedBlocks, balancedSpans, removeTrailingCommas) are exported too.
- TypeScript generics do not validate runtime shape. Pair this with your schema validator when fields matter.
- Repair is intentionally narrow: trailing commas only. It will not convert JSON5, comments, single quotes or unquoted keys.
- Candidate order is deterministic: JSON-ish fences first, then balanced objects/arrays in document order, filtered by
expect. - Unclosed reasoning tags return no JSON from that suffix instead of risking a draft extraction.
The package includes a small public corpus under fixtures/:
deepseek-thinking-object.txtgemini-reasoning-array.txtopenai-fenced-object.txtmultiple-fenced-final.txtanthropic-prose-object.txtprose-trailing-commas.txtmalformed-draft-valid-final.txtexpect-object-skips-array.txttruncated-stream-no-json.txtunclosed-thinking-no-json.txtno-json.txt- expected
tryExtractJsonoutputs underfixtures/expected/
The tests read these files directly, so parser changes are checked against stable, reusable examples. The fixtures are synthetic and safe for public CI: they contain no prompts, secrets, user data or live provider responses.
tool-schema— turn a JSON Schema into a provider tool/function schema (define the shape you then extract).llm-sse·llm-messages·llm-errors— the provider-portability suite.
MIT © Sebastian Legarraga