Turn Chaos Into Structure. A Type-Safe AI Agent that extracts valid JSON from unstructured data using PydanticAI, FastHTML, and Gemini 2.5.
-
Updated
Jan 10, 2026 - Python
Turn Chaos Into Structure. A Type-Safe AI Agent that extracts valid JSON from unstructured data using PydanticAI, FastHTML, and Gemini 2.5.
Interactive Phoenix LiveView demonstrations of the Crucible Framework - showcasing ensemble voting, request hedging, statistical analysis, and more with mock LLMs
Public artifact bundle for the preprint 'Lightweight Evaluation and Operational Scorecards for Tool-Using AI Agents'
Reliability and hallucination mitigation research for tool-augmented legal AI agents using QC-Sentinel verification architecture.
Reference implementation of CAAF — three-pillar agent framework with monotonic convergence.
Production-style LLM evaluation harness for structured clinical extraction — compares prompt strategies across accuracy, cost, and hallucination.
Preprint paper package — Lightweight Evaluation and Operational Scorecards for Tool-Using AI Agents (Zenodo DOI 10.5281/zenodo.20034550)
CrucibleFramework: A scientific platform for LLM reliability research on the BEAM
Add a description, image, and links to the llm-reliability topic page so that developers can more easily learn about it.
To associate your repository with the llm-reliability topic, visit your repo's landing page and select "manage topics."