Skip to content

The classical-liberal canon from the Online Library of Liberty (199 PD works)#5

Merged
adoistic merged 4 commits into
mainfrom
feat/oll-liberty-canon
Jun 11, 2026
Merged

The classical-liberal canon from the Online Library of Liberty (199 PD works)#5
adoistic merged 4 commits into
mainfrom
feat/oll-liberty-canon

Conversation

@adoistic

Copy link
Copy Markdown
Owner

The richest single addition available, and your home tradition: 199 public-domain works from Liberty Fund's Online Library of Liberty — Bastiat, Herbert Spencer, Tocqueville, Bentham, Lysander Spooner, William Graham Sumner, Nassau Senior, Jevons, McCulloch, Malthus, Maine, Acton, Erasmus, Marshall, Menger, Harrington, the Levellers, and the long line of political economy. David Hart, already a named carrier in our atlas, directed OLL for eighteen years.

Curation + licensing. An agent selected the canon from OLL's 1,017 titles with an adversarial check; an authoritative per-title license scan then kept only the ones OLL marks "the text is in the public domain" — 404 candidates to 202 PD, then 199 after dropping duplicates of works we hold and Calvin's off-theme Institutes. Every Liberty Fund / FEE copyrighted edition and modern translation was excluded.

The ePub ingester downloads each ePub, takes the content past the Liberty Fund front-matter, splits on chapter headings, strips the editorial apparatus (margin notes, footnotes), and repairs OLL's own ePub entity corruption (8sect; for the section sign). Verified clean — zero margin-note, footnote, boilerplate, or entity artifacts across all 199 works.

Corpus: 1,148 works, 18,945 chapters, ~945k searchable paragraphs. Liberty Fund's OLL credited on /about#sources; each work links to its title page. MCP 0.2.3 points at the corpus-2026-06-11d release. Regenerable ingester build artifacts are now gitignored.

Merging deploys falsafa.ai.

🤖 Generated with Claude Code

…erty

199 public-domain works from Liberty Fund's Online Library of Liberty,
the curated canon of classical-liberal, economic, political and legal
thought: Bastiat, Herbert Spencer, Tocqueville, Bentham, Lysander
Spooner, William Graham Sumner, Nassau Senior, Jevons, McCulloch,
Malthus, Maine, Acton, Erasmus, Marshall, Menger, Harrington, the
Levellers and the long line of political economy. David Hart, already a
named carrier in our atlas, directed OLL for eighteen years.

Curated by an agent from OLL's 1,017 titles with an adversarial check,
then gated by a per-title license scan: only titles OLL marks 'the
text is in the public domain' (404 candidates -> 202 PD -> 199 after
dropping duplicates of works we hold and Calvin's off-theme Institutes).
Every Liberty Fund / FEE copyrighted edition and modern translation was
excluded.

scripts/oll/ingest.ts downloads each ePub, takes the content past the
Liberty Fund front-matter (the filter is token-anchored so it stops
discarding 'de-Tocqueville' files as tables of contents), splits on
chapter headings, strips the editorial apparatus (margin notes,
footnotes), and repairs OLL's own ePub entity corruption ('8sect;' for
the section sign) and unhandled named entities. Verified clean across
all 199 works.

Corpus: 1,148 works, 18,945 chapters, ~945k searchable paragraphs.
Liberty Fund's OLL credited on /about#sources. MCP 0.2.3 points at the
corpus-2026-06-11d release (1,148 works).

The regenerable ingester *-works.json/*-audit.json build artifacts are
now gitignored (corpus/ holds the built output); oll-works.json was
133 MB and over GitHub's file limit.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 11, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
falsafa-site Error Error Jun 11, 2026 5:11pm

The Astro static build of ~28k chapter pages exceeded the default Node
heap and OOM'd on Vercel after the corpus doubled. Cap raised to 7.5 GB
(NODE_OPTIONS=--max-old-space-size=7680) so the build fits the free-tier
container's memory.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…tlify

The library outgrew hobby-tier PaaS builders: ~28k pages and a pagefind
index over ~945k paragraphs OOM'd Vercel, and Cloudflare Pages caps
deployments at 20k files (we emit ~56k). The fix is to build where there
is room and serve where there is no file cap: a GitHub Actions runner
(16 GB RAM, 14 GB disk) runs the astro + pagefind build, then deploys the
finished dist to Netlify, whose free tier has no hard file or size limit.

Needs two repo secrets: NETLIFY_AUTH_TOKEN and NETLIFY_SITE_ID. On push
to main (site or corpus changes) the workflow builds and deploys; it can
also be run manually via workflow_dispatch.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…fallback

Adnan connected the repo to Netlify for native continuous deployment, so
configure Netlify to build the bun-workspace site correctly: install at
the repo root, build apps/site (prebuild reads ../../corpus), publish
apps/site/dist, with bun pinned. The GitHub Actions deploy is demoted to
a manual fallback for the case where Netlify's own builder cannot handle
the ~28k-page / 945k-paragraph build at scale.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@netlify

netlify Bot commented Jun 11, 2026

Copy link
Copy Markdown

Deploy Preview for falsafaai failed.

Name Link
🔨 Latest commit 0fa8851
🔍 Latest deploy log https://app.netlify.com/projects/falsafaai/deploys/6a2aeb184c90720007a39fe7

@adoistic adoistic merged commit 7f2444b into main Jun 11, 2026
2 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant