Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/STANDARDS.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Fail fast and name the problem. Silent fallbacks hide bugs.
Not a hardened production service, but a few things matter.

- **Secrets: never commit `.env`; don't log prompts with keys.**
- **Don't leak private embeddings/text into committed artifacts.** (The index is gitignored for a reason.)
- **Don't leak private embeddings/text into committed artifacts.** (The index is gitignored for a reason.) The one exception is `demo/`: it commits only the exact public-domain natural sources and flagged synthetic spire allowlisted in `demo/artifacts.test.ts`, on purpose, to reproduce the headline with no key. Some public-domain sources are routed through the no-leak layer to exercise the boundary, but that is a demo layer assignment, not a secrecy claim. Do not generalize it to genuinely-private corpora.
- **Performance: brute-force cosine is intentional at this scale.** Don't add pgvector, HTTP, or caching in a drive-by PR unless the README's "Where to take it" story is the explicit goal.

## 6. Style & Naming (Follow the Room)
Expand Down
21 changes: 12 additions & 9 deletions NEXT-STEPS.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,13 +176,16 @@ operational discipline, not something the type system enforces across time.
## C. Performance levers that trade quality for cost

This is the starter for anyone adapting the system. Each lever names the saving,
the quality cost, and the rule. **This repository is full-precision and indexes
documents whole; it pulls none of these levers.** The production deployment
behind the project pulls one of them (int8 wire-format quantization, in a
private serving adapter) and chunks its long-form inputs; the rest are
documented here so you can reason about all of them the same way, whether or not
this repo exercises them. **Every one of them is gated by the gold suite, never
special-cased.**
the quality cost, and the rule. **This repository's core is full-precision and
indexes documents whole; it pulls none of these levers.** The one exception is
the marked illustration at `demo/`: a runnable int8 miniature on a
short-whole-unit public-domain corpus, which pulls exactly one lever (int8
quantization) to show the gold suite gating it. The core's claims stay true of
the core; `demo/` is named as the exception. The production deployment behind
the project pulls int8 wire-format quantization in a private serving adapter and
chunks its long-form inputs; the rest are documented here so you can reason
about all of them the same way. **Every one of them is gated by the gold suite,
never special-cased.**

The cost concentrates almost entirely in one object: the embedding index, in its
in-memory footprint and in the latency of shipping it to a stateless serving
Expand All @@ -196,8 +199,8 @@ cosine similarity normalizes by vector norm, so a positive per-vector scale
cancels as a matter of *algebra* (guaranteed, exact); and integer rounding can
reorder near-ties, so its harmlessness is *measured* against the gold suite, not
proven. The full-precision vectors stay the source of truth, so this is a
transport encoding, not a lossy store. (See `docs/production-scaling.md` §2 and
the artifact note §7.)
transport encoding, not a lossy store. (See `docs/production-scaling.md` §2, the
artifact note §7, and the runnable miniature at `demo/`.)

Going further trades more quality for more savings:
- **int4 / lower-bit quantization** — roughly halves the transport size again;
Expand Down
52 changes: 26 additions & 26 deletions demo/README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,27 @@
# The int8 scaling demo

The result this module is built to produce is a **caught failure**: the same
gold suite that owns grounding and refusal rejecting a cheaper encoding. The
*mechanism* is proven offline, in `quantize.test.ts` (run by `npm test`): on
fixture vectors searched to exhibit a near-tie, int8 preserves both the route
and the disambiguation winner, int4 flips the top slot, and the gate catches it.

Whether the **real Smith corpus** produces that flip at the int8/int4 boundary
is a separate, empirical question, settled by the build run, not asserted here.
"int8 held" on a small corpus is expected and proves little on its own; the gate
saying *no* when pushed is what shows the gold suite, not the encoding, is the
adjudicator. So: the mechanism is demonstrated; the real-corpus demonstration is
pending.

The committed vectors are not built yet (this module was written with no network
and no key), so `npm run demo:run` errors with a build pointer until then;
see **Build status**. Once built:
The result this module produces is a **caught failure**: the same gold suite that
owns grounding and refusal rejecting a cheaper encoding. The real-only headline
now holds cleanly: `npm run demo:run -- --natural` certifies int8 at 7/7 gold
verdicts, mean rho 1.0000, min rho 1.0000. The synthetic spire is broken out:
`--natural+synthetic` certifies int8 at 9/9, while `--natural+synthetic --bits 4`
holds 7/9 and is rejected because exactly two verdicts fail: both engineered
route cases flip from the private synthetic note to the public Amos record. The
seven natural cases still hold, so the caught int4 break is concentrated where
the demo constructed the near-tie. The gate says yes to int8 and no when the
encoding is pushed.

The real route case was also tested as an escape hatch. It holds through int4
and only breaks around int2, so the synthetic spire remains: real text is too
stable to demonstrate the catch at int4, and the spire constructs the controlled
near-tie in the open.

Run it:

```
npm run demo:run # int8, real corpus: the headline, keyless
npm run demo:run -- --natural+synthetic # add the spire and its gold
npm run demo:run -- --natural+synthetic --bits 4 # int4: the gate rejects the spire's route flip
npm run demo:run -- --natural+synthetic --bits 4 # int4: the gate rejects the spire route flips
npm run demo:run -- --full # also run the answer-mode pass (needs a key)
```

Expand Down Expand Up @@ -103,14 +104,13 @@ approximate content, which is the exposure the core's gitignored index avoids.

## Build status

The code, the gold set, the provenance manifest, and the deterministic harness
tests (`quantize.test.ts`, run by `npm test`) are committed. The real text
bodies and the committed vectors (`corpus/index.json`,
`corpus/index.synthetic.json`, `corpus/query-vectors.json`) are produced by
`demo:build`, which needs network access to the public-domain sources and an
`OPENAI_API_KEY`; the session that wrote the module had neither. See
The code, gold set, provenance manifest, real text bodies, and committed vectors
(`corpus/index.json`, `corpus/index.synthetic.json`,
`corpus/query-vectors.json`) are built. `demo:build` remains the regeneration
path and needs an `OPENAI_API_KEY`; `demo:run` is keyless because it reads the
committed source and query vectors. See
[`docs/scaling-demo/build-handoff.md`](../docs/scaling-demo/build-handoff.md)
for the exact steps, and the delta log for what is confirmed versus pending.
for the source-building steps and the delta log for the empirical findings.

## The spec and the log are kept in the open

Expand All @@ -121,8 +121,8 @@ discarded once the code landed:
- `SCALING-DEMO-spec.md`: what the demo set out to do, and why; the ticket it was
built from.
- `scaling-demo-delta-log.md`: every place the build diverged from that spec,
what is settled versus pending the keyed build run, and the prepared
reconciliations (NEXT-STEPS, STANDARDS, the paper) to apply at merge.
the empirical result the harness produced, and the reconciliations
(NEXT-STEPS, STANDARDS, the paper) applied or still owed at merge.
- `build-handoff.md`: the brief for the build run that fetches the public-domain
texts and generates the committed vectors.

Expand Down
51 changes: 51 additions & 0 deletions demo/artifacts.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
import assert from 'node:assert/strict';
import { test } from 'node:test';

import { loadGold } from '../src/evaluate.js';
import { readIndexFile } from '../src/store.js';
import { readQueryVectors } from './query-vectors.js';

const EXPECTED_NATURAL_SOURCES = [
'adam-smith:theory-of-moral-sentiments-justice',
'adam-smith:theory-of-moral-sentiments-sympathy',
'adam-smith:wealth-of-nations-division-of-labour',
'adam-smith:wealth-of-nations-value',
'george-adam-smith:isaiah-prophet-of-faith',
'george-adam-smith:twelve-prophets-amos',
'george-adam-smith:twelve-prophets-hosea',
'george-adam-smith:twelve-prophets-micah',
'note:forgiveness-of-sins',
'note:temptation',
'note:word-of-god',
].sort();

test('committed demo index matches the public-domain source allowlist', () => {
const natural = readIndexFile('demo/corpus/index.json');
const actual = natural
.map((entry) => (entry.sourceType === 'record' ? entry.record.id : entry.note.id))
.sort();
assert.deepEqual(
actual,
EXPECTED_NATURAL_SOURCES,
`committed demo index source ids changed; update the public-domain provenance and allowlist deliberately\n` +
`actual: ${actual.join(', ')}`,
);

const synthetic = readIndexFile('demo/corpus/index.synthetic.json');
assert.deepEqual(
synthetic.map((entry) => (entry.sourceType === 'note' ? entry.note.id : entry.record.id)),
['note:syn-amos-justice-margin'],
'committed synthetic spire changed; keep it to the single flagged near-tie unless the demo is recalibrated',
);
});

test('committed demo query vectors match the gold suite ids', () => {
const gold = [
...loadGold('demo/gold.yaml', 'Smith Collection'),
...loadGold('demo/gold.synthetic.yaml', 'Smith Collection'),
];
const queryVectors = readQueryVectors('demo/corpus/query-vectors.json');
assert.ok(queryVectors);

assert.deepEqual([...queryVectors.byId.keys()].sort(), gold.map((g) => g.id).sort());
});
22 changes: 12 additions & 10 deletions demo/corpus/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# The scaling-demo corpus
# The demo corpus

This folder holds the corpus for the int8 scaling demo. This README is the corpus's **answerable half**: the mechanism makes the unauthored move inexpressible; this document owns, in the open, every authored choice behind the data. Each entry names the choice and the reason it was made. None of it is hidden, so none of it is a concession; it is the record of decisions a maintainer signs for.

Expand All @@ -15,28 +15,30 @@ Both write dense moral prose about justice, society, and ethics, so the two bodi

## Build status

The text bodies and the embedding vectors are produced by `demo/build.ts`, which needs network access to the public-domain sources and an `OPENAI_API_KEY`. The code, the structure, the gold set, the provenance table below, and the deterministic harness tests are authored and committed; the real bodies and the committed `index.json` / `query-vectors.json` are populated by a build run with those two things. See `docs/scaling-demo/build-handoff.md` for the exact build steps. **Every ID and date below is a claim to verify against the live source during that run, not a confirmation made here.**
The text bodies are now populated from the public-domain sources below. The embedding vectors are produced by `demo/build.ts`, which needs an `OPENAI_API_KEY`; the committed `index.json` / `query-vectors.json` are populated by that build run. See `docs/scaling-demo/build-handoff.md` for the exact build steps.

## Provenance and public-domain status

Every source, with the basis for its public-domain status. Public domain is the *absence* of copyright, not a license: this corpus is not "permissively licensed," it is public-domain. State the basis in both jurisdictions cleanly, since they rest on different facts:
- **US:** published before 1931, so public domain in the USA. (As of 1 Jan 2026, works published in 1930 and earlier are PD in the US.)
- **Life-plus-70 jurisdictions:** public domain once the author has been dead 70 years. In 2026 that covers authors who died in 1955 or earlier; George Adam Smith died 1942 and Adam Smith in 1790, so both are clear.

Verify each ID and date against the source before relying on it; fill OCR-quality notes from the actual file.
These IDs and dates were verified live during the build run; see the delta log,
especially row 17, for the source checks and the Internet Archive ARK
adjudication.

| Work (unit) | Author | Pub. | Layer | Source (ID) | PD basis | Notes |
|---|---|---|---|---|---|---|
| _Theory of Moral Sentiments_, §\<n\> | Adam Smith | 1759 | public | Gutenberg \<id\> | US: pre-1931 / PD in USA. Life+70: author d. 1790; term expired | _verify; fill: clean / OCR-noisy_ |
| _Wealth of Nations_, bk\<n\> ch\<n\> | Adam Smith | 1776 | public | Gutenberg \<id\> | US: pre-1931 / PD in USA. Life+70: author d. 1790; term expired | _verify_ |
| _The Book of the Twelve Prophets_, \<prophet\> | George Adam Smith | 1896-98 | public | Gutenberg 43847 / 50747 | US: pre-1931 / PD in USA. Life+70: author d. 1942; term expired | _verify against Gutenberg_ |
| _The Book of Isaiah_, ch\<n\> | George Adam Smith | 1888-90 | public | Gutenberg 39767 / 43672 | US: pre-1931 / PD in USA. Life+70: author d. 1942; term expired | _verify against Gutenberg_ |
| _The Forgiveness of Sins, and Other Sermons_, \<sermon\> | George Adam Smith | 1905 (A. C. Armstrong & Son) | **private** | Internet Archive `forgivenessofsin00smitrich` (ARK `ark:/13960/t0gt5jk4g`); HathiTrust full-view backup record 100136688 | US: pre-1931 / PD in USA. Life+70: author d. 1942; term expired | _verify NOT_IN_COPYRIGHT; OCR-noisy expected, which is fine_ |
| _Theory of Moral Sentiments_, "Of Sympathy"; "Justice and Beneficence" | Adam Smith | 1759 (Gutenberg source from 1777 printing) | public | Gutenberg 67363 | US: PD in USA per Gutenberg. Life+70: author d. 1790; term expired | verified; clean PG text |
| _Wealth of Nations_, bk. I ch. 1; bk. I ch. 5 | Adam Smith | 1776 | public | Gutenberg 3300 | US: PD in USA per Gutenberg. Life+70: author d. 1790; term expired | verified; clean PG text |
| _The Book of the Twelve Prophets_, Amos / Hosea / Micah units | George Adam Smith | 1896-98 | public | Gutenberg 43847 | US: PD in USA per Gutenberg. Life+70: author d. 1942; term expired | verified; clean PG text; vol. 1 contains Amos, Hosea, Micah |
| _The Book of Isaiah_, "This Is the Victory... Our Faith" | George Adam Smith | 1888-90 | public | Gutenberg 39767 | US: PD in USA per Gutenberg. Life+70: author d. 1942; term expired | verified; clean PG text; unit taken from vol. 1 |
| _The Forgiveness of Sins, and Other Sermons_, sermons I-III | George Adam Smith | 1904; third printing 1905 | **private** | Internet Archive `forgivenessofsin00smitrich` (ARK `ark:/13960/t0gt5jk4g`); HathiTrust/Online Books Page listing (alternate HathiTrust scan surfaced as ARK `ark:/13960/t0zp4cz00`) | US: pre-1931 / IA metadata says NOT_IN_COPYRIGHT in US; visible notice date 1904. Life+70: author d. 1942; term expired | verified against IA metadata XML + direct IA OCR; HathiTrust page view blocked here, so the alternate ARK is recorded as a separate scan/copy, not the OCR source used. OCR-noisy, kept as source character |
| \<edge-case note\> | n/a (fabricated) | n/a | **synthetic** | authored here | n/a (no copyright in fabricated demo text) | quarantined in `synthetic/`; tests \<gold id\> |

**Sourcing (resolved, pending verification).** George's *major* commentaries are listed on Project Gutenberg. The private layer rests on *The Forgiveness of Sins, and other Sermons* (1905), a single volume yielding several short, windy sermon units, which is exactly what the private layer needs: short whole units that route without restating. *Jeremiah: Being the Baird Lecture for 1922* (1923) is a further minor source if wanted. The fallback (designating a *section* of a major work private) is therefore **not** required; if a future rebuild loses these sources, that fallback keeps the private layer real rather than padding it with synthetic.
**Sourcing (resolved).** George's *major* commentaries are confirmed on Project Gutenberg. The private layer rests on *The Forgiveness of Sins, and other Sermons* (copyright 1904; third printing 1905), whose Internet Archive OCR supplies short sermon units that route without restating. *Jeremiah: Being the Baird Lecture for 1922* (1923) remains a further minor source if wanted. The fallback (designating a *section* of a major work private) is therefore **not** required; if a future rebuild loses these sources, that fallback keeps the private layer real rather than padding it with synthetic.

**OPEN: the one sourcing check that can block the build.** Confirm George's minor/windy material (the sermons) actually downloads as clean-enough public-domain text. If only the big commentaries are digitized, the private ledger is thin: use the fallback (a short *section* of a major work, designated private) rather than padding with synthetic, which would turn the spire into a column. Record the outcome here.
**Sermon-length outcome.** The private ledger uses sermons I-III ("The Forgiveness of Sins," "The Word of God," and "Temptation") as whole units. None was split into windows; the paper-reaching chunking watch-item did not fire.

## URLs: demo-canonical citations, real route targets

Expand Down
1 change: 1 addition & 0 deletions demo/corpus/index.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions demo/corpus/index.synthetic.json

Large diffs are not rendered by default.

Loading