From b7c188c89f1bdacaafd596b9c0a06a3289e3c140 Mon Sep 17 00:00:00 2001 From: Tranquil-Flow Date: Tue, 9 Jun 2026 11:58:31 +0000 Subject: [PATCH] Solution: LP-0017 whistleblower --- solutions/LP-0017.md | 119 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 119 insertions(+) create mode 100644 solutions/LP-0017.md diff --git a/solutions/LP-0017.md b/solutions/LP-0017.md new file mode 100644 index 0000000..33f0f9b --- /dev/null +++ b/solutions/LP-0017.md @@ -0,0 +1,119 @@ +# Solution: LP-0017 — Whistleblower + +**Submitted by:** Tranquil-Flow + +## Summary + +Whistleblower is a Logos Basecamp app that lets a user pick a document, upload it to Logos Storage, broadcast the `(CID, metadata)` envelope over Logos Delivery, and (optionally) anchor the CID on a LEZ registry. A separate permissionless CLI batch-anchors accumulated CIDs without coordination with the publisher. The on-chain registry uses a **PDA-per-CID** layout — every anchored CID lives in its own program-derived account, so anchor cost is O(1), capacity is unbounded, and idempotency falls out of the default-state check. + +The registry program is **deployed and exercised on the public LEZ testnet** (`testnet.lez.logos.co`, real consensus, `RISC0_DEV_MODE=0`). Deploy, `anchor_one`, idempotent re-anchor, and a single-tx `anchor_batch` are all confirmed on chain and independently re-verifiable — see [Supporting Materials](#supporting-materials) and `TESTNET_PROOF.md`. + +## Repository + +- **Repo:** +- **License:** MIT OR Apache-2.0, at the recipient's option. +- **Deployed program (public LEZ testnet):** ImageID / ProgramId `54c7f793caa540408ce2ca4c22051d78c466cd5ed8db607feedd19dcb749aa91` +- **Explorer:** · **Sequencer RPC:** `https://testnet.lez.logos.co/` +- **Deploy tx:** `05781c3c5fa65d72d1ee9ee8f0964144f9a5688ef8ad14f445581e308026608f` (`Some(ProgramDeployment)`) + +## Approach + +The system has four components built from a shared trait surface: + +1. **Logos Basecamp app** (`ui/`) — Qt6/QML plugin packaged as a portable `.lgx`. Drives upload → Storage, broadcast → Delivery, optional anchor → LEZ via a Rust C-ABI (`ui/ffi/`). +2. **Reusable indexing module** (`document-indexing`, Qt-free Rust) — exposes `Publisher`, `StorageClient`, `DeliveryClient`, `RegistryClient` traits. Mock adapter for tests, LEZ adapter for production. +3. **Permissionless batch CLI** (`batch/whistleblower-batch`) — accumulates `(CID, metadata_hash)` tuples, dedupes via a sled-backed ledger, anchors in bounded batches against the deployed registry. CID source is pluggable (`DeliveryClient`): a live Logos Delivery subscription (the production transport) or an explicit operator-supplied envelope file (`--envelopes-from`, used for headless/CI runs). `--program-bin` points the tool at the deployed program ELF so its PDAs match the on-chain program. +4. **LEZ registry program** (`methods/guest/`) — PDA-per-CID storage, idempotent anchors, queryable without a transaction. + +### Key design decisions + +1. **PDA-per-CID registry storage** — not a single root PDA with a `HashMap`. A single account's data field is capped (~100 KiB), which bounds registry capacity by design. PDA-per-CID gives O(1) anchor cost regardless of registry size, unbounded capacity, and idempotency-by-default-state-check (no read-modify-write race). Tradeoff: no on-chain enumeration — acceptable because the spec only requires queryability by CID. +2. **Raw `nssa_core` guest** (not the `spel-framework` proc-macros) — the deployed guest ELF is hand-rolled `nssa_core` for a lean cross-compile. The **IDL is still machine-generated via `spel generate-idl`** (see Usability below); `spel generate-idl` only AST-parses its input, so this pulls nothing into the guest build. (An earlier submission claimed `spel-framework` forces `bonsai-sdk` into the `riscv32im` build — that was wrong; bonsai appears nowhere in LEZ, and the `host` feature is `k256`, off by default. Corrected.) +3. **LEZ program path** (not the zone SDK). The zone SDK currently requires a single designated consensus inscriber — decentralised sequencers haven't shipped. A LEZ program is permissionless from day one, which the censorship-resistance brief demands. +4. **Adapter-based indexing module.** The Qt-free Rust core is `dyn`-safe at three trait boundaries (`StorageClient`, `DeliveryClient`, `RegistryClient`). The Basecamp UI plugin and the batch CLI consume it identically, so the batch path's transport can be re-targeted without touching publishing logic. +5. **Wallet-free upload + broadcast.** Only on-chain anchoring needs a wallet; the publish path satisfies the spec's "without identifying the uploader" requirement. + +### Why the Logos stack + +The brief is censorship-resistant document publication. Storage gives content-addressed retrieval; Delivery gives broadcast without a central operator; LEZ gives a permissionless on-chain registry anyone can append to. On a centralized stack the publisher's identity leaks through whichever single operator hosts upload + broadcast + indexing — the property we're avoiding. Logos lets each component live in a different trust domain. + +## Success Criteria Checklist + +Mirrored from the LP-0017 spec. Legend: **[x]** met · **[~]** met with a documented caveat · **[ ]** not yet met. Honesty note: the on-chain registry is fully real on the public testnet; the upload→broadcast leg is real inside the Basecamp UI plugin (in-process `LogosAPIClient`) but is **GUI-driven**, because a *headless* Logos Delivery client requires a Rust QtRemoteObjects client against `logos_host` or RLN-membership management against Waku directly — a separate integration, documented and filed upstream. Items that depend on that are marked **[~]** rather than overclaimed. + +### Functionality + +- [x] **Upload to Logos Storage.** `ui/src/WhistleblowerBackend.cpp::uploadToStorage` via `LogosAPIClient` → `storage_module` (`uploadUrl` / `uploadInit`+`uploadChunk`+`uploadFinalize`), resolving on the CID-bearing `storageUploadDone` event (`adapters/qt/NOTES.md`). +- [x] **Broadcast envelope to a Logos Delivery topic.** `ui/src/WhistleblowerBackend.cpp::broadcastEnvelope` → `delivery_module.send` on `/lp0017-whistleblower/1/cids/json`. Envelope is `MetadataEnvelopeV1` (`core/`) carrying `cid`, `title`, `description`, `content_type`, `size_bytes`, `timestamp`, optional `tags`. +- [x] **Optional on-chain anchor.** Distinct "Anchor" action in the UI → `whistleblower_anchor_one` (C ABI, `ui/ffi/`) → `LezRegistryClient`. The C-ABI anchor path is exercised live (and asserted idempotent) by `ui/ffi/tests/anchor_one_live.rs` against a local sequencer; the same `anchor_one` instruction is confirmed on the **public testnet** by `anchor_spike/src/bin/testnet_lifecycle.rs` (tx `9f6aee9c…`). +- [x] **Permissionless batch anchor CLI.** `batch/` → `whistleblower-batch`. Accumulates `(CID, metadata_hash)` tuples, sled-backed dedupe ledger, batch window, idempotent re-anchoring; no coordination with the publisher. +- [x] **Idempotency — re-submitting a registered CID is a no-op.** Built into the guest (`new_claimed_if_default`). **Proven on the public testnet:** `cid_a` kept its original `anchor_one` timestamp (`1780495656451`) even after being re-included in the later batch — the batch's no-op did not overwrite it (`TESTNET_PROOF.md`). +- [x] **Registry stores `(CID, metadata_hash, anchor_timestamp)` per doc.** `AnchorEntry` borsh-encoded into one PDA per CID. Both testnet PDAs decode to a complete `AnchorEntry` with zero trailing bytes. +- [x] **Registry queryable by CID, no transaction.** Derive the PDA from `(program_id, sha256("lp0017:cid:v1\0" ‖ cid))` and read it: `whistleblower_query_by_cid` (FFI) + `LezRegistryClient::query_by_cid_hash`; `scripts/verify-testnet.sh` decodes both live PDAs. +- [x] **≥10 CIDs per batch tx.** A 50-CID batch is confirmed in a single tx by the live `lez_adapter_anchor_50_cids_in_one_tx` test (local sequencer) — 5× the spec's ≥10 floor; the batch instruction itself is also confirmed on the **public testnet** (`anchor_batch` tx `f5fedf29…`, a 2-CID batch). The program handles N PDAs in one tx regardless of N; the 50-CID headroom is the localnet-verified property. +- [x] **Reusable document-indexing module.** `document-indexing` crate, no Qt deps, three `dyn`-safe trait boundaries; consumed identically by the UI plugin and the batch CLI. API doc in `indexing/API.md`. + +### Usability + +- [x] **Basecamp app GUI + build instructions + downloadable asset + loadable.** `ui/` Qt6/QML plugin → `dist/whistleblower-plugin.lgx` (2.4 MB, darwin-arm64), built via the workspace nix flake (`nix build .#lgx`) and installable with `lgs basecamp install`. Build + install steps in `README.md` / `ui/README.md`. Smoke-tested in real Basecamp 2026-05-09. +- [x] **Indexing module as a library with a README.** `document-indexing` + `indexing/API.md` (the three adapter traits + `Publisher`); integrators implement one trait per Logos service. +- [x] **LEZ program IDL via SPEL.** `whistleblower-registry.idl.json` is **generated by `spel generate-idl`** (spel 0.2.0) from a parse-only `#[lez_program]` mirror at `idl/whistleblower_registry.rs` (regenerate: `bash scripts/regen-idl.sh`). `spel inspect … --type AnchorEntry` decodes on-chain entries. One documented gap: SPEL's `IdlSeed` (`const|account|arg`) can't express our `sha256(domain‖cid)` PDA seed — filed upstream; the exact derivation is recorded in the IDL `provenance` block and `idl/whistleblower_registry.rs`. + +### Reliability + +- [x] **Storage/anchor retry with backoff + clear final error.** `Publisher` wraps every adapter call in `with_retry` (5 retries, exponential, surfaces the final error). +- [x] **Delivery dedup.** `DurableDedupeStore` (sled) in `batch::run_batch_loop`; re-broadcasts of the same CID are dropped before anchoring. +- [x] **Batch resumes from last anchored.** Sled ledger persists across restarts; combined with the registry's default-state idempotency, re-runs from any point are safe — re-submitting an already-anchored CID is a no-op at the program level. + +### Performance + +- [~] **CU benchmarks for single-CID and 50-CID anchors.** `BENCHMARKS.md`. The public testnet **does not expose or persist** per-transaction compute units (`getTransactionReceipt` → `-32601`; the executor's cycle count is discarded at `nssa/src/program.rs:80`; filed upstream — `BUGS_FILED.md` #7). CU is therefore the **executor cycle count of the deployed ELF** (`54c7f793…`): the RISC0 zkVM is deterministic, so executing the deployed program for a given input yields exactly the cycles it consumes on the testnet. Shape: per-CID cost is constant (~1–2.5 ms executor time on the current numbers); a 50-CID batch is ~50× a single anchor plus fixed per-tx overhead. Testnet inclusion latency and payload/proof sizes are captured directly. **Current measured deployed-ELF values:** `anchor_one` = **100,185** user cycles; `anchor_batch(50)` = **4,506,872** user cycles (~90 K/CID), measured 2026-06-05 against the deployed rc3 ELF (`ImageID 54c7f793…`) using the same deterministic RISC0 executor path the sequencer uses. Older rc1 executor-time numbers are retained only as wall-clock corroboration, not as primary CU evidence. + +### Supportability + +- [x] **Registry deployed and tested on LEZ testnet.** Public testnet `testnet.lez.logos.co`, program `54c7f793…aa91`; deploy + full anchor lifecycle confirmed via `wallet chain-info` (`Some(ProgramDeployment)` / `Some(Public)`) and PDA readback. `TESTNET_PROOF.md`; re-verify live with `bash scripts/verify-testnet.sh`. +- [~] **End-to-end integration tests (upload → broadcast → batch anchor) in CI.** The **on-chain half** runs unattended: `.github/workflows/ci.yml`'s `verify-testnet` job re-queries the deployed program's deploy/anchor/dup/batch transactions from the public sequencer on **every push** and fails if any are missing (replacing the old `workflow_dispatch`+`exit 1` stub that never ran). Headless **upload → broadcast** is *not* in CI: it requires the Logos Storage/Delivery Qt modules (real Delivery is Waku + RLN over QtRemoteObjects), which run inside Basecamp, not in a Linux CI runner. The indexing pipeline that joins all three legs is covered by the `document-indexing` contract + e2e tests against in-memory adapters (`indexing/tests/`), and the batch→registry leg is covered live against the testnet. An optional **fresh-anchor-on-push** job (a real new anchor against the live testnet each push, beyond read-only re-query) is sketched as a commented, secret-gated block in `.github/workflows/ci.yml` — enabling it needs a funded CI wallet secret (a maintainer/secrets decision) and adds testnet state per push. **Honest status:** the registry e2e is real and automated on push; the Storage/Delivery leg is real but GUI-driven, not headless-in-CI. +- [x] **CI green on the default branch.** Fast workspace tests + FFI unit tests + the on-push `verify-testnet` job are green on `main`. +- [x] **README covers build, deploy address, running the app, running the batch tool, querying the registry.** `README.md` (+ `DEPLOYMENT.md`, `ui/README.md`). +- [~] **Reproducible e2e demo script, `RISC0_DEV_MODE=0`.** `scripts/demo.sh` (default) runs the real registry lifecycle against the public testnet headlessly with **no mock and no GUI** — clone-and-run; needs only `curl`+`python3` (or the `wallet` binary for the richer PDA-decoding check); `--full` does a fresh build + deploy + lifecycle; the batch tool runs via `--envelopes-from` + `--program-bin` (no `--mock-delivery`). The spec's literal "real *local* sequencer with `RISC0_DEV_MODE=0`" path is retained as a documented `--localnet` mode. The upload→broadcast leg is demonstrated by the Basecamp UI plugin (GUI), per the caveat above — it is not part of the unattended script. +- [x] **Narrated video showing terminal output incl. proof generation, `RISC0_DEV_MODE=0`.** Fresh public-testnet walkthrough recorded by Evi: . The recording follows `DEMO.md` / `scripts/record-final-video.sh`, leads with `RISC0_DEV_MODE=0` proof-mode evidence, then re-verifies the deployed public-testnet lifecycle and batch-anchor evidence. + +### Submission + +- [~] **GitHub issues filed for upstream problems.** `BUGS_FILED.md` documents every issue encountered with full root-cause + suggested-fix write-ups (LEZ template runner drift, `logos-liblogos` gtest timeout, `cargo-risczero` Metal-kernel build, `delivery_module` `librln` link, SPEL hashed-seed gap, testnet CU not persisted). One is filed (`logos-blockchain-circuits#33`); the remainder are written up and queued to file against the correct upstream repos. Ready-to-file bodies in `docs/upstream-issues/`. + +## FURPS Self-Assessment + +### Functionality + +Delivers the full publish flow (upload → broadcast → optional anchor) inside a real Basecamp instance, plus the permissionless batch CLI for third-party anchoring. The on-chain registry is queryable by CID hash without a transaction, deployed and exercised on the public testnet. A 50-CID batch confirms a single tx handles 5× the spec's ≥10-per-batch floor, with idempotency proven on chain. Out of scope by design: tag-based search (no on-chain enumeration — single-CID lookup only). Known limitation: a *headless* real-Delivery client for the batch CLI is not shipped (real Delivery is Waku + RLN over QtRemoteObjects, exercised through the Basecamp UI plugin); the batch CLI's CID source is the `DeliveryClient` trait, fed in headless/CI runs by an explicit envelope file rather than a live Waku subscription. + +### Usability + +End users install the `.lgx` via `lgs basecamp install` or Basecamp's "Install plugin" file picker. Onboarding is two steps: pick a document, hit Publish; anchoring is an opt-in second click. The reusable indexing module is documented in `indexing/API.md` + rustdoc on `Publisher` and the three adapter traits. The batch CLI is a single binary with a small flag set (`--topic`, `--batch-size`, `--batch-interval-secs`, `--dedupe-store-path`, `--program-bin`, and either a live Delivery subscription or `--envelopes-from `). + +### Reliability + +`Publisher` wraps every adapter call in `with_retry` (5 retries, exponential backoff, final error surfaced). The batch CLI's dedupe ledger is sled-backed so it survives restarts; combined with the registry's default-state idempotency, the batch is safe to re-run from any point. Failed anchors leave the registry unchanged — the LEZ program rejects transactions atomically before any PDA write. + +### Performance + +CU is reported as the deterministic executor-cycle cost of the deployed ELF, because the testnet neither exposes nor persists per-tx CU (filed upstream). Anchor cost is constant per CID regardless of registry size (PDA-per-CID), so a batch is ~linear in batch size with a fixed per-tx overhead. Wall-clock latency on the testnet is dominated by block cadence, not program compute. Absolute deployed-ELF cycle figures are measured against the deployed rc3 ELF: `anchor_one` = 100,185 user cycles and `anchor_batch(50)` = 4,506,872 user cycles; older rc1 localnet executor-time numbers remain only as wall-clock corroboration. Anchor transactions use the LEZ **Public** path, which is sequencer-proved — no host-side proof generation in the anchor path; the prover is shown firing on the privacy-preserving faucet/bootstrap path in the demo to evidence `RISC0_DEV_MODE=0`. + +### Supportability + +The registry is deployed and re-verifiable on the public testnet from any machine with the `wallet` binary (`scripts/verify-testnet.sh`) or with `curl` alone (`scripts/ci-verify-testnet.sh`, the on-push CI check). The codebase is split into focused crates plus the UI plugin so maintainers can swap individual layers (registry program, indexing traits, batch CLI, UI) independently. `DEPLOYMENT.md` documents both the public-testnet deploy and the local-sequencer path. The one genuine supportability gap — a headless, unattended full-pipeline e2e including upload→broadcast — is documented honestly above rather than papered over with a mock. + +## Supporting Materials + +- **Public-testnet proof (primary evidence):** `TESTNET_PROOF.md` — deploy + anchor + idempotent re-anchor + batch tx hashes, `chain-info` verdicts, PDA decodes; re-verify with `bash scripts/verify-testnet.sh`. +- **CU benchmarks:** +- **Deployment instructions:** +- **Demo script (testnet-first walkthrough):** +- **Upstream issues filed/queued during this build:** +- **SPEL-generated IDL:** (regenerate via `scripts/regen-idl.sh`) +- **Narrated demo video:** + +## Terms & Conditions + +By submitting this solution, I confirm that I have read and agree to the [Terms & Conditions](../TERMS.md).