The content-addressed retrieval endpoint bypasses path-scoped visibility entirely. This is the same withholding-bypass class as #98 (fork) on a different egress surface.
Where
crates/gitlawb-node/src/api/ipfs.rs, get_by_cid (route GET /ipfs/{cid} registered at server.rs:191).
- The handler takes no
AuthenticatedDid extension and the route sits outside add_auth_layers, so it is unauthenticated.
- It decodes the CID to a SHA-256, calls
list_all_repos(), and for every repo does repo_store.acquire(...) + store::read_object(repo_path, sha), returning the first match as raw bytes.
- There is no
authorize_repo_read, no visibility_check, and no withheld_blob_oids filtering anywhere on the path.
Impact
For a public repo with a path-scoped deny rule (e.g. /secret/**), git_upload_pack serves a filtered pack: the withheld blob content is omitted, but the tree objects that name those blobs and their SHAs are kept intact (mode B). An unauthenticated caller can therefore:
- Clone the repo and read the tree under the withheld path to recover the withheld blob SHAs and filenames.
- Compute each blob CID and fetch the cleartext content via
GET /ipfs/{cid}.
So every blob the read path is careful to withhold is retrievable in cleartext by anyone who can reach the node. The fork gate (#98) and the upload-pack filter do not help, because the leak is on a separate route.
Suggested remediation
get_by_cid must not serve a blob the requesting caller may not read. Options:
- Resolve the object to its owning repo(s) and apply the same per-caller withheld check the serve path uses (
withheld_blob_oids / visibility_check) before returning content; fail closed (404) when the object is in any repos withheld set for the caller. Requires threading caller identity onto the route.
- If the endpoint is meant only for already-public content, restrict it to objects reachable from a repo+path the caller passes
authorize_repo_read for, rather than a blind cross-repo SHA scan.
Found while mapping egress paths during the #98 review. Pre-existing, independent of #98.
The content-addressed retrieval endpoint bypasses path-scoped visibility entirely. This is the same withholding-bypass class as #98 (fork) on a different egress surface.
Where
crates/gitlawb-node/src/api/ipfs.rs,get_by_cid(routeGET /ipfs/{cid}registered atserver.rs:191).AuthenticatedDidextension and the route sits outsideadd_auth_layers, so it is unauthenticated.list_all_repos(), and for every repo doesrepo_store.acquire(...)+store::read_object(repo_path, sha), returning the first match as raw bytes.authorize_repo_read, novisibility_check, and nowithheld_blob_oidsfiltering anywhere on the path.Impact
For a public repo with a path-scoped deny rule (e.g.
/secret/**),git_upload_packserves a filtered pack: the withheld blob content is omitted, but the tree objects that name those blobs and their SHAs are kept intact (mode B). An unauthenticated caller can therefore:GET /ipfs/{cid}.So every blob the read path is careful to withhold is retrievable in cleartext by anyone who can reach the node. The fork gate (#98) and the upload-pack filter do not help, because the leak is on a separate route.
Suggested remediation
get_by_cidmust not serve a blob the requesting caller may not read. Options:withheld_blob_oids/visibility_check) before returning content; fail closed (404) when the object is in any repos withheld set for the caller. Requires threading caller identity onto the route.authorize_repo_readfor, rather than a blind cross-repo SHA scan.Found while mapping egress paths during the #98 review. Pre-existing, independent of #98.