Skip to content

proposal: streaming evaluation#15

Merged
kanywst merged 2 commits into
mainfrom
feat/streaming-evaluation
May 9, 2026
Merged

proposal: streaming evaluation#15
kanywst merged 2 commits into
mainfrom
feat/streaming-evaluation

Conversation

@kanywst
Copy link
Copy Markdown
Member

@kanywst kanywst commented May 9, 2026

Design doc only. Tracks the streaming-evaluation item from ROADMAP longer term.

See docs/proposals/streaming-evaluation.md.

Summary:

  • Static AST classifier at configure time: no-body-refs / prefix-only / full-tree policies routed through different body paths.
  • Streaming JSON parser variant in src/json.zig that emits (path, value) events; evaluator decides as soon as every referenced prefix is resolved.
  • No-body-refs policies skip proxy_on_request_body entirely. Prefix-only policies short-circuit before full buffering.

Status: design only, no implementation. Depends on body-aware-policies (#6) landing first.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 9, 2026

Warning

Rate limit exceeded

@kanywst has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 28 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ca9dba1e-90f9-4d16-a66f-8b145928ff70

📥 Commits

Reviewing files that changed from the base of the PR and between 956cf1c and 2198054.

📒 Files selected for processing (3)
  • docs/proposals/streaming-evaluation.md
  • src/body_deps.zig
  • src/root.zig
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/streaming-evaluation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request proposes a design for streaming evaluation to optimize policy enforcement by avoiding full body buffering when only specific fields are required. The feedback focuses on architectural improvements for efficiency and binary size, specifically suggesting the use of a trie for path matching, refining the streaming event model to minimize allocations, and utilizing a shared lexer to reduce code duplication in the JSON parser.

```zig
const BodyDeps = struct {
refs_body: bool = false,
refs_paths: std.ArrayList([]const []const u8), // prefix tree
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The refs_paths field is described as a "prefix tree" but typed as a flat ArrayList of paths. For efficient matching during streaming, especially if a policy has many references, a true trie or prefix tree structure would be more appropriate to allow $O(depth)$ lookups instead of $O(N)$ comparisons against all registered paths.

Comment on lines +84 to +89
pub const StreamEvent = union(enum) {
enter_object: []const u8, // path so far
leave_object: void,
field: struct { path: [][]const u8, value: Value },
done: void,
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The StreamEvent design appears to mix hierarchical and flat approaches. It provides enter_object but also includes a full path in each field event, which is redundant and expensive to allocate during streaming. Furthermore, using the recursive Value type for fields implies that nested objects are still fully buffered before being emitted. A more efficient streaming API would emit separate events for container boundaries and only use Value for scalar leaves.

Comment on lines +130 to +133
- The streaming parser nearly doubles the size of `src/json.zig`.
Worth running a sizing experiment before committing: does
`--release=small` keep zopa.wasm under 80 KB with both paths
present?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To address the binary size concern, consider refactoring src/json.zig to use a shared low-level lexer. Both the existing recursive-descent parser and the new streaming evaluator can consume this lexer, avoiding duplication of complex logic like string escape decoding and number parsing.

@kanywst kanywst marked this pull request as ready for review May 9, 2026 13:59
kanywst added 2 commits May 9, 2026 23:30
Configure-time analysis pass that categorises a compiled module by
how it references the request body:

  - no_body_refs:  policy never touches input.body
  - prefix_only:   policy reads specific body sub-paths
  - full_tree:     policy reads input.body whole or iterates over a
                   body-rooted ref (full body must be buffered)

Conservative classifier: when uncertain, returns full_tree.

Tests cover header-only, body-leaf, bare body, body iteration, and
the bare 'body' shorthand (no input prefix). prefix_count returns
the number of distinct body sub-paths so callers can size buffers
accordingly.

Streaming evaluation (per docs/proposals/streaming-evaluation.md)
needs the body-aware callback path landing first. This analyser is
the configure-time piece; it ships independently and the proxy-wasm
shim can call it in a follow-up to skip proxy_on_request_body when
the policy doesn't need the body.

ci.yml gains a test-unit job (parity with feat/string-builtins,
feat/composite-ref-iteration, etc.).
@kanywst kanywst force-pushed the feat/streaming-evaluation branch from 918ed0b to 2198054 Compare May 9, 2026 14:31
@kanywst kanywst merged commit dc484b1 into main May 9, 2026
10 checks passed
@kanywst kanywst deleted the feat/streaming-evaluation branch May 9, 2026 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant