docs(rust): final-state cleanup — tiktoken pin + error-aware cap handling#667
Merged
Conversation
Two findings from the fresh codex pass against main after PR #661 merged plus one follow-up after round 2: - `tiktoken-rs` version pin in the streaming tokenizer-fallback snippet was `"0.6"` while crates.io is at 0.11.0. The cited API (`o200k_base`, `encode_with_special_tokens`) is unchanged across that range, but the pin should match current. Bumped to `"0.11"`. - The error-aware `run_completion` example skipped cap handling entirely — it called `.max_completion_tokens(800u32)` without consulting `guard.caps()`. The intro's loud-failure stance claimed all examples on the page error out on non-positive `caps.max_tokens`, which was false for that example. Added the same `u32::try_from(cap)` + zero-check + release-and-error block used in the ALLOW_WITH_CAPS and streaming examples, with the error variants typed as `CompletionError::Cycles(CyclesError::Validation(...))` to flow through the function's typed error contract. - Codex round 2 then caught that the BASIC example also doesn't read caps — it takes `_ctx` to keep the minimum-viable composition compact. Rather than add cap handling and dilute that example's purpose, narrowed the intro claim: all four examples fail loud on missing `usage` and missing `content`, and the three cap-reading examples additionally fail on non-positive caps. The basic example explicitly ignores caps; production code should follow the capped pattern. Codex round 3 returned SHIP. Sources verified: runcycles 0.2.4 source, the shipped async_openai_completion.rs example on main, async-openai 0.38.2 docs.rs, tiktoken-rs 0.11.0 docs.rs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Fresh codex review against `main` (after PR #659/#660/#661 all landed) caught two issues that slipped through the three earlier rounds:
`tiktoken-rs` version pin stale. The streaming tokenizer-fallback subsection pinned `tiktoken-rs = "0.6"` while crates.io is at 0.11.0. The cited API (`o200k_base`, `encode_with_special_tokens`) is unchanged across that range, but the pin should match current.
Error-aware example skipped cap handling. The `run_completion` function called `.max_completion_tokens(800u32)` without checking `guard.caps()`. The intro's loud-failure stance claimed all examples on the page error out on non-positive `caps.max_tokens` — that was false for this example.
A third issue surfaced in codex round 2 of this PR: the basic example also doesn't read caps (it takes `_ctx` to keep the minimum-viable composition compact). Rather than dilute that example by adding cap handling to it, I narrowed the intro claim to accurately describe what each example demonstrates.
Changes
`how-to/integrating-cycles-with-async-openai.md`:
`tiktoken-rs` pin: `"0.6"` → `"0.11"` in the streaming tokenizer-fallback Cargo.toml comment.
Error-aware example cap handling: added the same `u32::try_from(cap)` + zero-check + release-and-error pattern used in the ALLOW_WITH_CAPS and streaming examples. The error variants are typed as `CompletionError::Cycles(CyclesError::Validation(...))` to flow through the function's typed-error contract:
```rust
let mut max_tokens: u32 = 800;
if let Some(caps) = guard.caps() {
if let Some(cap) = caps.max_tokens {
let cap_u32 = u32::try_from(cap).map_err(|_| {
CompletionError::Cycles(CyclesError::Validation(
"caps.max_tokens is negative".into(),
))
})?;
if cap_u32 == 0 {
let _ = guard.release("caps.max_tokens is 0".to_string()).await;
return Err(CompletionError::Cycles(CyclesError::Validation(
"caps.max_tokens is 0".into(),
)));
}
max_tokens = cap_u32.min(max_tokens);
}
}
```
Intro Loud-failure stance narrowed: all four examples fail loud on missing `usage` and missing `content`. The three cap-reading examples (ALLOW_WITH_CAPS, streaming, error-aware) additionally fail loud on non-positive caps. The basic example deliberately ignores caps to keep the minimum-viable composition compact; production code should follow the capped pattern.
Review
Verified against runcycles 0.2.4, async-openai 0.38.2, tiktoken-rs 0.11.0, and the shipped `cycles-client-rust/examples/async_openai_completion.rs` on main.
Test plan
Why a fresh PR rather than amending
PR #661 was already merged into main before the user prompted for a final-state re-review. Branching off main is the clean path; this PR is small and focused.