Streaming response support for the cffi interface + bug fixes#247
Open
Bineroflux wants to merge 8 commits into
Open
Streaming response support for the cffi interface + bug fixes#247Bineroflux wants to merge 8 commits into
Bineroflux wants to merge 8 commits into
Conversation
Adds four new cffi exports that let language bindings (Python, Node.js, C#, ...)
consume server-sent events and other long-lived streaming responses without
blocking the entire request-response round trip:
- requestStream - sends the request, returns headers + a streamId once
the response headers arrive (body not yet consumed)
- readStream - polls the next chunk of decompressed body bytes,
base64-encoded; supports blocking / non-blocking /
bounded-timeout modes via timeoutMs
- readStreamAll - drains the rest of the body in one call, returning a
standard Response envelope (for callers that decided
after inspecting the headers that the response is not
actually streaming)
- cancelStream - cancels an in-flight stream and releases its
connection; idempotent
Internally, requestStream registers a per-stream pump goroutine that reads
resp.Body in streamOutputBlockSize-sized chunks into a buffered channel.
readStream consumes one chunk per call, with a 60s back-pressure timeout
on the producer side so a misbehaving caller cannot pin a goroutine
forever.
destroyAll now also clears the stream registry alongside the session cache.
The existing request() export is unchanged; callers opt into streaming by
calling requestStream instead.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
16 tests against tls_client_cffi_src exercising the streaming primitives:
* ReadStreamChunk: drain-to-EOF, block-size chunking, unknown streamId,
non-blocking poll, bounded timeout, blocking-negative-timeout, body
read errors surfaced through StreamChunkResponse.Error.
* CancelStream: cancel mid-stream removes the registry entry and
invokes the cancel func; idempotent for unknown ids and repeated calls.
* ReadStreamAll: text body, IsByteResponse base64+MIME wrapping, and
the mixed case where a caller reads one chunk via ReadStreamChunk and
then drains the rest via ReadStreamAll.
* gzip decompression is applied transparently when Content-Encoding is
set and resp.Uncompressed is false.
* BuildStreamStartResponse preserves status/headers/protocol/sessionId
and emits an empty Body + a streamId.
* ClearStreamCache removes every registered stream.
* EndToEndSSEServer drives a real tls_client.HttpClient through a
Flusher-based fhttp/httptest server and asserts all flushed events
appear in the assembled body.
All tests pass under -race.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cffi_dist/example_python/example_stream.py demonstrates the four new
streaming exports (requestStream / readStream / readStreamAll /
cancelStream) end to end:
- bind all four via ctypes alongside the existing request/freeMemory
pattern from the other examples
- issue a streaming request and inspect the response headers from the
requestStream envelope
- reassemble \n-terminated lines across chunk boundaries (the same
line-buffering required for SSE \n\n event framing)
- call cancelStream from a finally block so an early loop exit still
releases the underlying TCP/TLS connection
Includes a documented drain_all_in_one_call() helper showing the
readStreamAll alternative for when the response turns out not to be
streaming, so a future maintainer can see both branches side by side.
A module-level docstring explains at a higher level what problem these
exports solve (the regular `request` blocks until the body is fully
read) and how the loop generalises from newline-delimited JSON to true
SSE event boundaries.
Targets httpbin.org/stream/5 (NDJSON over a chunked body) because it's
deterministic and finite; a real SSE endpoint slots in by changing the
URL only. Verified end-to-end against a freshly built DLL: 5 JSON
events parsed across chunk boundaries, EOF closes cleanly, cancelStream
runs in finally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The charset-detection refactor in d0fc595 moved io.ReadAll inside the !isByteResponse branch and forgot to provide a byte-response branch, so every IsByteResponse=true request silently returned an empty payload — the response body was just "data:application/octet-stream;base64," with nothing after the prefix. Restore the missing branch so byte responses still go through io.ReadAll (or readAllBodyWithStreamToFile when StreamOutputPath is set) before being base64-encoded. The byte path intentionally skips the charset wrapper and the empty-body preview because the payload is returned as raw bytes inside a data URI; running it through charset.NewReader would corrupt non-text content. Add tests/byte_response_test.go covering the regression: binary payloads that don't survive a UTF-8 round trip, empty bodies, large payloads, gzip-encoded byte responses, header/session propagation, and a sanity check that the text path still works. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The underlying tls_client.WithTimeoutSeconds(0) is documented as "Use 0 to disable the deadline (Unlimited) for large downloads or long-polling", but the cffi layer treated TimeoutSeconds=0 / TimeoutMilliseconds=0 as "field omitted" and silently fell through to DefaultTimeoutSeconds (30s). Callers consuming SSE / long-polling streams therefore had no way to disable the deadline and saw a 30s connection reset. Flipping 0 to mean "disable" would be a silent regression for every existing caller that relies on the field default — so introduce a negative sentinel instead. ResolveTimeoutOption now returns: - WithTimeoutSeconds(0) when either field is negative (disabled) - WithTimeoutMilliseconds(ms) when ms > 0 - WithTimeoutSeconds(seconds) when seconds > 0 - WithTimeoutSeconds(default) when both are 0 (unchanged behavior) Pull the resolver out of getTlsClient into an exported helper so the contract is testable and callers can find it. Document the precedence on RequestInput's godoc and fix the misleading "set timeoutSeconds to 0 for SSE streams" guidance in example_stream.py. Add tests/timeout_test.go covering all branches via httptest: default / positive seconds / positive ms / negative-disables / negative-precedes-positive / ms-wins-over-seconds. Verified the negative-sentinel cases fail against the original buggy resolver (WithTimeoutSeconds(-1) creates a negative duration, request fires an immediate i/o timeout) and pass against the fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
For non-byte responses BuildResponse routed the body through charset.NewReader before writing to disk via readAllBodyWithStreamToFile. charset.NewReader is intended for HTML / XML and assumes the input is text — when the body is binary (e.g. an image whose Content-Type is image/png), it gets re-encoded windows-1252 → UTF-8, inflating the file by ~1.6× and corrupting every high-byte value. With an empty or unknown Content-Type the transcoder even prepends a UTF-8 BOM, which makes the corruption obvious in the saved file. The fix: skip charset.NewReader when input.StreamOutputPath != nil. The stream-to-file path's contract is "save raw bytes to a file"; charset detection has no business on that path. The in-memory text path (no StreamOutputPath, no IsByteResponse) keeps using charset.NewReader as before. Add tests/stream_to_file_test.go covering the regression: PNG-shaped high-byte payload, 256 KiB binary spanning many block writes, application/octet-stream, empty Content-Type, UTF-8 text passthrough, empty body, and a sanity check that the no-StreamOutputPath text path still routes through the charset reader. Verified the binary cases fail against the unfixed code (10 bytes → 14, 8 bytes → 15 with prepended UTF-8 BOM) and pass against the fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two issues kept WithDebug=true from doing its job at the cffi layer:
1) Cache bypass. clients[sessionId] caches the constructed HttpClient
per session. A session first opened with WithDebug=false cached its
noop-logger client; subsequent requests on the same session that
flipped WithDebug=true silently retrieved that cached noop client
and the flag had no effect. Skip the cache when WithDebug=true.
2) Non-Debug log levels. tls_client.WithDebug() makes NewHttpClient
auto-wrap the supplied logger via NewDebugLogger, but that wrapper
only overrides Debug() — Info / Warn / Error still forward through
to the wrapped base logger. With NewNoopLogger() as the base,
diagnostic calls like c.logger.Warn("you did not setup a cookie
jar") and c.logger.Error("critical error during request handling")
were silently dropped. Switch the base to NewLogger() so non-Debug
levels also reach stdout.
Add cffi_src/factory_with_debug_test.go (in-package so it can call
the unexported getTlsClient) covering: silence when WithDebug=false,
Debug output when WithDebug=true, Warn output when WithDebug=true
(the specific level that was silent before), and the cache-bypass
regression where a previously-cached noop client must NOT win over a
fresh WithDebug=true request on the same sessionId.
Verified the new tests fail against the unfixed code:
WithDebugTrue_AlsoEmitsWarnOutput got only "get cookies for url:"
without the "you did not setup a cookie jar" Warn line; the
cache-bypass test got an empty string from the second (debug) call.
Both pass against the fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce a singleton io.Writer that captures the SSLKEYLOGFILE
environment variable exactly once per process and attaches it as
TransportOptions.KeyLogWriter on every constructed http client. This
matches the convention Chrome, curl, Firefox and Go's standard
crypto/tls package use, so dropping the same env var that decrypts a
desktop-app capture is enough to have Wireshark decrypt the cffi
caller's TLS to the upstream origin too.
Two call sites in getTlsClient:
- When the caller passes TransportOptions, attach the writer if the
caller didn't already supply one (caller-provided always wins).
- When the caller didn't pass TransportOptions but the env var is set,
construct a single-field TransportOptions{KeyLogWriter: w} so the
capability still applies. The surrounding zero-valued fields all
happen to match the "no transportOptions" defaults in
roundtripper.go, so the only behavioural difference is the
attached writer.
The env read sits inside a sync.Once body so:
- os.Getenv runs at most once per process, not on every request.
- A failed os.OpenFile (e.g. SSLKEYLOGFILE points at an unwritable
path) is also frozen -- one error printed at first attempt instead
of one per request.
- Mid-process changes to SSLKEYLOGFILE are intentionally ignored;
chasing them would only matter for a debug feature where session
log continuity is more useful than reactivity. The singleton's
doc comment states this explicitly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary of changes
Adds streaming response support (SSE / NDJSON / chunked bodies) to the
cffi interface, plus a series of cffi-layer fixes that surfaced while
integrating the resulting DLL into a real .NET wrapper.
New: streaming exports
Four new c-shared exports let cffi consumers react to bytes as they
arrive instead of waiting for the entire body:
requestStream— sends the request and returns the responseheaders (plus a
streamId) as soon as they arrive; the body keepsflowing into a per-stream buffer on the Go side.
readStream— pulls one chunk at a time, with a heartbeattimeoutMsso the loop can check for cancellation without everblocking forever (
<0blocks,=0non-blocking,>0bounded).readStreamAll— drain the rest of the body in a single call into anormal
Responseenvelope. Useful whenrequestStreamreturned butthe response turned out not to be streaming after all.
cancelStream— tear down a stream early; idempotent, safe infinally.Chunks are returned as base64 inside JSON envelopes, matching the rest
of the cffi ABI.
cffi_dist/example_python/example_stream.pydemonstrates the four exports end-to-end against
httpbin.org/stream/5.Fixes uncovered through downstream usage
Each ships with regression tests that fail against the prior code and
pass against the fix:
IsByteResponse=truerequests. A previousrefactor (
d0fc595) movedio.ReadAllinside the!isByteResponsebranch and forgot to provide a byte-response branch, so every byte
response returned an empty payload.
TimeoutSeconds=0could not actually disable the deadline. Thecffi treated 0 as "field omitted" and fell through to
DefaultTimeoutSeconds(30 s), even though the underlying librarydocuments
WithTimeoutSeconds(0)as "unlimited". Use a negativesentinel for "disable" so existing zero-value callers keep getting
the default.
StreamOutputPathcorrupted binary content.BuildResponseranthe body through
charset.NewReaderbefore the stream-to-file write,so PNGs and other high-byte content got re-encoded
windows-1252 → UTF-8 and inflated by ~1.6× (sometimes with a UTF-8
BOM prepended). Skip the charset wrapper when stream-to-file is
requested.
WithDebug=truewas silent for Info / Warn / Error. Theauto-wrap inside
NewHttpClientonly overridesDebug(); with aNewNoopLogger()base, calls likec.logger.Warn("you did not setup a cookie jar")were silently dropped. Switch the base toNewLogger()and bypass the per-session client cache whenWithDebug=trueso a previously-cached noop client doesn't win.SSLKEYLOGFILEsupport. Drop the env var and Wireshark candecrypt the cffi caller's TLS to upstream — same convention Chrome /
curl / Firefox / Go's
crypto/tlsuse. Captured exactly once perprocess via
sync.Once.Test coverage
40 new tests across
tests/andcffi_src/:blocking / non-blocking polls, mid-stream errors, cancellation
idempotence, gzip decompression, plus an end-to-end SSE test against
an
httptestserver.in-package
getTlsClienttests forWithDebuglogger / cachebypass.
Each fix's regression tests were verified to fail against the unfixed
code via
git stash-and-rerun before being committed.Final notes
made — the root cause, the regression scenario it fixes, what the
resulting code contractually guarantees, and (where applicable) the
verification recipe used to prove the regression test catches the
bug.
a single unit, the commits are intentionally structured to stand on
their own: each fix and the streaming feature can be cherry-picked
or merged in isolation. Pick and choose whichever subset you're
comfortable taking.