Wip/modular refactor pnpm turborepo#18
Conversation
The `#wrapHandler` returned `{ ...handler, onResponseStart, onResponseError }`,
but undici's fetch builds its DispatchHandler as a class instance with methods
on the prototype. Object spread only copies own enumerable properties, so the
wrapped handler reached undici missing `onRequestStart`, `onResponseData`, and
`onResponseEnd`. Undici rejected every dispatch with
`UND_ERR_INVALID_ARG (invalid onRequestStart method)`, the rotator marked
the proxy failed, the re-probe (which builds its own handler via `request()`)
succeeded, and the proxy was restored — only for the next real request to
fail the same way. /status reported healthy while every fetch returned
"OUTBOUND_PROXY: no healthy proxy available".
Replaces the object spread with a Proxy that transparently forwards every
property to the inner handler and only intercepts the two we observe.
Regression test instantiates a class-based handler (mirroring undici's real
shape) and asserts prototype methods survive the wrap.
Reviewer's GuideAdds CDN-embedded origin URL unwrapping for Akamai-style URLs in HTML/CSS rewriting, extends redirect unwrapping to AOL dynamic.aol.com patterns, and fixes RotatingProxyDispatcher handler wrapping to preserve undici DispatchHandler prototype methods, with corresponding regression tests. Sequence diagram for RotatingProxyDispatcher Proxy-wrapped DispatchHandlersequenceDiagram
participant RotatingProxyDispatcher
participant ProxyEntry
participant InnerHandler as DispatcherDispatchHandler
participant WrappedHandler as ProxyWrappedHandler
participant Undici as UndiciDispatcher
RotatingProxyDispatcher->>RotatingProxyDispatcher: #wrapHandler(entry, handler)
RotatingProxyDispatcher->>WrappedHandler: new Proxy(handler, { get })
RotatingProxyDispatcher-->>Undici: WrappedHandler as Dispatcher.DispatchHandler
Undici->>WrappedHandler: onResponseStart(controller, statusCode, headers, statusMessage)
WrappedHandler->>RotatingProxyDispatcher: [get("onResponseStart")]
WrappedHandler->>RotatingProxyDispatcher: PROXY_ERROR_STATUS.has(statusCode)
alt [status is proxy error]
WrappedHandler->>RotatingProxyDispatcher: #markFailed(entry, "proxy returned " + statusCode)
end
WrappedHandler->>InnerHandler: onResponseStart(controller, statusCode, headers, statusMessage)
Undici->>WrappedHandler: onResponseError(controller, error)
WrappedHandler->>RotatingProxyDispatcher: [get("onResponseError")]
WrappedHandler->>RotatingProxyDispatcher: #markFailed(entry, "transport: " + error.message)
WrappedHandler->>InnerHandler: onResponseError(controller, error)
Undici->>WrappedHandler: otherMethod(...)
WrappedHandler->>InnerHandler: otherMethod(...) (via Reflect.get and bind)
Flow diagram for URL rewriting with extractCdnEmbeddedUrl and unwrapRedirectUrlflowchart TD
A[raw URL string] --> B[match RE_ARCHIVE_URL]
B -->|archive match| C[extract timestamp ts and rawOriginal]
C --> D["extractCdnEmbeddedUrl(rawOriginal)"]
D -->|found| E[originalUrl = extracted]
D -->|not found| F[originalUrl = rawOriginal]
B -->|no archive match| G["unwrapRedirectUrl(trimmed)"]
G --> H["extractCdnEmbeddedUrl(unwrapped)"]
H -->|found| I["cdnUnwrapped = extracted"]
H -->|not found| J["cdnUnwrapped = unwrapped"]
E --> K["buildProxyUrl(originalUrl, fallbackTime, lockTime)"]
F --> K
I --> L["new URL(cdnUnwrapped, targetUrl)"]
J --> L
L --> M{protocol http: or https:?}
M -->|no| N[return raw]
M -->|yes| O["buildProxyUrl(resolved.toString(), fallbackTime, lockTime)"]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 3 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="tests/lib/url-rewriter.test.ts" line_range="374-383" />
<code_context>
+describe("rewriteHtmlUrls — CDN URLs with embedded origin", () => {
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a regression test for Akamai URLs that do not contain an embedded origin hostname
You already cover Akamai URLs with an embedded origin hostname and non-CDN absolute URLs. To further harden `extractCdnEmbeddedUrl` and the rewrite logic, please add a case where the host looks like Akamai (e.g. `a388.g.akamai.net`) but the path contains no embedded origin hostname, to ensure we neither rewrite such URLs incorrectly nor break valid CDN-hosted assets that can’t be unwrapped.
</issue_to_address>
### Comment 2
<location path="tests/lib/outbound-proxy.test.ts" line_range="563-572" />
<code_context>
+ it("forwards prototype methods from the inner handler (regression: undici DispatchHandler is a class instance)", () => {
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen the regression test by asserting that handler methods are invoked with the original instance as `this`
Since the implementation now uses a `Proxy` and binds functions, consider adding an assertion that `this` inside the handler methods is the original `FetchLikeHandler` instance, not the proxy. For example, have `onRequestStart` capture `this` and assert `capturedThis === userHandler` to ensure instance-state access via `this` continues to work as expected.
Suggested implementation:
```typescript
const { built, d } = buildRotator(2);
let onRequestStartThis: unknown;
let onResponseDataThis: unknown;
let onResponseEndThis: unknown;
const onRequestStart = jest.fn(function (this: unknown, ...args: unknown[]) {
onRequestStartThis = this;
// preserve jest's mock call recording
return (onRequestStart as jest.Mock).mock.calls && (onRequestStart as jest.Mock)(...args);
});
const onResponseData = jest.fn(function (this: unknown, ...args: unknown[]) {
onResponseDataThis = this;
return (onResponseData as jest.Mock)(...args);
});
const onResponseEnd = jest.fn(function (this: unknown, ...args: unknown[]) {
onResponseEndThis = this;
return (onResponseEnd as jest.Mock)(...args);
});
class FetchLikeHandler {
```
To fully implement the suggestion and assert that handler methods are invoked with the original instance as `this`, you should also:
1. Implement `onRequestStart`, `onResponseData`, and `onResponseEnd` as instance methods on `FetchLikeHandler` that delegate to the above mocks, e.g.:
```ts
class FetchLikeHandler {
onRequestStart(...args: unknown[]) {
return onRequestStart.apply(this, args);
}
onResponseData(...args: unknown[]) {
return onResponseData.apply(this, args);
}
onResponseEnd(...args: unknown[]) {
return onResponseEnd.apply(this, args);
}
}
```
2. In this test, ensure you keep a reference to the original handler instance (e.g. `const userHandler = new FetchLikeHandler();`) and that this instance is what `buildRotator`/the proxy wraps.
3. After invoking a request through `built`/`d` such that all three handler methods run, add assertions like:
```ts
expect(onRequestStartThis).toBe(userHandler);
expect(onResponseDataThis).toBe(userHandler);
expect(onResponseEndThis).toBe(userHandler);
```
This will verify that the `Proxy` and function binding in the production code preserve the original handler instance as `this` inside the methods.
</issue_to_address>
### Comment 3
<location path="src/lib/outbound-proxy.ts" line_range="374" />
<code_context>
- onResponseStart(controller, statusCode, headers, statusMessage) {
- if (PROXY_ERROR_STATUS.has(statusCode)) {
- self.#markFailed(entry, `proxy returned ${statusCode}`);
+ // Proxy (not object spread) because undici's fetch builds its handler as
+ // a class instance with methods on the prototype. `{ ...handler }` would
+ // drop them, and undici would reject the wrapped handler with
</code_context>
<issue_to_address>
**issue (complexity):** Consider replacing the generic Proxy-based handler wrapper with an Object.create-based wrapper that only overrides the two relevant methods to keep behavior explicit and easier to follow.
You can avoid the `Proxy` (and its global `get` trap/binding logic) while still preserving prototype methods and `this` by using `Object.create` and explicitly overriding only the two methods you care about.
That keeps the fix (no `{ ...handler }` cloning) but makes the wrapper local and easier to reason about:
```ts
#wrapHandler(entry: ProxyEntry, handler: Dispatcher.DispatchHandler): Dispatcher.DispatchHandler {
const self = this;
// Preserve handler's prototype chain & instance methods
const wrapped = Object.create(handler) as Dispatcher.DispatchHandler;
const origOnResponseStart = handler.onResponseStart?.bind(handler);
const origOnResponseError = handler.onResponseError?.bind(handler);
wrapped.onResponseStart = function (
controller: Dispatcher.DispatchController,
statusCode: number,
headers: IncomingHttpHeaders,
statusMessage?: string,
): void {
if (PROXY_ERROR_STATUS.has(statusCode)) {
self.#markFailed(entry, `proxy returned ${statusCode}`);
}
origOnResponseStart?.(controller, statusCode, headers, statusMessage);
};
wrapped.onResponseError = function (
controller: Dispatcher.DispatchController,
error: Error,
): void {
const reason = error instanceof Error ? error.message : String(error);
self.#markFailed(entry, `transport: ${reason}`);
origOnResponseError?.(controller, error);
};
return wrapped;
}
```
This:
- Preserves prototype methods (undici still sees the class instance shape).
- Keeps `this` for the original handler methods via `bind(handler)`.
- Avoids intercepting all properties, `Proxy`, `Reflect.get`, and generic function binding.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| describe("rewriteHtmlUrls — CDN URLs with embedded origin", () => { | ||
| it("unwraps an Akamai URL on <img src> and points the proxy URL at the origin", () => { | ||
| const html = `<img src="http://a388.g.akamai.net/f/388/21/1d/www.cnn.com/images/newmain/top.main.special.report.gif">`; | ||
| const { html: r } = rewriteHtmlUrls(html, TARGET, TIME); | ||
| expect(r).toContain( | ||
| `/web/${TIME}/http://www.cnn.com/images/newmain/top.main.special.report.gif`, | ||
| ); | ||
| expect(r).not.toContain("a388.g.akamai.net"); | ||
| }); | ||
|
|
There was a problem hiding this comment.
suggestion (testing): Consider adding a regression test for Akamai URLs that do not contain an embedded origin hostname
You already cover Akamai URLs with an embedded origin hostname and non-CDN absolute URLs. To further harden extractCdnEmbeddedUrl and the rewrite logic, please add a case where the host looks like Akamai (e.g. a388.g.akamai.net) but the path contains no embedded origin hostname, to ensure we neither rewrite such URLs incorrectly nor break valid CDN-hosted assets that can’t be unwrapped.
| it("forwards prototype methods from the inner handler (regression: undici DispatchHandler is a class instance)", () => { | ||
| // Regression for the production bug where `{ ...handler }` dropped | ||
| // undici's prototype-defined methods (onRequestStart, onResponseData, | ||
| // onResponseEnd), causing undici to reject every dispatch with | ||
| // "UND_ERR_INVALID_ARG (invalid onRequestStart method)" and take | ||
| // the proxy permanently out of rotation. | ||
| const { built, d } = buildRotator(2); | ||
| const onRequestStart = jest.fn(); | ||
| const onResponseData = jest.fn(); | ||
| const onResponseEnd = jest.fn(); |
There was a problem hiding this comment.
suggestion (testing): Strengthen the regression test by asserting that handler methods are invoked with the original instance as this
Since the implementation now uses a Proxy and binds functions, consider adding an assertion that this inside the handler methods is the original FetchLikeHandler instance, not the proxy. For example, have onRequestStart capture this and assert capturedThis === userHandler to ensure instance-state access via this continues to work as expected.
Suggested implementation:
const { built, d } = buildRotator(2);
let onRequestStartThis: unknown;
let onResponseDataThis: unknown;
let onResponseEndThis: unknown;
const onRequestStart = jest.fn(function (this: unknown, ...args: unknown[]) {
onRequestStartThis = this;
// preserve jest's mock call recording
return (onRequestStart as jest.Mock).mock.calls && (onRequestStart as jest.Mock)(...args);
});
const onResponseData = jest.fn(function (this: unknown, ...args: unknown[]) {
onResponseDataThis = this;
return (onResponseData as jest.Mock)(...args);
});
const onResponseEnd = jest.fn(function (this: unknown, ...args: unknown[]) {
onResponseEndThis = this;
return (onResponseEnd as jest.Mock)(...args);
});
class FetchLikeHandler {To fully implement the suggestion and assert that handler methods are invoked with the original instance as this, you should also:
- Implement
onRequestStart,onResponseData, andonResponseEndas instance methods onFetchLikeHandlerthat delegate to the above mocks, e.g.:class FetchLikeHandler { onRequestStart(...args: unknown[]) { return onRequestStart.apply(this, args); } onResponseData(...args: unknown[]) { return onResponseData.apply(this, args); } onResponseEnd(...args: unknown[]) { return onResponseEnd.apply(this, args); } }
- In this test, ensure you keep a reference to the original handler instance (e.g.
const userHandler = new FetchLikeHandler();) and that this instance is whatbuildRotator/the proxy wraps. - After invoking a request through
built/dsuch that all three handler methods run, add assertions like:This will verify that theexpect(onRequestStartThis).toBe(userHandler); expect(onResponseDataThis).toBe(userHandler); expect(onResponseEndThis).toBe(userHandler);
Proxyand function binding in the production code preserve the original handler instance asthisinside the methods.
| onResponseStart(controller, statusCode, headers, statusMessage) { | ||
| if (PROXY_ERROR_STATUS.has(statusCode)) { | ||
| self.#markFailed(entry, `proxy returned ${statusCode}`); | ||
| // Proxy (not object spread) because undici's fetch builds its handler as |
There was a problem hiding this comment.
issue (complexity): Consider replacing the generic Proxy-based handler wrapper with an Object.create-based wrapper that only overrides the two relevant methods to keep behavior explicit and easier to follow.
You can avoid the Proxy (and its global get trap/binding logic) while still preserving prototype methods and this by using Object.create and explicitly overriding only the two methods you care about.
That keeps the fix (no { ...handler } cloning) but makes the wrapper local and easier to reason about:
#wrapHandler(entry: ProxyEntry, handler: Dispatcher.DispatchHandler): Dispatcher.DispatchHandler {
const self = this;
// Preserve handler's prototype chain & instance methods
const wrapped = Object.create(handler) as Dispatcher.DispatchHandler;
const origOnResponseStart = handler.onResponseStart?.bind(handler);
const origOnResponseError = handler.onResponseError?.bind(handler);
wrapped.onResponseStart = function (
controller: Dispatcher.DispatchController,
statusCode: number,
headers: IncomingHttpHeaders,
statusMessage?: string,
): void {
if (PROXY_ERROR_STATUS.has(statusCode)) {
self.#markFailed(entry, `proxy returned ${statusCode}`);
}
origOnResponseStart?.(controller, statusCode, headers, statusMessage);
};
wrapped.onResponseError = function (
controller: Dispatcher.DispatchController,
error: Error,
): void {
const reason = error instanceof Error ? error.message : String(error);
self.#markFailed(entry, `transport: ${reason}`);
origOnResponseError?.(controller, error);
};
return wrapped;
}This:
- Preserves prototype methods (undici still sees the class instance shape).
- Keeps
thisfor the original handler methods viabind(handler). - Avoids intercepting all properties,
Proxy,Reflect.get, and generic function binding.
src/services/time-machine.ts:
- Use crypto.timingSafeEqual for Bearer token auth (004-a668)
- Add Content-Security-Policy: SANDBOX_CSP on SSE path (005-78d7)
- Guard all ws.send() calls with readyState check; add per-connection
ws.on("error") listener (012-1901)
- Expand signal handler to call stop().catch() with fatal log (014-e360)
- Map result.cache to "HIT"|"MISS" in WsResponse (022-1b8c)
- Fix hardcoded 200 in access log after handleCacheClear (033-ab07)
- Tighten getStatus return type to Promise<SystemStatus> (035-c1be)
src/services/cache.ts:
- Validate domainFilter query param; return 400 on invalid hostname (007-6eb8)
- Add atomicWrite helper (writeFile+rename via TMP_SUFFIX); wire in
writeResolvedTimeSidecar and writeContentTypeSidecar (008-0492)
- Guard outer catch in handleCacheClear with res.headersSent (013-5379)
- Replace fs.readFile with fs.open+read(SNIFF_BYTES)+close in
sniffContentType (017-8751)
src/clients/archive-job-client.ts:
- Replace serial queue.add() loop with queue.addBulk() (018-ff86)
src/queue/archive-worker.ts:
- Delete startTimes entry in stalled event to fix Map leak (019-59bf)
tests/lib/runtime-shim.test.ts:
- Add lockTime=true describe block with 15 tests (027-6577)
tests/: update mocks for addBulk, atomicWrite, and sniff partial-read
Stories: #4-a668 #5-78d7 #7-6eb8 #8-0492 #12-1901 #13-5379
#14-e360 #17-8751 #18-ff86 #19-59bf #022-1b8c #027-6577
#033-ab07 #035-c1be
Summary by Sourcery
Handle CDN-embedded origins and additional redirect patterns in URL rewriting while fixing proxy handler wrapping for undici compatibility.
New Features:
Bug Fixes:
Enhancements:
Tests: