Skip to content

fix: retry npm view with backoff in verifyPublishedArtifact#50

Merged
lacymorrow merged 1 commit into
mainfrom
fix/verify-published-retry
May 25, 2026
Merged

fix: retry npm view with backoff in verifyPublishedArtifact#50
lacymorrow merged 1 commit into
mainfrom
fix/verify-published-retry

Conversation

@lacymorrow

Copy link
Copy Markdown
Owner

Summary

  • Fix false-positive "Could not verify" warning after every successful publish.
  • Retry npm view <pkg>@<ver> up to 5 times with backoff (0s, 2s, 5s, 10s, 20s — ~37s max) before warning.

Context

The verify step added in #49 worked correctly, but npm view was being called immediately after npm publish returned — before the npm registry CDN had propagated the new version. Result: every release saw "Could not verify @ on the registry" even though the publish succeeded.

Concrete trigger: lacy@1.8.19 released cleanly today, but the user got the warning, then npm view lacy@1.8.19 returned the correct artifact a few seconds later. Same race that produces the post-publish ETARGET issue with npx ("No matching version found") — npm registry edge nodes lag.

Behavior

  • First attempt is still immediate (zero delay), so packages that have already propagated by the time we get here see no slowdown.
  • Subsequent retries spaced 2s, 5s, 10s, 20s — total ~37s before giving up.
  • The warning text changed slightly: "...after retries" instead of "...(npm view failed)".

Test plan

  • bun run typecheck clean
  • bun run test — 60 pass, 0 fail
  • Manual: release a real package and confirm no spurious warning

The post-publish verify step was firing immediately after `npm publish`
and warning whenever the registry CDN hadn't yet surfaced the new
version — a 100% false-positive rate for the common-case where the
package actually did publish successfully (e.g. lacy v1.8.19).

Retry `npm view` up to 5 times with backoff (0s, 2s, 5s, 10s, 20s —
~37s max). Only warn if every attempt fails. Verifying when the
registry says the version is present is the whole point of the step;
warning during a normal propagation window just trains people to ignore
the warning.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a retry mechanism with backoff delays when executing npm view to verify published artifacts, helping to mitigate issues caused by registry propagation lag. However, the synchronous sleep helper sleepSync uses Atomics.wait on a SharedArrayBuffer, which is not allowed on the main thread in Node.js or Bun and will throw a runtime TypeError. It is recommended to implement a safe synchronous sleep by spawning a short-lived child process instead.

Comment thread src/steps/npm.ts
Comment on lines +91 to +94
function sleepSync(ms: number): void {
if (ms <= 0) return;
Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Using Atomics.wait on the main thread in Node.js or Bun throws a TypeError: Atomics.wait cannot be called on the main thread at runtime, which will crash the CLI tool during the verification step.

Since verifyPublishedArtifact is synchronous and we cannot modify its callers (as they are outside the diff hunks), we can implement a safe synchronous sleep by spawning a short-lived child process using the already-imported exec helper. This blocks the main thread synchronously without pegging the CPU (unlike a busy-wait loop) and avoids the Atomics.wait restriction.

function sleepSync(ms: number): void {
	if (ms <= 0) return;
	try {
		exec(process.execPath, ["-e", `setTimeout(() => {}, ${ms})`]);
	} catch {
		// Ignore
	}
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Respectfully pushing back — verified locally that Atomics.wait on the main thread works in both Node and Bun:

$ node -e "const buf=new Int32Array(new SharedArrayBuffer(4));const t=Date.now();Atomics.wait(buf,0,0,200);console.log('elapsed:',Date.now()-t)"
elapsed: 205

$ bun -e "const buf=new Int32Array(new SharedArrayBuffer(4));const t=Date.now();Atomics.wait(buf,0,0,200);console.log('elapsed:',Date.now()-t)"
elapsed: 205

The "cannot be called on the main thread" restriction is a Web Platform rule (browsers prohibit it to keep the UI thread responsive). Node.js explicitly permits it on the main thread — that's documented behavior, not an accident.

Bun matches Node here (it ships V8 Atomics; same semantics).

The proposed spawn(process.execPath, ["-e", ...]) workaround would fork a full Node process per sleep — ~50-100ms of overhead before the actual sleep starts, plus extra CPU/memory. The Atomics.wait approach is the standard idiomatic sync-sleep in Node and is correct here.

Keeping the current implementation.

@lacymorrow lacymorrow merged commit 57b8b2e into main May 25, 2026
3 checks passed
@lacymorrow lacymorrow deleted the fix/verify-published-retry branch May 25, 2026 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant