Skip to content

bundler: let DCE drop unreachable top-level await before rejecting CJS#29246

Open
robobun wants to merge 9 commits intomainfrom
farm/38b7d2c8/dce-top-level-await-cjs
Open

bundler: let DCE drop unreachable top-level await before rejecting CJS#29246
robobun wants to merge 9 commits intomainfrom
farm/38b7d2c8/dce-top-level-await-cjs

Conversation

@robobun
Copy link
Copy Markdown
Collaborator

@robobun robobun commented Apr 13, 2026

Problem

bun build --format=cjs rejected any top-level await at parse time, even when the await sat inside a branch that constant folding / dead code elimination would have removed:

// entry.js
if (typeof TEST === "undefined" ? false : TEST) {
    await import("node:fs");
    // Do something with the filesystem
}
foo();
$ bun build entry.js --minify --format=cjs --define TEST=false
2 |     await import("node:fs");
              ^
error: "await" can only be used inside an "async" function

esbuild handles this fine: the if (false) { ... } block is dropped by DCE and the unreachable await is never reported. That blocks users who use --define to gate test-only code paths behind an import("node:fs") that never runs in production.

Cause

At module scope, when the target output format didn't support top-level await (--format=cjs), the parser used allow_await = .allow_ident. That treats await as a plain identifier and relies on a lexer-level prev_token_was_await_keyword flag to emit the TLA error the moment a follow-up token can't be interpreted as an identifier continuation (e.g. the import keyword). The diagnostic fires at lex/parse time — well before constant folding has a chance to eliminate the dead branch.

Fix

Parse await EXPR as an await expression at module scope even when the output format is CJS, as long as the following token unambiguously starts an expression rather than continuing an identifier (so var await = 42 and console.log(await) still work as identifier uses). Track "live" top-level awaits separately in the visit pass (has_live_top_level_await), and only let them imply .esm after DCE has run. If a live top-level await survives DCE in a CJS target, emit the diagnostic after visit:

error: Top-level await is currently not supported with the "cjs" output format

This matches esbuild's behaviour.

Verification

test/regression/issue/29243.test.ts covers four cases:

  • CJS + dead await → builds to foo();, no error (the bug).
  • CJS + live await → rejected with the new diagnostic, not the old lex-level one.
  • var await = 42; console.log(await) in a CJS file → still parses as identifier use.
  • await inside a non-async function → still reports the existing "await" can only be used inside an "async" function error with the "Consider adding the async keyword here" note.

The first two regress on the installed bun but pass on this build.

Fixes #29243

…S output

Previously, `bun build --format=cjs` would error on any top-level `await`
at parse time, even when the `await` sat inside a branch that constant
folding / dead code elimination would have removed (e.g. behind a
`--define`-controlled `if (false) { await import(...) }`). That blocked
use cases like "import this file only when tests are running" that
esbuild handles cleanly.

Now at module scope in a non-ESM target we still parse `await EXPR` as an
`await` expression (as long as the following token unambiguously starts
an expression — so `var await = 42` and `console.log(await)` keep
working as identifier uses). The parse-time `top_level_await_keyword`
range no longer implies `.esm` on its own. Instead, the visit pass sets
`has_live_top_level_await` only when a real `await` is reached in live
control flow, and the classifier uses that. If a live top-level await
survives DCE in a CJS target, we emit the diagnostic from after visit:

    error: Top-level await is currently not supported with the "cjs" output format

Fixes #29243
@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented Apr 13, 2026

Updated 10:40 PM PT - Apr 12th, 2026

❌ Your commit 6642ab5c has some failures in Build #45444 (All Failures)


🧪   To try this PR locally:

bunx bun-pr 29246

That installs a local version of the PR into your bun-29246 executable, so you can run:

bun-29246 --bun

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Distinguishes parse-time TLA tokens from live top-level await discovered during the visit pass, records live TLA, removes unreachable TLA prior to ESM classification and diagnostics, refines parsing disambiguation for await vs identifier at module scope, and adds regression tests for CJS/TLA behavior.

Changes

Cohort / File(s) Summary
Parser state & behavior
src/ast/P.zig, src/ast/Parser.zig
Added has_live_top_level_await: bool = false to parser state; compute tla_implies_esm from parse-time keyword + feature flag; use it for ESM/export classification and strict-mode inference; after visit, clear or report surviving TLA based on DCE and options.features.top_level_await.
Parsing disambiguation (await vs identifier)
src/ast/parsePrefix.zig, src/ast/parseStmt.zig
Added tokenStartsAwaitExpr and AwaitOrYield binding; conditionally reinterpret await identifier as an await expression at module top-level in non-ESM CJS contexts when next token permits, temporarily overriding fn_or_arrow_data_parse.allow_await; relaxed parse-time for await validation to allow tolerated module-scope for await when allow_await == .allow_ident and is_top_level.
Visit-time live-TLA recording
src/ast/visitExpr.zig, src/ast/visitStmt.zig
e_await and s_for_of now set p.top_level_await_keyword and p.has_live_top_level_await for the first live module-scope await encountered during reachable control-flow outside functions/arrows.
Tests (regression)
test/regression/issue/29243.test.ts
Added suite exercising bun build --format=cjs with minification across reachable vs unreachable TLA and for await scenarios, await as identifier/tagged-template cases, and asserts on build success/failure and diagnostics.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: allowing DCE to drop unreachable top-level await before CJS format rejection.
Description check ✅ Passed The description includes both required sections (What does this PR do and How did you verify your code works), with detailed explanations of problem, cause, fix, and verification.
Linked Issues check ✅ Passed The PR directly addresses issue #29243: it defers top-level await validation until after DCE runs, allows await parsing at module scope when unambiguous, and emits diagnostics post-visit matching esbuild behavior.
Out of Scope Changes check ✅ Passed All changes scope to the DCE + top-level await fix: parser modifications, visit pass tracking, test coverage, and for-await handling are directly related to the linked issue objective.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ast/visitExpr.zig`:
- Around line 988-990: When marking a live top-level await inside the visitor
(the branch that sets p.has_live_top_level_await via
p.fn_or_arrow_data_visit.is_outside_fn_or_arrow), also record the location/token
for that live await so downstream CJS diagnostics use the surviving await
location instead of the original parse-time one; update
p.top_level_await_keyword (or the parse-state field that holds the await
span/token) to the current await token/span when you set
p.has_live_top_level_await, ensuring you only overwrite it when the new location
corresponds to a live await discovered during this visit.

In `@test/regression/issue/29243.test.ts`:
- Line 41: Remove the flaky empty-stderr assertions in the test by deleting the
two occurrences of expect(stderr).toBe("") (the assertions at lines that
reference stderr in test/regression/issue/29243.test.ts); keep the existing
exit-code and stdout/content assertions intact (do not change assertions using
exitCode or stdout), and ensure the variable stderr is still collected if used
elsewhere but no longer asserted to be exactly "" to avoid ASAN-induced flakes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: dde02832-03b2-4a9d-8a33-8b3e382e96e5

📥 Commits

Reviewing files that changed from the base of the PR and between d7526e2 and 86d144e.

📒 Files selected for processing (5)
  • src/ast/P.zig
  • src/ast/Parser.zig
  • src/ast/parsePrefix.zig
  • src/ast/visitExpr.zig
  • test/regression/issue/29243.test.ts

@dragoncoder047
Copy link
Copy Markdown

Copying my comment in the linked issue in case @robobun hasn't seen it:

This bug has nothing to do with --define since omitting the --defines and using the code if (false) { await import("foo"); } still triggers the bug!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
test/regression/issue/29243.test.ts (1)

29-29: ⚠️ Potential issue | 🟠 Major

Remove empty-stderr assertions from success-path subprocess tests.

Line 29 and Line 70 make these tests flaky on ASAN CI, where benign warnings may appear on stderr even when behavior is correct.

Proposed patch
-  const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
+  const [stdout, exitCode] = await Promise.all([proc.stdout.text(), proc.exited]);

-  expect(stderr).toBe("");
   expect(stdout).toBe("foo();\n");
   expect(exitCode).toBe(0);
-  const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
+  const [stdout, exitCode] = await Promise.all([proc.stdout.text(), proc.exited]);

-  expect(stderr).toBe("");
   expect(stdout).toContain("var await = 42");
   expect(stdout).toContain("globalThis.output = await");
   expect(exitCode).toBe(0);

Based on learnings: in test/regression/issue/*.test.ts, avoid asserting spawned Bun subprocess stderr is empty because ASAN may emit benign warnings.

Also applies to: 70-70

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/regression/issue/29243.test.ts` at line 29, Remove the brittle
empty-stderr assertions in the regression test by deleting the
expect(stderr).toBe("") checks (e.g., the assertion at expect(stderr).toBe("")
in test/regression/issue/29243.test.ts and the similar one around line 70),
leaving the success-path subprocess tests to only assert on stdout/exit
behavior; locate the assertions by searching for expect(stderr).toBe("") in the
file and remove them so ASAN benign warnings no longer cause flakes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/regression/issue/29243.test.ts`:
- Line 29: Remove the brittle empty-stderr assertions in the regression test by
deleting the expect(stderr).toBe("") checks (e.g., the assertion at
expect(stderr).toBe("") in test/regression/issue/29243.test.ts and the similar
one around line 70), leaving the success-path subprocess tests to only assert on
stdout/exit behavior; locate the assertions by searching for
expect(stderr).toBe("") in the file and remove them so ASAN benign warnings no
longer cause flakes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 309a8c18-d823-45aa-964f-06b5d39f07bd

📥 Commits

Reviewing files that changed from the base of the PR and between 86d144e and 43d383a.

📒 Files selected for processing (1)
  • test/regression/issue/29243.test.ts

robobun and others added 2 commits April 13, 2026 03:26
- tokenStartsAwaitExpr no longer upgrades on template literals
  (`var await = String.raw; await``hello`` ` is a tagged-template
  call on the identifier) and now does upgrade on `{` to match esbuild,
  which treats `await { ... }` as an await expression.
- visitExpr.e_await records the first live top-level await's range into
  `top_level_await_keyword` so the CJS-TLA diagnostic points at a live
  await, not one the DCE pass dropped.
- The CJS-TLA error message now quotes the actual output format via
  `@tagName(output_format)`, matching esbuild, so `--format=iife`
  reports iife and not cjs.
- Regression test sweep: removed `expect(stderr).toBe("")` on the
  success path (ASAN noise), added coverage for the literal `if (false)`
  case (no --define), for tagged-template `await` identifier, and for
  `await { ... }`.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ast/Parser.zig`:
- Around line 649-663: The parser currently clears top_level_await_keyword when
p.has_live_top_level_await is false because s_for_of never sets that flag for
top-level "for await" loops; update the s_for_of visitor (visitStmt.zig) so that
when encountering a for-of statement with data.is_await == true AND the
statement is at module scope (p.current_scope == p.module_scope) you set
p.has_live_top_level_await = true (and ensure p.top_level_await_keyword remains
pointing to the live await range), so the diagnostic in Parser.zig that checks
p.has_live_top_level_await will be preserved for top-level for-await loops.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 57506c01-8895-4880-b9db-3c8aec88640b

📥 Commits

Reviewing files that changed from the base of the PR and between 43d383a and 481e42c.

📒 Files selected for processing (4)
  • src/ast/Parser.zig
  • src/ast/parsePrefix.zig
  • src/ast/visitExpr.zig
  • test/regression/issue/29243.test.ts

robobun and others added 2 commits April 13, 2026 03:45
`for await (x of y)` at module scope used to error at parse time (in
`parseStmt.t_for`) with the same "Cannot use await outside an async
function" diagnostic that blocked the plain `await` case — even when
the loop sat inside an `if (false)` branch that DCE would drop. That
path didn't go through `parsePrefix`'s upgrade, so the previous fix
missed it.

Relax the parse-time check to tolerate `for await` at module top-level
in non-ESM targets, and mark the loop as a live top-level `await` from
`visitStmt.s_for_of` when it survives DCE in reachable module-scope
control flow. The existing post-visit CJS-TLA error then fires against
the loop's range when the format doesn't support TLA.
Two follow-ups on the top-level `await` parse upgrade:

1. `parseFn` saves `fn_or_arrow_data_parse` and partially overwrites
   it when entering a function's argument list, but it never reset
   `is_top_level`. Before this change that was harmless because
   `is_top_level` was only set when the TLA feature was already on.
   Now that we unconditionally set it at module scope, the flag leaked
   into nested function arg contexts, so
   `function foo(x = await import("mod")) {}` silently upgraded the
   default-parameter `await` to an expression inside a non-async
   function, emitting invalid JS. Clear `is_top_level = false` alongside
   the other overrides in the argument-setup block.

2. `visitStmt.s_return` uses `top_level_await_keyword` to locate the
   "Top-level return cannot be used inside an ECMAScript module"
   diagnostic. With the parse-time upgrade landing dead awaits into that
   slot in CJS targets, a file like
   `if (false) { await import("x") } ... return;` was spuriously
   rejected. Gate the check on `features.top_level_await` so dead
   parse-time awaits in non-TLA targets don't trip the ESM diagnostic.

Regression tests cover both.
robobun added 2 commits April 13, 2026 04:47
Two more items uncovered by review:

1. Arrow function default parameters use `parseParenExpr` (parse.zig),
   not `parseFn`. That path also saves `fn_or_arrow_data_parse` and
   partially overwrites it, and also never reset `is_top_level`. So
   `const fn = (x = await import("mod")) => x` in CJS mode was
   silently emitting invalid JS for the same reason `parseFn` was.
   The `track_arrow_arg_errors` safety net doesn't catch it because
   the marker gets written to `p.fn_or_arrow_data_parse` and then
   discarded on restore, while the diagnostic check inspects a local
   copy that never sees the write. Clear `is_top_level = false`
   alongside the other overrides in `parseParenExpr`, mirroring the
   `parseFn` fix.

2. `visitStmt.s_for_of` was overwriting `p.top_level_await_keyword`
   with a range spanning `for` (3 chars at `stmt.loc`) instead of
   the correct `await` range `parseStmt.t_for` already stored.
   The result was a live-for-await CJS-TLA diagnostic whose squiggle
   landed under `for` instead of `await`. Drop the overwrite — the
   flag is all we need to set here — and let the parse-time range stand.

Regression tests pin both: arrow default param with a top-level await,
and a `1:5` column assertion on the live `for await` diagnostic.
The previous gate rejected because the release+with-fix lane couldn't
fetch WebKit (TLS cert failure on the in-container proxy). I populated
`/root/.bun/build-cache/webkit-42f80a684c5df571` via curl and verified
the release build produces a working binary that passes all 13
regression tests. No code change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bun doesn't dead-code eliminate the same way ESBuild does

2 participants