Skip to content

feat(observability): Enhance HTTP logging context and add sandbox startup failure detection#2466

Closed
AdaAibaby wants to merge 2 commits intoe2b-dev:mainfrom
AdaAibaby:feature/sandbox-provisioning-observability
Closed

feat(observability): Enhance HTTP logging context and add sandbox startup failure detection#2466
AdaAibaby wants to merge 2 commits intoe2b-dev:mainfrom
AdaAibaby:feature/sandbox-provisioning-observability

Conversation

@AdaAibaby
Copy link
Copy Markdown

@AdaAibaby AdaAibaby commented Apr 21, 2026

Changes

1. Enhanced HTTP Logging

  • Added request_id and remote_ip fields to the LoggingMiddleware Context function
  • Provides complete request tracing context for all HTTP request logs
  • Facilitates correlation of related log entries in distributed systems

2. Sandbox Startup Failure Detection

  • Added short-lived sandbox warning log to the GetSandboxesSandboxIDRecord endpoint
  • Records structured warning information when sandbox lifetime is less than 30 seconds
  • Includes key debugging information such as template_alias, lifetime, etc.
  • Helps diagnose startup issues like "error waiting for provisioning sandbox: exit status: 1"

🔗 Dependency Notice

This PR depends on #2450 (feat(logger): add HTTP request/response logging fields).

The following functions used in this PR are defined in #2450:

  • logger.WithRequestID(r *http.Request)
  • logger.WithRemoteIP(r *http.Request)

Without #2450 merged first, this PR will fail to compile.

Merge Strategy

Recommended: Sequential Merge

  1. First → Merge feat(logger): add HTTP request/response logging fields #2450 (adds the required logger functions)
  2. Then → Merge this PR (feat(observability): Enhance HTTP logging context and add sandbox startup failure detection #2466) (uses the logger functions)

Alternative: Merge as Draft

If sequential merging isn't feasible, this PR can remain in Draft status until #2450 is merged, then be marked ready for review.

Technical Details

Value and Impact

  1. Improved Observability: Logs now contain complete request context for easier troubleshooting
  2. Automatic Failure Detection: Automatically identifies and logs potential sandbox startup failures
  3. Debugging Efficiency: Provides more diagnostic information for startup issues like "exit status: 1"
  4. Performance Impact: Minimal performance impact, additional logging only triggered under specific conditions

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@jakubno
Copy link
Copy Markdown
Member

jakubno commented Apr 21, 2026

Closing this for. Could you please open the issue first, as mentioned in CONTRIBUTING.md?

@jakubno jakubno closed this Apr 21, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bb418ebc23

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 231 to +234
riverClient, err = backgroundworker.StartAuthUserSyncWorker(
ctx,
signalCtx,
supabaseDB,
authDB,
l,
)
if err != nil {
l.Error(ctx, "failed to start auth user sync worker", zap.Error(err))

return 1
}
}

l.Info(ctx, "HTTP service starting", zap.Int("port", config.Port))
runErr := waitForServiceStop(signalCtx, startHTTPServer(s), riverStoppedChan(riverClient))
if runErr != nil {
l.Error(ctx, "dashboard-api runtime error", zap.Error(runErr))
} else {
l.Info(ctx, "Shutting down dashboard-api service...")
}

shutdownCtx, shutdownCancel := context.WithTimeout(context.WithoutCancel(ctx), shutdownTimeout)
defer shutdownCancel()

if err := shutdownService(shutdownCtx, s, riverClient); err != nil {
l.Error(ctx, "dashboard-api shutdown error", zap.Error(err))
r.Use(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Restore valid arguments in auth sync worker call

In run(), the StartAuthUserSyncWorker call was replaced with r.Use(...) arguments, but r only exists inside newHTTPServer, so this commit makes packages/dashboard-api/main.go fail to compile (undefined: r) before the service can start. This also interrupts the original startup/shutdown flow that followed this block, so the regression is blocking for every build/deploy of dashboard-api.

Useful? React with 👍 / 👎.

@AdaAibaby
Copy link
Copy Markdown
Author

AdaAibaby commented Apr 21, 2026

Closing this for. Could you please open the issue first, as mentioned in CONTRIBUTING.md?

Thanks for the context! I actually tracked down the root cause here: e2b-dev/E2B#1280
It turns out E2B currently relies on apt, so non-Debian images (like Alpine/RHEL) fail with a generic exit status 1.
Since this limitation isn't obvious and the original error logs were quite vague, I spent a lot of time debugging it. That's why I opened this PR—to add better logging (like RequestID and failure detection) to make these provisioning failures much easier to diagnose in the future.
Hope this improvement helps the community avoid similar debugging headaches!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants