Skip to content

schedstat: add runnable goroutine spike detection#3

Merged
tbg merged 1 commit into
mainfrom
runnable-threshold
Feb 16, 2026
Merged

schedstat: add runnable goroutine spike detection#3
tbg merged 1 commit into
mainfrom
runnable-threshold

Conversation

@tbg
Copy link
Copy Markdown
Collaborator

@tbg tbg commented Feb 16, 2026

Detect anomalies based on runnable goroutine count in addition to p99
latency spikes. The --runnable-threshold flag sets the count above which
a window is flagged (default: 5*GOMAXPROCS).

Latency spikes are listed first in the output, followed by runnable
count spikes marked with [gorunnable].

Also adds a deterministic sort to the heavy unblocker query to avoid
flaky golden test output.

@tbg tbg force-pushed the runnable-threshold branch 2 times, most recently from 6758305 to b4b7cba Compare February 16, 2026 15:32
Add detection of time windows where the runnable goroutine count exceeds
a threshold (default 5*GOMAXPROCS, configurable via --runnable-threshold).
Restructure the anomaly output into separate "Latency Spikes" and
"Runnable Spikes" sections, each with per-spike root cause detail
(burst breakdown, heavy unblockers, longest run, queue activity).

Add experiment-upsert1000 gateway and leaseholder traces to the test
corpus with golden file output.

Code quality improvements from review:
- Replace string-typed spikeType with const/iota enum
- Fix fragile append (allSpikes could alias latencySpikes)
- Pass top as parameter instead of reading global opts in query functions
- Deduplicate count+data SQL queries using COUNT(*) OVER()
- Assign spike display indices in analyze() instead of mutating in print
- Log burst breakdown errors unconditionally instead of only under --verbose
- Extract testOpts() helper to reduce test boilerplate
- Use fixed goroutineThreshold in golden tests for determinism
- Add comments on cumulative runnable count assumption and time semantics

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@tbg tbg force-pushed the runnable-threshold branch from b4b7cba to eb992fe Compare February 16, 2026 15:36
@tbg tbg marked this pull request as ready for review February 16, 2026 15:36
@tbg tbg merged commit a2f486b into main Feb 16, 2026
1 check passed
@tbg tbg deleted the runnable-threshold branch February 16, 2026 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant