wall profiler: use thread-local for the active profiler#322
Draft
szegedi wants to merge 1 commit into
Draft
Conversation
…er map The signal handler previously looked up the per-isolate WallProfiler via ProtectedProfilerMap — an unordered_map shielded by an atomic-exchange dance so the SIGPROF handler could read it without taking a mutex. In Node's threading model each V8 isolate is pinned to a single thread, so a thread_local pointer captures exactly the same fact with no map, no atomics and no copy-on-write churn on every profiler start/stop. The cross-thread "gather worker CPU on the main thread" use case still needs a registry (thread-locals are per-thread), but that path has no signal-safety requirement, so it collapses to a plain mutex around an unordered_set plus the terminated-worker CPU bucket. Dispose no longer needs the removeFromMap parameter — it always removes itself from the registry and clears the thread-local.
Overall package sizeSelf size: 2 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | source-map | 0.7.6 | 185.63 kB | 185.63 kB | | pprof-format | 2.2.1 | 163.06 kB | 163.06 kB | | node-gyp-build | 4.8.3 | 13.81 kB | 13.81 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ProtectedProfilerMap(anunordered_map<Isolate*, WallProfiler*>shielded by an atomic-exchange dance so the SIGPROF handler could read it lock-free) with athread_local WallProfiler* t_active_profiler. In Node's threading model each V8 isolate is pinned to a single thread, so a per-thread pointer captures exactly the fact the signal handler needs without a map, or atomics, or copy-on-write churn on every start/stop. This is the same approach we adopted in the equivalent Node.js core port (Attach values to CPU samples nodejs/node#63163).ActiveProfilersregistry (plainstd::mutex+std::unordered_set<WallProfiler*>) for the one cross-thread path that survives: the main thread gathering worker-thread CPU while building a profile. That path has no signal-safety requirement, so the concurrency machinery was eliminated from there as well except a simple mutex.Disposeloses itsremoveFromMapparameter — it now always removes itself from the registry and clears the TLS pointer.CreateV8CpuProfilerchecks the TLS to refuse a duplicate start instead of relying on the map'semplacereturningfalse.This simplifies the SIGPROF handler, the start/stop hot path no longer copies a map under a mutex. This also solves the problem of one thread's SIGPROF handler not being able to find its own map while another is added to it, which would've lost us processing of samples.
Test plan
npm run build— cleannpm test— 88/88 passing, includingWorker Threads("should work", "should not crash when worker is terminated") which exercises cross-thread CPU gather and destroy-during-profilingnpm run format— clean