8385651: HotCodeSampler crashes with JFR enabled#31326
Conversation
|
👋 Welcome back crakoczy! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
|
The total number of required reviews for this PR has been set to 2 based on the presence of this label: |
Webrevs
|
|
What is the HotCode sampler? How does it interact with JFR in ways that make the JVM crash? Can you provide more details in the bug (full hs_err, for example) |
|
/label add hotspot-jfr |
|
@mgronlun |
The HotCodeSampler was added in JDK-8326205: Grouping frequently called C2 nmethods in CodeCache. It is responsible for sampling Java threads to find the hottest nmethods which are then grouped into a dedicated hot code heap.
JFR and the HotCodeSampler both use |
| * @bug 8383605 | ||
| * @summary test some out of bounds parameters for createInterleavedRaster() | ||
| * @summary Verify the HotCodeSampler and JFR do not attempt to suspend the same JavaThread and crash | ||
| * @requires vm.compiler2.enabled |
There was a problem hiding this comment.
* @requires vm.compiler2.enabled & vm.hasJFR
|
|
||
| /** | ||
| /* | ||
| * @test |
The JVM already has two profilers that report which code is hot: the default, out-of-the-box, “wall-clock” JFR sampler, and the newly introduced, experimental JFR CPUTime sampler. Introducing a third “Hot Code” sampler is gratuitous. Code is duplicated, as it's clearly a copy of the default JFR sampler. Because it uses the same internal mechanisms as the default JFR sampler, the two samplers share resources and therefore cannot be run simultaneously—they are mutually exclusive. As a consequence, and most importantly, the runtime performance overhead of sampling is now duplicated: two samplers do the same, or at least very similar, work twice. If you are interested in reporting what code is hot, I suggest you instead hook into the existing JFR samplers that already do this work and let them report back. If you are only interested in the top frame, you could have a callback placed at a single location: On the receiving end, using a ConcurrentHashTable, you will have your histogram immediately. |
We had a good reason for the “Hot Code” sampler. JFR is designed to collect data and to provide it to an external observer. This is its main purpose. It is designed for observability use cases as a lightweight tool. I don't think its design considered internal uses. The sampler is very simple. It does not have much code, less than 100 lines. It is not designed to provide data outside JVM. It does not need to do much work JFR does, e.g. constructing frames or traversing stacks. We don't want to pay for something we don't use. If I am correct, JFR will cost us 1-2% overhead.
Thank you for the suggestion. I think the HotCodeCollector might detect JFR running and use it as the sampler instead. We really want to have some sampling engine HotCodeCollector and JFR both can use. Is it possible to extract sampling functionality? I don't think hooking into JFR might be a good idea. The HotCodeCollector is experimental. We can replace the current sampler with the better one at any time. |
But instead you will have the end user pay twice for two samplers that essentially produce the same data? Those JFR ballpark numbers come when you are running with the full default.jfc and the profile.jfc configurations, respectively, but JFR is highly configurable; here, you only need to enable a single event. If you are concerned, you can also reduce the length of the stack traces to 1, like so:
The main problem I have is that the two implementations are mutually exclusive and so cannot run concurrently. JFR needs to wait, and this sampler needs to wait for JFR. I don't see any refactoring of sampler functionality in the near future. The Threads_lock is really too coarse to hold during sampling, and attempts have been made to remove it; see JDK-8358429. The problem with holding it during sampling is that it stalls safe pointing. That is one of the reasons the JFR samples only a subset of threads, whereas I see your sampler samples the entire set, which can take some time to complete. Taking a step back, is the 'hot code' functionality really best addressed through coarse-grained sampling? Isn't this better addressed internally, using something like MethodData and the compiler's internals? What is lacking there? There is no tracking of method invocations for methods compiled at tier 4? |
JDK-8385651
When running with
-XX:StartFlightRecording -XX:+HotCodeHeapthe JVM crashes when both samplers attempt to suspend the same Java thread. By holdingThreads_lockthe HotCodeSampler is able to coordinate with other samplers. This change only effects the HotCodeCollector which is currently experimental so the change is low risk. All hot code tests undertest/hotspot/jtreg/compiler/hotcode/passProgress
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/31326/head:pull/31326$ git checkout pull/31326Update a local copy of the PR:
$ git checkout pull/31326$ git pull https://git.openjdk.org/jdk.git pull/31326/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 31326View PR using the GUI difftool:
$ git pr show -t 31326Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/31326.diff
Using Webrev
Link to Webrev Comment