8385632: ZGC: Incorrect object undo in relocation race for relocation workers#31322
8385632: ZGC: Incorrect object undo in relocation race for relocation workers#31322jsikstro wants to merge 1 commit into
Conversation
|
👋 Welcome back jsikstro! A progress list of the required criteria for merging this PR into |
|
@jsikstro This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 26 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
The total number of required reviews for this PR has been set to 2 based on the presence of this label: |
stefank
left a comment
There was a problem hiding this comment.
Looks good! Thanks for fixing!
|
Thank you for the reviews everyone! |
|
Going to push as commit 64d6cba.
Your commit was automatically rebased without conflicts. |
Hello,
When a relocation worker races with a mutator and the relocation worker loses, it should attempt to undo its last allocation as a way to potentially save some space.
When the relocation worker loses the race, it passes on the object allocated by the mutator in its call to ZPage::undo_alloc_object, not the object the relocation worker just allocated.
Since relocation workers and mutators have totally separate target pages, we will never end up in a scenario where a mutator allocation can end up on a page that a relocation worker also allocates on. ZPage::undo_alloc_object only undos an allocation if it was the most recent one, by checking if the object's offset into the heap is the same offset as the page's top. This means the undo will never succeed if the object is not on the page also passed to undo_alloc_object. In practice this means that this bug is benign, resulting in one waste relocation worker allocation in the event of a race occurring.
On the mutator side, if the mutator loses the race it calls ZHeap::undo_alloc_object_for_relocation, which gets the correct page for an object via ZHeap::page, so we don't have the same issue there.
The fix is simple, we just make sure we pass the object allocated by the relocation worker in the call to undo_alloc_object.
This might be relevant to backport, so I suggest we add more robustness to the undo-paths in a follow-up RFE, making sure that we're not trying to undo object allocations in a page it doesn't belong to, so we can more easily catch bugs like this in the future.
Testing:
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/31322/head:pull/31322$ git checkout pull/31322Update a local copy of the PR:
$ git checkout pull/31322$ git pull https://git.openjdk.org/jdk.git pull/31322/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 31322View PR using the GUI difftool:
$ git pr show -t 31322Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/31322.diff
Using Webrev
Link to Webrev Comment