Skip to content

GenShen: WhiteBox full GC promotion is not reliable#5

Draft
pf0n wants to merge 2 commits into
masterfrom
8384221
Draft

GenShen: WhiteBox full GC promotion is not reliable#5
pf0n wants to merge 2 commits into
masterfrom
8384221

Conversation

@pf0n
Copy link
Copy Markdown
Owner

@pf0n pf0n commented May 20, 2026

This is a draft PR to get feedback on my implementation that forces all objects to be promoted for a WhiteBox full GC.

Ticket: https://bugs.openjdk.org/browse/JDK-8384221

Testing with linux-x86_64-server-fastdebug

These tests were previously failing or timing out. With this change they pass with 100 iterations each:

gc/TestReferenceClearDuringMarking.java
gc/TestNativeReferenceGet.java
gc/TestReferenceRefersTo.java
gc/TestReferenceRefersToDuringConcMark.java
gc/TestJNIWeak/TestJNIWeak.java

Notes

These changes introduces an always tenure mode only used for WhiteBox full GCs to promote objects regardless if it meets the tenuring threshold.

PLABs are constructed with promotions disabled, and remain so until the first retire. ShenandoahRetireGCLABClosure will enable promotions at the end of evacuation. The PLAB may also disable its own promotions mid-cycle in response to budget. We let it reconfigure itself this way during the cycle, so there is no need to disable promotions in ShenandoahEnablePlabPromotionsClosure. Here is a log where I try to force promotions on the first cycle without enabling promotions for PLAB:

[0.261s][debug][gc,plab     ] GC(0) Promotion failed, size 112, has plab? yes, PLAB remaining: 0, plab promotions disabled, promotion reserve: 25690112, promotion expended: 0, old capacity: 0, old_used: 0, old unaffiliated regions: 0
[0.261s][debug][gc,plab     ] GC(0) Promotion failed, size 24, has plab? yes, PLAB remaining: 0, plab promotions disabled, promotion reserve: 25690112, promotion expended: 0, old capacity: 0, old_used: 0, old unaffiliated regions: 0
[0.261s][debug][gc,plab     ] GC(0) Promotion failed, size 112, has plab? yes, PLAB remaining: 0, plab promotions disabled, promotion reserve: 25690112, promotion expended: 0, old capacity: 0, old_used: 0, old unaffiliated regions: 0
[0.261s][debug][gc,plab     ] GC(0) Promotion failed, size 136, has plab? yes, PLAB remaining: 0, plab promotions disabled, promotion reserve: 25690112, promotion expended: 0, old capacity: 0, old_used: 0, old unaffiliated regions: 0

adjust_evacuation_budgets can set the promotion reserve smaller than a min-size PLAB, in which case no thread can allocate a promotion PLAB. To avoid this, we keep at least one min-size PLAB per worker in the promotion reserve.

Not enabling promotions for the java thread's PLAB can cause the WhiteBox tests to become flakely because the PLAB is disabled on the first cycle.

There's a possibility of abbreviated cycles occurring. In an abbreviated cycle, no evacuation occurs, so if no regions are eligible for promote-in-place, no objects will be promoted. One solution I came up with was forcing the cset to be non-empty, preventing abbreviated cycles by reconfiguring the thresholds in ShenandoahGlobalHeuristics::choose_global_collection_set.


Copy link
Copy Markdown

@earthling-amzn earthling-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see if we can simplify this further. I don't want to add too much complexity and/or runtime costs to satisfy these tests.

Comment thread src/hotspot/share/gc/shenandoah/heuristics/shenandoahGenerationalHeuristics.cpp Outdated
Comment thread src/hotspot/share/gc/shenandoah/shenandoahAgeCensus.hpp Outdated
Comment thread src/hotspot/share/gc/shenandoah/shenandoahPLAB.cpp Outdated
}

// Only used for a WhiteBox full GC to enable promotions for PLABs before evacuation. PLAB
// construction defaults _allows_promotion to be false. ShenandoahRetireGCLABClosure will
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a bug. I can't think of a good reason not to default _allows_promotion to true. Then we don't need this extra code and extra steps.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll set it to true and retest again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants