Skip to content

8385648: PPC64: Improve receiver type profiling reliability#31325

Open
TheRealMDoerr wants to merge 2 commits into
openjdk:masterfrom
TheRealMDoerr:8385648_PPC64_profile_receiver_type
Open

8385648: PPC64: Improve receiver type profiling reliability#31325
TheRealMDoerr wants to merge 2 commits into
openjdk:masterfrom
TheRealMDoerr:8385648_PPC64_profile_receiver_type

Conversation

@TheRealMDoerr
Copy link
Copy Markdown
Contributor

@TheRealMDoerr TheRealMDoerr commented May 29, 2026

This is the PPC64 implementation. x64 and aarch64 are already integrated.
Unfortunately, I had one register too few in C1, so I have reused the recv register only for C1. I've kept the better code for the interpreter. Maybe we can clean up the C1 version if we free up another register in the future.



Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8385648: PPC64: Improve receiver type profiling reliability (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/31325/head:pull/31325
$ git checkout pull/31325

Update a local copy of the PR:
$ git checkout pull/31325
$ git pull https://git.openjdk.org/jdk.git pull/31325/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 31325

View PR using the GUI difftool:
$ git pr show -t 31325

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/31325.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link
Copy Markdown

bridgekeeper Bot commented May 29, 2026

👋 Welcome back mdoerr! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link
Copy Markdown

openjdk Bot commented May 29, 2026

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk Bot added the hotspot hotspot-dev@openjdk.org label May 29, 2026
@openjdk
Copy link
Copy Markdown

openjdk Bot commented May 29, 2026

@TheRealMDoerr The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk
Copy link
Copy Markdown

openjdk Bot commented May 29, 2026

The total number of required reviews for this PR has been set to 2 based on the presence of this label: hotspot. This can be overridden with the /reviewers command.

@openjdk openjdk Bot added the rfr Pull request is ready for review label May 29, 2026
@mlbridge
Copy link
Copy Markdown

mlbridge Bot commented May 29, 2026

Webrevs

Copy link
Copy Markdown
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems okay, but I suspect that killing recv is not really fine for this path:

 if (op->should_profile_receiver_type()) {
   assert(op->recv()->is_single_cpu(), "recv must be allocated");
   Register recv = op->recv()->as_register(); // <--- not a temp
   ...
    type_profile_helper(mdo, mdo_offset_bias, md, data, recv, tmp1); // kills recv
 }

Can you instead ask for another tmp?

@TheRealMDoerr
Copy link
Copy Markdown
Contributor Author

TheRealMDoerr commented May 30, 2026

It seems okay, but I suspect that killing recv is not really fine for this path:

 if (op->should_profile_receiver_type()) {
   assert(op->recv()->is_single_cpu(), "recv must be allocated");
   Register recv = op->recv()->as_register(); // <--- not a temp
   ...
    type_profile_helper(mdo, mdo_offset_bias, md, data, recv, tmp1); // kills recv
 }

Can you instead ask for another tmp?

Thank you for reviewing!

I had the same concern initially. However, recv has a temp and no output effect:

if (opProfileCall->_recv->is_valid()) do_temp(opProfileCall->_recv);

The register is allocated and the receiver value copied here:

recv = new_register(T_OBJECT);

Here's an example: The new sequence is ...0868 to ...08d4. recv = r4 is loaded before it and immediately reused for something else after it.

  0x0000668958ff0860:   lis     r4,2820
  0x0000668958ff0864:   addi    r4,r4,15360
  0x0000668958ff0868:   li      r0,2
  0x0000668958ff086c:   mtctr   r0
  0x0000668958ff0870:   li      r6,248
  0x0000668958ff0874:   ldx     r0,r6,r5
  0x0000668958ff0878:   cmpd    r0,r4
  0x0000668958ff087c:   beq-    0x0000668958ff08c8,bo=0b01100[no_hint]
  0x0000668958ff0880:   addi    r6,r6,16
  0x0000668958ff0884:   bdnz+   0x0000668958ff0874,bo=0b10000[no_hint]
  0x0000668958ff0888:   li      r0,2
  0x0000668958ff088c:   mtctr   r0
  0x0000668958ff0890:   li      r6,248
  0x0000668958ff0894:   ldx     r0,r6,r5
  0x0000668958ff0898:   cmpdi   r0,0
  0x0000668958ff089c:   beq-    0x0000668958ff08b0,bo=0b01100[no_hint]
  0x0000668958ff08a0:   addi    r6,r6,16
  0x0000668958ff08a4:   bdnz+   0x0000668958ff0894,bo=0b10000[no_hint]
  0x0000668958ff08a8:   li      r6,240
  0x0000668958ff08ac:   b       0x0000668958ff08cc
  0x0000668958ff08b0:   add     r6,r5,r6
  0x0000668958ff08b4:   ldarx   r0,0,r6
  0x0000668958ff08b8:   cmpdi   r0,0
  0x0000668958ff08bc:   bne-    0x0000668958ff08c4,bo=0b00110[not_taken]
  0x0000668958ff08c0:   stdcx.  r4,0,r6
  0x0000668958ff08c4:   b       0x0000668958ff0868
  0x0000668958ff08c8:   addi    r6,r6,8
  0x0000668958ff08cc:   ldx     r4,r6,r5
  0x0000668958ff08d0:   addi    r4,r4,1
  0x0000668958ff08d4:   stdx    r4,r6,r5
  0x0000668958ff08d8:   lis     r4,2820
  0x0000668958ff08dc:   addi    r4,r4,-9128

In general, I'm not very happy with C1's register usage for some LIR instructions, either. Especially that the temp effect is missing for some LIR nodes here:

if (op->code() == lir_store_check && opTypeCheck->_object->is_valid()) {
That required a terrible hack for some architectures because obj may be same register as one of the temps.
In addition, reg2mem and emit_profile_type could be implemented better if we had one more temp register.

So, I think that may be worth a separate RFE.
Initial test results for this PR look good so far, but I'll run more tests.
What do you think?

@TheRealMDoerr
Copy link
Copy Markdown
Contributor Author

I've run millions of tests over the weekend plus some tests with -XX:-C1OptimizeVirtualCallProfiling and the results are good.

Copy link
Copy Markdown
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay, with nits.

}

Label L_loop_search_receiver, L_loop_search_empty;
Label L_restart, L_found_recv, L_found_empty, L_polymorphic, L_count_update;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L_polymorphic is dead. It is actually smarter than the original version I did in x86 -- saves a branch, right? -- so we might want to touch up those, separately.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. Yeah, improving it for x86 would be nice, too.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// Atomically swing receiver slot: null -> recv.
//
// The update uses CAS, which clobbers tmp. Therefore, rscratch2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is about rscratch2 anymore?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I've removed the whole comment. It's already described in the other comment that we kill the offset.

@TheRealMDoerr
Copy link
Copy Markdown
Contributor Author

Thanks for a lot for reviewing! I appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot hotspot-dev@openjdk.org rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

2 participants