8385648: PPC64: Improve receiver type profiling reliability#31325
8385648: PPC64: Improve receiver type profiling reliability#31325TheRealMDoerr wants to merge 2 commits into
Conversation
|
👋 Welcome back mdoerr! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
|
@TheRealMDoerr The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
|
The total number of required reviews for this PR has been set to 2 based on the presence of this label: |
Webrevs
|
There was a problem hiding this comment.
It seems okay, but I suspect that killing recv is not really fine for this path:
if (op->should_profile_receiver_type()) {
assert(op->recv()->is_single_cpu(), "recv must be allocated");
Register recv = op->recv()->as_register(); // <--- not a temp
...
type_profile_helper(mdo, mdo_offset_bias, md, data, recv, tmp1); // kills recv
}
Can you instead ask for another tmp?
Thank you for reviewing! I had the same concern initially. However, jdk/src/hotspot/share/c1/c1_LIR.cpp Line 880 in d3073b5 The register is allocated and the receiver value copied here: jdk/src/hotspot/share/c1/c1_LIRGenerator.cpp Line 3056 in d3073b5 Here's an example: The new sequence is ...0868 to ...08d4. In general, I'm not very happy with C1's register usage for some LIR instructions, either. Especially that the temp effect is missing for some LIR nodes here: jdk/src/hotspot/share/c1/c1_LIR.cpp Line 807 in d3073b5 obj may be same register as one of the temps.In addition, reg2mem and emit_profile_type could be implemented better if we had one more temp register.
So, I think that may be worth a separate RFE. |
|
I've run millions of tests over the weekend plus some tests with -XX:-C1OptimizeVirtualCallProfiling and the results are good. |
| } | ||
|
|
||
| Label L_loop_search_receiver, L_loop_search_empty; | ||
| Label L_restart, L_found_recv, L_found_empty, L_polymorphic, L_count_update; |
There was a problem hiding this comment.
L_polymorphic is dead. It is actually smarter than the original version I did in x86 -- saves a branch, right? -- so we might want to touch up those, separately.
There was a problem hiding this comment.
Exactly. Yeah, improving it for x86 would be nice, too.
There was a problem hiding this comment.
|
|
||
| // Atomically swing receiver slot: null -> recv. | ||
| // | ||
| // The update uses CAS, which clobbers tmp. Therefore, rscratch2 |
There was a problem hiding this comment.
I don't think this is about rscratch2 anymore?
There was a problem hiding this comment.
Good catch! I've removed the whole comment. It's already described in the other comment that we kill the offset.
|
Thanks for a lot for reviewing! I appreciate it. |
This is the PPC64 implementation. x64 and aarch64 are already integrated.
Unfortunately, I had one register too few in C1, so I have reused the
recvregister only for C1. I've kept the better code for the interpreter. Maybe we can clean up the C1 version if we free up another register in the future.Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/31325/head:pull/31325$ git checkout pull/31325Update a local copy of the PR:
$ git checkout pull/31325$ git pull https://git.openjdk.org/jdk.git pull/31325/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 31325View PR using the GUI difftool:
$ git pr show -t 31325Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/31325.diff
Using Webrev
Link to Webrev Comment