-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8358329: AArch64: emit direct branches in static stubs for small code caches #25702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… caches In the A64 ISA, the B (direct branch) instruction can encode a target within a ±128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump. This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub. Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite: | Metric | Before | After | Difference | |-------------|---------------|---------------|------------| | totalInHeap | Avg: 1883.875 | Avg: 1871.667 | -0.65% | | | Sum: 6653848 | Sum: 6616344 | -0.56% | | stubCode | Avg: 103.164 | Avg: 87.285 | -15.38% | | | Sum: 364376 | Sum: 308552 | -15.33% |
👋 Welcome back mablakatov! A progress list of the required criteria for merging this PR into |
@mikabl-arm This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 100 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@theRealAph, @eastig) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
@mikabl-arm The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
MacroAssembler::pd_patch_instruction can distinguish between the `b` and `movk movz movz br` sequences. Strictly speaking, the method patches not a single instruction but a semantically joint sequence of instructions. Use it directly instead of `NativeJump` and `NativeGeneralJump` wrapper classes to simplify the implementation and get rid of an extra icache invalidation. Other changes in the patch simply clean up code that became redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Please fix the copyright date.
The error in java/lang/Thread/virtual/stress/GetStackTraceALotWhenBlocking.java#id0 looks similar to what has been previously reported here: https://bugs.openjdk.org/browse/JDK-8344577 . @theRealAph , do you think the patch may cause the error? Or should I open a similar JBS ticket to report it? |
Co-authored-by: Andrew Haley <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
That bug is macOS/x86. So, is the failure you're seeing repeatable? |
/integrate |
@mikabl-arm |
Hey @eastig , when you have a moment, could you take a look at this as a second reviewer? I'd appreciate your feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/sponsor |
Going to push as commit ba32b78.
Your commit was automatically rebased without conflicts. |
@eastig @mikabl-arm Pushed as commit ba32b78. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Sorry @theRealAph , I've re-requested a review by mistake. Please ignore it. |
This is causing failures in Oracle tier5 testing. See JDK-8359963. |
In the A64 ISA, the B (direct branch) instruction can encode a target within a ±128MB range relative to the instruction. Due to this limitation, when generating static stubs, HotSpot conservatively emits indirect branches for calls to c2i interface stubs. These indirect branches are implemented using a four-instruction sequence: three instructions to materialize the target address in a register, followed by a BR instruction to perform the jump.
This patch optimizes static stub generation when the code cache is small enough to guarantee that the target entry point of the c2i interface stub lies within the direct branch range. In such cases, a single direct B instruction can be used instead of the indirect sequence, saving 3 instructions (12 bytes) per static stub.
Below is an example of the optimization's impact, measured using the movie-lens benchmark from the Renaissance benchmark suite:
Full jtreg passed on AArch64.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25702/head:pull/25702
$ git checkout pull/25702
Update a local copy of the PR:
$ git checkout pull/25702
$ git pull https://git.openjdk.org/jdk.git pull/25702/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25702
View PR using the GUI difftool:
$ git pr show -t 25702
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25702.diff
Using Webrev
Link to Webrev Comment