Skip to content

Avoid 64bit division in multilist index functions. #12288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 29, 2021

Conversation

amotin
Copy link
Member

@amotin amotin commented Jun 27, 2021

Number of sublists in multilists equal to number of CPUs is typically
small even or even power of 2. It makes almost no sense to use full
slow 64bit division.

How Has This Been Tested?

On 80-thread FreeBSD system doing ~630K mostly uncached 4KB ZVOLs reads profiler shows reduction of CPU time spent in arc_state_multilist_index_func() from 3.7% to 3.3% of ARC eviction thread, bottle-necking this test.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@amotin amotin added the Status: Code Review Needed Ready for review and testing label Jun 27, 2021
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could potentially see this intentional 64-bit to 32-bit overflow causing some static analyzer warnings. It looks like by default both cppcheck and coverity have these checkers disabled, so we're not seeing any warnings today, but in the future we might. I'd suggest we extend to comments to make it clear why the casts are here and that they are intentional. Otherwise, this optimization might get removed if the default behavior for one of the static analyzers changes to complain about this.

The change itself though looks good to me.

Number of sublists in multilists equal to number of CPUs is typically
small even or even power of 2.  It makes almost no sense to use full
slow 64bit division.

Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
@amotin
Copy link
Member Author

amotin commented Jun 28, 2021

I am not sure why would static analyzer complained about explicit type change. And I believe we have commit messages for the reasons why the code is what it is. But I've added/extended comments to explain it explicitly.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's an explicit cast to throw away those bits I suppose you're right it probably wouldn't. But I think explaining why it's there, since it's subtle, has value. Thanks for adding the comments.

@mmaybee mmaybee added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jun 28, 2021
@mmaybee mmaybee merged commit 5b7053a into openzfs:master Jun 29, 2021
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Jun 29, 2021
The number of sublists in a multilist is relatively small. We dont need
64 bits to calculate an index. 32 bits is sufficient and makes the
code more efficient.

Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]> 
Reviewed-by: Mark Maybee <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
Closes openzfs#12288
@amotin amotin deleted the ml_idx branch August 24, 2021 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants