Skip to content

Fix l2arc_apply_transforms ztest crash #15248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 19, 2023

Conversation

pcd1193182
Copy link
Contributor

@pcd1193182 pcd1193182 commented Sep 7, 2023

Motivation and Context

We have a crash ocurring pretty regularly in zloop that indicates corruption in the buffer allocated by l2arc_apply_transforms. This issue is documented in #15177.

Description

In #13375 we modified the allocation size of the buffer that we use to apply l2arc transforms to be the size of the arc hdr we're using, rather than the allocation size that will be in place on the disk, because sometimes the hdr size is larger. Unfortunately, sometimes the allocation size is larger, which means that we overflow the buffer in that case. This change modifies the allocation to be the max of the two values.

How Has This Been Tested?

Two hours of zloop runs with asan, and the issue did not recur. Previously it reliably occurred within 30 minutes. I'm not sure of the easiest way to test to make sure that the original issue is still fixed; @rincebrain is there a test in the test suite that covers it? If not I can try to do some manual testing of the early abort logic.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@rincebrain
Copy link
Contributor

There's no specific tests around the EA code, but all the l2arc compressed_arc disabled tests were what was exploding then, which was likely just this problem, but that fix happened to work because it wasn't simulating a mixed ashift environment.

So I think you could probably make a trivial modification of those tests to test this breaking with the previous patch reverted in both mixed ashift and not cases, if you'd like, watch that explode, then watch one but not the other explode with current git, and both not explode with this.

@behlendorf behlendorf requested a review from don-brady September 8, 2023 01:00
@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Sep 8, 2023
Signed-off-by: Paul Dagnelie <[email protected]>
Copy link
Contributor

@mmaybee mmaybee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Note that I got a bit carried away with feedback as I was distracted by some elements of the function that are not technically part of this fix.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Sep 19, 2023
@behlendorf behlendorf merged commit 741c215 into openzfs:master Sep 19, 2023
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Sep 20, 2023
In openzfs#13375 we modified the allocation size of the buffer that we use 
to apply l2arc transforms to be the size of the arc hdr we're using, 
rather than the allocation size that will be in place on the disk, 
because sometimes the hdr size is larger. Unfortunately, sometimes 
the allocation size is larger, which means that we overflow the buffer 
in that case. This change modifies the allocation to be the max of 
the two values

Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#15177
Closes openzfs#15248
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Sep 21, 2023
In openzfs#13375 we modified the allocation size of the buffer that we use 
to apply l2arc transforms to be the size of the arc hdr we're using, 
rather than the allocation size that will be in place on the disk, 
because sometimes the hdr size is larger. Unfortunately, sometimes 
the allocation size is larger, which means that we overflow the buffer 
in that case. This change modifies the allocation to be the max of 
the two values

Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#15177
Closes openzfs#15248
behlendorf pushed a commit that referenced this pull request Sep 22, 2023
In #13375 we modified the allocation size of the buffer that we use 
to apply l2arc transforms to be the size of the arc hdr we're using, 
rather than the allocation size that will be in place on the disk, 
because sometimes the hdr size is larger. Unfortunately, sometimes 
the allocation size is larger, which means that we overflow the buffer 
in that case. This change modifies the allocation to be the max of 
the two values

Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes #15177
Closes #15248
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Dec 12, 2023
In openzfs#13375 we modified the allocation size of the buffer that we use 
to apply l2arc transforms to be the size of the arc hdr we're using, 
rather than the allocation size that will be in place on the disk, 
because sometimes the hdr size is larger. Unfortunately, sometimes 
the allocation size is larger, which means that we overflow the buffer 
in that case. This change modifies the allocation to be the max of 
the two values

Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#15177
Closes openzfs#15248
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants