Skip to content

Fix zrele race in zrele_async that can cause hang #11530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 28, 2021

Conversation

pcd1193182
Copy link
Contributor

Signed-off-by: Paul Dagnelie [email protected]

Motivation and Context

See #11527. Basically, there is a race condition in zfs_zrele_async when we are checking if we would be the one to evict an inode. This can lead to a txg sync deadlock.

Description

Instead of calling into iput directly, we attempt to perform the atomic decrement ourselves, unless that would set the i_count value to zero. In that case, we dispatch a call to iput to run later, to prevent a deadlock from occuring.

How Has This Been Tested?

ZFS test suite passes, along with some minor stress testing.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@pcd1193182 pcd1193182 added the Status: Code Review Needed Ready for review and testing label Jan 27, 2021
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jan 27, 2021
@behlendorf behlendorf merged commit 2921ad6 into openzfs:master Jan 28, 2021
behlendorf pushed a commit that referenced this pull request Jan 28, 2021
There is a race condition in zfs_zrele_async when we are checking if 
we would be the one to evict an inode. This can lead to a txg sync 
deadlock.

Instead of calling into iput directly, we attempt to perform the atomic 
decrement ourselves, unless that would set the i_count value to zero. 
In that case, we dispatch a call to iput to run later, to prevent a 
deadlock from occurring.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes #11527 
Closes #11530
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
There is a race condition in zfs_zrele_async when we are checking if 
we would be the one to evict an inode. This can lead to a txg sync 
deadlock.

Instead of calling into iput directly, we attempt to perform the atomic 
decrement ourselves, unless that would set the i_count value to zero. 
In that case, we dispatch a call to iput to run later, to prevent a 
deadlock from occurring.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#11527 
Closes openzfs#11530
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
There is a race condition in zfs_zrele_async when we are checking if 
we would be the one to evict an inode. This can lead to a txg sync 
deadlock.

Instead of calling into iput directly, we attempt to perform the atomic 
decrement ourselves, unless that would set the i_count value to zero. 
In that case, we dispatch a call to iput to run later, to prevent a 
deadlock from occurring.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#11527 
Closes openzfs#11530
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants