Avoid posting duplicate zpool events #10861

don-brady · 2020-08-31T22:30:09Z

Motivation and Context

Duplicate io and checksum ereport events can misrepresent that things are worse than they seem. Ideally the zpool events and the corresponding vdev stat error counts in a zpool status should be for unique errors -- not the same error being counted over and over. This can be demonstrated in a simple example. With a single bad block in a datafile and just 5 reads of the file we end up with a degraded vdev, even though there is only one unique error in the pool.

Example

$ truncate -s 512M /tmp/vdev1 /tmp/vdev2
$ sudo zpool create demo /tmp/vdev1 /tmp/vdev2
$ sudo dd if=/dev/urandom of=/demo/data bs=128K count=1

$ sudo zpool export demo
$ sudo zpool import -d /tmp demo
$ sudo zinject -t data -e checksum -T read /demo/data
Added handler 4 with the following properties:
  pool: demo
objset: 54
object: 2
  type: 0
 level: 0
 range: all
  dvas: 0x0

$ for i in {1..5}; do dd if=/demo/data of=/dev/null bs=128K; done
dd: error reading '/demo/data': Input/output error
0+0 records in
0+0 records out
0 bytes copied, 0.00159632 s, 0.0 kB/s

$ sudo zinject -c all
removed all registered handlers

$ sudo zpool status -v demo
  pool: demo
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: none requested
config:

	NAME          STATE     READ WRITE CKSUM
	demo          DEGRADED     0     0     0
	  /tmp/vdev1  ONLINE       0     0     0
	  /tmp/vdev2  DEGRADED     0     0    10  too many errors

errors: Permanent errors have been detected in the following files:

        /demo/data

Note that the zfs diagnosis agent will diagnose a vdev as degraded if it encounters 10 errors of the same type (io | checksum) within 10 minutes. These errors do not have to be unique! Any runtime condition that is retying a failed block read can easily generate a stream of error events that will trigger a degrade diagnosis.

Ideally, a single bad block should not be enough to degrade a vdev and it should only contribute once to the vdev's stat error count.

Description

The proposed solution to the above issue, is to eliminate duplicates when posting events and when updating vdev error stats. We now save recent error events of interest when posting events so that we can easily check for duplicates when posting an error. The bulk of the implementation changes are in module/zfs/zfs_fm.c

Note: some function prototype changes (like dropping unused args) had ripple effects across a number of files. Don't let this number of files changed scare you away from offering a review!

There are two tunables introduced, zfs_zevent_retain_max and zfs_zevent_retain_expire_secs. The duplicate checking mechanism can be disabled by setting the zfs_zevent_retain_max to zero.

Also added a zio_priority to the ereport payload since that adds the context of read/write and sync/async which is useful when evaluating duplicate errors.

How Has This Been Tested?

Ran ztest
Ran the ZTS functional/cli_root/zpool_events suite of tests that cover zpool events.
Added a new test, zpool_events_duplicates, that generates duplicate block errors and confirms that duplicate events are not posted.
Note that the existing test, zpool_events_errors, verifies that the event count matches the vdev error stat counts so this test confirms that we have not broken that constraint and adds adequate code coverage.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the ZFS on Linux code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
I have run the ZFS Test Suite with this change applied.
All commit messages are properly formatted and contain Signed-off-by.

Signed-off-by: Don Brady <[email protected]>

don-brady · 2020-09-01T23:45:27Z

Address merge conflicts after PR-10857 landed.

behlendorf

Looks nice! This will definitely be useful, it'll be great to catch the duplicates.

tests/zfs-tests/tests/functional/cli_root/zpool_events/ereports.c

tests/zfs-tests/tests/functional/cli_root/zpool_events/zpool_events_duplicates.ksh

man/man5/zfs-module-parameters.5

module/zfs/zfs_fm.c

Signed-off-by: Don Brady <[email protected]>

brad-lewis

This is a really nice addition. I had a couple questions about the scheduling the cleaner.

module/zfs/zfs_fm.c

tests/zfs-tests/tests/functional/cli_root/zpool_events/ereports.c

Signed-off-by: Don Brady <[email protected]>

ghost · 2020-09-03T21:27:55Z

The FreeBSD stable/12 failure should be fixed by a rebase.

Duplicate io and checksum ereport events can misrepresent that things are worse than they seem. Ideally the zpool events and the corresponding vdev stat error counts in a zpool status should be for unique errors -- not the same error being counted over and over. This can be demonstrated in a simple example. With a single bad block in a datafile and just 5 reads of the file we end up with a degraded vdev, even though there is only one unique error in the pool. The proposed solution to the above issue, is to eliminate duplicates when posting events and when updating vdev error stats. We now save recent error events of interest when posting events so that we can easily check for duplicates when posting an error. Reviewed by: Brad Lewis <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #10861

Duplicate io and checksum ereport events can misrepresent that things are worse than they seem. Ideally the zpool events and the corresponding vdev stat error counts in a zpool status should be for unique errors -- not the same error being counted over and over. This can be demonstrated in a simple example. With a single bad block in a datafile and just 5 reads of the file we end up with a degraded vdev, even though there is only one unique error in the pool. The proposed solution to the above issue, is to eliminate duplicates when posting events and when updating vdev error stats. We now save recent error events of interest when posting events so that we can easily check for duplicates when posting an error. Reviewed by: Brad Lewis <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes openzfs#10861

don-brady requested a review from tonyhutter August 31, 2020 22:30

behlendorf added the Status: Code Review Needed Ready for review and testing label Aug 31, 2020

Avoid posting duplicate zpool events

3524d09

Signed-off-by: Don Brady <[email protected]>

behlendorf reviewed Sep 2, 2020

View reviewed changes

Addressed review feedback.

d88ed40

Signed-off-by: Don Brady <[email protected]>

brad-lewis reviewed Sep 2, 2020

View reviewed changes

module/zfs/zfs_fm.c Show resolved Hide resolved

module/zfs/zfs_fm.c Show resolved Hide resolved

module/zfs/zfs_fm.c Show resolved Hide resolved

tests/zfs-tests/tests/functional/cli_root/zpool_events/ereports.c Show resolved Hide resolved

Additional changes from review feedback.

8c0a90c

Signed-off-by: Don Brady <[email protected]>

brad-lewis approved these changes Sep 3, 2020

View reviewed changes

behlendorf approved these changes Sep 3, 2020

View reviewed changes

behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Sep 3, 2020

behlendorf merged commit 4f07282 into openzfs:master Sep 4, 2020

ahrens mentioned this pull request Jan 29, 2021

Checksum errors may not be counted #11545

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid posting duplicate zpool events #10861

Avoid posting duplicate zpool events #10861

don-brady commented Aug 31, 2020

Uh oh!

don-brady commented Sep 1, 2020

Uh oh!

behlendorf left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brad-lewis left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ghost commented Sep 3, 2020

Uh oh!

Uh oh!

Avoid posting duplicate zpool events #10861

Avoid posting duplicate zpool events #10861

Conversation

don-brady commented Aug 31, 2020

Motivation and Context

Description

How Has This Been Tested?

Types of changes

Checklist:

Uh oh!

don-brady commented Sep 1, 2020

Uh oh!

behlendorf left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brad-lewis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ghost commented Sep 3, 2020

Uh oh!

Uh oh!