Skip to content

Posting checksum events fails with ENOMEM on FreeBSD #14717

Closed
@asomers

Description

@asomers

System information

Type Version/Name
Distribution Name FreeBSD
Distribution Version 14.0-CURRENT
Kernel Version bef81bc0aef (main as of 3-April-2023)
Architecture amd64
OpenZFS Version zfs-2.1.99-FreeBSD_g57cfae4a2 zfs-kmod-2.1.99-FreeBSD_g431083f75

Describe the problem you're observing

On FreeBSD, devctl events must fit within a 1016 byte string buffer. If they are too large, they will be silently dropped and never sent to userland. ZFS's ereport.fs.zfs.checksum events are far too large: 1862 bytes in my test case. So they never get sent to userland. This means that zfsd is unable to degrade a vdev that produces too many checksum errors.

Describe how to reproduce the problem

vim /etc/kyua/kyua.conf # Set test_suites.FreeBSD.disks
cd /usr/tests/sys/cddl/zfs/tests/zfsd
sudo kyua debug zfsd_test:zfsd_degrade_001_pos

Optionally, run a dtrace script like this, which will show various ZFS events get queued for userland, but not any ereport.fs.zfs.checksum events.

sudo dtrace -i 'fbt:kernel:devctl_queue:entry {printf("%s", stringof(args[0]->dei_data));}'

Possible solutions

PR #14716 partially fixes the problem by making the report format less verbose. However, it isn't enough. We still need to trim another ~75 bytes from the report (the exact number depends on the size of the pool and vdev names). Here are some possibilities for how to fully fix it:

  • Combine the bad_cleared_histogram and bad_set_histogram into a single bad_bit_histogram. That would save about 150 bytes.
  • Widen the histogram buckets from one bit wide to two bits wide.
  • Simply remove the histogram fields. zpool-events(8) suggests that they are intended for use with IDE and parallel SCSI, which are obsolete.
  • Remove several other fields. But which ones? zio_err will probably always be zero. But I'm not sure which of the others can go.
  • Double the size of FreeBSD's buffer, at an uncertain performance cost

Current Report format

Here's an example of a current report, including the changes from PR #14716 .

 class=ereport.fs.zfs.checksum ena=96599319807790081 pool=testpool.1933 pool_guid=1236902063710799041 pool_state=0 pool_context=0 pool_failmode=wait vdev_guid=2774253874431514999 vdev_type=file vdev_path=/tmp/kyua.6Temlq/2/work/file1.1933 vdev_ashift=9 vdev_complete_ts=92124283803 vdev_delta_ts=46670 vdev_read_errors=0 vdev_write_errors=0 vdev_cksum_errors=20 vdev_delays=0 parent_guid=8090931855087882905 parent_type=raidz vdev_spare_guids= zio_err=0 zio_flags=1048752 zio_stage=4194304 zio_pipeline=65011712 zio_delay=0 zio_timestamp=0 zio_delta=0 zio_priority=4 zio_offset=1028608 zio_size=512 zio_objset=0 zio_object=0 zio_level=0 zio_blkid=4 bad_ranges=0000000000000200 bad_ranges_min_gap=8 bad_range_sets=0000061f bad_range_clears=000001f4 bad_set_histogram=1719161c1c1c101618171a151a1a19161e1c171d1816161c191f1a18192117191c131d171b1613151a171419161a1b1319101b14171b18151e191a1b141a1c17 bad_cleared_histogram=06090a0808070a0b020609060506090a01090a050a0a0509070609080d050d0607080d060507080c04070807070a0608020c080c080908040808090a05090a07 time=00000016806477050000000604157480 eid=62

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: DefectIncorrect behavior (e.g. crash, hang)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions