Skip to content

ZFS allows I/O even when pool is UNAVAIL and a vdev has no redundancy #17262

Open
@arturpzol

Description

@arturpzol

System information

Type Version/Name
Distribution Name Debian
Distribution Version Any
Kernel Version 5.15
Architecture x86_64
OpenZFS Version 2.3.1

Describe the problem you're observing

Pool is composed of multiple raidz1 vdevs.
One of them (raidz1-1) lost 2 out of 3 disks:

zpool status -L
  pool: Regression-Pool
 state: UNAVAIL
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using zpool online' or replace the device with
        'zpool replace'.
  scan: resilvered 12K in 00:00:00 with 0 errors on Tue Apr 22 15:39:11 2025
config:

        NAME                        STATE     READ WRITE CKSUM
        Regression-Pool             UNAVAIL      0     0     0  insufficient replicas
          raidz1-0                  ONLINE       0     0     0
            sdl                     ONLINE       0     0     0
            sdj                     ONLINE       0     0     0
            sdk                     ONLINE       0     0     0
          raidz1-1                  UNAVAIL      0     0     0  insufficient replicas
            wwn-0x50000394f803873c  REMOVED      0     0     0
            wwn-0x50000394f809f478  REMOVED      0     0     0
            sdy                     ONLINE       0     0     0
          raidz1-2                  ONLINE       0     0     0
            sde                     ONLINE       0     0     0
            sdd                     ONLINE       0     0     0
            sdf                     ONLINE       0     0     0
          raidz1-3                  ONLINE       0     0     0
            sds                     ONLINE       0     0     0
            sdt                     ONLINE       0     0     0
            sdu                     ONLINE       0     0     0
          raidz1-4                  ONLINE       0     0     0
            sdn                     ONLINE       0     0     0
            sdm                     ONLINE       0     0     0
            sdo                     ONLINE       0     0     0
          raidz1-5                  ONLINE       0     0     0
            sdp                     ONLINE       0     0     0
            sdq                     ONLINE       0     0     0
            sdr                     ONLINE       0     0     0
          raidz1-6                  ONLINE       0     0     0
            sdv                     ONLINE       0     0     0
            sdw                     ONLINE       0     0     0
            sdx                     ONLINE       0     0     0
          raidz1-7                  ONLINE       0     0     0
            sdh                     ONLINE       0     0     0
            sdi                     ONLINE       0     0     0
            sdg                     ONLINE       0     0     0
        logs
          nvme0n1                   ONLINE       0     0     0
        cache
          nvme1n1                   ONLINE       0     0     0

errors: No known data errors

As expected, the vdev is marked UNAVAIL, and the pool as a whole is also UNAVAIL with insufficient replicas.
However, the pool does not enter the SUSPENDED state. I/O operations are still possible.

Logically, a pool state: UNAVAIL suggests that I/O should be blocked.

Yet in practice, the pool continues to serve I/O on other vdevs. This raises confusion about what UNAVAIL actually represents at the pool level.

If the pool is marked UNAVAIL, shouldn't it behave as unavailable (SUSPENDED) to the system?

There were changes in this area introduced in ZFS 2.3, for example in PR #16864.
However, in practice, testing shows that ZFS 2.2 and 2.3 behave similarly in this scenario — the pool remains active and I/O continues.

Is there a tunable to control this behavior? or is this a bug?

Describe how to reproduce the problem

Include any warning/errors/backtraces from the system logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: DefectIncorrect behavior (e.g. crash, hang)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions