Description
System information
Type | Version/Name |
---|---|
Distribution Name | Debian |
Distribution Version | Any |
Kernel Version | 5.15 |
Architecture | x86_64 |
OpenZFS Version | 2.3.1 |
Describe the problem you're observing
Pool is composed of multiple raidz1 vdevs.
One of them (raidz1-1) lost 2 out of 3 disks:
zpool status -L
pool: Regression-Pool
state: UNAVAIL
status: One or more devices has been removed by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using zpool online' or replace the device with
'zpool replace'.
scan: resilvered 12K in 00:00:00 with 0 errors on Tue Apr 22 15:39:11 2025
config:
NAME STATE READ WRITE CKSUM
Regression-Pool UNAVAIL 0 0 0 insufficient replicas
raidz1-0 ONLINE 0 0 0
sdl ONLINE 0 0 0
sdj ONLINE 0 0 0
sdk ONLINE 0 0 0
raidz1-1 UNAVAIL 0 0 0 insufficient replicas
wwn-0x50000394f803873c REMOVED 0 0 0
wwn-0x50000394f809f478 REMOVED 0 0 0
sdy ONLINE 0 0 0
raidz1-2 ONLINE 0 0 0
sde ONLINE 0 0 0
sdd ONLINE 0 0 0
sdf ONLINE 0 0 0
raidz1-3 ONLINE 0 0 0
sds ONLINE 0 0 0
sdt ONLINE 0 0 0
sdu ONLINE 0 0 0
raidz1-4 ONLINE 0 0 0
sdn ONLINE 0 0 0
sdm ONLINE 0 0 0
sdo ONLINE 0 0 0
raidz1-5 ONLINE 0 0 0
sdp ONLINE 0 0 0
sdq ONLINE 0 0 0
sdr ONLINE 0 0 0
raidz1-6 ONLINE 0 0 0
sdv ONLINE 0 0 0
sdw ONLINE 0 0 0
sdx ONLINE 0 0 0
raidz1-7 ONLINE 0 0 0
sdh ONLINE 0 0 0
sdi ONLINE 0 0 0
sdg ONLINE 0 0 0
logs
nvme0n1 ONLINE 0 0 0
cache
nvme1n1 ONLINE 0 0 0
errors: No known data errors
As expected, the vdev is marked UNAVAIL, and the pool as a whole is also UNAVAIL with insufficient replicas.
However, the pool does not enter the SUSPENDED state. I/O operations are still possible.
Logically, a pool state: UNAVAIL
suggests that I/O should be blocked.
Yet in practice, the pool continues to serve I/O on other vdevs. This raises confusion about what UNAVAIL actually represents at the pool level.
If the pool is marked UNAVAIL, shouldn't it behave as unavailable (SUSPENDED) to the system?
There were changes in this area introduced in ZFS 2.3, for example in PR #16864.
However, in practice, testing shows that ZFS 2.2 and 2.3 behave similarly in this scenario — the pool remains active and I/O continues.
Is there a tunable to control this behavior? or is this a bug?