Description
Hi there,
System info
I am using the latest stable ZFS-dkms release on Artix linux with s6-init.
Type | Version |
---|---|
System | Artix Linux with s6-init |
System Version | 5.10.4-artix2-1 |
System Architecture | Arch linux |
ZFS Version | zfs-2.0.1-1 |
Current behavior
When I currently boot the system, the Vconsole shows a log message regarding the ZFS AVL tree.
VERIFY(avl_find(tree, new_node, &where) == NULL) failed
PANIC at avl.c:641:avl_add()
CPU: 6 PID: 1019 Comm: z_livelist_dest Tainted: P OE 5.10.4-artix2-1 #1
Call Trace:
dump_stack+0x6b/0x83
spl_panic+0xef/0x117 [spl]
? allocate_slab+0x30b/0x4c0
? spl_kmem_alloc_impl+0xae/0xf0 [spl]
? __kmalloc_node+0x180/0x380
avl_add+0x9b/0xb0 [zavl]
dsl_livelist_iterate+0x142/0x1f0 [zfs]
bpobj_iterate_blkptrs+0xfe/0x360 [zfs]
? dsl_deadlist_load_cache+0x250/0x250 [zfs]
bpobj_iterate_impl+0x280/0x4f0 [zfs]
? dsl_deadlist_load_cache+0x250/0x250 [zfs]
dsl_process_sub_livelist+0xb5/0xf0 [zfs]
? dsl_deadlist_cache_compare+0x20/0x20 [zfs]
spa_livelist_delete_cb+0x1ea/0x350 [zfs]
zthr_procedure+0x135/0x150 [zfs]
? zrl_is_locked+0x20/0x20 [zfs]
thread_generic_wrapper+0x6f/0x80 [spl]
? __thread_exit+0x20/0x20 [spl]
kthread+0x133/0x150
? __kthread_bind_mask+0x60/0x60
ret_from_fork+0x22/0x30
Additionally the kernel kills a ZFS thread every now and then resulting in the following stack trace:
INFO: task z_livelist_dest:1019 blocked for more than 491 seconds.
Tainted: P OE 5.10.4-artix2-1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:z_livelist_dest state:D stack: 0 pid: 1019 ppid: 2 flags:0x00004000
Call Trace:
__schedule+0x295/0x810
schedule+0x5b/0xc0
spl_panic+0x115/0x117 [spl]
? spl_kmem_alloc_impl+0xae/0xf0 [spl]
? __kmalloc_node+0x180/0x380
avl_add+0x9b/0xb0 [zavl]
dsl_livelist_iterate+0x142/0x1f0 [zfs]
bpobj_iterate_blkptrs+0xfe/0x360 [zfs]
? dsl_deadlist_load_cache+0x250/0x250 [zfs]
bpobj_iterate_impl+0x280/0x4f0 [zfs]
? dsl_deadlist_load_cache+0x250/0x250 [zfs]
dsl_process_sub_livelist+0xb5/0xf0 [zfs]
? dsl_deadlist_cache_compare+0x20/0x20 [zfs]
spa_livelist_delete_cb+0x1ea/0x350 [zfs]
zthr_procedure+0x135/0x150 [zfs]
? zrl_is_locked+0x20/0x20 [zfs]
thread_generic_wrapper+0x6f/0x80 [spl]
? __thread_exit+0x20/0x20 [spl]
kthread+0x133/0x150
? __kthread_bind_mask+0x60/0x60
ret_from_fork+0x22/0x30
Since I am using ZFS on a mirrored rotational vdev, I can hear, that the zfs driver issues a huge amount of random write requests to the disk. As far as I know, the following iotop shows over 3K write requests queued. But I am not doing anything and I have not opened any program, which writes to the system, so there should be no requests to write to the disk.
>zpool iostat
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zroot 48.6G 189G 11 3.15K 320K 31.6M
Additionally iotop also shows a huge disk occupation by the txg_sync kthread
, again with no other processes accessing disk at that time.
> iotop
Total DISK READ : 164.10 M/s | Total DISK WRITE : 0.00 B/s
Actual DISK READ: 95.16 M/s | Actual DISK WRITE: 16.63 M/s
PID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
1521 be/4 root 0.00 B/s 0.00 B/s 0.00 % 83.48 % [txg_sync]
1552 be/0 root 0.00 B/s 0.00 B/s 0.00 % 69.83 % [vdev_autotrim]
Expected behavior
ZFS should not perform the random write requests.
Additional info
I am currently using docker (20.10.1
) with the zfs driver. So docker performs zfs clones and writes. Those datasets seem to create zfs errors, which can't be recovered through removing them. After running docker system prune -af
the zdb shows:
❯ sudo zdb -d zroot
Dataset mos [META], ID 0, cr_txg 4, 606M, 515 objects
Dataset zroot/ROOT/default@znap_2021-01-13-1004_monthly [ZPL], ID 2064, cr_txg 934007, 11.0G, 353740 objects
Dataset zroot/ROOT/default [ZPL], ID 144, cr_txg 38, 12.1G, 360045 objects
Dataset zroot/ROOT@znap_2021-01-13-1004_monthly [ZPL], ID 2060, cr_txg 934007, 96K, 6 objects
Dataset zroot/ROOT [ZPL], ID 259, cr_txg 34, 96K, 6 objects
Dataset zroot/swap [ZVOL], ID 150, cr_txg 157, 72K, 2 objects
Dataset zroot/HOME/user@znap_2021-01-13-1004_monthly [ZPL], ID 1469, cr_txg 934006, 34.9G, 822515 objects
Dataset zroot/HOME/user [ZPL], ID 515, cr_txg 91, 34.0G, 837682 objects
Dataset zroot/HOME@znap_2021-01-13-1004_monthly [ZPL], ID 1466, cr_txg 934006, 96K, 6 objects
Dataset zroot/HOME [ZPL], ID 387, cr_txg 46, 96K, 6 objects
Dataset zroot@znap_2021-01-13-1004_monthly [ZPL], ID 1342, cr_txg 934005, 96K, 6 objects
Dataset zroot [ZPL], ID 54, cr_txg 1, 96K, 6 objects
ERROR: Duplicate ALLOC: 0:3700c19000:1000 200L/200P F=0 B=1077805/500933 cksum=3e2115db729c012:2d8e864d83b6e7e1:b98e28f20bc247f7:3ee97a4be2bc609c
ERROR: Duplicate ALLOC: 0:3700c19000:1000 200L/200P F=0 B=1077805/500933 cksum=3e2115db729c012:2d8e864d83b6e7e1:b98e28f20bc247f7:3ee97a4be2bc609c
ERROR: Duplicate ALLOC: 0:3700c19000:1000 200L/200P F=0 B=1077805/500933 cksum=3e2115db729c012:2d8e864d83b6e7e1:b98e28f20bc247f7:3ee97a4be2bc609c
zfs_btree_find(tree, node, &where) == NULL (0x5583989d7f10 == 0x0)
ASSERT at ../../module/zfs/btree.c:1296:zfs_btree_add()[1] 9397 abort sudo zdb -d zroot
Things I've tried already
Since this looks like a problem in the ZFS arrangement for me, I searched through the issues and some old solaris documentations.
I have already ran a zpool scrub
to verify data integrity. This results in a successful scrub with no errors. Additionally I ran the zdb -bcsvL -AAA
command like i read in this article. Thus I am not completely sure, if I have set the zfs:zfs_recover=1
flag correctly (I set it using /etc/system
like said in this post).
The command also exits with a panic report. Thus I had to append the error correction flag -AAA
. The output looks like this:
Cached configuration:
version: 5000
name: 'zroot'
state: 0
txg: 2400861
pool_guid: 9365715525508061175
errata: 0
hostname: 'artixlinux'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 9365715525508061175
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 11552697276303360577
path: '/dev/sda'
whole_disk: 0
metaslab_array: 131
metaslab_shift: 31
ashift: 12
asize: 255765250048
is_log: 0
create_txg: 4
com.delphix:vdev_zap_leaf: 129
com.delphix:vdev_zap_top: 130
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
MOS Configuration:
version: 5000
name: 'zroot'
state: 0
txg: 2400861
pool_guid: 9365715525508061175
errata: 0
hostname: 'artixlinux'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 9365715525508061175
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 11552697276303360577
path: '/dev/sda'
whole_disk: 0
metaslab_array: 131
metaslab_shift: 31
ashift: 12
asize: 255765250048
is_log: 0
DTL: 3041
create_txg: 4
com.delphix:vdev_zap_leaf: 129
com.delphix:vdev_zap_top: 130
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
l->blk_birth == r->blk_birth (0x107a75 == 0x107934)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107a75 == 0x107934)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x10790a == 0x107934)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()avl_find(tree, new_node, &where) == NULL
ASSERT at ../../module/avl/avl.c:641:avl_add()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107905 == 0x107932)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()avl_find(tree, new_node, &where) == NULL
ASSERT at ../../module/avl/avl.c:641:avl_add()l->blk_birth == r->blk_birth (0x107932 == 0x107905)
zdb_blkptr_cb: Got error 52 reading <0, 5260, 0, 2> DVA[0]=<0:10c8636000:d000> DVA[1]=<0:1228e6c000:d000> DVA[2]=<0:f4408e000:d000> [L0 SPA space map] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/d000P birth=2551273L/2551273P fill=1 cksum=b96b4337e5c:12a9c6c2ef60af2:272200e1a4124a6d:a3236cbd3e6d4de5 -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 7183, 0, 3> DVA[0]=<0:a08de6000:a000> DVA[1]=<0:3a01cd6000:a000> DVA[2]=<0:371011a000:a000> [L0 SPA space map] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/a000P birth=2552118L/2552118P fill=1 cksum=8aa7c63331d:ab5b227864148e:d52a86dbe0754906:dbe896479a0074f4 -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 7190, 0, 3> DVA[0]=<0:120db42000:c000> DVA[1]=<0:3a04324000:c000> DVA[2]=<0:8202a000:c000> [L0 SPA space map] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/c000P birth=2552190L/2552190P fill=1 cksum=9d641bed268:f445df807aac09:73715b7fc334ba78:8581d0f6e2d0ce15 -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 10299, 0, 2> DVA[0]=<0:a3e0e8000:11000> DVA[1]=<0:3a0de9f000:11000> DVA[2]=<0:826b8000:11000> [L0 SPA space map] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/11000P birth=2552130L/2552130P fill=1 cksum=da0ba031ed9:1dbdab2ce810cf1:e63b6ae3058778df:af2b0a9bbbd4efed -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 0, 1, 3> DVA[0]=<0:c1582d000:2000> DVA[1]=<0:14031b5000:2000> DVA[2]=<0:8085a000:2000> [L1 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/2000P birth=2552203L/2552203P fill=49 cksum=1bc4cfc2d1e:812f98b82925e:15efd46b335fd978:d69c46d1d7aee47d -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 0, 1, 4> DVA[0]=<0:c15859000:2000> DVA[1]=<0:14031d9000:2000> DVA[2]=<0:8085c000:2000> [L1 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/2000P birth=2552220L/2552220P fill=141 cksum=2150faf7dfb:8b65362478f73:174a37f76e7aedf1:b51ee644db7d97b4 -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 153479, 0, 0> DVA[0]=<0:10c9f29000:1000> DVA[1]=<0:11be3b8000:1000> DVA[2]=<0:6cbbea000:1000> [L0 SPA space map] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/1000P birth=2552175L/2552175P fill=1 cksum=94871a0d72:1f4400678e74a:34f39294858a255:c232ba0ff4a0390a -- skipping
zdb_blkptr_cb: Got error 52 reading <144, 0, 1, 16> DVA[0]=<0:1424b51000:7000> DVA[1]=<0:c5ef7e000:7000> [L1 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique double size=20000L/7000P birth=2552108L/2552108P fill=3434 cksum=8b559a74703:7e5aa2f6b61e0d:a6fe050ed7b89717:fdca16de56b456df -- skipping
zdb_blkptr_cb: Got error 52 reading <515, 0, 2, 0> DVA[0]=<0:120ca2e000:2000> DVA[1]=<0:3a01ce2000:2000> [L2 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique double size=20000L/2000P birth=2552220L/2552220P fill=837415 cksum=1a2b7f91d87:93464a419dc7f:1b3cbbf438232892:4422824b5791f0b5 -- skipping
I have no other ideas how to fix the panics in the driver. The last time this happened to me on another vdev, i had to manually copy all the files from the corrupt zpool to a newly created one and I would like to avoid doing this.
Is there any way to force zfs to check its AVL tree for errors and correct them by itself?
I would be glad for every hint how to fix this corrupt vdev!