Skip to content

ZFS Kernel Panic at boot on Arch Linux #11480

Closed
@TheDome

Description

@TheDome

Hi there,

System info

I am using the latest stable ZFS-dkms release on Artix linux with s6-init.

Type Version
System Artix Linux with s6-init
System Version 5.10.4-artix2-1
System Architecture Arch linux
ZFS Version zfs-2.0.1-1

Current behavior

When I currently boot the system, the Vconsole shows a log message regarding the ZFS AVL tree.

VERIFY(avl_find(tree, new_node, &where) == NULL) failed
PANIC at avl.c:641:avl_add()
CPU: 6 PID: 1019 Comm: z_livelist_dest Tainted: P           OE     5.10.4-artix2-1 #1
Call Trace:
dump_stack+0x6b/0x83
spl_panic+0xef/0x117 [spl]
? allocate_slab+0x30b/0x4c0
? spl_kmem_alloc_impl+0xae/0xf0 [spl]
? __kmalloc_node+0x180/0x380
avl_add+0x9b/0xb0 [zavl]
dsl_livelist_iterate+0x142/0x1f0 [zfs]
 bpobj_iterate_blkptrs+0xfe/0x360 [zfs]
? dsl_deadlist_load_cache+0x250/0x250 [zfs]
bpobj_iterate_impl+0x280/0x4f0 [zfs]
? dsl_deadlist_load_cache+0x250/0x250 [zfs]
 dsl_process_sub_livelist+0xb5/0xf0 [zfs]
? dsl_deadlist_cache_compare+0x20/0x20 [zfs]
spa_livelist_delete_cb+0x1ea/0x350 [zfs]
zthr_procedure+0x135/0x150 [zfs]
? zrl_is_locked+0x20/0x20 [zfs]
thread_generic_wrapper+0x6f/0x80 [spl]
? __thread_exit+0x20/0x20 [spl]
 kthread+0x133/0x150
? __kthread_bind_mask+0x60/0x60
ret_from_fork+0x22/0x30

Additionally the kernel kills a ZFS thread every now and then resulting in the following stack trace:

INFO: task z_livelist_dest:1019 blocked for more than 491 seconds.
 Tainted: P           OE     5.10.4-artix2-1 #1
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:z_livelist_dest state:D stack:    0 pid: 1019 ppid:     2 flags:0x00004000
 Call Trace:
 __schedule+0x295/0x810
 schedule+0x5b/0xc0
 spl_panic+0x115/0x117 [spl]
 ? spl_kmem_alloc_impl+0xae/0xf0 [spl]
 ? __kmalloc_node+0x180/0x380
 avl_add+0x9b/0xb0 [zavl]
 dsl_livelist_iterate+0x142/0x1f0 [zfs]
 bpobj_iterate_blkptrs+0xfe/0x360 [zfs]
 ? dsl_deadlist_load_cache+0x250/0x250 [zfs]
 bpobj_iterate_impl+0x280/0x4f0 [zfs]
 ? dsl_deadlist_load_cache+0x250/0x250 [zfs]
 dsl_process_sub_livelist+0xb5/0xf0 [zfs]
 ? dsl_deadlist_cache_compare+0x20/0x20 [zfs]
 spa_livelist_delete_cb+0x1ea/0x350 [zfs]
 zthr_procedure+0x135/0x150 [zfs]
 ? zrl_is_locked+0x20/0x20 [zfs]
 thread_generic_wrapper+0x6f/0x80 [spl]
 ? __thread_exit+0x20/0x20 [spl]
 kthread+0x133/0x150
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x22/0x30

Since I am using ZFS on a mirrored rotational vdev, I can hear, that the zfs driver issues a huge amount of random write requests to the disk. As far as I know, the following iotop shows over 3K write requests queued. But I am not doing anything and I have not opened any program, which writes to the system, so there should be no requests to write to the disk.

>zpool iostat
               capacity     operations     bandwidth
pool         alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
zroot        48.6G   189G     11  3.15K   320K  31.6M

Additionally iotop also shows a huge disk occupation by the txg_sync kthread, again with no other processes accessing disk at that time.

> iotop
Total DISK READ :     164.10 M/s | Total DISK WRITE :       0.00 B/s
Actual DISK READ:      95.16 M/s | Actual DISK WRITE:      16.63 M/s
  PID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1521 be/4 root        0.00 B/s    0.00 B/s  0.00 % 83.48 % [txg_sync]
 1552 be/0 root        0.00 B/s    0.00 B/s  0.00 % 69.83 % [vdev_autotrim]

Expected behavior

ZFS should not perform the random write requests.

Additional info

I am currently using docker (20.10.1) with the zfs driver. So docker performs zfs clones and writes. Those datasets seem to create zfs errors, which can't be recovered through removing them. After running docker system prune -af the zdb shows:

❯ sudo zdb -d zroot
Dataset mos [META], ID 0, cr_txg 4, 606M, 515 objects
Dataset zroot/ROOT/default@znap_2021-01-13-1004_monthly [ZPL], ID 2064, cr_txg 934007, 11.0G, 353740 objects
Dataset zroot/ROOT/default [ZPL], ID 144, cr_txg 38, 12.1G, 360045 objects
Dataset zroot/ROOT@znap_2021-01-13-1004_monthly [ZPL], ID 2060, cr_txg 934007, 96K, 6 objects
Dataset zroot/ROOT [ZPL], ID 259, cr_txg 34, 96K, 6 objects
Dataset zroot/swap [ZVOL], ID 150, cr_txg 157, 72K, 2 objects
Dataset zroot/HOME/user@znap_2021-01-13-1004_monthly [ZPL], ID 1469, cr_txg 934006, 34.9G, 822515 objects
Dataset zroot/HOME/user [ZPL], ID 515, cr_txg 91, 34.0G, 837682 objects
Dataset zroot/HOME@znap_2021-01-13-1004_monthly [ZPL], ID 1466, cr_txg 934006, 96K, 6 objects
Dataset zroot/HOME [ZPL], ID 387, cr_txg 46, 96K, 6 objects
Dataset zroot@znap_2021-01-13-1004_monthly [ZPL], ID 1342, cr_txg 934005, 96K, 6 objects
Dataset zroot [ZPL], ID 54, cr_txg 1, 96K, 6 objects
	ERROR: Duplicate ALLOC: 0:3700c19000:1000 200L/200P F=0 B=1077805/500933 cksum=3e2115db729c012:2d8e864d83b6e7e1:b98e28f20bc247f7:3ee97a4be2bc609c
	ERROR: Duplicate ALLOC: 0:3700c19000:1000 200L/200P F=0 B=1077805/500933 cksum=3e2115db729c012:2d8e864d83b6e7e1:b98e28f20bc247f7:3ee97a4be2bc609c
	ERROR: Duplicate ALLOC: 0:3700c19000:1000 200L/200P F=0 B=1077805/500933 cksum=3e2115db729c012:2d8e864d83b6e7e1:b98e28f20bc247f7:3ee97a4be2bc609c
zfs_btree_find(tree, node, &where) == NULL (0x5583989d7f10 == 0x0)
ASSERT at ../../module/zfs/btree.c:1296:zfs_btree_add()[1]    9397 abort      sudo zdb -d zroot

Things I've tried already

Since this looks like a problem in the ZFS arrangement for me, I searched through the issues and some old solaris documentations.
I have already ran a zpool scrub to verify data integrity. This results in a successful scrub with no errors. Additionally I ran the zdb -bcsvL -AAA command like i read in this article. Thus I am not completely sure, if I have set the zfs:zfs_recover=1 flag correctly (I set it using /etc/systemlike said in this post).
The command also exits with a panic report. Thus I had to append the error correction flag -AAA. The output looks like this:

Cached configuration:
        version: 5000
        name: 'zroot'
        state: 0
        txg: 2400861
        pool_guid: 9365715525508061175
        errata: 0
        hostname: 'artixlinux'
        com.delphix:has_per_vdev_zaps
        vdev_children: 1
        vdev_tree:
            type: 'root'
            id: 0
            guid: 9365715525508061175
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 11552697276303360577
                path: '/dev/sda' 
                whole_disk: 0
                metaslab_array: 131
                metaslab_shift: 31
                ashift: 12
                asize: 255765250048
                is_log: 0
                create_txg: 4
                com.delphix:vdev_zap_leaf: 129
                com.delphix:vdev_zap_top: 130
        features_for_read:
            com.delphix:hole_birth
            com.delphix:embedded_data

MOS Configuration:
        version: 5000
        name: 'zroot'
        state: 0
        txg: 2400861
        pool_guid: 9365715525508061175
        errata: 0
        hostname: 'artixlinux'
        com.delphix:has_per_vdev_zaps
        vdev_children: 1
        vdev_tree:
            type: 'root'
            id: 0
            guid: 9365715525508061175
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 11552697276303360577
                path: '/dev/sda'
                whole_disk: 0
                metaslab_array: 131
                metaslab_shift: 31
                ashift: 12
                asize: 255765250048
                is_log: 0
                DTL: 3041
                create_txg: 4
                com.delphix:vdev_zap_leaf: 129
                com.delphix:vdev_zap_top: 130
        features_for_read:
            com.delphix:hole_birth
            com.delphix:embedded_data

l->blk_birth == r->blk_birth (0x107a75 == 0x107934)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107a75 == 0x107934)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x10790a == 0x107934)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()avl_find(tree, new_node, &where) == NULL
ASSERT at ../../module/avl/avl.c:641:avl_add()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107934 == 0x10790a)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()l->blk_birth == r->blk_birth (0x107905 == 0x107932)
ASSERT at ../../module/zfs/dsl_deadlist.c:933:livelist_compare()avl_find(tree, new_node, &where) == NULL
ASSERT at ../../module/avl/avl.c:641:avl_add()l->blk_birth == r->blk_birth (0x107932 == 0x107905)
zdb_blkptr_cb: Got error 52 reading <0, 5260, 0, 2> DVA[0]=<0:10c8636000:d000> DVA[1]=<0:1228e6c000:d000> DVA[2]=<0:f4408e000:d000> [L0 SPA space map] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/d000P birth=2551273L/2551273P fill=1 cksum=b96b4337e5c:12a9c6c2ef60af2:272200e1a4124a6d:a3236cbd3e6d4de5 -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 7183, 0, 3> DVA[0]=<0:a08de6000:a000> DVA[1]=<0:3a01cd6000:a000> DVA[2]=<0:371011a000:a000> [L0 SPA space map] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/a000P birth=2552118L/2552118P fill=1 cksum=8aa7c63331d:ab5b227864148e:d52a86dbe0754906:dbe896479a0074f4 -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 7190, 0, 3> DVA[0]=<0:120db42000:c000> DVA[1]=<0:3a04324000:c000> DVA[2]=<0:8202a000:c000> [L0 SPA space map] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/c000P birth=2552190L/2552190P fill=1 cksum=9d641bed268:f445df807aac09:73715b7fc334ba78:8581d0f6e2d0ce15 -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 10299, 0, 2> DVA[0]=<0:a3e0e8000:11000> DVA[1]=<0:3a0de9f000:11000> DVA[2]=<0:826b8000:11000> [L0 SPA space map] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/11000P birth=2552130L/2552130P fill=1 cksum=da0ba031ed9:1dbdab2ce810cf1:e63b6ae3058778df:af2b0a9bbbd4efed -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 0, 1, 3> DVA[0]=<0:c1582d000:2000> DVA[1]=<0:14031b5000:2000> DVA[2]=<0:8085a000:2000> [L1 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/2000P birth=2552203L/2552203P fill=49 cksum=1bc4cfc2d1e:812f98b82925e:15efd46b335fd978:d69c46d1d7aee47d -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 0, 1, 4> DVA[0]=<0:c15859000:2000> DVA[1]=<0:14031d9000:2000> DVA[2]=<0:8085c000:2000> [L1 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/2000P birth=2552220L/2552220P fill=141 cksum=2150faf7dfb:8b65362478f73:174a37f76e7aedf1:b51ee644db7d97b4 -- skipping
zdb_blkptr_cb: Got error 52 reading <0, 153479, 0, 0> DVA[0]=<0:10c9f29000:1000> DVA[1]=<0:11be3b8000:1000> DVA[2]=<0:6cbbea000:1000> [L0 SPA space map] fletcher4 lz4 unencrypted LE contiguous unique triple size=20000L/1000P birth=2552175L/2552175P fill=1 cksum=94871a0d72:1f4400678e74a:34f39294858a255:c232ba0ff4a0390a -- skipping
zdb_blkptr_cb: Got error 52 reading <144, 0, 1, 16> DVA[0]=<0:1424b51000:7000> DVA[1]=<0:c5ef7e000:7000> [L1 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique double size=20000L/7000P birth=2552108L/2552108P fill=3434 cksum=8b559a74703:7e5aa2f6b61e0d:a6fe050ed7b89717:fdca16de56b456df -- skipping
zdb_blkptr_cb: Got error 52 reading <515, 0, 2, 0> DVA[0]=<0:120ca2e000:2000> DVA[1]=<0:3a01ce2000:2000> [L2 DMU dnode] fletcher4 lz4 unencrypted LE contiguous unique double size=20000L/2000P birth=2552220L/2552220P fill=837415 cksum=1a2b7f91d87:93464a419dc7f:1b3cbbf438232892:4422824b5791f0b5 -- skipping

I have no other ideas how to fix the panics in the driver. The last time this happened to me on another vdev, i had to manually copy all the files from the corrupt zpool to a newly created one and I would like to avoid doing this.

Is there any way to force zfs to check its AVL tree for errors and correct them by itself?
I would be glad for every hint how to fix this corrupt vdev!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions