Closed
Description
System information
Type | Version/Name |
---|---|
Distribution Name | Arch Linux |
Distribution Version | rolling |
Kernel Version | 6.6.3-arch1-1 |
Architecture | amd64 |
OpenZFS Version | zfs-2.2.99-241_g3e4bef52b0 / zfs-kmod-2.2.99-241_g3e4bef52b0 - git as of 01.12.23 |
Describe the problem you're observing
get this oops when compiling openwrt on a pool running current git with
zfs_bclone_enabled=1
zfs_dmu_offset_next_sync=1
Dec 01 12:50:47 futro2 kernel: PANIC: zfs: adding existent segment to range tree (offset=11f694000 size=7000)
Dec 01 12:50:47 futro2 kernel: Showing stack for process 288
Dec 01 12:50:47 futro2 kernel: CPU: 3 PID: 288 Comm: txg_sync Tainted: P U OE 6.6.3-arch1-1 #1 6156c717f7d423f5954ce718462aaaaa43b9110d
Dec 01 12:50:47 futro2 kernel: Hardware name: FUJITSU FUTRO S740/D3544-A1, BIOS V5.0.0.13 R1.13.0 for D3544-A1x 09/23/2022
Dec 01 12:50:47 futro2 kernel: Call Trace:
Dec 01 12:50:47 futro2 kernel: <TASK>
Dec 01 12:50:47 futro2 kernel: dump_stack_lvl+0x47/0x60
Dec 01 12:50:47 futro2 kernel: vcmn_err+0xdf/0x120 [spl 8e72ae35b64a0f5a2b6fea420c9c9e09f33fc00d]
Dec 01 12:50:47 futro2 kernel: zfs_panic_recover+0x79/0xa0 [zfs 90d504f36e61841082f23aea7ae276b260ab21d6]
Dec 01 12:50:47 futro2 kernel: range_tree_add_impl+0x28f/0xea0 [zfs 90d504f36e61841082f23aea7ae276b260ab21d6]
Dec 01 12:50:47 futro2 kernel: ? __pfx_range_tree_add+0x10/0x10 [zfs 90d504f36e61841082f23aea7ae276b260ab21d6]
Dec 01 12:50:47 futro2 kernel: range_tree_vacate+0x85/0x230 [zfs 90d504f36e61841082f23aea7ae276b260ab21d6]
Dec 01 12:50:47 futro2 kernel: metaslab_sync_done+0x149/0x540 [zfs 90d504f36e61841082f23aea7ae276b260ab21d6]
Dec 01 12:50:47 futro2 kernel: vdev_sync_done+0x3a/0x90 [zfs 90d504f36e61841082f23aea7ae276b260ab21d6]
Dec 01 12:50:47 futro2 kernel: spa_sync+0x893/0x1070 [zfs 90d504f36e61841082f23aea7ae276b260ab21d6]
Dec 01 12:50:47 futro2 kernel: txg_sync_thread+0x1fe/0x3a0 [zfs 90d504f36e61841082f23aea7ae276b260ab21d6]
Dec 01 12:50:47 futro2 kernel: ? __pfx_txg_sync_thread+0x10/0x10 [zfs 90d504f36e61841082f23aea7ae276b260ab21d6]
Dec 01 12:50:47 futro2 kernel: ? __pfx_thread_generic_wrapper+0x10/0x10 [spl 8e72ae35b64a0f5a2b6fea420c9c9e09f33fc00d]
Dec 01 12:50:47 futro2 kernel: thread_generic_wrapper+0x5b/0x70 [spl 8e72ae35b64a0f5a2b6fea420c9c9e09f33fc00d]
Dec 01 12:50:47 futro2 kernel: kthread+0xe5/0x120
Dec 01 12:50:47 futro2 kernel: ? __pfx_kthread+0x10/0x10
Dec 01 12:50:47 futro2 kernel: ret_from_fork+0x31/0x50
Dec 01 12:50:47 futro2 kernel: ? __pfx_kthread+0x10/0x10
Dec 01 12:50:47 futro2 kernel: ret_from_fork_asm+0x1b/0x30
Dec 01 12:50:47 futro2 kernel: </TASK>
IO hangs and after reboot the pool can't be imported anymore:
Describe how to reproduce the problem
This is unfortunatly somewhat tricky - it happens during kernel build when the vsdo library is generated this is done via a c-program - i've already detailled all the steps in #15513 (comment) but this appears to be a slightly different bug. Also #15485 looks similiar?.
I can reproduce it reliable by building OpenWrt:
$ git clone https://github.com/openwrt/openwrt
$ cd openwrt
$ ./scripts/feeds update -a && ./scripts/feeds install -a
$ make defconfig
$ make -j$(nproc)
...
machine hangs
...
unfortunatly I still haven't figured out how to isolate the vdso generation - but build OpenWrt until the bug is triggered doesn't take that long - requirements for the build are documented here: https://openwrt.org/docs/guide-developer/toolchain/install-buildsystem#linux_gnu-linux_distributions