Description
System information
Type | Version/Name |
---|---|
Distribution Name | Debian |
Distribution Version | Sid |
Linux Kernel | 5.10.13 |
Architecture | ppc64le |
ZFS Version | 2.0.2 |
SPL Version | 2.0.2 |
I can also reproduce this using:
- kernel 5.10.13 with ZFS 2.01
- kernel 5.9.11 with ZFS 2.0.1, 2.0.2
But not with ZFS 0.8.6. I can't reproduce it at all on a similar x86 system.
Describe the problem you're observing
When scrubbing a dataset (4 drive raidz2) memory usage rises until all system memory is exhausted, and the kernel panics.
If the scrub is stopped before the kernel panics (zpool scrub -s
), memory usage drops back to the same level as before the scrub was started.
Describe how to reproduce the problem
This script reproduces the problem:
#!/bin/bash
function dump {
free -m > free."$1".txt
cat /proc/spl/kmem/slab > spl-slab."$1".txt
sudo slabtop -o > slabtop."$1".txt
sudo cat /proc/slabinfo > slabinfo."$1".txt
cat /proc/meminfo > meminfo."$1".txt
}
dump before
sudo zpool scrub data
sleep 30
dump during
sudo zpool scrub -s data
dump after
The used memory increases from 8GB to 72GB in 30 seconds, and returns to 8GB after the scrub is stopped. vmalloc
seems responsible for the majority of this:
VmallocUsed | |
---|---|
Before | 2.4 GB (2389248 KB) |
During | 68 GB (68183296 KB) |
After | 2.4 GB (2408192 KB) |
meminfo.before.txt
slabinfo.before.txt
slabtop.before.txt
spl-slab.before.txt
free.before.txt
meminfo.during.txt
slabinfo.during.txt
slabtop.during.txt
spl-slab.during.txt
free.during.txt
meminfo.after.txt
slabinfo.after.txt
slabtop.after.txt
spl-slab.after.txt
free.after.txt
Include any warning/errors/backtraces from the system logs
Last kernel logs (including OOM killer running) before kernel panics (unfortunately the panic does not get logged to disk):
oom.txt