Description
System information
Type | Version/Name |
---|---|
Distribution Name | Proxmox VE (Debian GNU/Linux 12 (bookworm)) |
Distribution Version | proxmox-ve 8.2.4 |
Kernel Version | Linux erpband-pve1 6.8.8-2-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.8-2 (2024-06-24T09:00Z) x86_64 GNU/Linux |
Architecture | amd64 |
OpenZFS Version | zfs-2.2.4-pve1 zfs-kmod-2.2.4-pve1 |
Describe the problem you're observing
Immediately after importing the pool, the ARC size (current) starts growing and quickly exceeds the Max size (zfs_arc_max).
It continues to grow until there is no free RAM left on the server. This causes an Out of memory: killed process. The server has a total of 64GB of RAM.
In arc_summary, I see that the Anonymous metadata size starts growing, and along with it, the ARC size (current) starts growing. When the Anonymous metadata size reaches 50.1 GiB and the Target size (adaptive) becomes 52.2 GiB, there is no free RAM left in the system.
arc_summary | grep -E 'ARC size (current)|Min size (hard limit)|Max size (high water)|Anonymous metadata size'
ARC size (current): 2616.7 % 52.3 GiB
Min size (hard limit): 50.0 % 1024.0 MiB
Max size (high water): 2:1 2.0 GiB
Anonymous metadata size: 99.5 % 50.9 GiB
Why is the ARC Max size limit (/sys/module/zfs/parameters/zfs_arc_max) set to 2GB being ignored? How is this possible?
What is Anonymous metadata size? I didn't find any information.
Describe how to reproduce the problem
A complete video demonstrating the problem.
https://youtu.be/ygukivJmaKw?si=jF5vZMDQm7FyfUmw&t=436
I import the pool.
root@erpband-pve1:~# zpool import -d /dev/disk/by-partlabel/ zp-erp1-hdd
After a 3-5 minutes, the ARC size starts to grow.
arc_summary | grep -E 'ARC size \(current\)|Min size \(hard limit\)|Max size \(high water\)|Anonymous metadata size'
ARC size (current): 2616.7 % 52.3 GiB
Min size (hard limit): 50.0 % 1024.0 MiB
Max size (high water): 2:1 2.0 GiB
Anonymous metadata size: 99.5 % 50.9 GiB
The maximum ARC size is set to 2GB, but it is being ignored.
ARC min 1Gb
root@erpband-pve1:~# cat /sys/module/zfs/parameters/zfs_arc_min
1073741823
ARC max 2Gb
root@erpband-pve1:~# cat /sys/module/zfs/parameters/zfs_arc_max
2147483647
I wrote a script
arc_monitor.txt
that outputs the parameters ARC size (current), Anonymous metadata size, Max size (high water), and Min size (hard limit) in a table. It clearly shows how the Anonymous metadata size grows, and along with it, the ARC size (current) grows.
time ARC size (current) Anonymous metadata size Max size (high water) Min size (hard limit)
14:50:22 2.0 GiB 813.8 MiB 2.0 GiB 1024.0 MiB
14:50:23 2.0 GiB 1.1 GiB 2.0 GiB 1024.0 MiB
14:50:24 2.3 GiB 1.3 GiB 2.0 GiB 1024.0 MiB
14:50:26 2.5 GiB 1.6 GiB 2.0 GiB 1024.0 MiB
14:50:27 2.8 GiB 1.8 GiB 2.0 GiB 1024.0 MiB
14:50:28 3.1 GiB 2.1 GiB 2.0 GiB 1024.0 MiB
14:50:30 3.3 GiB 2.4 GiB 2.0 GiB 1024.0 MiB
time ARC size (current) Anonymous metadata size Max size (high water) Min size (hard limit)
14:50:31 3.6 GiB 2.6 GiB 2.0 GiB 1024.0 MiB
14:50:32 3.8 GiB 2.9 GiB 2.0 GiB 1024.0 MiB
...
...
...
14:54:26 50.6 GiB 49.1 GiB 2.0 GiB 1024.0 MiB
14:54:27 50.8 GiB 49.3 GiB 2.0 GiB 1024.0 MiB
time ARC size (current) Anonymous metadata size Max size (high water) Min size (hard limit)
14:54:28 51.0 GiB 49.5 GiB 2.0 GiB 1024.0 MiB
14:54:29 51.2 GiB 49.7 GiB 2.0 GiB 1024.0 MiB
14:54:31 51.3 GiB 49.9 GiB 2.0 GiB 1024.0 MiB
14:54:32 51.5 GiB 50.1 GiB 2.0 GiB 1024.0 MiB
14:54:33 51.7 GiB 50.2 GiB 2.0 GiB 1024.0 MiB
14:54:34 51.8 GiB 50.4 GiB 2.0 GiB 1024.0 MiB
14:54:36 52.0 GiB 50.5 GiB 2.0 GiB 1024.0 MiB
time ARC size (current) Anonymous metadata size Max size (high water) Min size (hard limit)
14:54:37 52.2 GiB 50.7 GiB 2.0 GiB 1024.0 MiB
14:54:38 52.3 GiB 50.9 GiB 2.0 GiB 1024.0 MiB
At this point, the system froze due to lack of RAM.
Additional information
root@erpband-pve1:~# zpool status -v -t -D
pool: zp-erp1-hdd
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
zp-erp1-hdd ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
p-zp-erp1-hdd-rz2-d1 ONLINE 0 0 0 (trim unsupported)
p-zp-erp1-hdd-rz2-d2 ONLINE 0 0 0 (trim unsupported)
p-zp-erp1-hdd-rz2-d3 ONLINE 0 0 0 (trim unsupported)
p-zp-erp1-hdd-rz2-d4 ONLINE 0 0 0 (trim unsupported)
p-zp-erp1-hdd-rz2-d5 ONLINE 0 0 0 (trim unsupported)
p-zp-erp1-hdd-rz2-d6 ONLINE 0 0 0 (trim unsupported)
special
mirror-1 ONLINE 0 0 0
p-zp-erp1-hdd-nvme-spec-mi-d1 ONLINE 0 0 0 (untrimmed)
p-zp-erp1-hdd-nvme-spec-mi-d2 ONLINE 0 0 0 (untrimmed)
logs
mirror-2 ONLINE 0 0 0
p-zp-erp1-hdd-nvme-log-mi-d1 ONLINE 0 0 0 (untrimmed)
p-zp-erp1-hdd-nvme-log-mi-d2 ONLINE 0 0 0 (untrimmed)
cache
p-zp-erp1-hdd-nvme-cache-d1 ONLINE 0 0 0 (untrimmed)
p-zp-erp1-hdd-nvme-cache-d2 ONLINE 0 0 0 (untrimmed)
errors: No known data errors
dedup: DDT entries 211037580, size 313B on disk, 173B in core
bucket allocated referenced
______ ______________________________ ______________________________
refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
------ ------ ----- ----- ----- ------ ----- ----- -----
1 189M 23.6T 15.5T 16.3T 189M 23.6T 15.5T 16.3T
2 10.0M 1.25T 928G 962G 21.9M 2.74T 1.96T 2.04T
4 1.75M 224G 125G 135G 8.76M 1.10T 618G 667G
8 688K 86.1G 40.2G 44.9G 6.73M 861G 400G 448G
16 46.6K 5.83G 3.36G 3.63G 978K 122G 71.0G 76.5G
32 3.96K 507M 259M 284M 165K 20.6G 10.7G 11.7G
64 624 78M 26.3M 31.5M 51.8K 6.48G 2.13G 2.56G
128 158 19.8M 6.51M 7.57M 27.2K 3.40G 1.11G 1.29G
256 199 24.9M 12M 12.8M 71.6K 8.95G 4.56G 4.83G
512 95 11.9M 6.52M 6.76M 77.1K 9.64G 5.61G 5.77G
1K 32 4M 2.51M 2.58M 34.7K 4.34G 2.60G 2.68G
2K 7 896K 56K 112K 17.9K 2.24G 143M 287M
4K 1 128K 8K 16.0K 5.39K 690M 43.1M 86.1M
8K 1 128K 8K 16.0K 11.7K 1.47G 93.9M 188M
64K 1 128K 8K 16.0K 70.1K 8.77G 561M 1.09G
Total 201M 25.2T 16.6T 17.4T 228M 28.5T 18.6T 19.5T
pool: zp-erp1-nvme
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
zp-erp1-nvme ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvme-eui.002538ba31b4e232-part4 ONLINE 0 0 0 (100% trimmed, completed at Sun Jul 7 00:24:42 2024)
nvme-Samsung_SSD_980_PRO_1TB_S5GXNU0WC31300E_1-part4 ONLINE 0 0 0 (100% trimmed, completed at Sun Jul 7 00:24:42 2024)
errors: No known data errors
dedup: no DDT entries
root@erpband-pve1:~# zpool list -v
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
zp-erp1-hdd 44.0T 26.3T 17.7T - - 0% 59% 1.12x ONLINE -
raidz2-0 43.7T 26.2T 17.5T - - 0% 60.0% - ONLINE
p-zp-erp1-hdd-rz2-d1 7.28T - - - - - - - ONLINE
p-zp-erp1-hdd-rz2-d2 7.28T - - - - - - - ONLINE
p-zp-erp1-hdd-rz2-d3 7.28T - - - - - - - ONLINE
p-zp-erp1-hdd-rz2-d4 7.28T - - - - - - - ONLINE
p-zp-erp1-hdd-rz2-d5 7.28T - - - - - - - ONLINE
p-zp-erp1-hdd-rz2-d6 7.28T - - - - - - - ONLINE
special - - - - - - - - -
mirror-1 348G 83.7G 264G - - 53% 24.0% - ONLINE
p-zp-erp1-hdd-nvme-spec-mi-d1 350G - - - - - - - ONLINE
p-zp-erp1-hdd-nvme-spec-mi-d2 350G - - - - - - - ONLINE
logs - - - - - - - - -
mirror-2 6.50G 0 6.50G - - 0% 0.00% - ONLINE
p-zp-erp1-hdd-nvme-log-mi-d1 7G - - - - - - - ONLINE
p-zp-erp1-hdd-nvme-log-mi-d2 7G - - - - - - - ONLINE
cache - - - - - - - - -
p-zp-erp1-hdd-nvme-cache-d1 404G 122G 282G - - 0% 30.2% - ONLINE
p-zp-erp1-hdd-nvme-cache-d2 404G 122G 282G - - 0% 30.3% - ONLINE
zp-erp1-nvme 99.5G 41.0G 58.5G - - 50% 41% 1.00x ONLINE -
mirror-0 99.5G 41.0G 58.5G - - 50% 41.2% - ONLINE
nvme-eui.002538ba31b4e232-part4 100G - - - - - - - ONLINE
nvme-Samsung_SSD_980_PRO_1TB_S5GXNU0WC31300E_1-part4 100G - - - - - - - ONLINE
root@erpband-pve1:~# lsblk -o NAME,PARTLABEL,SIZE,PATH,mountpoint
NAME PARTLABEL SIZE PATH MOUNTPOINT
sda 7.3T /dev/sda
└─sda1 p-zp-erp1-hdd-rz2-d1 7.3T /dev/sda1
sdb 7.3T /dev/sdb
└─sdb1 p-zp-erp1-hdd-rz2-d2 7.3T /dev/sdb1
sdc 7.3T /dev/sdc
└─sdc1 p-zp-erp1-hdd-rz2-d3 7.3T /dev/sdc1
sdd 7.3T /dev/sdd
└─sdd1 p-zp-erp1-hdd-rz2-d4 7.3T /dev/sdd1
sde 7.3T /dev/sde
└─sde1 p-zp-erp1-hdd-rz2-d5 7.3T /dev/sde1
sdf 7.3T /dev/sdf
└─sdf1 p-zp-erp1-hdd-rz2-d6 7.3T /dev/sdf1
zd0 1M /dev/zd0
zd16 80G /dev/zd16
├─zd16p1 EFI system partition 100M /dev/zd16p1
├─zd16p2 Microsoft reserved partition 16M /dev/zd16p2
└─zd16p3 Basic data partition 79.9G /dev/zd16p3
nvme0n1 931.5G /dev/nvme0n1
├─nvme0n1p1 BIOS boot partition 3M /dev/nvme0n1p1
├─nvme0n1p2 EFI system partition 512M /dev/nvme0n1p2
├─nvme0n1p3 Linux RAID 70G /dev/nvme0n1p3
│ └─md0 69.9G /dev/md0
│ ├─pve-root 50G /dev/mapper/pve-root /
│ └─pve-swap 16G /dev/mapper/pve-swap [SWAP]
├─nvme0n1p4 p-zp-erp1-nvme-mi-d1 100G /dev/nvme0n1p4
├─nvme0n1p5 p-zp-erp1-hdd-nvme-spec-mi-d1 350G /dev/nvme0n1p5
├─nvme0n1p6 p-zp-erp1-hdd-nvme-cache-d1 404G /dev/nvme0n1p6
└─nvme0n1p7 p-zp-erp1-hdd-nvme-log-mi-d1 7G /dev/nvme0n1p7
nvme1n1 931.5G /dev/nvme1n1
├─nvme1n1p1 BIOS boot partition 3M /dev/nvme1n1p1
├─nvme1n1p2 EFI system partition 512M /dev/nvme1n1p2
├─nvme1n1p3 Linux RAID 70G /dev/nvme1n1p3
│ └─md0 69.9G /dev/md0
│ ├─pve-root 50G /dev/mapper/pve-root /
│ └─pve-swap 16G /dev/mapper/pve-swap [SWAP]
├─nvme1n1p4 p-zp-erp1-nvme-mi-d2 100G /dev/nvme1n1p4
├─nvme1n1p5 p-zp-erp1-hdd-nvme-spec-mi-d2 350G /dev/nvme1n1p5
├─nvme1n1p6 p-zp-erp1-hdd-nvme-cache-d2 404G /dev/nvme1n1p6
└─nvme1n1p7 p-zp-erp1-hdd-nvme-log-mi-d2 7G /dev/nvme1n1p7
root@erpband-pve1:~# zfs version
zfs-2.2.4-pve1
zfs-kmod-2.2.4-pve1
root@erpband-pve1:~# uname -a
Linux erpband-pve1 6.8.8-2-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.8-2 (2024-06-24T09:00Z) x86_64 GNU/Linux
root@erpband-pve1:~# lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 12 (bookworm)
Release: 12
Codename: bookworm
root@erpband-pve1:~# zpool get all zp-erp1-hdd
NAME PROPERTY VALUE SOURCE
zp-erp1-hdd size 44.0T -
zp-erp1-hdd capacity 59% -
zp-erp1-hdd altroot - default
zp-erp1-hdd health ONLINE -
zp-erp1-hdd guid 12804097997330970578 -
zp-erp1-hdd version - default
zp-erp1-hdd bootfs - default
zp-erp1-hdd delegation on default
zp-erp1-hdd autoreplace off default
zp-erp1-hdd cachefile - default
zp-erp1-hdd failmode wait default
zp-erp1-hdd listsnapshots off default
zp-erp1-hdd autoexpand off default
zp-erp1-hdd dedupratio 1.12x -
zp-erp1-hdd free 17.9T -
zp-erp1-hdd allocated 26.1T -
zp-erp1-hdd readonly off -
zp-erp1-hdd ashift 13 local
zp-erp1-hdd comment - default
zp-erp1-hdd expandsize - -
zp-erp1-hdd freeing 0 -
zp-erp1-hdd fragmentation 0% -
zp-erp1-hdd leaked 0 -
zp-erp1-hdd multihost off default
zp-erp1-hdd checkpoint - -
zp-erp1-hdd load_guid 14916355245013050689 -
zp-erp1-hdd autotrim off default
zp-erp1-hdd compatibility off default
zp-erp1-hdd bcloneused 0 -
zp-erp1-hdd bclonesaved 0 -
zp-erp1-hdd bcloneratio 1.00x -
zp-erp1-hdd feature@async_destroy enabled local
zp-erp1-hdd feature@empty_bpobj active local
zp-erp1-hdd feature@lz4_compress active local
zp-erp1-hdd feature@multi_vdev_crash_dump enabled local
zp-erp1-hdd feature@spacemap_histogram active local
zp-erp1-hdd feature@enabled_txg active local
zp-erp1-hdd feature@hole_birth active local
zp-erp1-hdd feature@extensible_dataset active local
zp-erp1-hdd feature@embedded_data active local
zp-erp1-hdd feature@bookmarks enabled local
zp-erp1-hdd feature@filesystem_limits enabled local
zp-erp1-hdd feature@large_blocks enabled local
zp-erp1-hdd feature@large_dnode enabled local
zp-erp1-hdd feature@sha512 enabled local
zp-erp1-hdd feature@skein enabled local
zp-erp1-hdd feature@edonr enabled local
zp-erp1-hdd feature@userobj_accounting active local
zp-erp1-hdd feature@encryption enabled local
zp-erp1-hdd feature@project_quota active local
zp-erp1-hdd feature@device_removal enabled local
zp-erp1-hdd feature@obsolete_counts enabled local
zp-erp1-hdd feature@zpool_checkpoint enabled local
zp-erp1-hdd feature@spacemap_v2 active local
zp-erp1-hdd feature@allocation_classes active local
zp-erp1-hdd feature@resilver_defer enabled local
zp-erp1-hdd feature@bookmark_v2 enabled local
zp-erp1-hdd feature@redaction_bookmarks enabled local
zp-erp1-hdd feature@redacted_datasets enabled local
zp-erp1-hdd feature@bookmark_written enabled local
zp-erp1-hdd feature@log_spacemap active local
zp-erp1-hdd feature@livelist enabled local
zp-erp1-hdd feature@device_rebuild enabled local
zp-erp1-hdd feature@zstd_compress active local
zp-erp1-hdd feature@draid enabled local
zp-erp1-hdd feature@zilsaxattr enabled local
zp-erp1-hdd feature@head_errlog active local
zp-erp1-hdd feature@blake3 enabled local
zp-erp1-hdd feature@block_cloning enabled local
zp-erp1-hdd feature@vdev_zaps_v2 active local
Include any warning/errors/backtraces from the system logs
There are no errors or warnings in the logs. The pool imports successfully, but it consumes all the RAM. Here is the log file for the period from the pool import to the system freeze:
logs_2024-07-07_18-40-36.txt
.