Description
System information
Type | Version/Name |
---|---|
Distribution Name | Debian |
Distribution Version | 12 |
Kernel Version | 6.1.0-22 |
Architecture | amd64 |
OpenZFS Version | zfs-2.3.0-rc2 |
A testing VM has 64GB of RAM and 32GB of RAM set for ZFS ARC via min and max parameters. RAIDZ1 pool with is configured with deduplication and fast dedup feature is enabled and active.
zpool status
pool: zpool16k
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
zpool16k ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
nvme0n1 ONLINE 0 0 0
nvme1n1 ONLINE 0 0 0
nvme2n1 ONLINE 0 0 0
nvme3n1 ONLINE 0 0 0
nvme4n1 ONLINE 0 0 0
errors: No known data errors
zpool config
NAME PROPERTY VALUE SOURCE
zpool16k size 14.5T -
zpool16k capacity 84% -
zpool16k altroot - default
zpool16k health ONLINE -
zpool16k guid 6048011872382422538 -
zpool16k version - default
zpool16k bootfs - default
zpool16k delegation on default
zpool16k autoreplace off default
zpool16k cachefile - default
zpool16k failmode wait default
zpool16k listsnapshots off default
zpool16k autoexpand off default
zpool16k dedupratio 1.00x -
zpool16k free 2.23T -
zpool16k allocated 12.3T -
zpool16k readonly off -
zpool16k ashift 12 local
zpool16k comment - default
zpool16k expandsize - -
zpool16k freeing 0 -
zpool16k fragmentation 1% -
zpool16k leaked 0 -
zpool16k multihost off default
zpool16k checkpoint - -
zpool16k load_guid 15783377096588126353 -
zpool16k autotrim off default
zpool16k compatibility off default
zpool16k bcloneused 0 -
zpool16k bclonesaved 0 -
zpool16k bcloneratio 1.00x -
zpool16k dedup_table_size 212G -
zpool16k dedup_table_quota auto default
zpool16k feature@async_destroy enabled local
zpool16k feature@empty_bpobj enabled local
zpool16k feature@lz4_compress active local
zpool16k feature@multi_vdev_crash_dump enabled local
zpool16k feature@spacemap_histogram active local
zpool16k feature@enabled_txg active local
zpool16k feature@hole_birth active local
zpool16k feature@extensible_dataset active local
zpool16k feature@embedded_data active local
zpool16k feature@bookmarks enabled local
zpool16k feature@filesystem_limits enabled local
zpool16k feature@large_blocks enabled local
zpool16k feature@large_dnode enabled local
zpool16k feature@sha512 enabled local
zpool16k feature@skein enabled local
zpool16k feature@edonr enabled local
zpool16k feature@userobj_accounting active local
zpool16k feature@encryption enabled local
zpool16k feature@project_quota active local
zpool16k feature@device_removal enabled local
zpool16k feature@obsolete_counts enabled local
zpool16k feature@zpool_checkpoint enabled local
zpool16k feature@spacemap_v2 active local
zpool16k feature@allocation_classes enabled local
zpool16k feature@resilver_defer enabled local
zpool16k feature@bookmark_v2 enabled local
zpool16k feature@redaction_bookmarks enabled local
zpool16k feature@redacted_datasets enabled local
zpool16k feature@bookmark_written enabled local
zpool16k feature@log_spacemap active local
zpool16k feature@livelist enabled local
zpool16k feature@device_rebuild enabled local
zpool16k feature@zstd_compress enabled local
zpool16k feature@draid enabled local
zpool16k feature@zilsaxattr enabled local
zpool16k feature@head_errlog active local
zpool16k feature@blake3 enabled local
zpool16k feature@block_cloning enabled local
zpool16k feature@vdev_zaps_v2 active local
zpool16k feature@redaction_list_spill enabled local
zpool16k feature@raidz_expansion enabled local
zpool16k feature@fast_dedup active local
zpool16k feature@longname enabled local
zpool16k feature@large_microzap enabled local
zfs config
NAME PROPERTY VALUE SOURCE
zpool16k type filesystem -
zpool16k creation Mon Oct 28 13:24 2024 -
zpool16k used 9.84T -
zpool16k available 1.66T -
zpool16k referenced 9.63T -
zpool16k compressratio 1.00x -
zpool16k mounted yes -
zpool16k quota none default
zpool16k reservation none default
zpool16k recordsize 16K local
zpool16k mountpoint /zpool16k default
zpool16k sharenfs off default
zpool16k checksum on default
zpool16k compression off local
zpool16k atime on default
zpool16k devices on default
zpool16k exec on default
zpool16k setuid on default
zpool16k readonly off default
zpool16k zoned off default
zpool16k snapdir hidden default
zpool16k aclmode discard default
zpool16k aclinherit restricted default
zpool16k createtxg 1 -
zpool16k canmount on default
zpool16k xattr on local
zpool16k copies 1 default
zpool16k version 5 -
zpool16k utf8only on -
zpool16k normalization formD -
zpool16k casesensitivity sensitive -
zpool16k vscan off default
zpool16k nbmand off default
zpool16k sharesmb off default
zpool16k refquota none default
zpool16k refreservation none default
zpool16k guid 3860442583779050184 -
zpool16k primarycache all default
zpool16k secondarycache all default
zpool16k usedbysnapshots 0B -
zpool16k usedbydataset 9.63T -
zpool16k usedbychildren 212G -
zpool16k usedbyrefreservation 0B -
zpool16k logbias latency default
zpool16k objsetid 54 -
zpool16k dedup on local
zpool16k mlslabel none default
zpool16k sync disabled local
zpool16k dnodesize legacy default
zpool16k refcompressratio 1.00x -
zpool16k written 9.63T -
zpool16k logicalused 8.08T -
zpool16k logicalreferenced 8.02T -
zpool16k volmode default default
zpool16k filesystem_limit none default
zpool16k snapshot_limit none default
zpool16k filesystem_count none default
zpool16k snapshot_count none default
zpool16k snapdev hidden default
zpool16k acltype posix local
zpool16k context none default
zpool16k fscontext none default
zpool16k defcontext none default
zpool16k rootcontext none default
zpool16k relatime on local
zpool16k redundant_metadata all default
zpool16k overlay on default
zpool16k encryption off default
zpool16k keylocation none default
zpool16k keyformat none default
zpool16k pbkdf2iters 0 default
zpool16k special_small_blocks 0 default
zpool16k prefetch all default
zpool16k direct standard default
zpool16k longname off default
zpool status with DDT (zpool status -D)
pool: zpool16k
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
zpool16k ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
nvme0n1 ONLINE 0 0 0
nvme1n1 ONLINE 0 0 0
nvme2n1 ONLINE 0 0 0
nvme3n1 ONLINE 0 0 0
nvme4n1 ONLINE 0 0 0
errors: No known data errors
dedup: DDT entries 536870912, size 212G on disk, 136G in core
bucket allocated referenced
______ ______________________________ ______________________________
refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
------ ------ ----- ----- ----- ------ ----- ----- -----
1 512M 8T 8T 9.59T 512M 8T 8T 9.59T
Total 512M 8T 8T 9.59T 512M 8T 8T 9.59T
arcstat
21 1 0x01 147 39984 64501438472 54609823965886
name type data
hits 4 3178411426
iohits 4 1496
misses 4 64419903
demand_data_hits 4 0
demand_data_iohits 4 0
demand_data_misses 4 0
demand_metadata_hits 4 3178410242
demand_metadata_iohits 4 1496
demand_metadata_misses 4 64418378
prefetch_data_hits 4 0
prefetch_data_iohits 4 0
prefetch_data_misses 4 0
prefetch_metadata_hits 4 1184
prefetch_metadata_iohits 4 0
prefetch_metadata_misses 4 1525
mru_hits 4 225453614
mru_ghost_hits 4 11470786
mfu_hits 4 2952957812
mfu_ghost_hits 4 18958526
uncached_hits 4 0
deleted 4 572151552
mutex_miss 4 4315505
access_skip 4 0
evict_skip 4 5
evict_not_enough 4 457326
evict_l2_cached 4 0
evict_l2_eligible 4 11090623758336
evict_l2_eligible_mfu 4 1735208350208
evict_l2_eligible_mru 4 9355415408128
evict_l2_ineligible 4 8192
evict_l2_skip 4 0
hash_elements 4 3515346
hash_elements_max 4 3754460
hash_collisions 4 175363571
hash_chains 4 562112
hash_chain_max 4 9
meta 4 4294389235
pd 4 2147483648
pm 4 1925069853
c 4 34359738368
c_min 4 34359738368
c_max 4 34359738368
size 4 34373397632
compressed_size 4 31970503680
uncompressed_size 4 68214341120
overhead_size 4 1515084288
hdr_size 4 844770080
data_size 4 147505152
metadata_size 4 33338082816
dbuf_size 4 13324032
dnode_size 4 2105824
bonus_size 4 310400
anon_size 4 0
anon_data 4 0
anon_metadata 4 0
anon_evictable_data 4 0
anon_evictable_metadata 4 0
mru_size 4 17549727232
mru_data 4 147505152
mru_metadata 4 17402222080
mru_evictable_data 4 147505152
mru_evictable_metadata 4 15175630848
mru_ghost_size 4 24628658176
mru_ghost_data 4 16587096064
mru_ghost_metadata 4 8041562112
mru_ghost_evictable_data 4 16587096064
mru_ghost_evictable_metadata 4 8041562112
mfu_size 4 15935860736
mfu_data 4 0
mfu_metadata 4 15935860736
mfu_evictable_data 4 0
mfu_evictable_metadata 4 15934646272
mfu_ghost_size 4 6871023616
mfu_ghost_data 4 0
mfu_ghost_metadata 4 6871023616
mfu_ghost_evictable_data 4 0
mfu_ghost_evictable_metadata 4 6871023616
uncached_size 4 0
uncached_data 4 0
uncached_metadata 4 0
uncached_evictable_data 4 0
uncached_evictable_metadata 4 0
l2_hits 4 0
l2_misses 4 0
l2_prefetch_asize 4 0
l2_mru_asize 4 0
l2_mfu_asize 4 0
l2_bufc_data_asize 4 0
l2_bufc_metadata_asize 4 0
l2_feeds 4 0
l2_rw_clash 4 0
l2_read_bytes 4 0
l2_write_bytes 4 0
l2_writes_sent 4 0
l2_writes_done 4 0
l2_writes_error 4 0
l2_writes_lock_retry 4 0
l2_evict_lock_retry 4 0
l2_evict_reading 4 0
l2_evict_l1cached 4 0
l2_free_on_write 4 0
l2_abort_lowmem 4 0
l2_cksum_bad 4 0
l2_io_error 4 0
l2_size 4 0
l2_asize 4 0
l2_hdr_size 4 0
l2_log_blk_writes 4 0
l2_log_blk_avg_asize 4 0
l2_log_blk_asize 4 0
l2_log_blk_count 4 0
l2_data_to_meta_ratio 4 0
l2_rebuild_success 4 0
l2_rebuild_unsupported 4 0
l2_rebuild_io_errors 4 0
l2_rebuild_dh_errors 4 0
l2_rebuild_cksum_lb_errors 4 0
l2_rebuild_lowmem 4 0
l2_rebuild_size 4 0
l2_rebuild_asize 4 0
l2_rebuild_bufs 4 0
l2_rebuild_bufs_precached 4 0
l2_rebuild_log_blks 4 0
memory_throttle_count 4 0
memory_direct_count 4 0
memory_indirect_count 4 0
memory_all_bytes 4 67418071040
memory_free_bytes 4 15142215680
memory_available_bytes 3 12786138880
arc_no_grow 4 0
arc_tempreserve 4 0
arc_loaned_bytes 4 0
arc_prune 4 0
arc_meta_used 4 34198593152
arc_dnode_limit 4 3435973836
async_upgrade_sync 4 0
predictive_prefetch 4 2709
demand_hit_predictive_prefetch 4 1180
demand_iohit_predictive_prefetch 4 1507
prescient_prefetch 4 0
demand_hit_prescient_prefetch 4 0
demand_iohit_prescient_prefetch 4 0
arc_need_free 4 0
arc_sys_free 4 2356076800
arc_raw_size 4 0
cached_only_in_progress 4 0
abd_chunk_waste_size 4 27299328
Describe the problem you're observing
When writing large files which in my test was 4 files 2TB each (8TB total) on a zpool with dedup enabled and fast dedup feature active, all ARC is used and total RAM consumption sits at around 47GB. When deleting the files, RAM usage grows and the system goes into OOM. This can be reproduced on other recordsizes as well (tested with 16K and 128K recordsize). Also, with lower amount of data and lower amount of RAM. Same can be observed with lots of small files but with equal total space occupied on zpool. If removing small files one by one, they can be deleted, but when attempting to remove lots of 1GB files simultaneously, this results in OOM. After the reset, zpool cannot be imported resulting in the same OOM condition.
Describe how to reproduce the problem
Write several large files on zpool with deduplication and fast dedup enabled. In my experiment, this was 4x2TB files. Total RAM - 64GB. Or, 4x1TB files but with lower amount of RAM (32GB). Try to remove the files with rm
Include any warning/errors/backtraces from the system logs
I cannot find the OOM messages after the reset in journal so attaching the screenshot here.
From the journal log, I see the following events:
Oct 29 04:45:17 zfs-rc2-test kernel: Large kmem_alloc(74904, 0x1000), please file an issue at: https://github.com/openzfs/zfs/issues/new
Attaching full journal logs and dmesg logs just in case.
log.txt
dmesg.txt