Skip to content

OOM after files remove with dedup on and fast dedup enabled #16697

Closed
@jtblck90

Description

@jtblck90

System information

Type Version/Name
Distribution Name Debian
Distribution Version 12
Kernel Version 6.1.0-22
Architecture amd64
OpenZFS Version zfs-2.3.0-rc2

A testing VM has 64GB of RAM and 32GB of RAM set for ZFS ARC via min and max parameters. RAIDZ1 pool with is configured with deduplication and fast dedup feature is enabled and active.

zpool status

  pool: zpool16k
 state: ONLINE
config:

	NAME         STATE     READ WRITE CKSUM
	zpool16k     ONLINE       0     0     0
	  raidz1-0   ONLINE       0     0     0
	    nvme0n1  ONLINE       0     0     0
	    nvme1n1  ONLINE       0     0     0
	    nvme2n1  ONLINE       0     0     0
	    nvme3n1  ONLINE       0     0     0
	    nvme4n1  ONLINE       0     0     0

errors: No known data errors

zpool config

NAME      PROPERTY                       VALUE                          SOURCE
zpool16k  size                           14.5T                          -
zpool16k  capacity                       84%                            -
zpool16k  altroot                        -                              default
zpool16k  health                         ONLINE                         -
zpool16k  guid                           6048011872382422538            -
zpool16k  version                        -                              default
zpool16k  bootfs                         -                              default
zpool16k  delegation                     on                             default
zpool16k  autoreplace                    off                            default
zpool16k  cachefile                      -                              default
zpool16k  failmode                       wait                           default
zpool16k  listsnapshots                  off                            default
zpool16k  autoexpand                     off                            default
zpool16k  dedupratio                     1.00x                          -
zpool16k  free                           2.23T                          -
zpool16k  allocated                      12.3T                          -
zpool16k  readonly                       off                            -
zpool16k  ashift                         12                             local
zpool16k  comment                        -                              default
zpool16k  expandsize                     -                              -
zpool16k  freeing                        0                              -
zpool16k  fragmentation                  1%                             -
zpool16k  leaked                         0                              -
zpool16k  multihost                      off                            default
zpool16k  checkpoint                     -                              -
zpool16k  load_guid                      15783377096588126353           -
zpool16k  autotrim                       off                            default
zpool16k  compatibility                  off                            default
zpool16k  bcloneused                     0                              -
zpool16k  bclonesaved                    0                              -
zpool16k  bcloneratio                    1.00x                          -
zpool16k  dedup_table_size               212G                           -
zpool16k  dedup_table_quota              auto                           default
zpool16k  feature@async_destroy          enabled                        local
zpool16k  feature@empty_bpobj            enabled                        local
zpool16k  feature@lz4_compress           active                         local
zpool16k  feature@multi_vdev_crash_dump  enabled                        local
zpool16k  feature@spacemap_histogram     active                         local
zpool16k  feature@enabled_txg            active                         local
zpool16k  feature@hole_birth             active                         local
zpool16k  feature@extensible_dataset     active                         local
zpool16k  feature@embedded_data          active                         local
zpool16k  feature@bookmarks              enabled                        local
zpool16k  feature@filesystem_limits      enabled                        local
zpool16k  feature@large_blocks           enabled                        local
zpool16k  feature@large_dnode            enabled                        local
zpool16k  feature@sha512                 enabled                        local
zpool16k  feature@skein                  enabled                        local
zpool16k  feature@edonr                  enabled                        local
zpool16k  feature@userobj_accounting     active                         local
zpool16k  feature@encryption             enabled                        local
zpool16k  feature@project_quota          active                         local
zpool16k  feature@device_removal         enabled                        local
zpool16k  feature@obsolete_counts        enabled                        local
zpool16k  feature@zpool_checkpoint       enabled                        local
zpool16k  feature@spacemap_v2            active                         local
zpool16k  feature@allocation_classes     enabled                        local
zpool16k  feature@resilver_defer         enabled                        local
zpool16k  feature@bookmark_v2            enabled                        local
zpool16k  feature@redaction_bookmarks    enabled                        local
zpool16k  feature@redacted_datasets      enabled                        local
zpool16k  feature@bookmark_written       enabled                        local
zpool16k  feature@log_spacemap           active                         local
zpool16k  feature@livelist               enabled                        local
zpool16k  feature@device_rebuild         enabled                        local
zpool16k  feature@zstd_compress          enabled                        local
zpool16k  feature@draid                  enabled                        local
zpool16k  feature@zilsaxattr             enabled                        local
zpool16k  feature@head_errlog            active                         local
zpool16k  feature@blake3                 enabled                        local
zpool16k  feature@block_cloning          enabled                        local
zpool16k  feature@vdev_zaps_v2           active                         local
zpool16k  feature@redaction_list_spill   enabled                        local
zpool16k  feature@raidz_expansion        enabled                        local
zpool16k  feature@fast_dedup             active                         local
zpool16k  feature@longname               enabled                        local
zpool16k  feature@large_microzap         enabled                        local

zfs config

NAME      PROPERTY              VALUE                  SOURCE
zpool16k  type                  filesystem             -
zpool16k  creation              Mon Oct 28 13:24 2024  -
zpool16k  used                  9.84T                  -
zpool16k  available             1.66T                  -
zpool16k  referenced            9.63T                  -
zpool16k  compressratio         1.00x                  -
zpool16k  mounted               yes                    -
zpool16k  quota                 none                   default
zpool16k  reservation           none                   default
zpool16k  recordsize            16K                    local
zpool16k  mountpoint            /zpool16k              default
zpool16k  sharenfs              off                    default
zpool16k  checksum              on                     default
zpool16k  compression           off                    local
zpool16k  atime                 on                     default
zpool16k  devices               on                     default
zpool16k  exec                  on                     default
zpool16k  setuid                on                     default
zpool16k  readonly              off                    default
zpool16k  zoned                 off                    default
zpool16k  snapdir               hidden                 default
zpool16k  aclmode               discard                default
zpool16k  aclinherit            restricted             default
zpool16k  createtxg             1                      -
zpool16k  canmount              on                     default
zpool16k  xattr                 on                     local
zpool16k  copies                1                      default
zpool16k  version               5                      -
zpool16k  utf8only              on                     -
zpool16k  normalization         formD                  -
zpool16k  casesensitivity       sensitive              -
zpool16k  vscan                 off                    default
zpool16k  nbmand                off                    default
zpool16k  sharesmb              off                    default
zpool16k  refquota              none                   default
zpool16k  refreservation        none                   default
zpool16k  guid                  3860442583779050184    -
zpool16k  primarycache          all                    default
zpool16k  secondarycache        all                    default
zpool16k  usedbysnapshots       0B                     -
zpool16k  usedbydataset         9.63T                  -
zpool16k  usedbychildren        212G                   -
zpool16k  usedbyrefreservation  0B                     -
zpool16k  logbias               latency                default
zpool16k  objsetid              54                     -
zpool16k  dedup                 on                     local
zpool16k  mlslabel              none                   default
zpool16k  sync                  disabled               local
zpool16k  dnodesize             legacy                 default
zpool16k  refcompressratio      1.00x                  -
zpool16k  written               9.63T                  -
zpool16k  logicalused           8.08T                  -
zpool16k  logicalreferenced     8.02T                  -
zpool16k  volmode               default                default
zpool16k  filesystem_limit      none                   default
zpool16k  snapshot_limit        none                   default
zpool16k  filesystem_count      none                   default
zpool16k  snapshot_count        none                   default
zpool16k  snapdev               hidden                 default
zpool16k  acltype               posix                  local
zpool16k  context               none                   default
zpool16k  fscontext             none                   default
zpool16k  defcontext            none                   default
zpool16k  rootcontext           none                   default
zpool16k  relatime              on                     local
zpool16k  redundant_metadata    all                    default
zpool16k  overlay               on                     default
zpool16k  encryption            off                    default
zpool16k  keylocation           none                   default
zpool16k  keyformat             none                   default
zpool16k  pbkdf2iters           0                      default
zpool16k  special_small_blocks  0                      default
zpool16k  prefetch              all                    default
zpool16k  direct                standard               default
zpool16k  longname              off                    default

zpool status with DDT (zpool status -D)

  pool: zpool16k
 state: ONLINE
config:

	NAME         STATE     READ WRITE CKSUM
	zpool16k     ONLINE       0     0     0
	  raidz1-0   ONLINE       0     0     0
	    nvme0n1  ONLINE       0     0     0
	    nvme1n1  ONLINE       0     0     0
	    nvme2n1  ONLINE       0     0     0
	    nvme3n1  ONLINE       0     0     0
	    nvme4n1  ONLINE       0     0     0

errors: No known data errors

 dedup: DDT entries 536870912, size 212G on disk, 136G in core

bucket              allocated                       referenced          
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1     512M      8T      8T   9.59T     512M      8T      8T   9.59T
 Total     512M      8T      8T   9.59T     512M      8T      8T   9.59T

arcstat

21 1 0x01 147 39984 64501438472 54609823965886
name                            type data
hits                            4    3178411426
iohits                          4    1496
misses                          4    64419903
demand_data_hits                4    0
demand_data_iohits              4    0
demand_data_misses              4    0
demand_metadata_hits            4    3178410242
demand_metadata_iohits          4    1496
demand_metadata_misses          4    64418378
prefetch_data_hits              4    0
prefetch_data_iohits            4    0
prefetch_data_misses            4    0
prefetch_metadata_hits          4    1184
prefetch_metadata_iohits        4    0
prefetch_metadata_misses        4    1525
mru_hits                        4    225453614
mru_ghost_hits                  4    11470786
mfu_hits                        4    2952957812
mfu_ghost_hits                  4    18958526
uncached_hits                   4    0
deleted                         4    572151552
mutex_miss                      4    4315505
access_skip                     4    0
evict_skip                      4    5
evict_not_enough                4    457326
evict_l2_cached                 4    0
evict_l2_eligible               4    11090623758336
evict_l2_eligible_mfu           4    1735208350208
evict_l2_eligible_mru           4    9355415408128
evict_l2_ineligible             4    8192
evict_l2_skip                   4    0
hash_elements                   4    3515346
hash_elements_max               4    3754460
hash_collisions                 4    175363571
hash_chains                     4    562112
hash_chain_max                  4    9
meta                            4    4294389235
pd                              4    2147483648
pm                              4    1925069853
c                               4    34359738368
c_min                           4    34359738368
c_max                           4    34359738368
size                            4    34373397632
compressed_size                 4    31970503680
uncompressed_size               4    68214341120
overhead_size                   4    1515084288
hdr_size                        4    844770080
data_size                       4    147505152
metadata_size                   4    33338082816
dbuf_size                       4    13324032
dnode_size                      4    2105824
bonus_size                      4    310400
anon_size                       4    0
anon_data                       4    0
anon_metadata                   4    0
anon_evictable_data             4    0
anon_evictable_metadata         4    0
mru_size                        4    17549727232
mru_data                        4    147505152
mru_metadata                    4    17402222080
mru_evictable_data              4    147505152
mru_evictable_metadata          4    15175630848
mru_ghost_size                  4    24628658176
mru_ghost_data                  4    16587096064
mru_ghost_metadata              4    8041562112
mru_ghost_evictable_data        4    16587096064
mru_ghost_evictable_metadata    4    8041562112
mfu_size                        4    15935860736
mfu_data                        4    0
mfu_metadata                    4    15935860736
mfu_evictable_data              4    0
mfu_evictable_metadata          4    15934646272
mfu_ghost_size                  4    6871023616
mfu_ghost_data                  4    0
mfu_ghost_metadata              4    6871023616
mfu_ghost_evictable_data        4    0
mfu_ghost_evictable_metadata    4    6871023616
uncached_size                   4    0
uncached_data                   4    0
uncached_metadata               4    0
uncached_evictable_data         4    0
uncached_evictable_metadata     4    0
l2_hits                         4    0
l2_misses                       4    0
l2_prefetch_asize               4    0
l2_mru_asize                    4    0
l2_mfu_asize                    4    0
l2_bufc_data_asize              4    0
l2_bufc_metadata_asize          4    0
l2_feeds                        4    0
l2_rw_clash                     4    0
l2_read_bytes                   4    0
l2_write_bytes                  4    0
l2_writes_sent                  4    0
l2_writes_done                  4    0
l2_writes_error                 4    0
l2_writes_lock_retry            4    0
l2_evict_lock_retry             4    0
l2_evict_reading                4    0
l2_evict_l1cached               4    0
l2_free_on_write                4    0
l2_abort_lowmem                 4    0
l2_cksum_bad                    4    0
l2_io_error                     4    0
l2_size                         4    0
l2_asize                        4    0
l2_hdr_size                     4    0
l2_log_blk_writes               4    0
l2_log_blk_avg_asize            4    0
l2_log_blk_asize                4    0
l2_log_blk_count                4    0
l2_data_to_meta_ratio           4    0
l2_rebuild_success              4    0
l2_rebuild_unsupported          4    0
l2_rebuild_io_errors            4    0
l2_rebuild_dh_errors            4    0
l2_rebuild_cksum_lb_errors      4    0
l2_rebuild_lowmem               4    0
l2_rebuild_size                 4    0
l2_rebuild_asize                4    0
l2_rebuild_bufs                 4    0
l2_rebuild_bufs_precached       4    0
l2_rebuild_log_blks             4    0
memory_throttle_count           4    0
memory_direct_count             4    0
memory_indirect_count           4    0
memory_all_bytes                4    67418071040
memory_free_bytes               4    15142215680
memory_available_bytes          3    12786138880
arc_no_grow                     4    0
arc_tempreserve                 4    0
arc_loaned_bytes                4    0
arc_prune                       4    0
arc_meta_used                   4    34198593152
arc_dnode_limit                 4    3435973836
async_upgrade_sync              4    0
predictive_prefetch             4    2709
demand_hit_predictive_prefetch  4    1180
demand_iohit_predictive_prefetch 4    1507
prescient_prefetch              4    0
demand_hit_prescient_prefetch   4    0
demand_iohit_prescient_prefetch 4    0
arc_need_free                   4    0
arc_sys_free                    4    2356076800
arc_raw_size                    4    0
cached_only_in_progress         4    0
abd_chunk_waste_size            4    27299328

Describe the problem you're observing

When writing large files which in my test was 4 files 2TB each (8TB total) on a zpool with dedup enabled and fast dedup feature active, all ARC is used and total RAM consumption sits at around 47GB. When deleting the files, RAM usage grows and the system goes into OOM. This can be reproduced on other recordsizes as well (tested with 16K and 128K recordsize). Also, with lower amount of data and lower amount of RAM. Same can be observed with lots of small files but with equal total space occupied on zpool. If removing small files one by one, they can be deleted, but when attempting to remove lots of 1GB files simultaneously, this results in OOM. After the reset, zpool cannot be imported resulting in the same OOM condition.

Describe how to reproduce the problem

Write several large files on zpool with deduplication and fast dedup enabled. In my experiment, this was 4x2TB files. Total RAM - 64GB. Or, 4x1TB files but with lower amount of RAM (32GB). Try to remove the files with rm

Include any warning/errors/backtraces from the system logs

I cannot find the OOM messages after the reset in journal so attaching the screenshot here.

OOMscreen

From the journal log, I see the following events:

Oct 29 04:45:17 zfs-rc2-test kernel: Large kmem_alloc(74904, 0x1000), please file an issue at: https://github.com/openzfs/zfs/issues/new

Attaching full journal logs and dmesg logs just in case.
log.txt
dmesg.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: DefectIncorrect behavior (e.g. crash, hang)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions