Skip to content

Memory pressure may result in violated assertion in arc_wait_for_eviction() #11285

Closed
@gamanakis

Description

@gamanakis

System information

Type Version/Name
Distribution Name Archlinux
Distribution Version rolling
Linux Kernel 5.9.10
Architecture x64
ZFS Version 8f158ae
SPL Version 8f158ae

Describe the problem you're observing

Memory stressing a system with mprime results in a panic in arc_wait_for_eviction().

Describe how to reproduce the problem

In a system with 128GB memory, of which 40GB were consumed by ARC running mprime panicked in the assertion of arc_wait_for_eviction(). mprime was run in torture-test mode, type of test was "4. Blend (tests all of the above)" (ie both large and small fast Fourrier transforms). In this test mode mprime tries to consume all available memory.

I believe what happened is that the system was running low on memory so that the if clause at https://github.com/openzfs/zfs/blob/master/module/zfs/arc.c#L4166 was not true and arc_evict_waiters were not woken. However, arc_evict_count was still incremented at https://github.com/openzfs/zfs/blob/master/module/zfs/arc.c#L4164 which led to the panicked assertion in arc_wait_for_eviction().

Just a thought though.

Include any warning/errors/backtraces from the system logs

VERIFY3(last->aew_count > arc_evict_count) failed (850903552 > 868262400)
PANIC at arc.c:5255:arc_wait_for_eviction()
Showing stack for process 1429448
CPU: 4 PID: 1429448 Comm: mprime Tainted: P           OE     5.9.10-1-vfio #1
Call Trace:
 dump_stack+0x6b/0x83
 spl_panic+0xef/0x117 [spl]
 ? sysvec_call_function+0x36/0x80
 ? asm_sysvec_call_function+0x12/0x20
 arc_wait_for_eviction+0x1db/0x1f0 [zfs]
 arc_shrinker_scan+0x36/0xd0 [zfs]
 do_shrink_slab+0x146/0x290
 shrink_slab+0xd0/0x2f0
 shrink_node+0x2c0/0x6e0
 do_try_to_free_pages+0xda/0x4c0
 try_to_free_pages+0xef/0x1c0
 __alloc_pages_slowpath.constprop.0+0x384/0xce0
 ? tick_nohz_next_event+0x8f/0x180
 __alloc_pages_nodemask+0x2e6/0x310
 alloc_pages_vma+0x80/0x250
 handle_mm_fault+0xebd/0x1930
 do_user_addr_fault+0x1b8/0x3f0
 exc_page_fault+0x82/0x1a0
 ? asm_exc_page_fault+0x8/0x30
 asm_exc_page_fault+0x1e/0x30

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions