Skip to content

spa: clear checkpoint information during retry #17319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 11, 2025

Conversation

oshogbo
Copy link
Contributor

@oshogbo oshogbo commented May 9, 2025

Motivation and Context

During spa loading, the spa_add() function allocates the spa_t
structure. The problematic rewind logic starts in
the spa_load_best() function:

static int
spa_load_best(spa_t *spa, spa_load_state_t state, uint64_t max_request,
    int rewind_flags)
{
        // ...

        load_error = rewind_error = spa_load(spa, state, SPA_IMPORT_EXISTING);
        if (load_error == 0)
                return (0);
        if (load_error == ZFS_ERR_NO_CHECKPOINT) {
                //
                return (load_error);
        }

        // ...
        while (rewind_error && spa->spa_uberblock.ub_txg >= min_txg &&
            spa->spa_uberblock.ub_txg <= spa->spa_load_max_txg) {
                if (spa->spa_load_max_txg < safe_rewind_txg)
                        spa->spa_extreme_rewind = B_TRUE;
                rewind_error = spa_load_retry(spa, state);
        }
        // ...
}

spa_load() is called, which is a wrapper for spa_load_impl().
One of the first actions in that function is to call
spa_ld_read_checkpoint_txg().
If a checkpoint exists, this function sets spa->spa_checkpoint_txg.

However, during spa_load_impl(), many steps may fail after reading
the checkpoint, such as reading feature flags or MOS directories.
If one of these steps fails, the function returns an error.

If the failure is something other than ZFS_ERR_NO_CHECKPOINT,
a rewind is attempted in the loop shown above. But because
the checkpoint had already been read (and spa_checkpoint_txg
was set), and nothing clears this state on failure,
another call to spa_ld_read_checkpoint_txg() during rewind will
casue assertion:

static int
spa_ld_read_checkpoint_txg(spa_t *spa)
{
        //...

        ASSERT0(spa->spa_checkpoint_txg);
}

And we might end with crash like:

[ 6543.510793] VERIFY0(spa->spa_checkpoint_txg) failed (0 == 15)
  [ 6543.511146] PANIC at spa.c:5224:spa_ld_read_checkpoint_txg()
  [ 6543.511407] Showing stack for process 450801
  [ 6543.512011] CPU: 0 PID: 450801 Comm: zpool Kdump: loaded Tainted: P           OE     -------  ---  5.14.0-503.35.1.el9_5.x86_64 #1
  [ 6543.512296] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
  [ 6543.512736] Call Trace:
  [ 6543.513202]  <TASK>
  [ 6543.513683]  dump_stack_lvl+0x34/0x48
  [ 6543.515313]  spl_panic+0xd1/0xe9 [spl]
  [ 6543.516916]  ? allocate_cgrp_cset_links+0x89/0xa0
  [ 6543.520874]  ? spl_kmem_alloc_impl+0xb0/0xd0 [spl]
  [ 6543.521241]  ? spl_kmem_alloc_impl+0xb0/0xd0 [spl]
  [ 6543.521597]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.522182]  ? __kmalloc_node+0x4e/0x140
  [ 6543.522598]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.522856]  ? spl_kmem_alloc_impl+0xb0/0xd0 [spl]
  [ 6543.523118]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.523358]  ? __list_add+0x12/0x30 [spl]
  [ 6543.523650]  ? __dprintf+0x120/0x190 [zfs]
  [ 6543.539257]  spa_ld_read_checkpoint_txg+0x194/0x1d0 [zfs]
  [ 6543.539724]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.539858]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.539988]  ? spa_import_progress_set_notes_impl+0x103/0x200 [zfs]
  [ 6543.540410]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.540569]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.540709]  ? spa_import_progress_set_notes+0x5b/0x80 [zfs]
  [ 6543.541124]  spa_load_impl.constprop.0+0x10d/0x720 [zfs]
  [ 6543.541558]  spa_load+0x76/0x140 [zfs]
  [ 6543.542288]  spa_load_best+0x138/0x2c0 [zfs]
  [ 6543.542930]  spa_import+0x28a/0x780 [zfs]
  [ 6543.543526]  ? free_unref_page+0xf2/0x130
  [ 6543.543762]  zfs_ioc_pool_import+0x140/0x160 [zfs]
  [ 6543.544348]  zfsdev_ioctl_common+0x690/0x760 [zfs]
  [ 6543.544959]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.545230]  ? _copy_from_user+0x27/0x60
  [ 6543.545593]  zfsdev_ioctl+0x53/0xe0 [zfs]
  [ 6543.546157]  __x64_sys_ioctl+0x8a/0xc0
  [ 6543.546526]  do_syscall_64+0x5f/0xf0
  [ 6543.546826]  ? srso_alias_return_thunk+0x5/0xfbef5
  [ 6543.547096]  ? exc_page_fault+0x62/0x150
  [ 6543.547411]  entry_SYSCALL_64_after_hwframe+0x78/0x80
  [ 6543.547966] RIP: 0033:0x7ff05670313b
  [ 6543.550135] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ad 4c 0f 00 f7 d8 64 89 01 48
  [ 6543.550491] RSP: 002b:00007ffe5bd417f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  [ 6543.550779] RAX: ffffffffffffffda RBX: 000056329756ce10 RCX: 00007ff05670313b
  [ 6543.551046] RDX: 00007ffe5bd42960 RSI: 0000000000005a02 RDI: 0000000000000003
  [ 6543.551290] RBP: 00007ffe5bd45f60 R08: 0000000000000003 R09: 0000000000000000
  [ 6543.551529] R10: 0000000010000000 R11: 0000000000000246 R12: 00007ffe5bd41960
  [ 6543.551779] R13: 000056329755c2e0 R14: 00007ffe5bd42960 R15: 00007ff050002da8
  [ 6543.552096]  </TASK>

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.

Description

There are several potential ways to address this issue:

  1. Clear spa_checkpoint_txg during retry (proposed in this PR).
  2. Remove the assertion in spa_ld_read_checkpoint_txg().
  3. Perform broader spa_t structure cleanup before retry.

How Has This Been Tested?

I have hit this error during work on #16853. In that PR, we add one
additional TXG during export, which seems to trigger this loop during
import. This was triggered by the checkpoint_zhack_feat test.
However, I don't think that this bug is limited to this PR, and it's a
more generic problem that users can hit.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@amotin
Copy link
Member

amotin commented May 9, 2025

Shouldn't it be done in spa_unload() for symmetry? spa_load_retry() seems a wrong place for it.

@oshogbo oshogbo force-pushed the oshogbo/checkpoint branch from 6b70304 to 8a6daa3 Compare May 9, 2025 13:52
@oshogbo
Copy link
Contributor Author

oshogbo commented May 9, 2025

@amotin Sounds like a better place.
Thanks!

Signed-off-by: Mariusz Zaborski <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
@amotin amotin added Status: Code Review Needed Ready for review and testing Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels May 9, 2025
@amotin amotin merged commit 8b9c4e6 into openzfs:master May 11, 2025
23 of 24 checks passed
robn pushed a commit to robn/zfs that referenced this pull request May 23, 2025
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Mariusz Zaborski <[email protected]>
Closes openzfs#17319
(cherry picked from commit 8b9c4e6)
@robn robn mentioned this pull request May 23, 2025
14 tasks
robn pushed a commit to robn/zfs that referenced this pull request May 24, 2025
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Mariusz Zaborski <[email protected]>
Closes openzfs#17319
(cherry picked from commit 8b9c4e6)
tonyhutter pushed a commit that referenced this pull request May 28, 2025
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Mariusz Zaborski <[email protected]>
Closes #17319
(cherry picked from commit 8b9c4e6)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants