Skip to content

Do not persist user/group/project quota zap objects when unneeded #14721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 20, 2023

Conversation

atkinsam
Copy link
Contributor

@atkinsam atkinsam commented Apr 5, 2023

Motivation and Context

This change provides up to a ~300% performance improvement in the case where a user creates a zpool and adds a user/group/project quota, and then removes all user/group/project quotas. In this case, today the write path will still do a zap_lookup on the quota object(s). Even when all quotas are deleted.

Description

In the zfs_id_over*quota functions, there is a short-circuit to skip the zap_lookup when the quota zap does not exist. If quotas are never used in a zpool, then the quota zap will never exist. But if user/group/project quotas are ever used, the zap objects will be created and will persist even if the quotas are deleted.

The quota zap_lookup in the write path can become a bottleneck for write-heavy small I/O workloads. Before this commit, it was not possible to remove this lookup without creating a new zpool.

During a write-heavy workload I was testing, I realized that these zap objects are a huge contributor to the total usage of zap_lookup:

[root@ip-198-19-76-37 ~]# bpftrace -e 'kprobe:zap_lookup { @[arg1] = count(); }'
Attaching 1 probe...
^C

@[130]: 264
@[1]: 343
@[2]: 390000
@[3]: 390000

[root@ip-198-19-76-37 ~]# zdb -d fsx/ 2:3
Dataset fsx [ZPL], ID 54, cr_txg 1, 204M, 25008 objects

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
         2    1   128K    512      0     512    512  100.00  ZFS user/group/project quota
         3    1   128K    512      0     512    512  100.00  ZFS user/group/project quota

And a big contributor to CPU. In one of my workloads, ~90% of CPU time is spent in zfs_id_overblockquota:

write_heavy_flame

How Has This Been Tested?

Created some user/group quotas and did some writes. Then verified that when I set my user/group quotas back to 0, the quota zap object is removed.

[root@ip-198-19-83-180 ~]# zdb -dd myds | grep project
        -1    1   128K    512      0     512    512  100.00  ZFS user/group/project used
        -2    1   128K    512      0     512    512  100.00  ZFS user/group/project used
        -3    1   128K    512      0     512    512  100.00  ZFS user/group/project used
         2    1   128K    512      0     512    512  100.00  ZFS user/group/project quota
         3    1   128K    512      0     512    512  100.00  ZFS user/group/project quota
[root@ip-198-19-83-180 ~]# zfs set userquota@0=0 myds
[root@ip-198-19-83-180 ~]# zfs set groupquota@0=0 myds
[root@ip-198-19-83-180 ~]# zdb -dd myds | grep project
        -1    1   128K    512      0     512    512  100.00  ZFS user/group/project used
        -2    1   128K    512      0     512    512  100.00  ZFS user/group/project used
        -3    1   128K    512      0     512    512  100.00  ZFS user/group/project used

A few test cases run with ior, comparing performance before/after this change when I added and then removed quotas on my file system:

  • No Quota Throughput: throughput achieved on a server with zfs including this commit
  • Quota Throughput: throughput achieved on a server with zfs 2.1.7
  • No Quota IOPS: IOPS achieved on a server with zfs including this commit
  • Quota IOPS: IOPS achieved on a server with zfs 2.1.7

zfs_quota_write_zstd_6x

zfs_quota_write_zstd_7000x

zfs_quota_write_no_compression

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf added Type: Performance Performance improvement or performance problem Status: Code Review Needed Ready for review and testing labels Apr 5, 2023
@behlendorf behlendorf requested review from amotin and behlendorf April 5, 2023 17:27
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me as an optimization for those who enable quotas and then decide to disable them. What's pretty surprising through is how much overhead there is in zap_lookup() since it should all be cached in the ARC. It seems like there's some potential for additional optimization here in a follow up PR.

@behlendorf behlendorf added the Status: Revision Needed Changes are required for the PR to be accepted label Apr 21, 2023
In the zfs_id_over*quota functions, there is a short-circuit to skip
the zap_lookup when the quota zap does not exist. If quotas are never
used in a zpool, then the quota zap will never exist. But if
user/group/project quotas are ever used, the zap objects will be
created and will persist even if the quotas are deleted.

The quota zap_lookup in the write path can become a bottleneck for
write-heavy small I/O workloads. Before this commit, it was not
possible to remove this lookup without creating a new zpool.

Signed-off-by: Sam Atkinson <[email protected]>
@behlendorf behlendorf removed the Status: Revision Needed Changes are required for the PR to be accepted label Oct 20, 2023
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Oct 20, 2023
@behlendorf behlendorf merged commit 797f55e into openzfs:master Oct 20, 2023
@amotin
Copy link
Member

amotin commented Oct 23, 2023

It may be a coincidence, but in my #15428 I hit such an unexpected panic:

[10243.375381] VERIFY3(dn->dn_assigned_txg == tx->tx_txg) failed (0 == 26)
[10243.379940] PANIC at dmu_tx.c:715:dmu_tx_dirty_buf()
[10243.384035] Showing stack for process 1105298
[10243.387875] CPU: 1 PID: 1105298 Comm: zfs Tainted: P           OE      6.5.7-200.fc38.x86_64 #1
[10243.395239] Hardware name: Amazon EC2 m5d.large/, BIOS 1.0 10/16/2017
[10243.399833] Call Trace:
[10243.402857]  <TASK>
[10243.405749]  dump_stack_lvl+0x47/0x60
[10243.409268]  spl_panic+0x100/0x120 [spl]
[10243.412921]  ? list_head+0x9/0x30 [zfs]
[10243.416899]  dmu_tx_dirty_buf+0x27e/0x440 [zfs]
[10243.421097]  dbuf_dirty+0x60/0x1320 [zfs]
[10243.425055]  ? dbuf_verify+0xcd/0x950 [zfs]
[10243.429050]  ? dmu_buf_will_dirty_impl+0x108/0x350 [zfs]
[10243.433529]  zap_lockdir_impl+0x2c2/0x540 [zfs]
[10243.437520]  zap_lockdir+0xd6/0x130 [zfs]
[10243.441283]  zap_remove_norm+0x52/0xb0 [zfs]
[10243.445160]  zfs_set_userquota+0x414/0x5a0 [zfs]
[10243.449208]  zfs_prop_set_userquota+0x11d/0x1c0 [zfs]
[10243.453418]  zfs_prop_set_special+0xd2/0x3d0 [zfs]
[10243.457555]  zfs_set_prop_nvlist+0x2be/0x590 [zfs]
[10243.461666]  zfs_ioc_set_prop+0x11b/0x150 [zfs]
[10243.465665]  zfsdev_ioctl_common+0x667/0x7a0 [zfs]
[10243.469763]  ? __kmalloc_node+0xc3/0x150
[10243.473244]  zfsdev_ioctl+0x53/0xe0 [zfs]
[10243.477023]  __x64_sys_ioctl+0x97/0xd0
[10243.480445]  do_syscall_64+0x60/0x90
[10243.483795]  ? __count_memcg_events+0x42/0x90
[10243.487464]  ? count_memcg_events.constprop.0+0x1a/0x30
[10243.491440]  ? handle_mm_fault+0x9e/0x350
[10243.494965]  ? do_user_addr_fault+0x179/0x640
[10243.498608]  ? exc_page_fault+0x7f/0x180

@behlendorf
Copy link
Contributor

Hmm, it doesn't appear to be a coincidence. I just saw we hit the same issue in other PRs which were recently rebased. Let me see about reverting this change.

@behlendorf
Copy link
Contributor

@atkinsam I've opened PR #15438 to revert this change for now to keep the CI happy. It looks like some additional changes will be need to handle project quotas.

@atkinsam
Copy link
Contributor Author

Thanks for taking care of the revert, I'll take a look at this.

lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Dec 12, 2023
In the zfs_id_over*quota functions, there is a short-circuit to skip
the zap_lookup when the quota zap does not exist. If quotas are never
used in a zpool, then the quota zap will never exist. But if
user/group/project quotas are ever used, the zap objects will be
created and will persist even if the quotas are deleted.

The quota zap_lookup in the write path can become a bottleneck for
write-heavy small I/O workloads. Before this commit, it was not
possible to remove this lookup without creating a new zpool.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sam Atkinson <[email protected]>
Closes openzfs#14721
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested) Type: Performance Performance improvement or performance problem
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants