Skip to content

Commit 337a1e9

Browse files
amotinlundman
authored andcommitted
Better fill empty metaslabs
Before this change zfs_metaslab_switch_threshold tunable switched metaslabs each time ones index reduced by two (which means biggest contiguous chunk reduced to 1/4). It is a good idea to balance metaslabs fragmentation. But for empty metaslabs (having power- of-2 sizes) this means switching when they get just below the half of their capacity. Inspection with zdb after filling new pool to half capacity shown most of its metaslabs filled to half capacity. I consider this sub-optimal for pool fragmentation in a long run. This change blocks the metaslabs switching if most of the metaslab free space (15/16) is represented by a single contiguous range. Such metaslab should not be considered fragmented until it actually fail some big allocation. More contiguous filling should improve data locality and increase time before previously filled and partially freed metaslab is touched again, giving it more time to free more contiguous chunks for lower fragmentation. It should also slightly reduce spacemap traffic. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Paul Dagnelie <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17081
1 parent 4364636 commit 337a1e9

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

module/zfs/metaslab.c

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3545,6 +3545,15 @@ metaslab_segment_may_passivate(metaslab_t *msp)
35453545
if (WEIGHT_IS_SPACEBASED(msp->ms_weight) || spa_sync_pass(spa) > 1)
35463546
return;
35473547

3548+
/*
3549+
* As long as a single largest free segment covers majorioty of free
3550+
* space, don't consider the metaslab fragmented. It should allow
3551+
* us to fill new unfragmented metaslabs full before switching.
3552+
*/
3553+
if (metaslab_largest_allocatable(msp) >
3554+
zfs_range_tree_space(msp->ms_allocatable) * 15 / 16)
3555+
return;
3556+
35483557
/*
35493558
* Since we are in the middle of a sync pass, the most accurate
35503559
* information that is accessible to us is the in-core range tree

0 commit comments

Comments
 (0)