Skip to content

Commit b4aad96

Browse files
committed
Increase default zfs_rebuild_vdev_limit to 64MB
When testing distributed rebuild performance with more capable hardware it was observed than increasing the zfs_rebuild_vdev_limit to 64M reduced the rebuild time by 17%. Beyond 64MB there was some improvement (~2%) but it was not significant when weighed against the increased memory usage. Memory usage is capped at 1/4 of arc_c_max. Additionally, vr_bytes_inflight_max has been moved so it's updated per-metaslab to allow the size to be adjust while a rebuild is running. Signed-off-by: Brian Behlendorf <[email protected]>
1 parent 83b1af4 commit b4aad96

File tree

2 files changed

+16
-10
lines changed

2 files changed

+16
-10
lines changed

man/man4/zfs.4

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1769,7 +1769,7 @@ completes in order to verify the checksums of all blocks which have been
17691769
resilvered.
17701770
This is enabled by default and strongly recommended.
17711771
.
1772-
.It Sy zfs_rebuild_vdev_limit Ns = Ns Sy 33554432 Ns B Po 32 MiB Pc Pq u64
1772+
.It Sy zfs_rebuild_vdev_limit Ns = Ns Sy 67108864 Ns B Po 64 MiB Pc Pq u64
17731773
Maximum amount of I/O that can be concurrently issued for a sequential
17741774
resilver per leaf device, given in bytes.
17751775
.

module/zfs/vdev_rebuild.c

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
#include <sys/zio.h>
3535
#include <sys/dmu_tx.h>
3636
#include <sys/arc.h>
37+
#include <sys/arc_impl.h>
3738
#include <sys/zap.h>
3839

3940
/*
@@ -116,13 +117,12 @@ static uint64_t zfs_rebuild_max_segment = 1024 * 1024;
116117
* segment size is also large (zfs_rebuild_max_segment=1M). This helps keep
117118
* the queue depth short.
118119
*
119-
* 32MB was selected as the default value to achieve good performance with
120-
* a large 90-drive dRAID HDD configuration (draid2:8d:90c:2s). A sequential
121-
* rebuild was unable to saturate all of the drives using smaller values.
122-
* With a value of 32MB the sequential resilver write rate was measured at
123-
* 800MB/s sustained while rebuilding to a distributed spare.
120+
* 64MB was observed to deliver the best performance and set as the default.
121+
* Testing was performed with a 106-drive dRAID HDD pool (draid2:11d:106c)
122+
* and a rebuild rate of 1.2GB/s was measured to the distribute spare.
123+
* Smaller values were unable to fully saturate the available pool I/O.
124124
*/
125-
static uint64_t zfs_rebuild_vdev_limit = 32 << 20;
125+
static uint64_t zfs_rebuild_vdev_limit = 64 << 20;
126126

127127
/*
128128
* Automatically start a pool scrub when the last active sequential resilver
@@ -786,9 +786,6 @@ vdev_rebuild_thread(void *arg)
786786
vr->vr_pass_bytes_scanned = 0;
787787
vr->vr_pass_bytes_issued = 0;
788788

789-
vr->vr_bytes_inflight_max = MAX(1ULL << 20,
790-
zfs_rebuild_vdev_limit * vd->vdev_children);
791-
792789
uint64_t update_est_time = gethrtime();
793790
vdev_rebuild_update_bytes_est(vd, 0);
794791

@@ -804,6 +801,15 @@ vdev_rebuild_thread(void *arg)
804801
metaslab_t *msp = vd->vdev_ms[i];
805802
vr->vr_scan_msp = msp;
806803

804+
/*
805+
* Calculate the max number of in-flight bytes for top vdev
806+
* scanning operations (minimum 1MB / maximum 1/4 of arc_c_max).
807+
* Limits for the issuing phase are done per top-level vdev and
808+
* are handled separately.
809+
*/
810+
vr->vr_bytes_inflight_max = MIN(arc_c_max / 4, MAX(1ULL << 20,
811+
zfs_rebuild_vdev_limit * vd->vdev_children));
812+
807813
/*
808814
* Removal of vdevs from the vdev tree may eliminate the need
809815
* for the rebuild, in which case it should be canceled. The

0 commit comments

Comments
 (0)