Skip to content

Commit 45e23ab

Browse files
authored
Update resume token at object receive.
Before this change resume token was updated only on data receive. Usually it is enough to resume replication without much overlap. But we've got a report of a curios case, where replication source was traversed with recursive grep, which through enabled atime modified every object without modifying any data. It produced several gigabytes of replication traffic without a single data write and so without a single resume point. While the resume token was not designed to resume from an object, I've found that the send implementation always sends object before any data. So by requesting resume from offset 0 we are effectively resuming from the object, followed (or not) by the data at offset 0, just as we need it. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Paul Dagnelie <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes #15927
1 parent ef08a4d commit 45e23ab

File tree

1 file changed

+10
-0
lines changed

1 file changed

+10
-0
lines changed

module/zfs/dmu_recv.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2110,6 +2110,16 @@ receive_object(struct receive_writer_arg *rwa, struct drr_object *drro,
21102110
dmu_buf_rele(db, FTAG);
21112111
dnode_rele(dn, FTAG);
21122112
}
2113+
2114+
/*
2115+
* If the receive fails, we want the resume stream to start with the
2116+
* same record that we last successfully received. There is no way to
2117+
* request resume from the object record, but we can benefit from the
2118+
* fact that sender always sends object record before anything else,
2119+
* after which it will "resend" data at offset 0 and resume normally.
2120+
*/
2121+
save_resume_state(rwa, drro->drr_object, 0, tx);
2122+
21132123
dmu_tx_commit(tx);
21142124

21152125
return (0);

0 commit comments

Comments
 (0)