Skip to content

Remove dedupditto functionality #8310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 19, 2019
Merged

Conversation

ahrens
Copy link
Member

@ahrens ahrens commented Jan 18, 2019

Motivation and Context

If dedup is in use, the dedupditto property can be set, causing ZFS to
keep an extra copy of data that is referenced many times (>100x). The
idea was that this data is more important than other data and thus we
want to be really sure that it is not lost if the disk experiences a
small amount of random corruption.

ZFS (and system administrators) rely on the pool-level redundancy to
protect their data (e.g. mirroring or RAIDZ). Since the user/sysadmin
doesn't have control over what data will be offered extra redundancy by
dedupditto, this extra redundancy is not very useful. The bulk of the
data is still vulnerable to loss based on the pool-level redundancy.
For example, if particle strikes corrupt 0.1% of blocks, you will either
be saved by mirror/raidz, or you will be sad. This is true even if
dedupditto saved another 0.01% of blocks from being corrupted.

Therefore, the dedupditto functionality is rarely enabled (i.e. the
property is rarely set), and it fulfills its promise of increased
redundancy even more rarely.

Additionally, this feature does not work as advertised (on existing
releases), because scrub/resilver did not repair the extra (dedupditto)
copy (see #8270).

In summary, this seldom-used feature doesn't work, and even if it did it
wouldn't provide useful data protection. It has a non-trivial
maintenance burden (again see #8270).

Description

This change removes the dedupditto functionality. For backwards
compatibility with the existing CLI, "zpool set dedupditto" will still
"succeed" (exit code zero), but won't have any effect. For backwards
compatibility with existing pools that had dedupditto enabled at some
point, the code will still be able to understand dedupditto blocks and
free them when appropriate. However, ZFS won't write any new dedupditto
blocks.

How Has This Been Tested?

Minimal testing at this point. Wanted to get this out there to gauge response and to work on how to communicate this.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@ahrens ahrens added the Status: Design Review Needed Architecture or design is under discussion label Jan 18, 2019
@tcaputi
Copy link
Contributor

tcaputi commented Jan 18, 2019

I took a look at the patch and it looks like it will do what it advertises. For clarification, #8270 fixed the dedup ditto scrub code so it does work again as of now.

As for whether or not we should get rid of dedup-ditto I have a few thoughts / discussion points.

For example, if particle strikes corrupt 0.1% of blocks, you will either be saved by mirror/raidz, or you will be sad.

This is true, but I often get questions from people about how much data they lost. Although users will be sad if they lose any data, I think the users I talk to would argue that minimizing losses in the event of a failure is still a worthwhile goal.

Second, from a usability standpoint I have never really understood why we enforce that dedup ditto only kicks in after 100 dedup copies. To me, I think it doesn't seem unreasonable for a user to want 2 copies of their data if it appears twice, but they may not see the value in keeping around a 3rd, 4th, or 100th copy and so they might want dedup ditto to kick in then.

All of this being said, we currently steer people away from the dedup code in general due to the performance problems that can arise there. So maybe because of that, this change won't effect that many people's pools either way.

TL;DR: I think I could be convinced either way, but I think there are arguments for and against, and I think that in the event of a stalemate we should keep tings the way they are.

@ahrens ahrens added Status: Code Review Needed Ready for review and testing and removed Status: Design Review Needed Architecture or design is under discussion labels Jan 29, 2019
@ahrens
Copy link
Member Author

ahrens commented Jan 29, 2019

I've gotten a fair amount of positive feedback on this (on the mailing list, and the OpenZFS Leadership Meeting), so I'd like to move forward with it. Code reviews welcome!

*/
void
ztest_ddt_repair(ztest_ds_t *zd, uint64_t id)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this test case removed I'm concerned that we no longer have any test coverage for existing pools which which did make use of dedupditto functionality. We'll be risking accidentally breaking this functionality again. As part of this removal why don't we add a basic test case to the ZTS which verifies that dedupditto repair continues to work as intended. We can create a tiny custom test pool for the purpose as was done for some of the zpool_upgrade tests.

@rlaager
Copy link
Member

rlaager commented Apr 19, 2019

@behlendorf Has this been approved in principle? It seems that it has, given the Code Review label vs Design Review. If so, do you expect this to land before 0.8.0? I would assume not; in such a case, should we add a comment to the man page now that this feature is deprecated and will be removed in the future?

@behlendorf
Copy link
Contributor

@rlaager that's right. There's general consensus appears to be that this functionality should be removed, after 0.8.0 is tagged. I do think adding a warning to man page now would be a good idea.

@behlendorf behlendorf added Status: Post 0.8.0 Status: Revision Needed Changes are required for the PR to be accepted and removed Status: Code Review Needed Ready for review and testing labels Apr 19, 2019
@rlaager rlaager mentioned this pull request Apr 21, 2019
12 tasks
@behlendorf
Copy link
Contributor

Since 0.8.0 was tagged let's go ahead and remove this functionality. I believe there's still general agreement that this should be removed. The PR just needs to be updated to resolves the conflicts.

@ahrens
Copy link
Member Author

ahrens commented Jun 7, 2019

@behlendorf Rebased.

@behlendorf
Copy link
Contributor

behlendorf commented Jun 10, 2019

@ahren since this was originally authored we have a new consumer of ztest_spa_prop_set_uint64 which resulted in the build failures.

If dedup is in use, the `dedupditto` property can be set, causing ZFS to
keep an extra copy of data that is referenced many times (>100x).  The
idea was that this data is more important than other data and thus we
want to be really sure that it is not lost if the disk experiences a
small amount of random corruption.

ZFS (and system administrators) rely on the pool-level redundancy to
protect their data (e.g. mirroring or RAIDZ).  Since the user/sysadmin
doesn't have control over what data will be offered extra redundancy by
dedupditto, this extra redundancy is not very useful.  The bulk of the
data is still vulnerable to loss based on the pool-level redundancy.
For example, if particle strikes corrupt 0.1% of blocks, you will either
be saved by mirror/raidz, or you will be sad.  This is true even if
dedupditto saved another 0.01% of blocks from being corrupted.

Therefore, the dedupditto functionality is rarely enabled (i.e. the
property is rarely set), and it fulfills its promise of increased
redundancy even more rarely.

Additionally, this feature does not work as advertised (on existing
releases), because scrub/resilver did not repair the extra (dedupditto)
copy (see openzfs#8270).

In summary, this seldom-used feature doesn't work, and even if it did it
wouldn't provide useful data protection.  It has a non-trivial
maintenance burden (again see openzfs#8270).

We should remove the dedupditto functionality.  For backwards
compatibility with the existing CLI, "zpool set dedupditto" will still
"succeed" (exit code zero), but won't have any effect.  For backwards
compatibility with existing pools that had dedupditto enabled at some
point, the code will still be able to understand dedupditto blocks and
free them when appropriate.  However, ZFS won't write any new dedupditto
blocks.
@ahrens
Copy link
Member Author

ahrens commented Jun 13, 2019

@behlendorf build errors should be fixed now.

@codecov
Copy link

codecov bot commented Jun 13, 2019

Codecov Report

Merging #8310 into master will decrease coverage by 0.29%.
The diff coverage is 80%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #8310     +/-   ##
=========================================
- Coverage   78.84%   78.55%   -0.3%     
=========================================
  Files         382      382             
  Lines      117818   117651    -167     
=========================================
- Hits        92896    92415    -481     
- Misses      24922    25236    +314
Flag Coverage Δ
#kernel 79.41% <80%> (+0.01%) ⬆️
#user 66.76% <80%> (-0.94%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update daddbdc...64769f3. Read the comment docs.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Revision Needed Changes are required for the PR to be accepted labels Jun 19, 2019
@behlendorf behlendorf merged commit 050d720 into openzfs:master Jun 19, 2019
@ahrens ahrens deleted the dedupditto branch June 20, 2019 03:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants