Skip to content

Drop path prefix workaround #11295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 10, 2020
Merged

Conversation

sterlingjensen
Copy link
Contributor

Canonicalization, the source of the trouble, was disabled in 9000a9f.

Signed-off-by: Sterling Jensen [email protected]

Motivation and Context

Description

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Dec 6, 2020
Canonicalization, the source of the trouble, was disabled in 9000a9f.

Signed-off-by: Sterling Jensen <[email protected]>
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Dec 8, 2020
@behlendorf behlendorf merged commit 1e4667a into openzfs:master Dec 10, 2020
behlendorf pushed a commit that referenced this pull request Dec 23, 2020
Canonicalization, the source of the trouble, was disabled in 9000a9f.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sterling Jensen <[email protected]>
Closes #11295
@petrosagg
Copy link

petrosagg commented Jan 11, 2021

This turned out to be breaking change in the end. If you have a pool named foo that you want to mount at /foo this no longer works: mount.zfs /foo /foo.

This is more common than it seems because mount from util-linux will automatically prepend the slash if the dataset happens to have the same name as a directory in mount's cwd. In other words, this will guarantee that mount will call mount.zfs with a prepended slash, even though the user specified it correctly:

cd /
mkdir foo
mount -t zfs foo /foo # will run `mount.zfs /foo /foo` internally because /foo exists

Since this is a breaking change I think it should be reverted, what do you think?

@behlendorf
Copy link
Contributor

Thanks for reporting this. That sure does look like a case where we regressed. @sterlingjensen could you take a another look at this? This is something we'll want to extend the test case to cover.

@sterlingjensen
Copy link
Contributor Author

It seems to me that this is more a question of maintaining backwards compatibility for surprising broken behavior. If that is the objective than a back out may involve more than just this patch, because the only thing it does is remove an unnecessary check for unwanted behavior that has been fixed for years elsewhere in the code. I'm not even sure how far back we'd have to roll things to maintain compatibility with broken scripts... path canonicalization was disabled in 2017 in order to correct this very issue.

@petrosagg
Copy link

@sterlingjensen is mounting zfs volumes through the normal mount utility with mount -t zfs dataset /mountpoint considered unsupported? Because this is all you need to reproduce this issue, even passing correct arguments (see my example above).

This is also reproducible if you have the following systemd mount unit (this is how I found it actually):

[Mount]
Where=/foo
What=foo
Type=zfs

It is definitely not just broken scripts that will observe the breaking change.

@sterlingjensen
Copy link
Contributor Author

@petrosagg Yes mounting zfs volumes with mount(8) is unsupported UNLESS you do what the big failure message tells you to: set the mountpoint to legacy. You did do that right? You also made sure that the destination directory was in order, right? Because I just spent an hour trying to reproduce your problem, and the only way I can is if I ignore all the messages directing me to do the right thing. So here we go:

# dd if=/dev/zero of=/tmp/vdev0 bs=1M count=64
64+0 records in
64+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 0.055691 s, 1.2 GB/s

# zpool create foo /tmp/vdev0; stat -fc %T /foo
zfs

# zfs get type,mounted,mountpoint,canmount foo
NAME  PROPERTY    VALUE       SOURCE
foo   type        filesystem  -
foo   mounted     yes         -
foo   mountpoint  /foo        default
foo   canmount    on          default

# zfs unmount foo; stat -fc %T /foo
xfs

# ### BIG HINT INCOMING! ###
# cd ~; mount -t zfs foo /foo
filesystem 'foo' cannot be mounted using 'mount'.
Use 'zfs set mountpoint=legacy' or 'zfs mount foo'.
See zfs(8) for more information.
# ### BIG HINT OUTGOING! ###

# zfs set mountpoint=legacy foo; mkdir /foo; stat -fc %T /foo
xfs

# mount -t zfs foo /foo; stat -fc %T /foo
zfs

# zfs get type,mounted,mountpoint,canmount foo
NAME  PROPERTY    VALUE       SOURCE
foo   type        filesystem  -
foo   mounted     yes         -
foo   mountpoint  legacy      local
foo   canmount    on          default

See? Oh, and using device paths instead of datasets for the mount(8) (which is largely the whole point of this helper) continues to function as it always has in the same way that every other mount helper does.

# mount -t zfs /tmp/vdev0 /foo; stat -fc %T /foo
zfs

Or have I completely misunderstood your problem? I can't tell you how to fix your systemd configs, because I don't use systemd.

@petrosagg
Copy link

@sterlingjensen I think what you missed from the reproduction, which is also what caused me endless headaches before narrowing this bug down, is that you need to run the mount command while in the root directory. There is a cd ~ in your script that hides the problem. mount tries to be clever and if the first argument exists in its current directory it will prepend a slash before calling out to mount.zfs.

Yes mounting zfs volumes with mount(8) is unsupported UNLESS you do what the big failure message tells you to:

This doesn't sound right. You wouldn't say setting an encryption key is unsupported UNLESS you turn on encryption. Sure, it's named legacy but nowhere in the documentation says that it doesn't work or that there be dragons if you use it. It's one of the ZFS features.

To give a bit more context of my situation, I have a NixOS box with ZFS set up as per their official guides which failed to boot after an upgrade.

Or have I completely misunderstood your problem? I can't tell you how to fix your systemd configs, because I don't use systemd.

Of course, I'm not looking for a systemd solution or a NixOS solution or anything like that. My point that this PR breaks at least two reasonable ways of interacting with ZFS, not some quirky script I wrote myself. Plus, the PR claimed to be a non-breaking change which shipped in a patch release but turns out it breaks real systems.

Here is a complete script based on what you wrote above that reproduces the breakage:

#!/usr/bin/env bash

set -o errexit
set -x

dd if=/dev/zero of=/tmp/vdev0 bs=1M count=64
zpool create foo /tmp/vdev0
zfs unmount foo
zfs set mountpoint=legacy foo;

# This is what triggers the edge case
# systemd will also run all mount commands with its cwd set to /
cd /

mkdir /foo;
mount -t zfs foo /foo

If you run the above with version 2.0.0 it works as expected. I'm happy to PR a revert of the above plus the addition of this test case.

@sterlingjensen
Copy link
Contributor Author

Ah, I think I finally get it. I understood you to be asking for the very libmount path canonicalization that leads to this kind of unwanted path rewriting. Did you happen to spot from where in the stack this unwanted help was being inflicted, or did you stop at where I removed the otherwise redundant string manipulation? My memory is a little fuzzy, but I want to say that the entire point of the --no-canonicalize flag was to... prevent canonicalization? I'm inclined more to determining the root cause and applying the fix there, but a fix is certainly needed and a restoration of the prefix sanitizer is in order should the source of the problem be in libmount's path rewriting logic.

@behlendorf
Copy link
Contributor

The problematic code in mount(8) can be found here. For local filesystems there will be an attempt to canonicalize the path. You can see there are already exceptions to prevent canonicalization for network filesystem sources (line 1786), known source tags (line 1797) and even pseudo filesystems (line 1806). Unfortunately, ZFS doesn't fall under any of the existing exceptions. It'd be ideal if we could get included under one of these exceptions in mount(8).


This is somewhat beside the point, but mount(8) is expected to fail with the following error message for zfs datasets unless the mountpoint=legacy property is set.

filesystem 'foo' cannot be mounted using 'mount'.
Use 'zfs set mountpoint=legacy' or 'zfs mount foo'.
See zfs(8) for more information.

The reason for this is that the zfs utilities need to be aware of which mount points they're actively managing to support things like zfs mount -a and zfs unmount -a. The is mentioned at the top of the zfs-mount(8) man page. However, I see it doesn't give any reason why this is needed or what the consequences of not setting legacy might be. So there's room for improvement.

     zfs mount [-Oflv] [-o options] -a | filesystem
       Mount ZFS filesystem on a path described by its mountpoint property,
       if the path exists and is empty. If mountpoint is set to legacy, the
       filesystem should be instead mounted using mount(8).

@petrosagg
Copy link

@behlendorf thanks, I opened an issue upstream (util-linux/util-linux#1231) as well to see if we can fix it there

@sterlingjensen no, I haven't looked deeper into the ZFS code that triggers this

@sterlingjensen
Copy link
Contributor Author

@petrosagg The path is malformed before hitting the mount helper, so you correctly identified the problem. This is a very old problem with libmount, so I think your revert + test would be welcome. Or I can clean up my own mistake, dealers choice - let me know.

@behlendorf
Copy link
Contributor

@sterlingjensen thanks. Can you please open a new PR which adds back the needed prefix removal code with a test case. I think that'd be preferable to a straight revert.

@petrosagg
Copy link

petrosagg commented Jan 13, 2021

@sterlingjensen I think it's going to take me a fair amount of work to get the whole test suite working on my computer (NixOS is not great for ad-hoc things like that). I pushed a commit that I think is correct with the test case here petrosagg@83fb297 petrosagg@49501b6

@sterlingjensen
Copy link
Contributor Author

Sorry for the delay, one of these days I'll become competent with kornshell - but not today. #11462

@karelzak
Copy link

Oh guys, why do complain so late? ;-) This is something that is pretty simple to resolve in libmount.

If I good understand the source is always "dataset" and never device name or path, right? In this case, add an exception to libmount will be pretty simple. I'll do it and I can also backport it to util-linux 2.36.2.

karelzak added a commit to util-linux/util-linux that referenced this pull request Jan 14, 2021
@petrosagg
Copy link

petrosagg commented Jan 14, 2021 via email

@behlendorf
Copy link
Contributor

@karelzak that was fast, thanks! Yes, it's expected to be a dataset name.

@sterlingjensen
Copy link
Contributor Author

@behlendorf Wait... are we talking about the target string passed to the mount helper? Because that string can definitely be something other than a dataset name.

/*
* Opportunistically convert a target string into a pool name. If the
* string does not represent a block device with a valid zfs label
* then it is passed through without modification.
*/
static void
parse_dataset(const char *target, char **dataset)
{

@behlendorf
Copy link
Contributor

Correct. It's normally expected to be the dataset name, but we wanted to also support passing a vdev path as well. I think that's generally useful behavior. For example, and at the time this made it possible to adapt xfstests to run on ZFS which was handy. With the libmount change it seems like that should still be possible it's just not going to canonicalize the provided path for us.

@sterlingjensen
Copy link
Contributor Author

Well then that is great news, just wanted to make sure - given the prior 'always "dataset" and never device name or path' qualifier.

@karelzak
Copy link

The change in libmount means that for example symlinks and tags (LABEL=foo) will not be converted to the device name.

If this behavior is not expected then we can make it more sensitive to other scenarios. For example, source strings that start with "/dev" can be used as a path; we can use tags in the traditional way, etc.

It would be nice if you can define the source string more precisely for me. Thanks.

@sterlingjensen
Copy link
Contributor Author

@karelzak If the change to libmount means that it no longer modifies the path string, then I believe that is exactly what everybody wants. We've already got ZFS specific logic built into the mount helper, coincidentally this commit was for the function that performs device name conversion - specifically the part that undoes libmount's path canonicalization 😏

behlendorf pushed a commit that referenced this pull request Jan 19, 2021
Prior to util-linux 2.36.2, if a file or directory in the
current working directory was named 'dataset' then mount(8)
would prepend the current working directory to the dataset.

Eventually, we should be able to drop this workaround.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sterling Jensen <[email protected]>
Closes #11295 
Closes #11462
@karelzak
Copy link

@sterlingjensen OK, libmount since v2.37 and v2.36.2 will not touch the mount source string for ZFS at all.

behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Jan 22, 2021
Prior to util-linux 2.36.2, if a file or directory in the
current working directory was named 'dataset' then mount(8)
would prepend the current working directory to the dataset.

Eventually, we should be able to drop this workaround.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sterling Jensen <[email protected]>
Closes openzfs#11295 
Closes openzfs#11462
behlendorf pushed a commit that referenced this pull request Jan 23, 2021
Prior to util-linux 2.36.2, if a file or directory in the
current working directory was named 'dataset' then mount(8)
would prepend the current working directory to the dataset.

Eventually, we should be able to drop this workaround.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sterling Jensen <[email protected]>
Closes #11295 
Closes #11462
karelzak added a commit to util-linux/util-linux that referenced this pull request Feb 10, 2021
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
Canonicalization, the source of the trouble, was disabled in 9000a9f.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sterling Jensen <[email protected]>
Closes openzfs#11295
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
Prior to util-linux 2.36.2, if a file or directory in the
current working directory was named 'dataset' then mount(8)
would prepend the current working directory to the dataset.

Eventually, we should be able to drop this workaround.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sterling Jensen <[email protected]>
Closes openzfs#11295 
Closes openzfs#11462
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
Canonicalization, the source of the trouble, was disabled in 9000a9f.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sterling Jensen <[email protected]>
Closes openzfs#11295
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
Prior to util-linux 2.36.2, if a file or directory in the
current working directory was named 'dataset' then mount(8)
would prepend the current working directory to the dataset.

Eventually, we should be able to drop this workaround.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sterling Jensen <[email protected]>
Closes openzfs#11295 
Closes openzfs#11462
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants