Skip to content

oc cluster up doesn't work when Docker is running with user namespaces support #12643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
php-coder opened this issue Jan 24, 2017 · 12 comments
Closed
Assignees
Labels
component/composition kind/bug Categorizes issue or PR as related to a bug. priority/P2

Comments

@php-coder
Copy link
Contributor

oc cluster up doesn't start when Docker is running with --userns-remap=default option.

Probably in the remap environment we should enable --user=host option for privileged containers (and in some other cases: https://github.com/kubernetes/kubernetes/pull/31169/files#diff-10055ae93a8699af13ceba0482fc43c3R1406).

Version

oc v1.5.0-alpha.2+c5868ac-143-dirty
kubernetes v1.5.2+43a9be4
features: Basic-Auth

Steps To Reproduce
  1. sudoedit /usr/lib/systemd/system/docker.service
  2. append --userns-remap=default option to ExecStart parameter
  3. sudo systemctl daemon-reload
  4. sudo systemctl restart docker
  5. oc cluster up
Current Result

oc cluster up fails with error "Privileged mode is incompatible with user namespaces".

Expected Result

oc cluster up should run the cluster.

Additional Information
$ oc cluster up --version=latest --loglevel=5
-- Checking OpenShift client ... 
-- Checking Docker client ... 
I0124 13:56:46.658726    1502 up.go:510] No Docker environment variables found. Will attempt default socket.
I0124 13:56:46.659069    1502 up.go:515] No Docker host (DOCKER_HOST) configured. Will attempt default socket.
I0124 13:56:46.663255    1502 up.go:543] Docker ping succeeded
-- Checking Docker version ... 
I0124 13:56:46.663905    1502 helper.go:101] Retrieving Docker version
I0124 13:56:46.665231    1502 helper.go:107] Docker version results: &docker.Env{"ApiVersion=1.24", "GitCommit=7392c3b", "GoVersion=go1.6.4", "Os=linux", "Arch=amd64", "KernelVersion=4.2.3-300.fc23.x86_64", "BuildTime=2016-12-16T02:26:11.521107950+00:00", "Version=1.12.5"}
I0124 13:56:46.665426    1502 helper.go:112] Version: 1.12.5
I0124 13:56:46.665449    1502 up.go:610] Checking that docker version is at least 1.10.0
-- Checking for existing OpenShift container ... 
I0124 13:56:46.665805    1502 helper.go:168] Inspecting docker container "origin"
I0124 13:56:46.666771    1502 helper.go:172] Container "origin" was not found
-- Checking for openshift/origin:latest image ... 
I0124 13:56:46.667076    1502 helper.go:136] Inspecting Docker image "openshift/origin:latest"
I0124 13:56:46.668887    1502 helper.go:139] Image "openshift/origin:latest" found: &docker.Image{ID:"sha256:7f358fef58682535ac9ed134c3dc938e740c1bb85e90f696c38eec0a9e9d3b1d", RepoTags:[]string{"openshift/origin:latest"}, Parent:"", Comment:"", Created:time.Time{sec:63620295732, nsec:314264452, loc:(*time.Location)(0x3543a20)}, Container:"2e34a6c20544623e2283ca1cdf0bb0478b3a39e3bad46b59e859b6e62d8c113e", ContainerConfig:docker.Config{Hostname:"d6dcf178f680", Domainname:"", User:"", Memory:0, MemorySwap:0, MemoryReservation:0, KernelMemory:0, CPUShares:0, CPUSet:"", AttachStdin:false, AttachStdout:false, AttachStderr:false, PortSpecs:[]string(nil), ExposedPorts:map[docker.Port]struct {}{"53/tcp":struct {}{}, "8443/tcp":struct {}{}}, StopSignal:"", Tty:false, OpenStdin:false, StdinOnce:false, Env:[]string{"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "HOME=/root", "OPENSHIFT_CONTAINERIZED=true", "KUBECONFIG=/var/lib/origin/openshift.local.config/master/admin.kubeconfig"}, Cmd:[]string{"/bin/sh", "-c", "#(nop) ", "ENTRYPOINT [\"/usr/bin/openshift\"]"}, DNS:[]string(nil), Image:"sha256:d0631ddcc4f653b4c51abd79918ed5427f20b257d80a8cef385fa5d747d3d696", Volumes:map[string]struct {}(nil), VolumeDriver:"", VolumesFrom:"", WorkingDir:"/var/lib/origin", MacAddress:"", Entrypoint:[]string{"/usr/bin/openshift"}, NetworkDisabled:false, SecurityOpts:[]string(nil), OnBuild:[]string{}, Mounts:[]docker.Mount(nil), Labels:map[string]string{"license":"GPLv2", "name":"CentOS Base Image", "vendor":"CentOS", "build-date":"20161102", "io.k8s.description":"OpenShift Origin is a platform for developing, building, and deploying containerized applications. See https://docs.openshift.org/latest for more on running OpenShift Origin.", "io.k8s.display-name":"OpenShift Origin Application Platform"}}, DockerVersion:"1.12.2-rc2", Author:"", Config:(*docker.Config)(0xc4209641c0), Architecture:"amd64", Size:785414217, VirtualSize:785414217, RepoDigests:[]string{"openshift/origin@sha256:d00abcf169959e33dd0f55923b690fb7ac3c64a00fc3d81e14491f31c4d8c9db"}}
-- Checking Docker daemon configuration ... 
I0124 13:56:46.669151    1502 helper.go:67] Retrieving Docker daemon info
I0124 13:56:46.678746    1502 helper.go:75] Docker daemon info: types.Info{ID:"YHFD:22Z2:JM6H:AUT2:XA72:6POC:DFYQ:JTFZ:KE3Y:QPUM:ZY33:DOVO", Containers:2, ContainersRunning:0, ContainersPaused:0, ContainersStopped:2, Images:8, Driver:"devicemapper", DriverStatus:[][2]string{[2]string{"Pool Name", "docker-253:0-1877325-pool"}, [2]string{"Pool Blocksize", "65.54 kB"}, [2]string{"Base Device Size", "10.74 GB"}, [2]string{"Backing Filesystem", "xfs"}, [2]string{"Data file", "/dev/loop2"}, [2]string{"Metadata file", "/dev/loop3"}, [2]string{"Data Space Used", "2.24 GB"}, [2]string{"Data Space Total", "107.4 GB"}, [2]string{"Data Space Available", "14.98 GB"}, [2]string{"Metadata Space Used", "2.249 MB"}, [2]string{"Metadata Space Total", "2.147 GB"}, [2]string{"Metadata Space Available", "2.145 GB"}, [2]string{"Thin Pool Minimum Free Space", "10.74 GB"}, [2]string{"Udev Sync Supported", "true"}, [2]string{"Deferred Removal Enabled", "false"}, [2]string{"Deferred Deletion Enabled", "false"}, [2]string{"Deferred Deleted Device Count", "0"}, [2]string{"Data loop file", "/var/lib/docker/427680.427680/devicemapper/devicemapper/data"}, [2]string{"Metadata loop file", "/var/lib/docker/427680.427680/devicemapper/devicemapper/metadata"}, [2]string{"Library Version", "1.02.109 (2015-09-22)"}}, SystemStatus:[][2]string(nil), Plugins:types.PluginsInfo{Volume:[]string{"local"}, Network:[]string{"host", "null", "bridge", "overlay"}, Authorization:[]string(nil)}, MemoryLimit:true, SwapLimit:true, KernelMemory:true, CPUCfsPeriod:true, CPUCfsQuota:true, CPUShares:true, CPUSet:true, IPv4Forwarding:true, BridgeNfIptables:true, BridgeNfIP6tables:true, Debug:false, NFd:16, OomKillDisable:true, NGoroutines:24, SystemTime:"2017-01-24T13:56:46.677750986Z", ExecutionDriver:"", LoggingDriver:"journald", CgroupDriver:"cgroupfs", NEventsListener:0, KernelVersion:"4.2.3-300.fc23.x86_64", OperatingSystem:"Fedora 23 (Twenty Three)", OSType:"linux", Architecture:"x86_64", IndexServerAddress:"https://index.docker.io/v1/", RegistryConfig:(*registry.ServiceConfig)(0xc4209b87c0), NCPU:2, MemTotal:3616374784, DockerRootDir:"/var/lib/docker/427680.427680", HTTPProxy:"", HTTPSProxy:"", NoProxy:"", Name:"localhost.localdomain", Labels:[]string(nil), ExperimentalBuild:false, ServerVersion:"1.12.5", ClusterStore:"", ClusterAdvertise:"", SecurityOptions:[]string{"seccomp"}}
I0124 13:56:46.678867    1502 helper.go:48] Looking for "172.30.0.0/16" in []*registry.NetIPNet{(*registry.NetIPNet)(0xc420b2ed20), (*registry.NetIPNet)(0xc420b2ed80)}
I0124 13:56:46.678981    1502 helper.go:52] Found "172.30.0.0/16"
-- Checking for available ports ... 
I0124 13:56:46.679849    1502 run.go:155] Creating container named ""
config:
  image: openshift/origin:latest
  entry point:
    /bin/bash
  command:
    -c
    cat /proc/net/tcp && ( [ -e /proc/net/tcp6 ] && cat /proc/net/tcp6 || true)

host config:
  pid mode: host
  network mode: host

FAIL
   Error: a port needed by OpenShift is not available
   Caused By:
     Error: Cannot get TCP port information from Kubernetes host
     Caused By:
       Error: cannot create container using image openshift/origin:latest
       Caused By:
         Error: API error (500): {"message":"Privileged mode is incompatible with user namespaces"}
@pweil-
Copy link

pweil- commented Jan 24, 2017

this is likely because it needs the experimental kubelet behavior that is behind the userns feature gate.

@pweil- pweil- added component/composition kind/bug Categorizes issue or PR as related to a bug. priority/P2 labels Jan 24, 2017
@pweil-
Copy link

pweil- commented Jan 24, 2017

@csrwng fyi. I think this is something @php-coder can help out with but wanted to loop you in.

@php-coder
Copy link
Contributor Author

@pweil- There is a similar question like we had with Kubernetes -- should we detect the remap environment or user should explicitly activate it? Do we have the same feature gates in the OpenShift or it should be just an option (like oc cluster up --enable-user-ns)?

@pweil-
Copy link

pweil- commented Jan 24, 2017

the feature gate should be enabled explicitly. This is possible by the extended args in the node config but I don't think it's quite so simple for oc cluster up.

@csrwng
Copy link
Contributor

csrwng commented Jan 24, 2017

@pweil- if it's possible to detect, then imho we should detect and add the proper args to the node config with cluster up. Otherwise, you can't start the cluster right?

@pweil-
Copy link

pweil- commented Jan 24, 2017

@csrwng I'd like to detect it in cluster up since we're already hedging on docker versions there anyway. However, it's a little bit tricky. In docker 1.13+ we have the ability to detect userns enablement through the security-opt settings of the /info endpoint available to the docker client.

Prior to that the only way of knowing (afaik) is looking at the daemon's arguments for the --userns-remap flag which starts to get nasty. Do I look for a systemd config file? What if it was started manually?

@csrwng
Copy link
Contributor

csrwng commented Jan 24, 2017

@pweil- I see, yeah, that's tricky. Then I'd say detect if we can and provide a flag to enable it explicitly if you need it but it can't be detected automatically.

@php-coder
Copy link
Contributor Author

Prior to that the only way of knowing (afaik) is looking at the daemon's arguments for the --userns-remap flag which starts to get nasty.

Another option is to inspect /proc/self/uid_map file inside of a running container. I wouldn't suggest it in the other circumstances but oc cluster up already doing similar things, so maybe this solution won't look so nasty.

@php-coder
Copy link
Contributor Author

php-coder commented Jan 25, 2017

this is likely because it needs the experimental kubelet behavior that is behind the userns feature gate.

Exactly this check, that is failing now, doesn't use Kubernetes and create container in Docker directly. To pass --userns=host option we need to update go-dockerclient first (required for having UsernsMode field).

@php-coder
Copy link
Contributor Author

I'm curious how the command nsenter --mount=/rootfs/proc/1/ns/mnt mount -o bind /var/lib/origin/openshift.local.volumes /var/lib/origin/openshift.local.volumes will work if /rootfs/proc/1/ns/mnt isn't readable:

$ docker run --rm -it -v /:/rootfs --entrypoint /bin/bash --userns=host --ipc=host --pid=host --uts=host --net=host openshift/origin:v3.6.0-alpha.1
[root@localhost origin]# ls -l /rootfs/proc/1/ns/mnt
ls: cannot read symbolic link /rootfs/proc/1/ns/mnt: Permission denied
lrwxrwxrwx. 1 root root 0 Apr 20 13:34 /rootfs/proc/1/ns/mnt

@php-coder
Copy link
Contributor Author

I'm curious how the command nsenter --mount=/rootfs/proc/1/ns/mnt mount -o bind /var/lib/origin/openshift.local.volumes /var/lib/origin/openshift.local.volumes will work

@legionus told me that it could be solved by using capabilities. So it works with --cap-add SYS_PTRACE

@php-coder
Copy link
Contributor Author

I'm closing this issue because it was fixed by #14169
oc cluster up still doesn't start when Docker is running in user namespaces but it produces another error it it should be fixed in a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/composition kind/bug Categorizes issue or PR as related to a bug. priority/P2
Projects
None yet
Development

No branches or pull requests

3 participants