You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sdn: kill containers that fail to update on node restart
With the move to remote runtimes, we can no longer get the pod's
network namespace from kubelet (since we cannot insert ourselves
into the remote runtime's plugin list and intercept network plugin
calls). As kubelet does not call network plugins in any way on
startup if a container is already running, we have no way to ensure
the container is using the correct NetNamespace (as it may have
changed while openshift-node was down) at startup, unless we encode
the required information into OVS flows.
But if OVS was restarted around the same time OpenShift was,
those flows are lost, and we have no information with which to
recover the pod's networking on node startup. In this case, kill
the infra container underneath kubelet so that it will be restarted
and we can set its network up again.
NOTE: this is somewhat hacky and will not work with other remote
runtimes like CRI-O, but OpenShift 3.6 hardcodes dockershim so this
isn't a problem yet. The "correct" solution is to either checkpoint
our network configuration at container setup time and recover that
ourselves, or to add a GET/STATUS call to CNI and make Kubelet call
that operation on startup when recovering running containers.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1453113
0 commit comments