-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Router provides an invalid, expired certificate by default #10345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@knobunc Here is a pass at using an annotation to provide the default cert PTAL |
@@ -755,6 +756,10 @@ func RunCmdRouter(f *clientcmd.Factory, cmd *cobra.Command, out io.Writer, cfg * | |||
} | |||
} | |||
} | |||
if len(defaultCert) == 0 { | |||
// When a user does not provide the default cert, create one via a Service annotation | |||
t.Annotations = map[string]string{"service.alpha.openshift.io/serving-cert-secret-name": name + "-certs"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k is there a constant for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k is there a constant for this?
Down in a controller package. If its going to be used elsewhere, let's go ahead and promote it to a common package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f64dc4b
to
5ee3ca3
Compare
@@ -325,6 +325,9 @@ func generateSecretsConfig(cfg *RouterConfig, kClient *kclient.Client, | |||
mounts = append(mounts, mount) | |||
} | |||
|
|||
// default cert (which is a pem) is provided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the comment as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's describe this case on its own
// tls.crt is the cert and tls.key is the key | ||
// The crt and key are concatenated to form the needed pem | ||
// is /etc/pki/tls/private/tls.crt and tls.key present | ||
var fileCrtName = "/etc/pki/tls/private/tls.crt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... I really don't like having to hard-code these paths... Make a new env for this case? DEFAULT_CERTIFICATE_DIR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I added the env var. We are here because the router couldn't get the default route. The last chance is to look for a secret.
In your commit message and PR description, you need to have
|
@DirectXMan12 @knobunc -- I had resolves: 1349144, Ben wanted the link. The first line in the commit should be 50 or fewer characters. I will make more changes as suggested above. |
Yeah, I just had the body. You should have a brief title, like |
@DirectXMan12 @knobunc I made the suggested changes PTAL |
func (r *templateRouter) writeDefaultCert() error { | ||
// A path is provided so nothing needs to be done | ||
if len(r.defaultCertificatePath) != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a change of behavior... before, I think the DEFAULT_CERTIFICATE trumped the DEFAULT_CERTIFICATE_PATH. This switches that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this change the default was to use the default_certificate (if present) or secret (if present) first and if nothing happens in this function whatever is in r.defaultCertificatePath gets used. So it looks like I need to back out the last change.
[TEST] |
@knobunc @DirectXMan12 @liggitt @smarterclayton Does anyone know why this is failing? the go 1.6 build works but the go tip build fails and I can't figure out why it is failing. Thanks for the help. Phil |
@knobunc @DirectXMan12 more changes... [ PTAL] |
FAILURE: Generated completions out of date. Please run hack/update-generated-completions.sh |
@@ -185,7 +185,7 @@ backend be_no_sni | |||
|
|||
frontend fe_no_sni | |||
# terminate ssl on edge | |||
bind 127.0.0.1:{{env "ROUTER_SERVICE_NO_SNI_PORT" "10443"}} ssl no-sslv3 {{ if (len .DefaultCertificate) gt 0 }}crt {{.DefaultCertificate}}{{ else }}crt /var/lib/haproxy/conf/default_pub_keys.pem{{ end }} accept-proxy | |||
bind 127.0.0.1:{{env "ROUTER_SERVICE_NO_SNI_PORT" "10443"}} ssl no-sslv3 crt {{.DefaultCertificate}} accept-proxy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove that stray space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@DirectXMan12 @knobunc @smarterclayton Here we go again... PTAL |
[TEST] |
@knobunc is there anything left to do here? |
@pecameron as soon as testing completes, and we have an upgrade story, then we should be good to go. |
@knobunc I committed the original fix, a refreshed default cert. I changed the commit comment to reflect this. With this commit the update story is simple. Whatever router you have deployed before the update will continue to work after the update (and you get a non-expired default cert).The build/test is running. |
@pecameron Cool. As soon as the tests complete we should be good to go. |
@DirectXMan12 PTAL |
We previously shipped a default certificate to be used if the admin did not specify their own when deploying the router. Over time the certificate expired and we continued to ship the expired certificate. This fix updates the expired cert. The replacement cert permits routers deployed in v3.2 or earlier to continue to work after an upgrade. Because the default cert is in the router image, the same certificate could end up being used across multiple different deployments. In v3.3 and beyond, when a router is deployed it ends up with its own default certificate whether or not one is supplied by the admin. When the admin does not provide the default cert the automated certificate generation for services is used to generate a secret containing a default certificate. Unlike the user-passed default certificate (which must include the concatenated private key PEM format), this secret contains a key and a crt as separate files. The router combines them before use. Fixes bug 1349144 Signed-off-by: Phil Cameron <[email protected]>
env.Add(app.Environment{"DEFAULT_CERTIFICATE_PATH": defaultCertificatePath}) | ||
} | ||
} | ||
|
||
secrets, volumes, mounts, err := generateSecretsConfig(cfg, kClient, namespace, defaultCert) | ||
env.Add(app.Environment{"DEFAULT_CERTIFICATE_DIR": defaultCertificateDir}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can I specify both DIR and PATH?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're just refreshing the cert, and an admin can always provide a default certificate via _PATH, why do we need this new env var for _DIR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my question comes down to why wouldn't DEFAULT_CERTIFICATE_PATH be sufficient - why do we need the new env var in order to get this behavior (now that we are auto generating a cert)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we are getting a secret created for us by the annotation on the service. BUT the secret comes in as a cert and key... not as a combined PEM. And haproxy needs a combined PEM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton When --default-cert is used the admin passes a pem and _PATH points to the tls.crt file in the mounted secret. When a pem is not provided we use a service annotation to create a secret containing the default crt/key in two separate files (not a pem in a single file). In this case the files need to be combined into a pem. _DIR points to the root directory (/etc/pki/private).
You specify _DIR when you have a secret in crt/key (in two files not a pem in the crt) format.
In this case _PATH is empty.
[test] |
Evaluated for origin test up to 3d09763 |
@pecameron talked to @smarterclayton on IRC and he doesn't like adding another environment variable. His preference is to overload DEFAULT_CERTIFICATE_PATH so that if we are given a PEM in the path, we use it. But if the path points at only a certificate, then we substitute key for cert in the path and concatenate the two (if found). So say, DEFAULT_CERTIFICATE_PATH was /mnt/tls/file.cert then we look for /mnt/tls/file.key and concat the two to get the PEM. The behavior seems reasonable... can you please make that change? |
Argh. Didn't mean to close... trackpad malfunction |
The only question is whether this impacts old clusters, or could cause confusion to people trying to set this correctly. I.e., a router that today is broken (because it's pointing to a cert) would start working (if there was a key in the same dir). Could that be abused by an admin? If not, does having one less flag simplify or complicate an admin's life? |
continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/8186/) |
I don't think an admin could abuse the changed behavior. If they can change the ENV then can't they rsh in can get the key anyway? Not that using it in the proxy would expose the value anyway. So... if you prefer that interface, that's fine by me. I was expecting to have to have an automatic upgrade path, but since we obviated that with a fallback certificate, that's less of a concern. |
Is there a better defaulting that means users get the new generated
serving cert automatically? By removing the default cert and using
the new mount path for the generated cert? You added a generated cert
- is that mounted into the container automatically?
|
We are generating the cert using an annotation on the service to generate the secret. Then we have to have the pod mount the secret somewhere. Then we need to set an environment variable to tell the router where to find the cert. @pecameron's changes to 'oadm router' make it do that automatically when no cert is provided on the command line. This is analogous to what happens when we are given a cert (we turn it into a secret, mount it, and set _PATH to the right location). Today (and with this patch) the default cert in the image is only used when all else fails. We prefer the DEFAULT_CERTIFICATE environment variable (that has to point to a PEM). Then we try DEFAULT_CERTIFICATE_PATH (which again, needs to point to a PEM). Then we fall back to the cert in the image. @pecameron's patch just adds the DEFAULT_CERTIFICATE_DIR as the lowest priority match, and then makes use of that when no default cert is provided. My preference would be for the generated cert to generate a key, a cert, and a PEM... then we would not need to change the router code at all, and can just use the DEFAULT_CERTIFICATE_PATH, but I didn't realize we needed a PEM and didn't get one until close to feature freeze. I'll propose that as a feature for the future. Anyway, what do you want to do? If you don't like adding _DIR, we can make _PATH look for a key too... I see no security or usability concern there. BUT we are running desperately short of time on this. If you want something else, can we talk first thing tomorrow morning to get this straight? |
It's fine as is. We can go back and address this when we switch to ingress - I suspect we'll want to reevaluate the behavior at that point. |
[merge] |
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/8186/) (Image: devenv-rhel7_4881) |
Evaluated for origin merge up to 3d09763 |
Router provides an invalid, expired certificate by default
We previously shipped a default certificate to be used if the admin
did not specify their own when deploying the router. Over time the
certificate expired and we continued to ship the expired certificate.
This fix updates the expired cert. The replacement cert permits
routers deployed in v3.2 or earlier to continue to work after an
upgrade.
Because the default cert is in the router image, the same certificate
could end up being used across multiple different deployments. In v3.3
and beyond, when a router is deployed it ends up with its own default
certificate whether or not one is supplied by the admin.
When the admin does not provide the default cert the automated certificate
generation for services is used to generate a secret containing a default
certificate. Unlike the user-passed default certificate (which must
include the concatenated private key PEM format), this secret contains
a key and a crt as separate files. The router combines them before use.
Fixes bug 1349144
Signed-off-by: Phil Cameron [email protected]