Skip to content

Router provides an invalid, expired certificate by default #10345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 19, 2016

Conversation

pecameron
Copy link
Contributor

@pecameron pecameron commented Aug 10, 2016

Router provides an invalid, expired certificate by default

We previously shipped a default certificate to be used if the admin
did not specify their own when deploying the router. Over time the
certificate expired and we continued to ship the expired certificate.
This fix updates the expired cert. The replacement cert permits
routers deployed in v3.2 or earlier to continue to work after an
upgrade.

Because the default cert is in the router image, the same certificate
could end up being used across multiple different deployments. In v3.3
and beyond, when a router is deployed it ends up with its own default
certificate whether or not one is supplied by the admin.

When the admin does not provide the default cert the automated certificate
generation for services is used to generate a secret containing a default
certificate. Unlike the user-passed default certificate (which must
include the concatenated private key PEM format), this secret contains
a key and a crt as separate files. The router combines them before use.

Fixes bug 1349144

Signed-off-by: Phil Cameron [email protected]

@pecameron
Copy link
Contributor Author

@knobunc Here is a pass at using an annotation to provide the default cert PTAL

@@ -755,6 +756,10 @@ func RunCmdRouter(f *clientcmd.Factory, cmd *cobra.Command, out io.Writer, cfg *
}
}
}
if len(defaultCert) == 0 {
// When a user does not provide the default cert, create one via a Service annotation
t.Annotations = map[string]string{"service.alpha.openshift.io/serving-cert-secret-name": name + "-certs"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k is there a constant for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k is there a constant for this?

Down in a controller package. If its going to be used elsewhere, let's go ahead and promote it to a common package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k @liggitt I have not done this. Could you point me to the proper place?

@pecameron pecameron force-pushed the bz1349144a branch 2 times, most recently from f64dc4b to 5ee3ca3 Compare August 12, 2016 18:03
@pecameron
Copy link
Contributor Author

pecameron commented Aug 12, 2016

@liggitt @knobunc I made the suggested changes (except the service.alpha.openshift.io/serving-cert-secret-name becoming common) Still don't know what to do here. [PTAL]

@@ -325,6 +325,9 @@ func generateSecretsConfig(cfg *RouterConfig, kClient *kclient.Client,
mounts = append(mounts, mount)
}

// default cert (which is a pem) is provided.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the comment as above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's describe this case on its own

// tls.crt is the cert and tls.key is the key
// The crt and key are concatenated to form the needed pem
// is /etc/pki/tls/private/tls.crt and tls.key present
var fileCrtName = "/etc/pki/tls/private/tls.crt"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... I really don't like having to hard-code these paths... Make a new env for this case? DEFAULT_CERTIFICATE_DIR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I added the env var. We are here because the router couldn't get the default route. The last chance is to look for a secret.

@DirectXMan12
Copy link
Contributor

DirectXMan12 commented Aug 15, 2016

In your commit message and PR description, you need to have Fixes bug 1349144 instead of Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1349144. Additionally it should specify what used to be the case/why you changed this. For example:

We used to ship a hard-coded default certificate to be used if the admin did not specify their own
when launching the router. This was suboptimal, since it could expire, and since the same certificate
could end up being used across multiple different deployments.

Instead, this commit removes the hard-coded default certificate, and instead used the automated
certificate generation for services to generate a secret containing a default certificate when the
admin does not supply their own. Unlike the user-passed default certificate (which must be in the
cert and the private key concatenated PEM format), this secret contains a key and a crt as separate
files, so the router will now check for that and concatenate them if need be.

Fixes bug 1349144

@pecameron
Copy link
Contributor Author

@DirectXMan12 @knobunc -- I had resolves: 1349144, Ben wanted the link. The first line in the commit should be 50 or fewer characters. I will make more changes as suggested above.

@DirectXMan12
Copy link
Contributor

The first line in the commit should be 50 or fewer characters

Yeah, I just had the body. You should have a brief title, like Router: generate a new default cert instead of vendoring it.

@pecameron
Copy link
Contributor Author

@DirectXMan12 @knobunc I made the suggested changes PTAL

func (r *templateRouter) writeDefaultCert() error {
// A path is provided so nothing needs to be done
if len(r.defaultCertificatePath) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change of behavior... before, I think the DEFAULT_CERTIFICATE trumped the DEFAULT_CERTIFICATE_PATH. This switches that.

Copy link
Contributor Author

@pecameron pecameron Aug 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this change the default was to use the default_certificate (if present) or secret (if present) first and if nothing happens in this function whatever is in r.defaultCertificatePath gets used. So it looks like I need to back out the last change.

@pecameron
Copy link
Contributor Author

[TEST]

@pecameron
Copy link
Contributor Author

@knobunc @DirectXMan12 @liggitt @smarterclayton Does anyone know why this is failing? the go 1.6 build works but the go tip build fails and I can't figure out why it is failing. Thanks for the help. Phil

@pecameron
Copy link
Contributor Author

@knobunc @DirectXMan12 more changes... [ PTAL]

@knobunc
Copy link
Contributor

knobunc commented Aug 17, 2016

FAILURE: Generated completions out of date. Please run hack/update-generated-completions.sh

@@ -185,7 +185,7 @@ backend be_no_sni

frontend fe_no_sni
# terminate ssl on edge
bind 127.0.0.1:{{env "ROUTER_SERVICE_NO_SNI_PORT" "10443"}} ssl no-sslv3 {{ if (len .DefaultCertificate) gt 0 }}crt {{.DefaultCertificate}}{{ else }}crt /var/lib/haproxy/conf/default_pub_keys.pem{{ end }} accept-proxy
bind 127.0.0.1:{{env "ROUTER_SERVICE_NO_SNI_PORT" "10443"}} ssl no-sslv3 crt {{.DefaultCertificate}} accept-proxy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove that stray space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@pecameron
Copy link
Contributor Author

@DirectXMan12 @knobunc @smarterclayton Here we go again... PTAL

@pecameron
Copy link
Contributor Author

[TEST]

@pecameron
Copy link
Contributor Author

@knobunc is there anything left to do here?

@knobunc
Copy link
Contributor

knobunc commented Aug 18, 2016

@pecameron as soon as testing completes, and we have an upgrade story, then we should be good to go.

@pecameron
Copy link
Contributor Author

@knobunc I committed the original fix, a refreshed default cert. I changed the commit comment to reflect this. With this commit the update story is simple. Whatever router you have deployed before the update will continue to work after the update (and you get a non-expired default cert).The build/test is running.

@knobunc
Copy link
Contributor

knobunc commented Aug 18, 2016

@pecameron Cool. As soon as the tests complete we should be good to go.

@knobunc
Copy link
Contributor

knobunc commented Aug 18, 2016

@DirectXMan12 PTAL

We previously shipped a default certificate to be used if the admin
did not specify their own when deploying the router.  Over time the
certificate expired and we continued to ship the expired certificate.
This fix updates the expired cert. The replacement cert permits
routers deployed in v3.2 or earlier to continue to work after an
upgrade.

Because the default cert is in the router image, the same certificate
could end up being used across multiple different deployments. In v3.3
and beyond, when a router is deployed it ends up with its own default
certificate whether or not one is supplied by the admin.

When the admin does not provide the default cert the automated certificate
generation for services is used to generate a secret containing a default
certificate.  Unlike the user-passed default certificate (which must
include the concatenated private key PEM format), this secret contains
a key and a crt as separate files. The router combines them before use.

Fixes bug 1349144

Signed-off-by: Phil Cameron <[email protected]>
env.Add(app.Environment{"DEFAULT_CERTIFICATE_PATH": defaultCertificatePath})
}
}

secrets, volumes, mounts, err := generateSecretsConfig(cfg, kClient, namespace, defaultCert)
env.Add(app.Environment{"DEFAULT_CERTIFICATE_DIR": defaultCertificateDir})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can I specify both DIR and PATH?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're just refreshing the cert, and an admin can always provide a default certificate via _PATH, why do we need this new env var for _DIR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question comes down to why wouldn't DEFAULT_CERTIFICATE_PATH be sufficient - why do we need the new env var in order to get this behavior (now that we are auto generating a cert)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we are getting a secret created for us by the annotation on the service. BUT the secret comes in as a cert and key... not as a combined PEM. And haproxy needs a combined PEM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton When --default-cert is used the admin passes a pem and _PATH points to the tls.crt file in the mounted secret. When a pem is not provided we use a service annotation to create a secret containing the default crt/key in two separate files (not a pem in a single file). In this case the files need to be combined into a pem. _DIR points to the root directory (/etc/pki/private).

You specify _DIR when you have a secret in crt/key (in two files not a pem in the crt) format.
In this case _PATH is empty.

@knobunc
Copy link
Contributor

knobunc commented Aug 18, 2016

[test]

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 3d09763

@knobunc
Copy link
Contributor

knobunc commented Aug 18, 2016

@pecameron talked to @smarterclayton on IRC and he doesn't like adding another environment variable. His preference is to overload DEFAULT_CERTIFICATE_PATH so that if we are given a PEM in the path, we use it. But if the path points at only a certificate, then we substitute key for cert in the path and concatenate the two (if found). So say, DEFAULT_CERTIFICATE_PATH was /mnt/tls/file.cert then we look for /mnt/tls/file.key and concat the two to get the PEM. The behavior seems reasonable... can you please make that change?

@knobunc knobunc closed this Aug 18, 2016
@knobunc
Copy link
Contributor

knobunc commented Aug 18, 2016

Argh. Didn't mean to close... trackpad malfunction

@knobunc knobunc reopened this Aug 18, 2016
@smarterclayton
Copy link
Contributor

The only question is whether this impacts old clusters, or could cause confusion to people trying to set this correctly. I.e., a router that today is broken (because it's pointing to a cert) would start working (if there was a key in the same dir). Could that be abused by an admin? If not, does having one less flag simplify or complicate an admin's life?

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/8186/)

@knobunc
Copy link
Contributor

knobunc commented Aug 18, 2016

I don't think an admin could abuse the changed behavior. If they can change the ENV then can't they rsh in can get the key anyway? Not that using it in the proxy would expose the value anyway. So... if you prefer that interface, that's fine by me. I was expecting to have to have an automatic upgrade path, but since we obviated that with a fallback certificate, that's less of a concern.

@smarterclayton
Copy link
Contributor

smarterclayton commented Aug 18, 2016 via email

@knobunc
Copy link
Contributor

knobunc commented Aug 19, 2016

We are generating the cert using an annotation on the service to generate the secret. Then we have to have the pod mount the secret somewhere. Then we need to set an environment variable to tell the router where to find the cert.

@pecameron's changes to 'oadm router' make it do that automatically when no cert is provided on the command line. This is analogous to what happens when we are given a cert (we turn it into a secret, mount it, and set _PATH to the right location).

Today (and with this patch) the default cert in the image is only used when all else fails. We prefer the DEFAULT_CERTIFICATE environment variable (that has to point to a PEM). Then we try DEFAULT_CERTIFICATE_PATH (which again, needs to point to a PEM). Then we fall back to the cert in the image.

@pecameron's patch just adds the DEFAULT_CERTIFICATE_DIR as the lowest priority match, and then makes use of that when no default cert is provided. My preference would be for the generated cert to generate a key, a cert, and a PEM... then we would not need to change the router code at all, and can just use the DEFAULT_CERTIFICATE_PATH, but I didn't realize we needed a PEM and didn't get one until close to feature freeze. I'll propose that as a feature for the future.

Anyway, what do you want to do? If you don't like adding _DIR, we can make _PATH look for a key too... I see no security or usability concern there. BUT we are running desperately short of time on this. If you want something else, can we talk first thing tomorrow morning to get this straight?

@smarterclayton
Copy link
Contributor

It's fine as is. We can go back and address this when we switch to ingress - I suspect we'll want to reevaluate the behavior at that point.

@smarterclayton
Copy link
Contributor

[merge]

@openshift-bot
Copy link
Contributor

openshift-bot commented Aug 19, 2016

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/8186/) (Image: devenv-rhel7_4881)

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 3d09763

@openshift-bot openshift-bot merged commit 2a9dc2c into openshift:master Aug 19, 2016
@pecameron pecameron deleted the bz1349144a branch August 19, 2016 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants