Add basic validation for route TLS configuration - checks that input is "syntactically" valid. #8366

ramr · 2016-04-06T02:19:34Z

Partial Fix for https://bugzilla.redhat.com/show_bug.cgi?id=1312292

The other PR #8353 would be needed in addition to this for a complete fix but this one provides a partial fix to check input is valid PEM format.

# make release  || echo "run make, copy binary and build new docker image"  
oadm router --latest-images  --replicas=0  
oc env dc/router EXTENDED_VALIDATION=true  
oc scale  dc/router --replicas=1

Updated with testing instructions/usage.

smarterclayton · 2016-04-06T02:30:27Z

pkg/route/api/validation/validation.go

 	routeapi "github.com/openshift/origin/pkg/route/api"
 )

+// disableRouteTLSValidation returns whether or not to disable TLS config validation checks.
+func disableRouteTLSValidation() bool {


Why not just create a new function and invoke it from the router in the unique_host check (or in an optional pre check step, based on a flag)?

ramr · 2016-04-06T02:33:18Z

[test]

smarterclayton · 2016-04-06T02:35:53Z

Can we check the cert order by validation? We already do that in other
places in the code.

On Tue, Apr 5, 2016 at 10:19 PM, Ram Ranganathan [email protected]
wrote:

Partial Fix for https://bugzilla.redhat.com/show_bug.cgi?id=1312292

@smarterclayton https://github.com/smarterclayton PTAL Thx

The other PR #8353 #8353 would
be needed in addition to this for a complete fix but this one provides a

partial fix to check input is valid PEM format.

You can view, comment on, or merge this pull request online at:

#8366
Commit Summary

Add basic validation for route TLS configuration - checks that input
is

File Changes

M pkg/route/api/validation/validation.go
https://github.com/openshift/origin/pull/8366/files#diff-0 (77)

M pkg/route/api/validation/validation_test.go
https://github.com/openshift/origin/pull/8366/files#diff-1 (257)

Patch Links:

https://github.com/openshift/origin/pull/8366.patch

https://github.com/openshift/origin/pull/8366.diff

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8366

smarterclayton · 2016-04-06T02:38:05Z

We don't have to validate in the api server - for instance, rejecting a route because the cert is malformed from the router side is sufficient. My concern is letting haproxy do it (or having to do it for every single route) if the check is simple enough for us to do.

smarterclayton · 2016-04-06T02:38:27Z

We can be more strict than what haproxy allows (especially if we give a good error message), as long as we can be sure we're not less strict.

ramr · 2016-04-06T18:01:44Z

Ok - will rework this and get some bits in from the other PR.

smarterclayton · 2016-04-06T18:36:00Z

Do we think that will be sufficient? Do we know of anything else beyond
certs being invalid and cert order being wrong in the PEM? Do we need to
limit the types of certs?

On Wed, Apr 6, 2016 at 2:01 PM, Ram Ranganathan [email protected]
wrote:

Ok - will rework this and get some bits in from the other PR.

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8366 (comment)

ramr · 2016-04-07T06:44:39Z

Ok @smarterclayton PTAL I pushed some changes. I added a check for cert verification + check key mismatches. Don't think we can do a full blown verification on cacerts or destcacerts (aka only support well known certifying authorities) as I suspect there could well be self signed ones on the edge [my testing] + private CAs on the haproxy-to-pod traffic end (re-encrypt) - i think we should allow for that.

ramr · 2016-04-07T06:45:21Z

[test]

ramr · 2016-04-07T09:28:40Z

[test]

smarterclayton · 2016-04-07T16:42:25Z

Define full blown verification of cacerts? What problems would openssl
have besides ordering? Does openssl have a check function we could call?
I'm asking for an enumeration of possible failure modes, whether
verification checks them, and unknowns in haproxy. If we don't know that,
we don't know whether the validation approach works.

On Thu, Apr 7, 2016 at 6:25 AM, OpenShift Bot [email protected]
wrote:

continuous-integration/openshift-jenkins/test FAILURE (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/2779/)

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8366 (comment)

ramr · 2016-04-07T18:52:19Z

So what verification involves is - pem is valid, is a cert, checking the cert is valid (expiry + allowed domains), the certificate chain is valid (which would return an unknown authority for self-signed/generated CAs) and then optionally any extended key usages. We can do that for the edge certificates we have as we have the associated ca cert but not for the cacert/destination cacert themselves. imho, this seems a bit backward to do in golang code anyway when haproxy -c would just verify this for us - which is what the PR #8353 does. We are basically reverse engineering that and we cannot do any future-proofing guarantees anyway as haproxy code could change.

smarterclayton · 2016-04-07T18:58:25Z

Can we test the certs in isolation rather than have to generate them into
the file? I.e. construct a simple haproxy -c that
verifies only those certs?

On Thu, Apr 7, 2016 at 2:52 PM, Ram Ranganathan [email protected]
wrote:

So what verification involves is - pem is valid, is a cert, checking the
cert is valid (expiry + allowed domains), the certificate chain is valid
(which would return an unknown authority for self-signed/generated CAs) and
then optionally any extended key usages. We can do that for the edge
certificates we have as we have the associated ca cert but not for the
cacert/destination cacert themselves. imho, this seems a bit backward to do
in golang code anyway when haproxy -c would just verify this for us -
which is what the PR #8353 #8353
does. We are basically reverse engineering that and we cannot do any
future-proofing guarantees anyway as haproxy code could change.

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8366 (comment)

ramr · 2016-04-07T19:18:39Z

So that was my original intent but then there was a point made regarding testing in isolation vs with the rest of the config as there could be potential of failures when run in "tandem". Which was the reason for doing the validate-config on the generated config (just the new certs/keys/cacerts would be written but all the existing config + the config for the new service alias and its config/certs would be tested).
Edited new certs/keys/cacerts

smarterclayton · 2016-04-07T19:25:52Z

What would the tandem failure be?

ramr · 2016-04-07T20:00:10Z

Due to overlapping backend configs. Some error cases I can think off could be due to with content switching overlapping frontend/backend modes, duplicate keys (which we catch in the service aliases map), duplicates ids and acls (would have to be a custom config though). Not a complete set by any stretch but just from a more defensive standpoint - checking in tandem vs isolation would address that and possibly be a bit more future-proof as it does catch more error cases.

smarterclayton · 2016-04-07T21:41:27Z

The first couple you mentioned seem like things we have to catch for almost
all frameworks anyway - things under our control.

My concerns with #8353 (at least, as of now) are as follows:

We have to check one at a time. That's going to break any high density
case on restart.
We have to change the structure of our queue to where it isn't layered
anymore (we're basically doing transactional commits, so each change
becomes an ordered transaction)
Any break that is a user break needs to be communicated back to the user
somehow eventually

On Thu, Apr 7, 2016 at 4:00 PM, Ram Ranganathan [email protected]
wrote:

Due to overlapping backend configs. Some error cases I can think off could
be due to with content switching overlapping frontend/backend modes,
duplicate keys (which we catch in the service aliases map), duplicates ids
and acls (would have to be a custom config though). Not a complete set by
any stretch but just from a more defensive standpoint - checking in tandem
vs isolation would address that and possibly be a bit more future-proof as
it does catch more error cases.

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#8366 (comment)

ramr · 2016-04-07T22:30:33Z

So as re: the concerns on #8353 :

We can be smarter on that. A flag on the route - check if its set, skip otherwise. Set the flag
if the route is modified (and set on creation).
We don't need a structure change to the queue really because we just apply the transactions in order. They are batched up, yes but the checks are always in order. If something fails a check, its not added to the batch.
That we can do with updating the status on the route (which is similar to a duplicate route hostname).

Edited numbering

smarterclayton · 2016-04-08T00:01:41Z

Let's talk tomorrow.

ramr · 2016-04-08T01:27:48Z

sounds good.

ramr · 2016-04-08T18:00:25Z

[test]

smarterclayton · 2016-04-11T14:00:14Z

pkg/router/template/plugin.go

@@ -42,6 +42,7 @@ type TemplatePluginConfig struct {
 	StatsUsername          string
 	StatsPassword          string
 	IncludeUDP             bool
+	ValidateRouteTLSConfig bool


Maybe this should be ExtendedValidation which more stringently tests the route against a set of known good safe configuration.

liggitt · 2016-04-20T16:40:47Z

pkg/cmd/infra/router/template.go

+	if o.ExtendedValidation {
+		nextPlugin = controller.NewExtendedValidator(nextPlugin, statusPlugin)
+	}
+	plugin := controller.NewUniqueHost(nextPlugin, o.RouteSelectionFunc(), statusPlugin)


cast statusPlugin to RejectionRecorder here so we know we're not using its plugin functions

ramr · 2016-04-20T21:52:53Z

@liggitt made the fixes as per your recommendations, cleaned up remnants of the old implementation and added more tests. @smarterclayton @knobunc fyi.

knobunc · 2016-04-21T15:49:36Z

For Online Beta they are disabling custom hostnames, so that means that we don't need to allow custom certs. We need to work out how ops can disable certs (perhaps by deploying a custom router).

We also need to get this patch landed for GA.

ramr · 2016-05-06T21:19:12Z

@smarterclayton /bump anything else needed on this?

smarterclayton · 2016-05-07T06:08:15Z

Regarding sni lookup order, is a crt-list the solution? https://rafpe.ninja/2016/04/24/haproxy-ssl-domains-in-crt-list/

smarterclayton · 2016-05-07T06:09:41Z

Will do final follow up on Monday.

liggitt · 2016-05-07T12:46:14Z

is a crt-list the solution?

Looks like it. We probably need logic to decide which route's cert to use for a host when multiple routes have the same host (in scenarios where that is allowed)

smarterclayton · 2016-05-13T04:10:35Z

[test]

smarterclayton · 2016-05-13T15:39:12Z

[test]

smarterclayton · 2016-05-13T20:48:00Z

Test failure is real

spinolacastro · 2016-05-16T12:01:13Z

Do you believe it will land in time to v1.2.0 ?

smarterclayton · 2016-05-16T15:04:59Z

No, this is unfortunately too big for 1.2.0. You however will be able to use a 1.3.0.alpha track router image for that.

@smarterclayton

input is "syntactically" valid. o Checkpoint initial code. o Add support for validating route tls config. o Add option for validating route tls config. o Validation fixes. o Check private key + cert mismatches. o Add tests. o Record route rejection. o Hook into add route processing + store invalid service alias configs in another place - easy to check prior errors on readmission. o Remove entry from invalid service alias configs upon route removal. o Add generated completions. o Bug fixes. o Recording rejecting routes is not working completely. o Fix status update problem - we should set the status to admitted only if we had no errors handling a route. o Rework to use a new controller - extended_validator as per @smarterclayton comments. o Cleanup validation as per @liggitt comments. o Update bash completions. o Fixup older validation unit tests. o Changes as per @liggitt review comments + cleanup tests. o Fix failing test.

ramr · 2016-05-16T17:25:48Z

rebased and fixed tests

ramr · 2016-05-16T17:25:52Z

[test]

openshift-bot · 2016-05-16T17:26:24Z

Evaluated for origin test up to cdf242e

openshift-bot · 2016-05-16T18:20:29Z

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/3852/)

ramr · 2016-05-16T20:25:09Z

@smarterclayton PTAL - rebased and fixed tests + all tests pass now.

smarterclayton · 2016-05-17T04:39:32Z

LGTM [merge]

openshift-bot · 2016-05-17T04:40:12Z

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_requests_origin/5924/) (Image: devenv-rhel7_4216)

openshift-bot · 2016-05-17T04:40:12Z

Evaluated for origin merge up to cdf242e

smarterclayton reviewed Apr 6, 2016
View reviewed changes

ramr force-pushed the route-tls-validation branch 3 times, most recently from 6f97439 to f60e526 Compare April 7, 2016 06:38

ramr force-pushed the route-tls-validation branch from f60e526 to e3e7d76 Compare April 7, 2016 09:26

ramr mentioned this pull request Apr 8, 2016

[WIP] Add support for route config validation #8353

Closed

smarterclayton reviewed Apr 11, 2016
View reviewed changes

liggitt reviewed Apr 20, 2016
View reviewed changes

ramr force-pushed the route-tls-validation branch from 1b2902b to 34b0242 Compare April 20, 2016 21:50

smarterclayton added the needs-api-review label Apr 26, 2016

ramr mentioned this pull request Apr 26, 2016

Add a kubectl create secret tls command kubernetes/kubernetes#24719

Merged

smarterclayton added api-approved and removed needs-api-review labels May 13, 2016

ramr force-pushed the route-tls-validation branch from 34b0242 to cdf242e Compare May 16, 2016 17:25

openshift-bot merged commit 14d77ab into openshift:master May 17, 2016

ramr deleted the route-tls-validation branch February 3, 2017 00:10

Add basic validation for route TLS configuration - checks that input is "syntactically" valid. #8366

Add basic validation for route TLS configuration - checks that input is "syntactically" valid. #8366

Uh oh!

Conversation

ramr commented Apr 6, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smarterclayton Apr 6, 2016

Choose a reason for hiding this comment

Uh oh!

ramr commented Apr 6, 2016

Uh oh!

smarterclayton commented Apr 6, 2016

partial fix to check input is valid PEM format.

Uh oh!

smarterclayton commented Apr 6, 2016

Uh oh!

smarterclayton commented Apr 6, 2016

Uh oh!

ramr commented Apr 6, 2016

Uh oh!

smarterclayton commented Apr 6, 2016

Uh oh!

ramr commented Apr 7, 2016

Uh oh!

ramr commented Apr 7, 2016

Uh oh!

ramr commented Apr 7, 2016

Uh oh!

smarterclayton commented Apr 7, 2016

Uh oh!

ramr commented Apr 7, 2016

Uh oh!

smarterclayton commented Apr 7, 2016

Uh oh!

ramr commented Apr 7, 2016

Uh oh!

smarterclayton commented Apr 7, 2016 via email

Uh oh!

ramr commented Apr 7, 2016

Uh oh!

smarterclayton commented Apr 7, 2016

Uh oh!

ramr commented Apr 7, 2016

Uh oh!

smarterclayton commented Apr 8, 2016 via email

Uh oh!

ramr commented Apr 8, 2016

Uh oh!

ramr commented Apr 8, 2016

Uh oh!

smarterclayton Apr 11, 2016

Choose a reason for hiding this comment

Uh oh!

liggitt Apr 20, 2016

Choose a reason for hiding this comment

Uh oh!

ramr Apr 20, 2016

Choose a reason for hiding this comment

Uh oh!

ramr commented Apr 20, 2016

Uh oh!

knobunc commented Apr 21, 2016

Uh oh!

ramr commented May 6, 2016

Uh oh!

smarterclayton commented May 7, 2016

Uh oh!

smarterclayton commented May 7, 2016

Uh oh!

liggitt commented May 7, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smarterclayton commented May 13, 2016

Uh oh!

smarterclayton commented May 13, 2016

Uh oh!

smarterclayton commented May 13, 2016

Uh oh!

spinolacastro commented May 16, 2016

ramr commented Apr 6, 2016 •

edited

Loading

liggitt commented May 7, 2016 •

edited

Loading

openshift-bot commented May 17, 2016 •

edited

Loading