Skip to content

add a build completed notification so that we can chain builds together to make a Deployment Pipline #1228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jstrachan opened this issue Mar 5, 2015 · 32 comments

Comments

@jstrachan
Copy link
Contributor

TL;DR; add a new build trigger kind to BuildConfig, so a build B can be triggered if build A completes succesfully. i.e. OpenShift would automatically fire the web hooks for build B if build A completes without having to hard code the web hook of B into the build docker container of A.

Now the background...

To implement Continuous Delivery we need to setup a Deployment Pipeline between environments which includes integration/soak/UAT tests, manual approval steps and so forth.

OpenShift already has almost everything required for this right now (e.g. a namespace can be used for each environment, there's web hooks for triggering builds & deployments and builds can be fairly flexible things running docker containers).

The missing thing - from what I can see - is we can't chain arbitrary builds together. i.e. I want to trigger build B if build A completes successfully. There doesn't seem to be a way to notify with a web hook that a build completed successfully.

e.g. lets imagine a relatively simple sequential deployment pipeline such as this using 2 namespaces and a manual approval to production:

  1. git push hook
  2. STI image build updates
  3. deploy new image on staging namespace
  4. kick off a soak test and if that passed...
  5. fire an approval process (e.g. a PR on a git repo using gerrit or raising an issue or something)
  6. when approved this build is triggered to move the production tag on the image
  7. when the product tag is changed on an image deploy it into the production namespace

So right now 1-3 can be done already with git push hooks & deploymentConfig triggered by image change. 4 can be done (I think) by triggering a soak test build via image change.

The approval build (5) there's no way to fire that if (4) succeeds; without hard coding the build 4 to know to fire a web hook on build 5 inside the build itself.

To implement 5->6 we could use a PR which maybe does the docker tag as part of a build or something; or just a build which is fired by a web hook in the approval process or something.

Often these CD pipelines can get complex; they often involve fork/join builds; where when an image changes you may wish to kick off parallel builds (e.g. a soak test, load test and UAT test in different environments/namespaces). So it'd be nice to be able to join things together so that when all those things succeed, then the next build step is kicked off.

What would be cooler is if we could leave builds, triggers and deployments as loosely coupled, plug and play things and just compose them together to make pipelines (so that each build isn't hard coded to know what the next build step is - so builds can be reusable). It feels the docker-ish thing to do; to make builds do one thing well; then compose them together. A build docker container (STI or otherwise as builds may not generate new docker images they may just test/approve/update tags/perform PRs) should not know or care about whats next in the pipeline.

So it'd be nice if a BuildConfig could specify as a trigger the other build (or even builds) which it depends on. e.g. we could configure BuildConfig for B to have a trigger when BuildConfig A completes succesfully. Then we could configure Build C to have a trigger on build B. When whenever build A completes, OpenShift would look for all the other BuildConfigs which have a "build complete trigger" for A - which would result in B being triggered. If/when build B completes then build C would be triggered.

Since a BuildConfig can have many triggers a build E could be triggered by C and D. If we need a join step in a pipeline (only do something if C and D are built successfully) then that could be a separate stateful build to figure out when its complete (and so fire the next step). e.g. we could use something like Gerrit +1 voting for now as an implementation detail - but this would be something internal to the build docker image to figure out. Another approach would be a stateful docker container which keeps track of which builds have been completed for a given version/image so that it fails the build until all the required things are done.

In many ways though the joining part is an implementation detail inside a build docker image. All OpenShift would need to do would be fire builds which have the 'build completed trigger' when other builds complete successfully (to avoid having to hard code firing of web hooks inside the build docker images).

Thoughts?

@soltysh
Copy link
Contributor

soltysh commented Mar 5, 2015

Actually, this is partially possible with STI builds, since plain STI has an option to specify a CallbackURL (see here). The only thing needed for this to run would be to give a user option to specify that callbackURL and update the payload to match the one accepted by the generic webhook (at least partially).
@bparees thoughts?

@jstrachan
Copy link
Contributor Author

@soltysh Ah great find! I guess I'm asking for the same CallbackURL support for other kinds of builds. For STI builds that generate new docker images, we can use the docker image trigger anyway. (But the CallbackURL is very handy for STI too!)

Its more for other kinds of builds that don't generate docker images (e.g. soak tests / regression tests) that we really need a CallbackURL on success

@bparees
Copy link
Contributor

bparees commented Mar 5, 2015

@jstrachan STI builds and Docker builds both produce images in which case you should use the ImageChangeTrigger to kick off the next build in line (as you mention).

So that leaves CustomBuild which would be what i'd expect you to use for test type patterns...in that case your Custom builder image itself could invoke the generic webhook. You could specify the webhook via an ENV var defined on the BuildConfig (all env vars on the BuildConfig get given to the custom builder container)

So in theory at least you can implement this today, with admittedly a poor user experience.

I agree a "BuildCompletedTrigger" would be a useful feature to have, similar to what Jenkins offers today...(trigger a downstream build), i guess the question becomes how critical is it, given the existing way this can be implemented?

@smarterclayton
Copy link
Contributor

We originally had this in geard - I agree that "non output" style builds could be useful.

Do we need to make builds serial and atomic before we do this? Otherwise your triggers may be out of order or conflict.

@jstrachan
Copy link
Contributor Author

@bparees yeah, I'm talking about CustomBuild (sorry I should have been more clear). I agree its technically possible to hack it right now by adding custom build triggering into each CustomBuild docker image; but its quite ugly hack really. Imagine Jenkins folks saying "yeah we could do that - or you could just go off and hack this into every known jenkins plugin instead....". These builds are gonna be varied using all kinds of technologies; we don't wanna have to hack them all.

It also means to add a trigger on build B when build A completes I have to hack the config of build A (which may be in a different environment that I may not have karma) and it means we can only use CustomBuild images which have been hacked to perform Callbacks with custom env vars.

Without this capability its feeling cleaner to just ignore OpenShift CustomBuilds and use Jenkins for all the custom builds which has flexible triggering built in - so my big preference would be to just add the callback mechanism into CustomBuilds (its a pretty simple code change really); then OpenShift has the flexibility to be a general purpose CD tool (rather than relying on Jenkins being the CD pipeline tool and OpenShift just doing the docker image trigger deployment part).

@smarterclayton ah cool thanks for the heads up.

If builds are parallel then notifications/callbacks would be out of order which is totally fine with me. Otherwise if order is important we could force the triggers to be in order; A triggers B, B triggers C etc. If B and C are triggered by A then order is unknown.

@jimmidyson
Copy link
Contributor

Adding a new build trigger of BuildCompleted referencing another build config to define that a build should be triggered after a successful build should quite easily allow for a linear build pipeline, hooking into wherever is best (perhaps setting up a build watcher) to trigger the next build.

To get a more complex pipeline (e.g. fork/join, rescue, retries, etc) we would probably need a BuildPipeline type that reuses the build triggers (webhooks, etc) to kick off. This BuildPipeline could just be a slice of slices of BuildPipelines/Builds. Each build could be annotated with it's appropriate build pipeline ID as it's run so it can always be mapped back to a pipeline & so it's complete the next build in the pipeline could be triggered. We could use the added BuildCompleted trigger above to handle the next step in the pipeline. This would of course require keeping state of the BuildPipeline somewhere, but it could just be a reference to the Builds that are created as part of this pipeline as they're triggered.

@soltysh
Copy link
Contributor

soltysh commented Mar 6, 2015

I think there are two UC defined here:

  1. notify external system when build completes
  2. notify OpenShift when build completes

Although I'm aware of the overlap between the two, since OpenShift could be configured using generic webhook to listen for the newly added build complete event, it's rather hacky since current webhooks expect information about source code being changed and that was their goal. In exchange I'd rather see us going with no. 2 towards @jimmidyson's idea of pipelining builds, so it's easier for users to create complex CD workflows @jstrachan mentioned, this should obviously include deployments as well IMHO. Unfortunately given all this it's getting quite complicated scenario but something I'd love to see possible in OpenShift.

@jstrachan
Copy link
Contributor Author

@soltysh agreed with all that.

On the deployment part; I figure build pipelines would ultimately either create new image versions or add tags to images created at the start of the pipeline; so that the existing DeploymentConfig image trigger would be enough to trigger deployments in some namespace from some build pipeline.

e.g. a build pipeline could start creating a new image; perform test builds and approval steps which could then create/move the production tag on the image which could then fire the production DeploymentConfig in the production namespace

@soltysh
Copy link
Contributor

soltysh commented Mar 6, 2015

There's a WIP by @ironcladlou in the topic of deployment hooks, see #1224

@jimmidyson
Copy link
Contributor

@soltysh I didn't really consider deployments. You're right that deployments need to be included as there could be subsequent steps that require the deployment, take the example of load or soak test: deploy, trigger build to run tests against deployment, pass, success, next build (which could be production deployment).

For now we could build this iteratively to just provide a simple pipeline that can only include builds & add in deployments later? I see value in this, say in my Java world of building dependency JARs in parallel, pushing to maven repo & the pipeline then joining to a WAR build that uses those JARs, outputting docker image from build to trigger deployment. This is a pretty common scenario in my experience & would be great to demo.

We could then add in deployment steps later?

@bparees
Copy link
Contributor

bparees commented Mar 6, 2015

I spoke with @smarterclayton about this and the approach we agreed on is to implement a "webhook on build complete" feature where you will be able to specify webhooks to invoke when the build successfully completes (adding options to invoke it on failure also would be possible but i'd rather see if we can get away w/ just success for now).

for build chaining that would mean you'd add a webhook on complete that invoked a downstream build's webhook trigger. @soltysh i know you don't like this because it feels abusive of webhooks, but really only the github webhook should be assuming information about source changes is being provided. the generic hook is intended for this sort of use case, imho.

Doing it this way enables a wider swath of use cases than just the "trigger a build on build complete" feature would, without the expense of implementing both features (and confusing the user with additional options).

i've created a trello card to track this feature, but @jstrachan we'd happily accept a PR as well :)
https://trello.com/c/Sk3aSNxG/516-invoke-webhook-on-build-completion

@soltysh
Copy link
Contributor

soltysh commented Mar 6, 2015

I'm OK with using the generic as long as we make it really generic either by expanding all the information it can hold, or rather create a generic information about the source of the invocation,
currently there's git info in it.

I'd vote for having just some generic info in there.

@jimmidyson
Copy link
Contributor

@bparees So using the "webhook on build complete" would mean that the workflow for build pipelines would have to be managed externally to OpenShift? You could do linear pipelines A->B->C in OpenShift using the webhooks there but for any more complex e.g. fan out/fan in pipelines would require an external engine to receive the webhook & process accordingly. If that is the case then definitely need "webhook on failure" too as external system needs to know build has finished, successfully or not.

Still I personally prefer adding pipelines into OpenShift itself.

@bparees
Copy link
Contributor

bparees commented Mar 6, 2015

@soltysh the git info is optional. what other input information do you feel should be supplied on an input trigger?

@jimmidyson I'm definitely not looking to tackle fan in with the card i created. we need to be careful how far we go here as we're not trying to turn openshift into the ultimate build management tool. I'm ok with adding "triggerOnFailure:true/false" as part of the output trigger config instead :)

@soltysh
Copy link
Contributor

soltysh commented Mar 6, 2015

Some info regarding the source of the invocation, if that was build then some info about that build.

@jimmidyson
Copy link
Contributor

@bparees - boooo! As long as we can trigger on completion then we can build whatever we need on top of that I guess.

@soltysh
Copy link
Contributor

soltysh commented Mar 6, 2015

@bparees but this might lead us to adding every now and then new field to the generic webhook, thus I'd prefer to have some generic info there, rather then specific fields each for its own purpose.

@smarterclayton
Copy link
Contributor

To clarify, outbound web hooks are different from inbound webhooks. Triggers are inbound, there should be something else (a different list) that defines outbound. Maybe notifications.

On Mar 6, 2015, at 11:12 AM, Ben Parees [email protected] wrote:

I spoke with @smarterclayton about this and the approach we agreed on is to implement a "webhook on build complete" feature where you will be able to specify webhooks to invoke when the build successfully completes (adding options to invoke it on failure also would be possible but i'd rather see if we can get away w/ just success for now).

for build chaining that would mean you'd add a webhook on complete that invoked a downstream build's webhook trigger. @soltysh i know you don't like this because it feels abusive of webhooks, but really only the github webhook should be assuming information about source changes is being provided. the generic hook is intended for this sort of use case, imho.

Doing it this way enables a wider swath of use cases than just the "trigger a build on build complete" feature would, without the expense of implementing both features (and confusing the user with additional options).

i've created a trello card to track this feature, but @jstrachan we'd happily accept a PR as well :)
https://trello.com/c/Sk3aSNxG/516-invoke-webhook-on-build-completion


Reply to this email directly or view it on GitHub.

@smarterclayton
Copy link
Contributor

Whatever we do here should be logically consistent with deployment configs and #1224. A deployment and a build have similar characteristics (a process that runs to completion, is triggered, and which has an outcome). While a deployment hook is not the same thing as a completion notification, the "triggers", "hooks", "notifications" triangle should be preserved. Builds may have externally defined hooks (things that run in a specific part of the build, like a post build content scan) and so the pattern should be consistent. The use case listed in #1224 for notifying a third party via a webhook is the same as this use case.

@bparees
Copy link
Contributor

bparees commented Mar 8, 2015

@smarterclayton yes outbound vs inbound will be distinct sections, i think the subtlety is in the fact that the initial use case for outbound webhooks will be to call our inbound webhooks, so obviously we need to ensure that connection is logical.

i'm currently envisioning that the outbound webhook configuration will look something like:
url: (url to call, get parameters can be specified as part of the url)
payload: (payload to deliver, if specified we POST instead of GET)
invokeOnFail: (default false)
invokeOnSuccess: (default true)

if we want to increase the sophistication and create a full buildHook mechanism compatible with the pending deploymentHook model, for which these webhook invocations are just one possible action the buildhook would take, we can I guess.... it will definitely delay the implementation though.

@smarterclayton
Copy link
Contributor

Before we go any further with this, I'd like everyone involved to read through the api and feature design of:

And also take a look at what Jenkins build pipeline does:

Both offer deep pipelines of flow. The questions we have to answer every time we add something to builds is:

  • does this help or hurt a future integration with a platform like this?
  • does our incremental addition enable the same capabilities in the long term that these solutions offer users?

I don't think buildHook is on the table right now - but both deployment notifications and build notifications are relevant. We need to clearly define what a build notification is not so we don't back ourselves into a corner there.

What you've listed seems ok, but I also recommend we document some of the existing patterns out there (implementations in common use like github, Jenkins, DockerHub, other CI sites) so we can identify what we do and don't want to implement. Some questions:

  • is fail or success the only criteria? Do I want to have both?
  • will I ever need to PUT?
  • what limits will we place on how long we wait if the server doesn't respond?
  • should we retry a webhook?
  • what parts of the github webhook experience will we also be expected to offer (logging, retry, test)?

On Mar 7, 2015, at 10:35 PM, Ben Parees [email protected] wrote:

@smarterclayton yes outbound vs inbound will be distinct sections, i think the subtlety is in the fact that the initial use case for outbound webhooks will be to call our inbound webhooks, so obviously we need to ensure that connection is logical.

i'm currently envisioning that the outbound webhook configuration will look something like:
url: (url to call, get parameters can be specified as part of the url)
payload: (payload to deliver, if specified we POST instead of GET)
invokeOnFail: (default false)
invokeOnSuccess: (default true)

if we want to increase the sophistication and create a full buildHook mechanism compatible with the pending deploymentHook model, for which these webhook invocations are just one possible action the buildhook would take, we can I guess.... it will definitely delay the implementation though.


Reply to this email directly or view it on GitHub.

@jstrachan
Copy link
Contributor Author

Agreed. I'm thinking having a way to chain sequential builds together seems too limiting right now; I'd prefer inbound triggers so we can support fan out. Ideally we'd need fan in (joining) and to configure build retries and so forth too.

Maybe its better to try develop a separate prototype "build pipeline engine" above OpenShift which takes a pipeline model of how builds are chained together (with the ability to deal with failures, kick off retries and cancel an entire pipeline instance if a new build comes along at a different stage).

Something along the lines of the jenkins workflow plugin (which is like the build pipeline plugin but each stage is persistent so it can recover from restarts of the jenkins master and pipelines carry on running) but where each build step is an OpenShift build & reuses the OpenShift REST API.

Given OpenShift has a REST API to watch builds and query build status; a separate pipeline engine could be developed and be used with OpenShift (and ideally eventually integrated into the OpenShift REST API). But it could start out being a separate prototype service for now until the idea is proven and see how well it compares to something like Jenkins workflow plugin?

@bparees
Copy link
Contributor

bparees commented Mar 9, 2015

@jstrachan Was there a particular use case you had in mind that lead to this, or were you just identifying a general gap? I'm asking mainly so we can put the right priority around this and also figure out if we need to do something tactical in the short term to address it.

@jstrachan
Copy link
Contributor Author

There's a bit of background here: fabric8io/fabric8#3541.

Generally fabric8 users want an integrated Continuous Deployment pipeline tool. Once an image is created (e.g. by the initial CI flow or STI in OpenShift) they want the image to move through multiple environments (soak testing -> regression testing -> UAT -> staging -> production) where workflow pipelines are used to manage builds and human approval steps along the way. Each environment would map to a separate Kubernetes namespace.

For now we're doing the pipeline orchestration in Jenkins with its workflow plugin; but we'd prefer a native OpenShift solution really.

@soltysh
Copy link
Contributor

soltysh commented Mar 11, 2015

After spending some time reading through the discussion once again, I came up with this idea. I'd propose to create an interface, which will be capable of returning info about the result in a generic way so that interested parties will be able to consume it. This way we'll create basics for building quite complicated pipelines with different resource types. Currently I'm thinking about Builds and Deployments but that can be then easily extended in the future just by implementing this interface for those resources. Obviously the consuming part will have to be still written as part of the resource controller to support full interaction but I guess that's acceptable and it aligns nicely with the traditional CD flow, giving us great flexibility.

@bparees
Copy link
Contributor

bparees commented Mar 11, 2015

@soltysh isn't that interface the Build REST resource? The Build object contains the Status field that indicates the outcome.

@soltysh
Copy link
Contributor

soltysh commented Mar 11, 2015

@bparees I was thinking about something more than just Status field, besides my idea was to generalize this output so we can put other resources underneath it, eg. Deployments.

@smarterclayton
Copy link
Contributor

Why would a build rest result contain arbitrary resources? In general if the types are specific we should use something strongly typed vs generic. Would like to see a proposal before we go too far here

On Mar 11, 2015, at 4:59 PM, Maciej Szulik [email protected] wrote:

@bparees I was thinking about something more than just Status field, besides my idea was to generalize this output so we can put other resources underneath it, eg. Deployments.


Reply to this email directly or view it on GitHub.

@soltysh
Copy link
Contributor

soltysh commented Mar 19, 2015

In the end we've decided that Build resource will provide a webhook-like mechanism, where OpenShift will notify configured endpoint by sending a POST request with the result of the Build. In future Deployments will also be supported. So for now we're not planning any CD/CI flow inside of OpenShift in favor of easily hooking with existing tools.

@jstrachan
Copy link
Contributor Author

Incidentally when I raised this issue I was hoping for a '100% pure OpenShift' solution. Since raising this issue we ended up just using Jenkins for the pipeline of CI / CD jobs which turns out to be pretty simple and easy; plus comes with lots of tooling which we've integrated into the hawtio-kubernetes UI plugin (which hopefully we can integrate into the OpenShift console soon).

Having a webhook when an OpenShift build completes will be handy; we can then use it to trigger a Jenkins build for the integration tests -> approval -> provision in a different environment pipelines etc.

@bparees
Copy link
Contributor

bparees commented Apr 8, 2016

was just coming in here to make the same comment @jstrachan made 8 months ago, i think this should be closed in deference to the jenkins pipeline work we're doing. We already tried to implement this feature once (trigger builds when a build completes) and ended up deciding against it.

@smarterclayton @jstrachan any objection to closing this?

@jstrachan
Copy link
Contributor Author

@bparees feel free to close it - I'm totally happy with the Jenkins Pipeline option

@bparees bparees closed this as completed Apr 11, 2016
jboyd01 pushed a commit to jboyd01/origin that referenced this issue Sep 20, 2017
…service-catalog/' changes from ae6b643caf..50e234de83

50e234de83 origin build: add origin tooling
092d7f8 Fix typos and resource names in walkthrough e2e logs (openshift#1237)
d25bd11 Archive the old agenda doc, link to new one (openshift#1243)
6192d14 fix lint errors (openshift#1242)
d103dad Fix lint errors and regenerate openapi (openshift#1238)
e9328d3 Broker Relist (openshift#1183)
b0f3222 Correct the reasons and messages set on the ready condition during async polling (openshift#1235)
d2bb82f Re-enable the href checker (openshift#1232)
2c29654 Use feature gates in controller-manager (openshift#1231)
699eab9 switch build to go1.9 (openshift#1155)
7529ed8 broker resource secret authorization checking (openshift#1186)
50d9bdf v0.0.20 chart updates (openshift#1228)
REVERT: ae6b643caf Use oc adm instead of oadm which might not exist in various installations.
REVERT: 66a4eb2a2c Update instructions... will remove once documented elsewhere
REVERT: 1b704d1530 replace build context setup with init containers
REVERT: ee4df18c7f hack/lib: dedup os::util::host_platform and os::build::host_platform
REVERT: 1cd6dfa998 origin: Switch out owners to Red Hatters
REVERT: 664f4d318f Add instructions for syncing repos
REVERT: 2f2cdd546b origin-build: delete files with colon in them
REVERT: cdf8b12848 origin-build: don't build user-broker
REVERT: ebfede9056 origin build: add _output to .gitignore
REVERT: 55412c7e3d origin build: make build-go and build-cross work
REVERT: 68c74ff4ae origin build: modify hard coded path
REVERT: 3d41a217f6 origin build: add origin tooling

git-subtree-dir: cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog
git-subtree-split: 50e234de836b5e7c9e3d7d763847b99a0f0ea500
jboyd01 pushed a commit to jboyd01/origin that referenced this issue Sep 21, 2017
…service-catalog/' changes from ae6b643caf..06b897d198

06b897d198 origin build: add origin tooling
092d7f8 Fix typos and resource names in walkthrough e2e logs (openshift#1237)
d25bd11 Archive the old agenda doc, link to new one (openshift#1243)
6192d14 fix lint errors (openshift#1242)
d103dad Fix lint errors and regenerate openapi (openshift#1238)
e9328d3 Broker Relist (openshift#1183)
b0f3222 Correct the reasons and messages set on the ready condition during async polling (openshift#1235)
d2bb82f Re-enable the href checker (openshift#1232)
2c29654 Use feature gates in controller-manager (openshift#1231)
699eab9 switch build to go1.9 (openshift#1155)
7529ed8 broker resource secret authorization checking (openshift#1186)
50d9bdf v0.0.20 chart updates (openshift#1228)
REVERT: ae6b643caf Use oc adm instead of oadm which might not exist in various installations.
REVERT: 66a4eb2a2c Update instructions... will remove once documented elsewhere
REVERT: 1b704d1530 replace build context setup with init containers
REVERT: ee4df18c7f hack/lib: dedup os::util::host_platform and os::build::host_platform
REVERT: 1cd6dfa998 origin: Switch out owners to Red Hatters
REVERT: 664f4d318f Add instructions for syncing repos
REVERT: 2f2cdd546b origin-build: delete files with colon in them
REVERT: cdf8b12848 origin-build: don't build user-broker
REVERT: ebfede9056 origin build: add _output to .gitignore
REVERT: 55412c7e3d origin build: make build-go and build-cross work
REVERT: 68c74ff4ae origin build: modify hard coded path
REVERT: 3d41a217f6 origin build: add origin tooling

git-subtree-dir: cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog
git-subtree-split: 06b897d1988a5a3c035c5a971c15b97cbc732918
jpeeler pushed a commit to jpeeler/origin that referenced this issue Feb 1, 2018
Miciah pushed a commit to Miciah/origin that referenced this issue Jun 27, 2018
[3.9][backport] Improve patching ovs flow rules in UpdateEgressNetworkPolicyRules
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants