|
| 1 | +--- |
| 2 | +status: proposed |
| 3 | +title: Retention Policy for Tekton Results |
| 4 | +creation-date: '2024-07-17' |
| 5 | +last-updated: '2024-07-17' |
| 6 | +authors: |
| 7 | +- '@khrm' |
| 8 | +collaborators: [] |
| 9 | +--- |
| 10 | + |
| 11 | +# TEP-0157: Tekton Results: Retention Policy for older Results and Records |
| 12 | + |
| 13 | +<!-- toc --> |
| 14 | +- [Summary](#summary) |
| 15 | +- [Motivation](#motivation) |
| 16 | + - [Goals](#goals) |
| 17 | + - [Non-Goals](#non-goals) |
| 18 | + - [Use Cases](#use-cases) |
| 19 | + - [Requirements](#requirements) |
| 20 | +- [Proposal](#proposal) |
| 21 | + - [Notes and Caveats](#notes-and-caveats) |
| 22 | +- [Design Details](#design-details) |
| 23 | +- [Design Evaluation](#design-evaluation) |
| 24 | + - [Reusability](#reusability) |
| 25 | + - [Simplicity](#simplicity) |
| 26 | + - [Flexibility](#flexibility) |
| 27 | + - [User Experience](#user-experience) |
| 28 | + - [Performance](#performance) |
| 29 | + - [Risks and Mitigations](#risks-and-mitigations) |
| 30 | + - [Drawbacks](#drawbacks) |
| 31 | +- [Alternatives](#alternatives) |
| 32 | +- [Implementation Plan](#implementation-plan) |
| 33 | + - [Test Plan](#test-plan) |
| 34 | + - [Infrastructure Needed](#infrastructure-needed) |
| 35 | + - [Upgrade and Migration Strategy](#upgrade-and-migration-strategy) |
| 36 | + - [Implementation Pull Requests](#implementation-pull-requests) |
| 37 | +- [References](#references) |
| 38 | +<!-- /toc --> |
| 39 | + |
| 40 | +## Summary |
| 41 | +Tekton Results stores Pipelineruns, TaskRuns, Events and Logs indefinitely. |
| 42 | +This proposed adding a retention policy feature for removing older Result and their associated records |
| 43 | +alongwith request to delete logs. |
| 44 | + |
| 45 | +## Motivation |
| 46 | +Storing older results and records indefinitely leads to wastage of storage resources |
| 47 | +and degradation of DB performance. Sometime we don't require some pipelines to be deleted from |
| 48 | +archives. |
| 49 | + |
| 50 | +### Goals |
| 51 | +- Ability to define retention period for the Results at cluster level. All records and results past that period should be deleted. |
| 52 | +- Ability to filter PipelineRuns when setting retention policy. |
| 53 | +- A way to delete associated logs also from s3 buckets, gcs buckets or PVC. |
| 54 | + |
| 55 | +### Non-Goals |
| 56 | + |
| 57 | +<!-- |
| 58 | +Listing non-goals helps to focus discussion and make progress. |
| 59 | +- What is out of scope for this TEP? |
| 60 | +--> |
| 61 | + |
| 62 | +### Use Cases |
| 63 | +- User can specify a global policy for all the results. All records and logs falling under results satisfying pruning condition will be deleted. |
| 64 | +- User can filter results based on cel expression and result Summary expression. All the associated records will be deleted. |
| 65 | + |
| 66 | +### Requirements |
| 67 | +- For all results satisfying delete conditions, following things need to be deleted: |
| 68 | +* Results |
| 69 | +* Records for PipelineRun and TaskRun |
| 70 | +* Records for EventLog |
| 71 | +* Deletion of associated logs from s3 bucket, gcs bucket or PVC. EventLog Records should also be deleted. |
| 72 | + |
| 73 | +## Proposal |
| 74 | +A pruner will run which will spin up job at specified interval based on configmap `config-results-retention-policy` given ttl and cel expressions. |
| 75 | + |
| 76 | +### Notes and Caveats |
| 77 | + |
| 78 | + |
| 79 | +## Design Details |
| 80 | +Configmap will have following structure: |
| 81 | +` |
| 82 | +runAt: 5 4 * * 7 # Specify when to run job so that we can bring DB in maintenance mode if needed. |
| 83 | +max-retention: 2880h # Max Retention duration of 120 days. |
| 84 | +filters: |
| 85 | +- expr: summary.status == FAILURE # When there's failure in PipelineRun |
| 86 | + ttl: 700h |
| 87 | +- expr: parent == dev # When parent/namespace is dev |
| 88 | + ttl: 300h |
| 89 | +` |
| 90 | + |
| 91 | + |
| 92 | +## Design Evaluation |
| 93 | +<!-- |
| 94 | +How does this proposal affect the api conventions, reusability, simplicity, flexibility |
| 95 | +and conformance of Tekton, as described in [design principles](https://github.com/tektoncd/community/blob/master/design-principles.md) |
| 96 | +--> |
| 97 | + |
| 98 | +### Reusability |
| 99 | + |
| 100 | +<!-- |
| 101 | +https://github.com/tektoncd/community/blob/main/design-principles.md#reusability |
| 102 | +
|
| 103 | +- Are there existing features related to the proposed features? Were the existing features reused? |
| 104 | +- Is the problem being solved an authoring-time or runtime-concern? Is the proposed feature at the appropriate level |
| 105 | +authoring or runtime? |
| 106 | +--> |
| 107 | + |
| 108 | +### Simplicity |
| 109 | + |
| 110 | +<!-- |
| 111 | +https://github.com/tektoncd/community/blob/main/design-principles.md#simplicity |
| 112 | +
|
| 113 | +- How does this proposal affect the user experience? |
| 114 | +- What’s the current user experience without the feature and how challenging is it? |
| 115 | +- What will be the user experience with the feature? How would it have changed? |
| 116 | +- Does this proposal contain the bare minimum change needed to solve for the use cases? |
| 117 | +- Are there any implicit behaviors in the proposal? Would users expect these implicit behaviors or would they be |
| 118 | +surprising? Are there security implications for these implicit behaviors? |
| 119 | +--> |
| 120 | + |
| 121 | +### Flexibility |
| 122 | + |
| 123 | +<!-- |
| 124 | +https://github.com/tektoncd/community/blob/main/design-principles.md#flexibility |
| 125 | +
|
| 126 | +- Are there dependencies that need to be pulled in for this proposal to work? What support or maintenance would be |
| 127 | +required for these dependencies? |
| 128 | +- Are we coupling two or more Tekton projects in this proposal (e.g. coupling Pipelines to Chains)? |
| 129 | +- Are we coupling Tekton and other projects (e.g. Knative, Sigstore) in this proposal? |
| 130 | +- What is the impact of the coupling to operators e.g. maintenance & end-to-end testing? |
| 131 | +- Are there opinionated choices being made in this proposal? If so, are they necessary and can users extend it with |
| 132 | +their own choices? |
| 133 | +--> |
| 134 | + |
| 135 | +### Conformance |
| 136 | + |
| 137 | +<!-- |
| 138 | +https://github.com/tektoncd/community/blob/main/design-principles.md#conformance |
| 139 | +
|
| 140 | +- Does this proposal require the user to understand how the Tekton API is implemented? |
| 141 | +- Does this proposal introduce additional Kubernetes concepts into the API? If so, is this necessary? |
| 142 | +- If the API is changing as a result of this proposal, what updates are needed to the |
| 143 | +[API spec](https://github.com/tektoncd/pipeline/blob/main/docs/api-spec.md)? |
| 144 | +--> |
| 145 | + |
| 146 | +### User Experience |
| 147 | + |
| 148 | +<!-- |
| 149 | +(optional) |
| 150 | +
|
| 151 | +Consideration about the user experience. Depending on the area of change, |
| 152 | +users may be Task and Pipeline editors, they may trigger TaskRuns and |
| 153 | +PipelineRuns or they may be responsible for monitoring the execution of runs, |
| 154 | +via CLI, dashboard or a monitoring system. |
| 155 | +
|
| 156 | +Consider including folks that also work on CLI and dashboard. |
| 157 | +--> |
| 158 | + |
| 159 | +### Performance |
| 160 | +This improves the peformance of DB by deleting superfluous results and their associated datas. |
| 161 | + |
| 162 | +### Risks and Mitigations |
| 163 | + |
| 164 | +<!-- |
| 165 | +What are the risks of this proposal and how do we mitigate? Think broadly. |
| 166 | +For example, consider both security and how this will impact the larger |
| 167 | +Tekton ecosystem. Consider including folks that also work outside the WGs |
| 168 | +or subproject. |
| 169 | +- How will security be reviewed and by whom? |
| 170 | +- How will UX be reviewed and by whom? |
| 171 | +--> |
| 172 | + |
| 173 | +### Drawbacks |
| 174 | + |
| 175 | +<!-- |
| 176 | +Why should this TEP _not_ be implemented? |
| 177 | +--> |
| 178 | + |
| 179 | +## Alternatives |
| 180 | + |
| 181 | + |
| 182 | +## Implementation Plan |
| 183 | + |
| 184 | +<!-- |
| 185 | +What are the implementation phases or milestones? Taking an incremental approach |
| 186 | +makes it easier to review and merge the implementation pull request. |
| 187 | +--> |
| 188 | + |
| 189 | + |
| 190 | +### Test Plan |
| 191 | + |
| 192 | +- We will add a Integration tests like we have for Logging in GCS storage and other scenarios. |
| 193 | + |
| 194 | +### Infrastructure Needed |
| 195 | + |
| 196 | +<!-- |
| 197 | +(optional) |
| 198 | +
|
| 199 | +Use this section if you need things from the project or working group. |
| 200 | +Examples include a new subproject, repos requested, GitHub details. |
| 201 | +Listing these here allows a working group to get the process for these |
| 202 | +resources started right away. |
| 203 | +--> |
| 204 | + |
| 205 | +### Upgrade and Migration Strategy |
| 206 | + |
| 207 | +<!-- |
| 208 | +(optional) |
| 209 | +
|
| 210 | +Use this section to detail whether this feature needs an upgrade or |
| 211 | +migration strategy. This is especially useful when we modify a |
| 212 | +behavior or add a feature that may replace and deprecate a current one. |
| 213 | +--> |
| 214 | + |
| 215 | +### Implementation Pull Requests |
| 216 | + |
| 217 | + |
| 218 | +## References |
| 219 | + |
| 220 | +<!-- |
| 221 | +(optional) |
| 222 | +
|
| 223 | +Use this section to add links to GitHub issues, other TEPs, design docs in Tekton |
| 224 | +shared drive, examples, etc. This is useful to refer back to any other related links |
| 225 | +to get more details. |
| 226 | +--> |
0 commit comments