Skip to content

merge db_main to release #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 42 commits into from
Mar 17, 2025
Merged

merge db_main to release #13

merged 42 commits into from
Mar 17, 2025

Conversation

jnyi
Copy link

@jnyi jnyi commented Mar 15, 2025

a number of changes have been in dev for a long time, and this is the new one: #12

johannaratliff and others added 30 commits April 4, 2024 12:54
* Update dependencies

* Correct PR number
…fana#146)

This addresses a bug in rollout-operator where:

1. Kubernetes receives a request to downscale a statefulset by `X` hosts.
2. The prepare-downscale admission webhook attempts to prepare `X` pods for shutdown by sending an HTTP `POST` to their handler identified by the `grafana.com/prepare-downscale-http-path` and `-port` annotations.
3. At least one of these requests fails. The admission webhook returns an error to Kubernetes, so the downscale is not approved.
4. 💥 But some hosts may have been prepared for downscale. 💥 

This PR adds cleanup logic to issue `DELETE` requests on all involved pods if any of the `POST`s failed. Notes:
* `DELETE` calls are attempted once.
* `DELETE` failures are logged but otherwise ignored.
* For simplicity, we'll invoke `DELETE` on all of the pods involved in the scaledown operation, not just ones that received a POST.

This doesn't fix the similar issue where replica count changing from 10->9->10 leaves that one pod prepared for shutdown. (But that's in the works.)
Add a changelog entry for grafana#146, and prepare changelog for v0.16.0.


Co-authored-by: Patryk Prus <[email protected]>

---------

Co-authored-by: Patryk Prus <[email protected]>
* Swap base image from alpine to distroless

* Remove user setup

* Use nonroot image

* Add different base image for boringcrypto

* Add changelog entry
For better debuggability when there are concurrent webhook calls.
* Include UserInfo.Username in 'handling request' log.

* Changelog.
* Add support for specifying percentage in rollout-max-unavailable annotation.

* CHANGELOG.md
Fix unbalanced pairs in log, leading to a log message like this:
`level=error ts=2024-06-13T03:30:49.769575693Z pod=ingester-zone-a-16 url=http://ingester-zone-a-16.ingester-zone-a.mimir-dev.svc.cluster.local./ingester/prepare-partition-downscale errorsendingHTTPPOSTrequesttoendpoint=err`
* When checking downscale delay in the statefulset allow downscale if some pods at the end of statefulset are ready to be downscaled.

* CHANGELOG.md
…s to store (grafana#151)

Fix a snag found in grafana#146 where if the "downscaled" annotation/configmap fails to persist, the scale operation is denied, but the pods are not informed via DELETE that they should no longer shutdown.
)

* Only scale up zone after leader zone replicas are ready

* Update CHANGELOG

* Change to only scaling once all replicas are ready

* Rename config annotation

* Add log line

* remove redundant test

* Update changelog
* Update dependencies

* Update CHANGELOG

* Fix build errors

* Upgrade docker and grpc for remaining CVEs
* Update Go to 1.23

* Add some nolint
* Added grafana.com/rollout-mirror-replicas-from-resource-update-status-replicas annotation to optionally disable patching of reference resource when using scaling based on reference resource.

* Review findings.

* CHANGELOG entry.
…-status-replicas` annotation (grafana#171)

* Renamed `grafana.com/rollout-mirror-replicas-from-resource-write-back-status-replicas` annotation to `grafana.com/rollout-mirror-replicas-from-resource-write-back`

* Fix changelog.
Merge remote-tracking branch 'upstream/main' into merge-upstream-v0.20.0
* fix: add support for delayed downscale port in the URL

* update changelog

* Update CHANGELOG.md

Co-authored-by: Marco Pracucci <[email protected]>

---------

Co-authored-by: Marco Pracucci <[email protected]>
@jnyi jnyi requested review from hczhu-db and yuchen-db March 15, 2025 02:06
Copy link

@yuchen-db yuchen-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jnyi jnyi merged commit 2b06b52 into release Mar 17, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants