-
Notifications
You must be signed in to change notification settings - Fork 6.3k
[serve] Log rejected requests at router side #51346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Cindy Zhang <[email protected]>
Signed-off-by: Cindy Zhang <[email protected]>
Signed-off-by: Cindy Zhang <[email protected]>
python/ray/serve/_private/replica.py
Outdated
@@ -624,7 +624,7 @@ async def handle_request_with_rejection( | |||
limit = self._deployment_config.max_ongoing_requests | |||
num_ongoing_requests = self.get_num_ongoing_requests() | |||
if num_ongoing_requests >= limit: | |||
logger.warning( | |||
logger.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this supposed to be logger.debug
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops yes thanks
Signed-off-by: Cindy Zhang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Cindy Zhang <[email protected]>
Signed-off-by: Cindy Zhang <[email protected]>
## Why are these changes needed? Router side logs (made less alarming, made clear that request will be retried): ``` INFO 2025-03-13 13:42:35,298 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 4a843e03-e1c7-47a2-be9d-6c0224108f42. INFO 2025-03-13 13:42:35,298 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 57d94c8a-13b4-4ea2-a628-75d566ef29e5. INFO 2025-03-13 13:42:35,301 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 4a843e03-e1c7-47a2-be9d-6c0224108f42. ``` Replica side logs about rejected requests are now DEBUG logs only. This is to make the logs appear less alarming for users who are not familiar with the request lifecycle. The way the logs are now, the user can get confused reading the replica-side logs and think requests got dropped. https://anyscale1.atlassian.net/browse/SERVE-659 --------- Signed-off-by: Cindy Zhang <[email protected]>
## Why are these changes needed? Router side logs (made less alarming, made clear that request will be retried): ``` INFO 2025-03-13 13:42:35,298 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 4a843e03-e1c7-47a2-be9d-6c0224108f42. INFO 2025-03-13 13:42:35,298 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 57d94c8a-13b4-4ea2-a628-75d566ef29e5. INFO 2025-03-13 13:42:35,301 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 4a843e03-e1c7-47a2-be9d-6c0224108f42. ``` Replica side logs about rejected requests are now DEBUG logs only. This is to make the logs appear less alarming for users who are not familiar with the request lifecycle. The way the logs are now, the user can get confused reading the replica-side logs and think requests got dropped. https://anyscale1.atlassian.net/browse/SERVE-659 --------- Signed-off-by: Cindy Zhang <[email protected]>
…)" This reverts commit 65514ea.
…)" This reverts commit 65514ea. Signed-off-by: akyang-anyscale <[email protected]>
This reverts commit 65514ea. <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? Logging the rejected requests is causing lower serve throughput. The regression was originally flagged from the microbenchmark test that runs in the nightly release tests. <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes #1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: akyang-anyscale <[email protected]>
## Why are these changes needed? Router side logs (made less alarming, made clear that request will be retried): ``` INFO 2025-03-13 13:42:35,298 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 4a843e03-e1c7-47a2-be9d-6c0224108f42. INFO 2025-03-13 13:42:35,298 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 57d94c8a-13b4-4ea2-a628-75d566ef29e5. INFO 2025-03-13 13:42:35,301 serve 40047 -- Replica(id='7mqhdb0d', deployment='Model', app='default') rejected request because it is at max capacity of 1 ongoing requests. Retrying request 4a843e03-e1c7-47a2-be9d-6c0224108f42. ``` Replica side logs about rejected requests are now DEBUG logs only. This is to make the logs appear less alarming for users who are not familiar with the request lifecycle. The way the logs are now, the user can get confused reading the replica-side logs and think requests got dropped. https://anyscale1.atlassian.net/browse/SERVE-659 --------- Signed-off-by: Cindy Zhang <[email protected]> Signed-off-by: Dhakshin Suriakannu <[email protected]>
…)" (ray-project#51698) This reverts commit 65514ea. <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? Logging the rejected requests is causing lower serve throughput. The regression was originally flagged from the microbenchmark test that runs in the nightly release tests. <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: akyang-anyscale <[email protected]> Signed-off-by: Dhakshin Suriakannu <[email protected]>
This reverts commit 65514ea. <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? Logging the rejected requests is causing lower serve throughput. The regression was originally flagged from the microbenchmark test that runs in the nightly release tests. <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes #1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( Signed-off-by: akyang-anyscale <[email protected]> Signed-off-by: Srinath Krishnamachari <[email protected]>
Why are these changes needed?
Router side logs (made less alarming, made clear that request will be retried):
Replica side logs about rejected requests are now DEBUG logs only.
This is to make the logs appear less alarming for users who are not familiar with the request lifecycle. The way the logs are now, the user can get confused reading the replica-side logs and think requests got dropped.
https://anyscale1.atlassian.net/browse/SERVE-659
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.