-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: vllm missing logprobs #5279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for localai ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
6ff3817
to
21da7ce
Compare
referencing mudler#3436, mudler#2930 - if i could test it, this might show that the output from the vllm backend is processed and returned to the user Signed-off-by: Wyatt Neal <[email protected]>
Signed-off-by: Wyatt Neal <[email protected]>
21da7ce
to
05d08ed
Compare
Signed-off-by: Wyatt Neal <[email protected]>
Signed-off-by: Wyatt Neal <[email protected]>
yes, we should actually use the venv grpcios to build the backends so we can control better the generated code, but that's not yet in place. Help is welcome, I'd be happy to help here in review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the cleanup and adding tests!
right - vllm tests might not work as it lacks HW capabilities for it. Maybe it's just better to disable it for now |
Is there any existing facility that I'm not seeing to force it to CPU based / lower requirement model to enable testing the same way diffusers works? |
not really, diffusers just works in that case by offloading to CPU without much problems and the test is quite simple (as it generates a 16x16 image), that's why it works good enough to pass to the CI. for vLLM I remember having issues having it working with uv for CPUs - but I didn't had a closer look lately if things changed there. |
Description
This PR fixes part of #3436 and #2930 by adding these features to the backend interface. These features should be exported via the protobuf and are then mapped to the current list of SamplingParams. The test that is in place makes sure that they return a value, but not that this value means anything because I didn't think this was the place to be testing vllm; however, I think there might be opportunities to improve the test.
Feedback is always appreciated to improve my PR or to approach in a different fashion than is present.
Notes for Reviewers
There appears to be some compatibility issues within the grpcio and protobufs libraries in the system. This likely will impact the ability to execute tests. I've also noted there's likely a lot of clean up in the main Makefile to make it "match" the other backend tests. I'd be happy to create a PR for that as well but I didn't want to create additional confusion.
Signed commits