-
Notifications
You must be signed in to change notification settings - Fork 293
Update vLLM version to v0.9.1 #2061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: CICD-at-OPEA <[email protected]>
Dependency Review✅ No vulnerabilities or license issues found.Scanned FilesNone |
this is because core binding requires privileged mode to run numa_migrate_pages call. this PR solved the issue: vllm-project/vllm#19241. two options here:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the vLLM version from v0.9.0.1 to v0.9.1 across multiple test scripts, docker-compose files, and environment configuration files.
- Updated vLLM version in test scripts for WorkflowExecAgent, VisualQnA, HybridRAG, DocSum, CodeTrans, CodeGen, ChatQnA, AudioQnA, etc.
- Added a new environment variable (VLLM_CPU_OMP_THREADS_BIND) in docker-compose files.
- Updated the build script environment variable in .github/env/_build_image.sh.
Reviewed Changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
WorkflowExecAgent/tests/2_start_vllm_service.sh | Updated vLLM version. |
VisualQnA/tests/test_compose_on_xeon.sh | Updated vLLM version. |
HybridRAG/tests/test_compose_on_gaudi.sh | Updated vLLM version. |
DocSum/tests/test_compose_on_xeon.sh | Updated vLLM version. |
CodeTrans/tests/test_compose_on_xeon.sh | Updated vLLM version. |
CodeGen/tests/test_compose_on_xeon.sh | Updated vLLM version. |
ChatQnA/tests/test_compose_without_rerank_on_xeon.sh | Updated vLLM version. |
ChatQnA/tests/test_compose_qdrant_on_xeon.sh | Updated vLLM version. |
ChatQnA/tests/test_compose_pinecone_on_xeon.sh | Updated vLLM version. |
ChatQnA/tests/test_compose_on_xeon.sh | Updated vLLM version. |
ChatQnA/tests/test_compose_milvus_on_xeon.sh | Updated vLLM version. |
ChatQnA/tests/test_compose_mariadb_on_xeon.sh | Updated vLLM version. |
ChatQnA/tests/test_compose_faqgen_tgi_on_xeon.sh | Updated vLLM version. |
ChatQnA/tests/test_compose_faqgen_on_xeon.sh | Updated vLLM version. |
AudioQnA/tests/test_compose_on_xeon.sh | Updated vLLM version. |
AudioQnA/tests/test_compose_multilang_on_xeon.sh | Updated vLLM version. |
AudioQnA/docker_compose/intel/cpu/xeon/compose_multilang.yaml | Added VLLM_CPU_OMP_THREADS_BIND variable. |
AudioQnA/docker_compose/intel/cpu/xeon/compose.yaml | Added VLLM_CPU_OMP_THREADS_BIND variable. |
.github/env/_build_image.sh | Updated vLLM version in the environment export. |
Comments suppressed due to low confidence (3)
AudioQnA/docker_compose/intel/cpu/xeon/compose_multilang.yaml:47
- [nitpick] Consider adding a comment or updating related documentation to clarify the purpose of the new VLLM_CPU_OMP_THREADS_BIND variable, ensuring maintainability of the configuration.
VLLM_CPU_OMP_THREADS_BIND: all
AudioQnA/docker_compose/intel/cpu/xeon/compose.yaml:43
- [nitpick] Consider adding a comment or updating related documentation to clarify the purpose of adding VLLM_CPU_OMP_THREADS_BIND to the service configuration for easier future maintenance.
VLLM_CPU_OMP_THREADS_BIND: all
.github/env/_build_image.sh:5
- Ensure that VLLM_FORK_VER (currently set to v0.6.6.post1+Gaudi-1.20.0) remains compatible with the updated vLLM version v0.9.1 to prevent any integration issues.
export VLLM_VER=v0.9.1
@louie-tsai thank you very much for the solution. Both are workable. |
Update vLLM version to v0.9.1