Skip to content

Update vLLM version to v0.9.1 #2061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Update vLLM version to v0.9.1 #2061

wants to merge 5 commits into from

Conversation

CICD-at-OPEA
Copy link
Collaborator

Update vLLM version to v0.9.1

Signed-off-by: CICD-at-OPEA <[email protected]>
Copy link

github-actions bot commented Jun 10, 2025

Dependency Review

✅ No vulnerabilities or license issues found.

Scanned Files

None

@louie-tsai
Copy link
Collaborator

louie-tsai commented Jun 13, 2025

this is because core binding requires privileged mode to run numa_migrate_pages call. this PR solved the issue: vllm-project/vllm#19241. two options here:

  1. use privileged mode to run vLLM.
  2. disable core binding in vLLM by setting VLLM_CPU_OMP_THREADS_BIND=all and TP/PP=1

@yinghu5 yinghu5 requested a review from louie-tsai June 16, 2025 01:02
@yinghu5 yinghu5 requested review from yinghu5 and Copilot June 16, 2025 02:01
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the vLLM version from v0.9.0.1 to v0.9.1 across multiple test scripts, docker-compose files, and environment configuration files.

  • Updated vLLM version in test scripts for WorkflowExecAgent, VisualQnA, HybridRAG, DocSum, CodeTrans, CodeGen, ChatQnA, AudioQnA, etc.
  • Added a new environment variable (VLLM_CPU_OMP_THREADS_BIND) in docker-compose files.
  • Updated the build script environment variable in .github/env/_build_image.sh.

Reviewed Changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.

Show a summary per file
File Description
WorkflowExecAgent/tests/2_start_vllm_service.sh Updated vLLM version.
VisualQnA/tests/test_compose_on_xeon.sh Updated vLLM version.
HybridRAG/tests/test_compose_on_gaudi.sh Updated vLLM version.
DocSum/tests/test_compose_on_xeon.sh Updated vLLM version.
CodeTrans/tests/test_compose_on_xeon.sh Updated vLLM version.
CodeGen/tests/test_compose_on_xeon.sh Updated vLLM version.
ChatQnA/tests/test_compose_without_rerank_on_xeon.sh Updated vLLM version.
ChatQnA/tests/test_compose_qdrant_on_xeon.sh Updated vLLM version.
ChatQnA/tests/test_compose_pinecone_on_xeon.sh Updated vLLM version.
ChatQnA/tests/test_compose_on_xeon.sh Updated vLLM version.
ChatQnA/tests/test_compose_milvus_on_xeon.sh Updated vLLM version.
ChatQnA/tests/test_compose_mariadb_on_xeon.sh Updated vLLM version.
ChatQnA/tests/test_compose_faqgen_tgi_on_xeon.sh Updated vLLM version.
ChatQnA/tests/test_compose_faqgen_on_xeon.sh Updated vLLM version.
AudioQnA/tests/test_compose_on_xeon.sh Updated vLLM version.
AudioQnA/tests/test_compose_multilang_on_xeon.sh Updated vLLM version.
AudioQnA/docker_compose/intel/cpu/xeon/compose_multilang.yaml Added VLLM_CPU_OMP_THREADS_BIND variable.
AudioQnA/docker_compose/intel/cpu/xeon/compose.yaml Added VLLM_CPU_OMP_THREADS_BIND variable.
.github/env/_build_image.sh Updated vLLM version in the environment export.
Comments suppressed due to low confidence (3)

AudioQnA/docker_compose/intel/cpu/xeon/compose_multilang.yaml:47

  • [nitpick] Consider adding a comment or updating related documentation to clarify the purpose of the new VLLM_CPU_OMP_THREADS_BIND variable, ensuring maintainability of the configuration.
VLLM_CPU_OMP_THREADS_BIND: all

AudioQnA/docker_compose/intel/cpu/xeon/compose.yaml:43

  • [nitpick] Consider adding a comment or updating related documentation to clarify the purpose of adding VLLM_CPU_OMP_THREADS_BIND to the service configuration for easier future maintenance.
VLLM_CPU_OMP_THREADS_BIND: all

.github/env/_build_image.sh:5

  • Ensure that VLLM_FORK_VER (currently set to v0.6.6.post1+Gaudi-1.20.0) remains compatible with the updated vLLM version v0.9.1 to prevent any integration issues.
export VLLM_VER=v0.9.1

@yinghu5
Copy link
Collaborator

yinghu5 commented Jun 16, 2025

@louie-tsai thank you very much for the solution. Both are workable.
but consider several related factors, we may upgrade the vllm docker image next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants