[Core][Bugfix] Fix Online MM Beam Search #19688

alex-jw-brooks · 2025-06-16T13:25:43Z

Purpose

Fixes #19687

Test Plan

Running the sample script in the issue should produce output considering the mm data correctly. We should also update an async multimodal beam search test to correctly check the output to be safe.

Test Result

Sample response for the repro case provided in the issue, which now correctly passes the mm data through.

------------
Result 1: This image displays a gradient of colors ranging from red to violet.<|end|>
------------
Result 2: This image displays a gradient of colors ranging from red to violet, arranged in a checkered pattern.<|end|>

One of the online beam search tests with vision has also been updated to check the content to validate the fix.

@DarkLight1337 can you please take a look?

gemini-code-assist

Summary of Changes

Hello @alex-jw-brooks, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've addressed a bug affecting offline multimodal beam search where the multimodal data was not being correctly propagated. This involved adjusting the data retrieval mechanism in the engine protocol and updating a relevant test case to validate the fix with specific expected outputs. A minor refactor to the beam search sorting key function was also included.

Highlights

Multimodal Beam Search Fix: I've corrected how multimodal data and prompt tokens are accessed within the beam search logic to resolve issues where this data was not being passed correctly, specifically addressing the problems outlined in issue [Bug]: Async Beam Search Doesn't Pass Multimodal Data Correctly #19687.
Test Update: I've modified an existing vision beam search test (test_single_chat_session_image_base64encoded_beamsearch) to specifically assert against expected output for different images, ensuring that multimodal data is correctly processed during beam search.
Code Refactoring: I've extracted the logic for creating the beam search sorting key into a dedicated helper function (create_sort_beams_key_function) for better code organization.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configureGemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2025-06-16T13:26:14Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This pull request aims to fix an issue with multimodal data not being correctly passed through in offline beam search. The core change in vllm/engine/protocol.py modifies how prompt_token_ids and multi_modal_data are sourced for beam search, using the raw prompt input instead of processed_inputs. This is acknowledged as a workaround in a TODO comment, which is good. The refactoring in vllm/entrypoints/llm.py to use create_sort_beams_key_function improves code structure.

The main concern is with the updated test in tests/entrypoints/openai/test_vision.py. While adding temperature=0.0 and strengthening the assertion logic are positive changes, there's a potential mismatch between the max_completion_tokens=10 setting, the content/length of the newly added EXPECTED_MM_BEAM_SEARCH_RES strings, and the sample output provided in the PR description. This could affect the test's correctness and reliability.

tests/entrypoints/openai/test_vision.py

alex-jw-brooks · 2025-06-16T13:31:03Z

vllm/entrypoints/llm.py

-                                         tokenizer.eos_token_id,
-                                         length_penalty)
+        tokenizer = self.get_tokenizer()
+        sort_beams_key = create_sort_beams_key_function(


Small refactor to share this between sync / async because it's identical

DarkLight1337

LGTM, thanks for the fix!

Signed-off-by: Alex-Brooks <[email protected]>

alex-jw-brooks · 2025-06-19T15:30:46Z

Hey @DarkLight1337, thanks for the quick review! I think the test failure is unrelated

alex-jw-brooks requested review from DarkLight1337, robertgshaw2-redhat, simon-mo and aarnphm as code owners June 16, 2025 13:25

gemini-code-assist bot reviewed Jun 16, 2025

View reviewed changes

mergify bot added the frontend label Jun 16, 2025

gemini-code-assist bot reviewed Jun 16, 2025

View reviewed changes

tests/entrypoints/openai/test_vision.py Show resolved Hide resolved

alex-jw-brooks changed the title ~~[Core][Bugfix] Fix Offline MM Beam Search~~ [Core][Bugfix] Fix Online MM Beam Search Jun 16, 2025

alex-jw-brooks commented Jun 16, 2025

View reviewed changes

DarkLight1337 approved these changes Jun 16, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) June 16, 2025 15:47

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 16, 2025

auto-merge was automatically disabled June 16, 2025 16:28
Head branch was pushed to by a user without write access

alex-jw-brooks added 4 commits June 19, 2025 07:53

Use common beam search scoring

4bd5b1e

Signed-off-by: Alex-Brooks <[email protected]>

Pull prompt token ids and mm data off of raw prompt

7f6066f

Signed-off-by: Alex-Brooks <[email protected]>

Check responsese in vision async beam search test

b886374

Signed-off-by: Alex-Brooks <[email protected]>

Fix beam search test output in ci

2cf7fd8

Signed-off-by: Alex-Brooks <[email protected]>

alex-jw-brooks force-pushed the async_beam_search branch from 0c54650 to 2cf7fd8 Compare June 19, 2025 07:53

DarkLight1337 enabled auto-merge (squash) June 19, 2025 08:06

DarkLight1337 merged commit ead2110 into vllm-project:main Jun 19, 2025
72 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core][Bugfix] Fix Online MM Beam Search #19688

[Core][Bugfix] Fix Online MM Beam Search #19688

alex-jw-brooks commented Jun 16, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

github-actions bot commented Jun 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

alex-jw-brooks Jun 16, 2025

Uh oh!

DarkLight1337 left a comment

Uh oh!

alex-jw-brooks commented Jun 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Core][Bugfix] Fix Online MM Beam Search #19688

[Core][Bugfix] Fix Online MM Beam Search #19688

Conversation

alex-jw-brooks commented Jun 16, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Jun 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

alex-jw-brooks Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

alex-jw-brooks commented Jun 19, 2025

Uh oh!

Uh oh!

Uh oh!

alex-jw-brooks commented Jun 16, 2025 •

edited by github-actions bot

Loading