Skip to content

add vllm support for token ids as input #3280

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

wybryan
Copy link

@wybryan wybryan commented Apr 11, 2025

What does this PR do?

This PR adds support for vLLM server & client to accept token ids as input.

The vLLM engine does support token ids as input as opposed to text under the hood, this PR makes the feature available to end users. The rationale behind is that in certain training user cases user wants precise control of what exact input tokens are used for vLLM, therefore adding this capability to let user have full control of input tokens before sending to vLLM.

Who can review?

Anyone in the community is free to review the PR once the tests have passed.

@wybryan wybryan marked this pull request as draft April 12, 2025 04:37
@wybryan wybryan marked this pull request as ready for review April 12, 2025 04:37
@wybryan
Copy link
Author

wybryan commented Apr 12, 2025

Hi @qgallouedec, is it possible for you to review this PR please?

@qgallouedec
Copy link
Member

Hi, so sorry for the late review. We are trying to mimic as much as possible de vLLM server. Do you know if it supports it? If so, how?

@wybryan
Copy link
Author

wybryan commented May 24, 2025

The vLLM engine natively supports input token ids instead of input strings. My PR just expose this feature from the wrapper in TRL. > Hi, so sorry for the late review. We are trying to mimic as much as possible de vLLM server. Do you know if it supports it? If so, how?

@wybryan
Copy link
Author

wybryan commented May 24, 2025

The rationale is that sometimes we want the training code take care of tokenization, i.e, we may manipulate directly at token ids and we want to the vLLM rollout generation takes the same manipulated token ids directly as opposed to original input string and do standard tokenization inside vLLM which will cause inconsistency between training and rollout generation.

That's what this PR is about, to feed raw token ids directly to vLLM (the vLLM engine supports this already but not accessible without this PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants