Skip to content

server: continue to update other slots on embedding concurrent request #5699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 24, 2024

Conversation

phymbert
Copy link
Collaborator

@phymbert phymbert commented Feb 24, 2024

Context

If multiple slots are computing embedding, only the first one is updated.

Changes

Continue to update remaining slots in update_slots in the main loop on embedding task.
Test scenario moved to parallel feature.

Closes #5655

…t request.

server: tests: add multi users embeddings as fixed
@phymbert phymbert requested review from ggerganov and ngxson February 24, 2024 12:05
@phymbert phymbert added bug Something isn't working server/webui labels Feb 24, 2024
Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go 🚀

@phymbert
Copy link
Collaborator Author

I will enjoy this PR to add OAI compatible embeddings concurrent scenario

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@phymbert phymbert merged commit 9e359a4 into master Feb 24, 2024
@phymbert phymbert deleted the hotfix/server-issue-5655-concurrent-embedding-final branch February 24, 2024 18:16
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
ggml-org#5699)

* server: ggml-org#5655 - continue to update other slots on embedding concurrent request.

* server: tests: add multi users embeddings as fixed

* server: tests: adding OAI compatible embedding concurrent endpoint

* server: tests: adding OAI compatible embedding with multiple inputs
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
ggml-org#5699)

* server: ggml-org#5655 - continue to update other slots on embedding concurrent request.

* server: tests: add multi users embeddings as fixed

* server: tests: adding OAI compatible embedding concurrent endpoint

* server: tests: adding OAI compatible embedding with multiple inputs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working server/webui
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segmentation fault
3 participants