batch inference #347
Unanswered
liuxiaohao-xn
asked this question in
Q&A
Replies: 2 comments 3 replies
-
Same question here, it seems llama.cpp already supports batched inference. @abetlen It seems this usecase is very popular, what do you think? |
Beta Was this translation helpful? Give feedback.
2 replies
-
What kind of work would need to be done, either here or upstream in llama.cpp, to get batch inference fully working? Is there a roadmap anywhere already? I've got some code here from work that needs my help working against Vicuna, and the only real obstacle left is getting batch processing working |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have multiple prompts, I want to feed them all at once to the model to generate the output,can you tell me how to achieve it ?
Beta Was this translation helpful? Give feedback.
All reactions