Open
Description
@slaren @ggerganov
would you consider a PR adding an option to build and take advantage of these header only cuda kernels in ggml-cuda:
https://docs.flashinfer.ai/installation.html#c-api
https://github.com/flashinfer-ai/flashinfer
?
Metadata
Metadata
Assignees
Labels
No labels