Closed
Description
the latest convert.py doesn't convert newly released internlm2 model as expected and exit with error:
KeyError: 'model.tok_embeddings.weight'
internlm2 official response to the issue is:
"Unlike other GQA models, it packed q, k, v weights into one tensor."
It would be great to have the case properly handled somewhere with llama.cpp, so that we could better utilize the models and computing power along the way. See the issue logged in internlm2 community as below for more details.