Open
Description
It would be great to support this new model! https://cohere.com/blog/command-a
They use a fairly unique architecture, where some layers use sliding window attention while others use global attention with no position embeddings, so even though I read through the documentation on how to add a model I'm a little lost on how to do this myself.