Support for Jina Code model #292

patricebechard · 2024-06-18T02:06:57Z

What does this PR do?

We implement the modified architecture for Jina models found here

This allows us to run inference for jinaai/jina-embeddings-v2-base-code, a long context code embeddings model.

IMPORTANT CAVEAT
The way the models are currently loaded in the repo uses the model_type key found in the config.json for a given model. The way the Jina models are currently loaded is a bit hackish as we use Alibi position encoding to route to the JinaBertModel architecture rather than the BertModel architecture.

In the case where different models have the same model_type key (e.g. "model_type": "bert") and have "position_embedding_type": "alibi", they get routed to the JinaBertModel architecture at startup.

To get around this, I locally changed the model_type of the normal Jina models (e.g. jina-embeddings-v2-base-en) to jina_bert and the model_type of the Jina code models to jina_code_bert.

This enables me to properly load the models locally, but it is obviously not a clean solution that would enable someone to plug in the embedding models without changing the config files.

A more robust way of loading the different model architectures would be necessarily in order to properly support this new implementation

Fixes #270

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@OlivierDehaene

OlivierDehaene

That's ultra cool thanks for the hard work! :)
I only have very small comments.

Regarding how to correctly select the model type, I'm not sure yet. Very annoying that they didn't create a new one AGAIN.

OlivierDehaene · 2024-06-20T15:16:22Z

backends/candle/src/models/flash_jina_code.rs

+        let query_layer = self.query_linear.forward(hidden_states)?;
+        let query_layer = self.layer_norm_q.forward(&query_layer, None)?;
+
+        let key_layer = self.key_linear.forward(hidden_states)?;
+        let key_layer = self.layer_norm_k.forward(&key_layer, None)?;
+
+        let value_layer = self.value_linear.forward(hidden_states)?;


You should still merge qkv and do one fused linear.
It's faster to do one big linear instead of 3 smaller ones.
You can then split the result and apply the layer norm for q and k.

OlivierDehaene · 2024-06-20T15:16:48Z

backends/candle/src/models/flash_jina_code.rs

+        let query_linear = Linear::new(query_weight, Some(query_bias), None);
+        let key_linear = Linear::new(key_weight, Some(key_bias), None);
+        let value_linear = Linear::new(value_weight, Some(value_bias), None);


See other comment on merging

OlivierDehaene · 2024-06-20T15:17:51Z

backends/candle/src/models/flash_jina_code.rs

+        let query_layer = query_layer.reshape(new_qkv_shape.as_slice())?.transpose(1, 2)?;
+        let key_layer = key_layer.reshape(new_qkv_shape.as_slice())?.transpose(1, 2)?;
+        let value_layer = value_layer.reshape(new_qkv_shape.as_slice())?.transpose(1, 2)?;


Why are there transpose here?

OlivierDehaene · 2024-06-20T15:23:28Z

backends/candle/src/models/flash_jina_code.rs

+        let non_gated = hidden_states.i((.., .., 0..self.intermediate_size))?;
+        let gated = hidden_states.i((.., .., self.intermediate_size..))?;


You can use narrow instead:

Suggested change

let non_gated = hidden_states.i((.., .., 0..self.intermediate_size))?;

let gated = hidden_states.i((.., .., self.intermediate_size..))?;

let non_gated = hidden_states.narrow(2, 0, self.intermediate_size)?;

let gated = hidden_states.narrow(2, self.intermediate_size, self.intermediate_size)?;

I'm a bit confused, why is hidden_states dim3 here? Shouldn't it be dim2 and narrow on 1?

OlivierDehaene · 2024-06-20T15:25:02Z

backends/candle/src/models/jina_code.rs

+        let query_layer = self.query_linear.forward(hidden_states)?;
+        let query_layer = self.layer_norm_q.forward(&query_layer, None)?;
+
+        let key_layer = self.key_linear.forward(hidden_states)?;
+        let key_layer = self.layer_norm_k.forward(&key_layer, None)?;
+
+        let value_layer = self.value_linear.forward(hidden_states)?;


OlivierDehaene · 2024-06-20T15:25:14Z

backends/candle/src/models/jina_code.rs

+        let non_gated = hidden_states.i((.., .., 0..self.intermediate_size))?;
+        let gated = hidden_states.i((.., .., self.intermediate_size..))?;


OlivierDehaene · 2024-06-20T15:30:58Z

I'm not sure why we are dim3 in the flash implementation though. We should always be dim2 (concatenanted ids, hidden_dim). Can you check if there is an added dim somewhere for some reason?

OlivierDehaene · 2024-06-21T12:39:32Z

I will merge this branch in a dev branch and modify the few nitpicks above to make this PR part of today's release.
I hope you don't mind. Of course you will be credited for the PR!
Thanks again for the hard work!

patricebechard · 2024-06-21T20:29:42Z

I will merge this branch in a dev branch and modify the few nitpicks above to make this PR part of today's release.

I hope you don't mind. Of course you will be credited for the PR!

Thanks again for the hard work!

Not a problem at all. Thanks for helping!

patricebechard added 2 commits June 17, 2024 19:43

add implementation and tests for Jina Code model

a719c4b

add test snapshots

1f27f34

OlivierDehaene reviewed Jun 20, 2024

View reviewed changes

OlivierDehaene merged commit b8f6c78 into huggingface:main Jun 21, 2024

OlivierDehaene mentioned this pull request Jun 21, 2024

fix(candle): fix FlashJinaCodeModel #302

Merged

MasakiMu319 pushed a commit to MasakiMu319/text-embeddings-inference that referenced this pull request Nov 27, 2024

feat(candle): Support for Jina Code model (huggingface#292)

07f7de2

aagnone3 pushed a commit to StratisLLC/hf-text-embeddings-inference that referenced this pull request Dec 11, 2024

feat(candle): Support for Jina Code model (huggingface#292)

a7505ee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for Jina Code model #292

Support for Jina Code model #292

Uh oh!

patricebechard commented Jun 18, 2024

Uh oh!

OlivierDehaene left a comment

Uh oh!

OlivierDehaene Jun 20, 2024

Uh oh!

OlivierDehaene Jun 20, 2024

Uh oh!

OlivierDehaene Jun 20, 2024

Uh oh!

OlivierDehaene Jun 20, 2024

Uh oh!

OlivierDehaene Jun 20, 2024

Uh oh!

OlivierDehaene Jun 20, 2024

Uh oh!

OlivierDehaene commented Jun 20, 2024

Uh oh!

OlivierDehaene commented Jun 21, 2024

Uh oh!

patricebechard commented Jun 21, 2024

Uh oh!

Uh oh!

		let non_gated = hidden_states.i((.., .., 0..self.intermediate_size))?;
		let gated = hidden_states.i((.., .., self.intermediate_size..))?;

Support for Jina Code model #292

Support for Jina Code model #292

Uh oh!

Conversation

patricebechard commented Jun 18, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

OlivierDehaene left a comment

Choose a reason for hiding this comment

Uh oh!

OlivierDehaene Jun 20, 2024

Choose a reason for hiding this comment

Uh oh!

OlivierDehaene Jun 20, 2024

Choose a reason for hiding this comment

Uh oh!

OlivierDehaene Jun 20, 2024

Choose a reason for hiding this comment

Uh oh!

OlivierDehaene Jun 20, 2024

Choose a reason for hiding this comment

Uh oh!

OlivierDehaene Jun 20, 2024

Choose a reason for hiding this comment

Uh oh!

OlivierDehaene Jun 20, 2024

Choose a reason for hiding this comment

Uh oh!

OlivierDehaene commented Jun 20, 2024

Uh oh!

OlivierDehaene commented Jun 21, 2024

Uh oh!

patricebechard commented Jun 21, 2024

Uh oh!

Uh oh!