Skip to content

Model update resets document auto-generated embeddings #76

Open
@ingria

Description

@ingria

Description

In Typesense version 0.25 it's possible to define an embedding field. But since this field is auto-generated and is not stored locally, any update to the model will cause the embeddings field to reset.

Steps to reproduce

Schema:

    {
      "name": "embedding",
      "type": "float[]",
      "facet": false,
      "optional": false,
      "index": true,
      "sort": false,
      "infix": false,
      "locale": "",
      "embed": {
        "from": [
          "some_field"
        ],
        "model_config": {
          "model_name": "ts/paraphrase-multilingual-mpnet-base-v2"
        }
      },
      "num_dim": 768
    }

After I import the model, Typesense takes some time to generate embeddings. After that process, all documents will have the embedding field with array of 768 floats.

Then, If I call searchable() method on the model, the embedding field becomes empty.

Expected Behavior

embedding field should ether be updated if embed.from fields are changed, or be left unchanged.

Actual Behavior

the embedding field becomes empty

Metadata

Typesense Version: 0.25.0

OS: Ubuntu 20.04

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions