Skip to content

Consider the distributed case #637

Open
@iliya-malecki

Description

@iliya-malecki

I think there will literally be no other need to use any other vectorstore implementation if pgvecto.rs implements some level of smarts when it comes to sharding. My ideas include:

  • expose cluster IDs or hierarchy from within the indexes
  • have some decision making process to partition the space in the least disruptive way
  • either use citus or reimplement partitioning (haven't taken a single look into pgvecto.rs's source so I'm clueless about what's easier)
  • have a mechanism implementing the logic of select cluster_id from cluster_management order by centroids <#> $input

So essentially I'm asking for a distributed index for sharding stuff with citus or alike. I'm not sure how big of a help I can be but I can surely try implementing it!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions