Skip to content

Commit 32b82a4

Browse files
feat: Updating documents to highlight v2 api for Vector Similarity Se… (#5000)
* feat: Updating documents to highlight v2 api for Vector Similarity Search Signed-off-by: Francisco Javier Arceo <[email protected]> * updated to add another column about v2 support Signed-off-by: Francisco Javier Arceo <[email protected]> --------- Signed-off-by: Francisco Javier Arceo <[email protected]>
1 parent 92dde13 commit 32b82a4

File tree

2 files changed

+149
-68
lines changed

2 files changed

+149
-68
lines changed

docs/reference/alpha-vector-database.md

Lines changed: 146 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -7,20 +7,35 @@ Vector database allows user to store and retrieve embeddings. Feast provides gen
77
## Integration
88
Below are supported vector databases and implemented features:
99

10-
| Vector Database | Retrieval | Indexing |
11-
|-----------------|-----------|----------|
12-
| Pgvector | [x] | [ ] |
13-
| Elasticsearch | [x] | [x] |
14-
| Milvus | [ ] | [ ] |
15-
| Faiss | [ ] | [ ] |
16-
| SQLite | [x] | [ ] |
17-
| Qdrant | [x] | [x] |
10+
| Vector Database | Retrieval | Indexing | V2 Support* |
11+
|-----------------|-----------|----------|-------------|
12+
| Pgvector | [x] | [ ] | [] |
13+
| Elasticsearch | [x] | [x] | [] |
14+
| Milvus | [x] | [x] | [x] |
15+
| Faiss | [ ] | [ ] | [] |
16+
| SQLite | [x] | [ ] | [] |
17+
| Qdrant | [x] | [x] | [] |
18+
19+
*Note: V2 Support means the SDK supports retrieval of features along with vector embeddings from vector similarity search.
1820

1921
Note: SQLite is in limited access and only working on Python 3.10. It will be updated as [sqlite_vec](https://github.com/asg017/sqlite-vec/) progresses.
2022

21-
## Example
23+
{% hint style="danger" %}
24+
We will be deprecating the `retrieve_online_documents` method in the SDK in the future.
25+
We recommend using the `retrieve_online_documents_v2` method instead, which offers easier vector index configuration
26+
directly in the Feature View and the ability to retrieve standard features alongside your vector embeddings for richer context injection.
27+
28+
Long term we will collapse the two methods into one, but for now, we recommend using the `retrieve_online_documents_v2` method.
29+
Beyond that, we will then have `retrieve_online_documents` and `retrieve_online_documents_v2` simply point to `get_online_features` for
30+
backwards compatibility and the adopt industry standard naming conventions.
31+
{% endhint %}
32+
33+
**Note**: Milvus implements the v2 `retrieve_online_documents_v2` method in the SDK. This will be the longer-term solution so that Data Scientists can easily enable vector similarity search by just flipping a flag.
2234

23-
See [https://github.com/feast-dev/feast-workshop/blob/rag/module_4_rag](https://github.com/feast-dev/feast-workshop/blob/rag/module_4_rag) for an example on how to use vector database.
35+
## Examples
36+
37+
- See the v0 [Rag Demo](https://github.com/feast-dev/feast-workshop/blob/rag/module_4_rag) for an example on how to use vector database using the `retrieve_online_documents` method (planning migration and deprecation (planning migration and deprecation).
38+
- See the v1 [Milvus Quickstart](../../examples/rag/milvus-quickstart.ipynb) for a quickstart guide on how to use Feast with Milvus using the `retrieve_online_documents_v2` method.
2439

2540
### **Prepare offline embedding dataset**
2641
Run the following commands to prepare the embedding dataset:
@@ -34,25 +49,23 @@ The output will be stored in `data/city_wikipedia_summaries.csv.`
3449
Use the feature_store.yaml file to initialize the feature store. This will use the data as offline store, and Pgvector as online store.
3550

3651
```yaml
37-
project: feast_demo_local
52+
project: local_rag
3853
provider: local
39-
registry:
40-
registry_type: sql
41-
path: postgresql://@localhost:5432/feast
54+
registry: data/registry.db
4255
online_store:
43-
type: postgres
56+
type: milvus
57+
path: data/online_store.db
4458
vector_enabled: true
45-
vector_len: 384
46-
host: 127.0.0.1
47-
port: 5432
48-
database: feast
49-
user: ""
50-
password: ""
59+
embedding_dim: 384
60+
index_type: "IVF_FLAT"
5161

5262

5363
offline_store:
5464
type: file
55-
entity_key_serialization_version: 2
65+
entity_key_serialization_version: 3
66+
# By default, no_auth for authentication and authorization, other possible values kubernetes and oidc. Refer the documentation for more details.
67+
auth:
68+
type: no_auth
5669
```
5770
Run the following command in terminal to apply the feature store configuration:
5871
@@ -63,75 +76,128 @@ feast apply
6376
Note that when you run `feast apply` you are going to apply the following Feature View that we will use for retrieval later:
6477

6578
```python
66-
city_embeddings_feature_view = FeatureView(
67-
name="city_embeddings",
68-
entities=[item],
79+
document_embeddings = FeatureView(
80+
name="embedded_documents",
81+
entities=[item, author],
6982
schema=[
70-
Field(name="Embeddings", dtype=Array(Float32)),
83+
Field(
84+
name="vector",
85+
dtype=Array(Float32),
86+
# Look how easy it is to enable RAG!
87+
vector_index=True,
88+
vector_search_metric="COSINE",
89+
),
90+
Field(name="item_id", dtype=Int64),
91+
Field(name="author_id", dtype=String),
92+
Field(name="created_timestamp", dtype=UnixTimestamp),
93+
Field(name="sentence_chunks", dtype=String),
94+
Field(name="event_timestamp", dtype=UnixTimestamp),
7195
],
72-
source=source,
73-
ttl=timedelta(hours=2),
96+
source=rag_documents_source,
97+
ttl=timedelta(hours=24),
7498
)
7599
```
76100

77-
Then run the following command in the terminal to materialize the data to the online store:
78-
79-
```shell
80-
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%S")
81-
feast materialize-incremental $CURRENT_TIME
101+
Let's use the SDK to write a data frame of embeddings to the online store:
102+
```python
103+
store.write_to_online_store(feature_view_name='city_embeddings', df=df)
82104
```
83105

84106
### **Prepare a query embedding**
107+
During inference (e.g., during when a user submits a chat message) we need to embed the input text. This can be thought of as a feature transformation of the input data. In this example, we'll do this with a small Sentence Transformer from Hugging Face.
108+
85109
```python
86-
from batch_score_documents import run_model, TOKENIZER, MODEL
110+
import torch
111+
import torch.nn.functional as F
112+
from feast import FeatureStore
113+
from pymilvus import MilvusClient, DataType, FieldSchema
87114
from transformers import AutoTokenizer, AutoModel
88-
89-
question = "the most populous city in the U.S. state of Texas?"
115+
from example_repo import city_embeddings_feature_view, item
116+
117+
TOKENIZER = "sentence-transformers/all-MiniLM-L6-v2"
118+
MODEL = "sentence-transformers/all-MiniLM-L6-v2"
119+
120+
def mean_pooling(model_output, attention_mask):
121+
token_embeddings = model_output[
122+
0
123+
] # First element of model_output contains all token embeddings
124+
input_mask_expanded = (
125+
attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
126+
)
127+
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(
128+
input_mask_expanded.sum(1), min=1e-9
129+
)
130+
131+
def run_model(sentences, tokenizer, model):
132+
encoded_input = tokenizer(
133+
sentences, padding=True, truncation=True, return_tensors="pt"
134+
)
135+
# Compute token embeddings
136+
with torch.no_grad():
137+
model_output = model(**encoded_input)
138+
139+
sentence_embeddings = mean_pooling(model_output, encoded_input["attention_mask"])
140+
sentence_embeddings = F.normalize(sentence_embeddings, p=2, dim=1)
141+
return sentence_embeddings
142+
143+
question = "Which city has the largest population in New York?"
90144

91145
tokenizer = AutoTokenizer.from_pretrained(TOKENIZER)
92146
model = AutoModel.from_pretrained(MODEL)
93-
query_embedding = run_model(question, tokenizer, model)
94-
query = query_embedding.detach().cpu().numpy().tolist()[0]
147+
query_embedding = run_model(question, tokenizer, model).detach().cpu().numpy().tolist()[0]
95148
```
96149

97-
### **Retrieve the top 5 similar documents**
98-
First create a feature store instance, and use the `retrieve_online_documents` API to retrieve the top 5 similar documents to the specified query.
150+
### **Retrieve the top K similar documents**
151+
First create a feature store instance, and use the `retrieve_online_documents_v2` API to retrieve the top 5 similar documents to the specified query.
99152

100153
```python
101-
from feast import FeatureStore
102-
store = FeatureStore(repo_path=".")
103-
features = store.retrieve_online_documents(
104-
feature="city_embeddings:Embeddings",
105-
query=query,
106-
top_k=5
107-
).to_dict()
108-
109-
def print_online_features(features):
110-
for key, value in sorted(features.items()):
111-
print(key, " : ", value)
112-
113-
print_online_features(features)
154+
context_data = store.retrieve_online_documents_v2(
155+
features=[
156+
"city_embeddings:vector",
157+
"city_embeddings:item_id",
158+
"city_embeddings:state",
159+
"city_embeddings:sentence_chunks",
160+
"city_embeddings:wiki_summary",
161+
],
162+
query=query_embedding,
163+
top_k=3,
164+
distance_metric='COSINE',
165+
).to_df()
114166
```
167+
### **Generate the Response**
168+
Let's assume we have a base prompt and a function that formats the retrieved documents called `format_documents` that we
169+
can then use to generate the response with OpenAI's chat completion API.
170+
```python
171+
FULL_PROMPT = format_documents(rag_context_data, BASE_PROMPT)
115172

116-
### Configuration
173+
from openai import OpenAI
117174

118-
We offer [PGVector](https://github.com/pgvector/pgvector), [SQLite](https://github.com/asg017/sqlite-vec), [Elasticsearch](https://www.elastic.co) and [Qdrant](https://qdrant.tech/) as Online Store options for Vector Databases.
119-
120-
#### Installation with SQLite
175+
client = OpenAI(
176+
api_key=os.environ.get("OPENAI_API_KEY"),
177+
)
178+
response = client.chat.completions.create(
179+
model="gpt-4o-mini",
180+
messages=[
181+
{"role": "system", "content": FULL_PROMPT},
182+
{"role": "user", "content": question}
183+
],
184+
)
121185

122-
If you are using `pyenv` to manage your Python versions, you can install the SQLite extension with the following command:
123-
```bash
124-
PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" \
125-
LDFLAGS="-L/opt/homebrew/opt/sqlite/lib" \
126-
CPPFLAGS="-I/opt/homebrew/opt/sqlite/include" \
127-
pyenv install 3.10.14
186+
# And this will print the content. Look at the examples/rag/milvus-quickstart.ipynb for an end-to-end example.
187+
print('\n'.join([c.message.content for c in response.choices]))
128188
```
129-
And you can the Feast install package via:
189+
190+
### Configuration and Installation
191+
192+
We offer [Milvus](https://milvus.io/), [PGVector](https://github.com/pgvector/pgvector), [SQLite](https://github.com/asg017/sqlite-vec), [Elasticsearch](https://www.elastic.co) and [Qdrant](https://qdrant.tech/) as Online Store options for Vector Databases.
193+
194+
Milvus offers a convenient local implementation for vector similarity search. To use Milvus, you can install the Feast package with the Milvus extra.
195+
196+
#### Installation with Milvus
130197

131198
```bash
132-
pip install feast[sqlite_vec]
199+
pip install feast[milvus]
133200
```
134-
135201
#### Installation with Elasticsearch
136202

137203
```bash
@@ -143,3 +209,17 @@ pip install feast[elasticsearch]
143209
```bash
144210
pip install feast[qdrant]
145211
```
212+
#### Installation with SQLite
213+
214+
If you are using `pyenv` to manage your Python versions, you can install the SQLite extension with the following command:
215+
```bash
216+
PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" \
217+
LDFLAGS="-L/opt/homebrew/opt/sqlite/lib" \
218+
CPPFLAGS="-I/opt/homebrew/opt/sqlite/include" \
219+
pyenv install 3.10.14
220+
```
221+
222+
And you can the Feast install package via:
223+
```bash
224+
pip install feast[sqlite_vec]
225+
```

docs/reference/online-stores/milvus.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ The set of functionality supported by online stores is described in detail [here
4343
Below is a matrix indicating which functionality is supported by the Milvus online store.
4444
4545
| | Milvus |
46-
| :-------------------------------------------------------- |:-------|
46+
|:----------------------------------------------------------|:-------|
4747
| write feature values to the online store | yes |
4848
| read feature values from the online store | yes |
4949
| update infrastructure (e.g. tables) in the online store | yes |
@@ -59,6 +59,7 @@ Below is a matrix indicating which functionality is supported by the Milvus onli
5959
| support for deleting expired data | yes |
6060
| collocated by feature view | no |
6161
| collocated by feature service | no |
62-
| collocated by entity key | yes |
62+
| collocated by entity key | no |
63+
| vector similarity search | yes |
6364
6465
To compare this set of functionality against other online stores, please see the full [functionality matrix](overview.md#functionality-matrix).

0 commit comments

Comments
 (0)