Small and extensible library for creating RAG workflows.
- π Features
- π₯ Installation
- π Usage
- π API Reference
- π§© Using Custom Models
- π± Example App
- π¦ Packages
- π€ Contributing
- π License
- Modular: Use only the components you need. Choose from
LLM
,Embeddings
,VectorStore
, andTextSplitter
. - Extensible: Create your own components by implementing the
LLM
,Embeddings
,VectorStore
, andTextSplitter
interfaces. - Multiple Integration Options: Whether you prefer a simple hook (
useRAG
), a powerful class (RAG
), or direct component interaction, the library adapts to your needs. - On-device Inference: Powered by
@react-native-rag/executorch
, allowing for private and efficient model execution directly on the user's device. - Vector Store Persistence: Includes support for SQLite with
@react-native-rag/op-sqlite
to save and manage vector stores locally. - Semantic Search Ready: Easily implement powerful semantic search in your app by using the
VectorStore
andEmbeddings
components directly.
npm install react-native-rag
You will also need an embeddings model and a large language model. We recommend using @react-native-rag/executorch
for on-device inference. To use it, install the following packages:
npm install @react-native-rag/executorch react-native-executorch
For persisting vector stores, you can use @react-native-rag/op-sqlite
:
We offer three ways to integrate RAG, depending on your needs.
The easiest way to get started. Good for simple use cases where you want to quickly set up RAG.
import React, { useState } from 'react';
import { Text } from 'react-native';
import { useRAG, MemoryVectorStore } from 'react-native-rag';
import {
ALL_MINILM_L6_V2,
ALL_MINILM_L6_V2_TOKENIZER,
LLAMA3_2_1B_QLORA,
LLAMA3_2_1B_TOKENIZER,
LLAMA3_2_TOKENIZER_CONFIG,
} from 'react-native-executorch';
import {
ExecuTorchEmbeddings,
ExecuTorchLLM,
} from '@react-native-rag/executorch';
const vectorStore = new MemoryVectorStore({
embeddings: new ExecuTorchEmbeddings({
modelSource: ALL_MINILM_L6_V2,
tokenizerSource: ALL_MINILM_L6_V2_TOKENIZER,
}),
});
const llm = new ExecuTorchLLM({
modelSource: LLAMA3_2_1B_QLORA,
tokenizerSource: LLAMA3_2_1B_TOKENIZER,
tokenizerConfigSource: LLAMA3_2_TOKENIZER_CONFIG,
});
const app = () => {
const rag = useRAG({ vectorStore, llm });
return (
<Text>{rag.response}</Text>
);
};
For more control over components and configuration.
import React, { useEffect, useState } from 'react';
import { Text } from 'react-native';
import { RAG, MemoryVectorStore } from 'react-native-rag';
import {
ExecuTorchEmbeddings,
ExecuTorchLLM,
} from '@react-native-rag/executorch';
import {
ALL_MINILM_L6_V2,
ALL_MINILM_L6_V2_TOKENIZER,
LLAMA3_2_1B_QLORA,
LLAMA3_2_1B_TOKENIZER,
LLAMA3_2_TOKENIZER_CONFIG,
} from 'react-native-executorch';
const app = () => {
const [rag, setRag] = useState<RAG | null>(null);
const [response, setResponse] = useState<string | null>(null);
useEffect(() => {
const initializeRAG = async () => {
const embeddings = new ExecuTorchEmbeddings({
modelSource: ALL_MINILM_L6_V2,
tokenizerSource: ALL_MINILM_L6_V2_TOKENIZER,
});
const llm = new ExecuTorchLLM({
modelSource: LLAMA3_2_1B_QLORA,
tokenizerSource: LLAMA3_2_1B_TOKENIZER,
tokenizerConfigSource: LLAMA3_2_TOKENIZER_CONFIG,
responseCallback: setResponse,
});
const vectorStore = new MemoryVectorStore({ embeddings });
const ragInstance = new RAG({
llm: llm,
vectorStore: vectorStore,
});
await ragInstance.load();
setRag(ragInstance);
};
initializeRAG();
}, []);
return (
<Text>{response}</Text>
);
}
For advanced use cases requiring fine-grained control.
This is the recommended way you if you want to implement semantic search in your app, use the VectorStore
and Embeddings
classes directly.
import React, { useEffect, useState } from 'react';
import { Text } from 'react-native';
import { MemoryVectorStore } from 'react-native-rag';
import {
ExecuTorchEmbeddings,
ExecuTorchLLM,
} from '@react-native-rag/executorch';
import {
ALL_MINILM_L6_V2,
ALL_MINILM_L6_V2_TOKENIZER,
LLAMA3_2_1B_QLORA,
LLAMA3_2_1B_TOKENIZER,
LLAMA3_2_TOKENIZER_CONFIG,
} from 'react-native-executorch';
const app = () => {
const [embeddings, setEmbeddings] = useState<ExecuTorchEmbeddings | null>(null);
const [llm, setLLM] = useState<ExecuTorchLLM | null>(null);
const [vectorStore, setVectorStore] = useState<MemoryVectorStore | null>(null);
const [response, setResponse] = useState<string | null>(null);
useEffect(() => {
const initializeRAG = async () => {
// Instantiate and load the Embeddings Model
// NOTE: Calling load on VectorStore will automatically load the embeddings model
// so loading the embeddings model separately is not necessary in this case.
const embeddings = await new ExecuTorchEmbeddings({
modelSource: ALL_MINILM_L6_V2,
tokenizerSource: ALL_MINILM_L6_V2_TOKENIZER,
}).load();
// Instantiate and load the Large Language Model
const llm = await new ExecuTorchLLM({
modelSource: LLAMA3_2_1B_QLORA,
tokenizerSource: LLAMA3_2_1B_TOKENIZER,
tokenizerConfigSource: LLAMA3_2_TOKENIZER_CONFIG,
responseCallback: setResponse,
}).load();
// Instantiate and initialize the Vector Store
const vectorStore = await new MemoryVectorStore({ embeddings }).load();
setEmbeddings(embeddings);
setLLM(llm);
setVectorStore(vectorStore);
};
initializeRAG();
}, []);
return (
<Text>{response}</Text>
);
}
A React hook for Retrieval Augmented Generation (RAG). Manages the RAG system lifecycle, loading, unloading, generation, and document storage.
Parameters:
params
: An object containing:vectorStore
: An instance of a class that implementsVectorStore
interface.llm
: An instance of a class that implementsLLM
interface.preventLoad
(optional): A boolean to defer loading the RAG system.
Returns: An object with the following properties:
response
(string
): The current generated text from the LLM.isReady
(boolean
): True if the RAG system (Vector Store and LLM) is loaded.isGenerating
(boolean
): True if the LLM is currently generating a response.isStoring
(boolean
): True if a document operation (add, update, delete) is in progress.error
(string | null
): The last error message, if any.generate
: A function to generate text. SeeRAG.generate()
for details.interrupt
: A function to stop the current generation. SeeRAG.interrupt()
for details.splitAddDocument
: A function to split and add a document. SeeRAG.splitAddDocument()
for details.addDocument
: Adds a document. SeeRAG.addDocument()
for details.updateDocument
: Updates a document. SeeRAG.updateDocument()
for details.deleteDocument
: Deletes a document. SeeRAG.deleteDocument()
for details.
The core class for managing the RAG workflow.
constructor(params: RAGParams)
params
: An object containing:vectorStore
: An instance that implementsVectorStore
interface.llm
: An instance that implementsLLM
interface.
Methods:
async load(): Promise<this>
: Initializes the vector store and loads the LLM.async unload(): Promise<void>
: Unloads the vector store and LLM.async generate(input: Message[] | string, options?: { augmentedGeneration?: boolean; k?: number; questionGenerator?: Function; promptGenerator?: Function; callback?: (token: string) => void }): Promise<string>
Generates a response.input
(Message[] | string
): A string or an array ofMessage
objects.options
(object, optional): Generation options.augmentedGeneration
(boolean
, optional): Iftrue
(default), retrieves context from the vector store to augment the prompt.k
(number
, optional): Number of documents to retrieve (default:3
).questionGenerator
(function
, optional): Custom question generator.promptGenerator
(function
, optional): Custom prompt generator.callback
(function
, optional): A function that receives tokens as they are generated.
async splitAddDocument(document: string, metadataGenerator?: (chunks: string[]) => Record<string, any>[], textSplitter?: TextSplitter): Promise<string[]>
: Splits a document into chunks and adds them to the vector store.async addDocument(document: string, metadata?: Record<string, any>): Promise<string>
: Adds a single document to the vector store.async updateDocument(id: string, document?: string, metadata?: Record<string, any>): Promise<void>
: Updates a document in the vector store.async deleteDocument(id: string): Promise<void>
: Deletes a document from the vector store.async interrupt(): Promise<void>
: Interrupts the ongoing LLM generation.
An in-memory implementation of the VectorStore
interface. Useful for development and testing without persistent storage or when you don't need to save documents across app restarts.
constructor(params: { embeddings: Embeddings })
-
params
: Requires anembeddings
instance to generate vectors for documents. -
async load(): Promise<this>
: Loads the Embeddings model. -
async unload(): Promise<void>
: Unloads the Embeddings model.
These interfaces define the contracts for creating your own custom components.
load: () => Promise<this>
: Loads the embedding model.unload: () => Promise<void>
: Unloads the model.embed: (text: string) => Promise<number[]>
: Generates an embedding for a given text.
load: () => Promise<this>
: Loads the language model.interrupt: () => Promise<void>
: Stops the current text generation.unload: () => Promise<void>
: Unloads the model.generate: (messages: Message[], callback: (token: string) => void) => Promise<string>
: Generates a response from a list of messages, streaming tokens to the callback.
load: () => Promise<this>
: Initializes the vector store.unload: () => Promise<void>
: Unloads the vector store and releases resources.add(document: string, metadata?: Record<string, any>): Promise<string>
: Adds a document.update(id: string, document?: string, metadata?: Record<string, any>): Promise<void>
: Updates a document.delete(id: string): Promise<void>
: Deletes a document.similaritySearch(query: string, k?: number): Promise<{ id: string; content: string; ... }[]>
: Searches fork
similar documents.
splitText: (text: string) => Promise<string[]>
: Splits text into an array of chunks.
The library provides wrappers around common langchain
text splitters. All splitters are initialized with { chunkSize: number, chunkOverlap: number }
.
RecursiveCharacterTextSplitter
: Splits text recursively by different characters. (Default inRAG
class).CharacterTextSplitter
: Splits text by a fixed character count.TokenTextSplitter
: Splits text by token count.MarkdownTextSplitter
: Splits text while preserving Markdown structure.LatexTextSplitter
: Splits text while preserving LaTeX structure.
uuidv4(): string
: Generates a compliant Version 4 UUID. Not cryptographically secure.cosine(a: number[], b: number[]): number
: Calculates the cosine similarity between two vectors.dotProduct(a: number[], b: number[]): number
: Calculates the dot product of two vectors.magnitude(a: number[]): number
: Calculates the Euclidean magnitude of a vector.
Bring your own models by creating classes that implement the LLM
, Embeddings
, VectorStore
and TextSplitter
interfaces. This allows you to use any model or service that fits your needs.
interface Embeddings {
load: () => Promise<this>;
unload: () => Promise<void>;
embed: (text: string) => Promise<number[]>;
}
interface LLM {
load: () => Promise<this>;
interrupt: () => Promise<void>;
unload: () => Promise<void>;
generate: (
messages: Message[],
callback: (token: string) => void
) => Promise<string>;
}
interface TextSplitter {
splitText: (text: string) => Promise<string[]>;
}
interface VectorStore {
load: () => Promise<this>;
unload: () => Promise<void>;
add(document: string, metadata?: Record<string, any>): Promise<string>;
update(
id: string,
document?: string,
metadata?: Record<string, any>
): Promise<void>;
delete(id: string): Promise<void>;
similaritySearch(
query: string,
k?: number
): Promise<
{
id: string;
content: string;
metadata?: Record<string, any>;
similarity: number;
}[]
>;
}
For a complete example app that demonstrates how to use the library, check out the example app.
@react-native-rag/executorch
: On-device inference withreact-native-executorch
.@react-native-rag/op-sqlite
: Persisting vector stores using SQLite.
Contributions are welcome! See the contributing guide to learn about the development workflow.
MIT