RAG-ChatBot is a Retrieval-Augmented Generation (RAG) system that combines document ingestion, vector-based retrieval, and large language models (LLMs) to provide intelligent responses based on ingested documents. The system supports multiple LLM providers, including OpenAI, Anthropic, and Google Gemini, and allows users to query the system through a React-based frontend.
- Document Ingestion: Supports ingestion of various file formats, including PDFs, DOCX, CSVs, images, and videos.
- Vector-Based Retrieval: Uses FAISS for efficient similarity search on document embeddings.
- LLM Integration: Supports multiple LLM providers (OpenAI, Anthropic, Google Gemini, and custom APIs).
- React Frontend: Interactive chat interface with settings and source display.
- REST API: Exposes endpoints for querying, ingestion, and system status.
.
├── .gitignore
├── README.md
├── requirements.txt
├── test.py
├── data/
│ ├── csv/
│ ├── docx/
│ ├── images/
│ └── pdfs/
├── frontend/
│ ├── package.json
│ ├── public/
│ └── src/
└── src/
├── __init__.py
├── api.py
├── document_processor.py
├── embeddings.py
├── gemini_interface.py
├── llm_interface.py
├── main_application.py
├── rag_system.py
├── text_processor.py
└── vector_store.py
-
Clone the repository:
git clone https://github.com/your-username/RAG-ChatBot.git cd RAG-ChatBot
-
Install Python dependencies:
pip install -r requirements.txt
-
Set up environment variables:
OPENAI_API_KEY
: Your OpenAI API key.ANTHROPIC_API_KEY
: Your Anthropic API key.GEMINI_API_KEY
: Your Google Gemini API key.
-
Run the backend server:
python src/api.py
-
Navigate to the
frontend
directory:cd frontend
-
Install Node.js dependencies:
npm install
-
Start the development server:
npm start
The frontend will be available at http://localhost:3000.
- Enter a file or directory path in the "Ingest Documents" panel on the frontend.
- Click the "Ingest" button to process and store the documents in the vector store.
- Type a question in the input box on the frontend.
- Click "Send" to query the system. The response will be displayed in the chat interface.
- Click the "Settings" button in the header to configure:
- LLM provider (OpenAI, Anthropic, Gemini, or custom API).
- API key and model name.
- Whether to include sources in responses.
- Query:
/api/query
(POST) - Ingest:
/api/ingest
(POST) - System Status:
/api/status
(GET)
- Text Documents: PDF, DOCX, CSV
- Images: JPG, PNG, BMP, TIFF
- Videos: MP4, AVI, MKV, etc.
- Backend: Python, Flask, FAISS, Sentence Transformers
- Frontend: React, Create React App
- LLM Providers: OpenAI, Anthropic, Google Gemini
- Vector Search: FAISS
- Fork the repository.
- Create a new branch:
git checkout -b feature-name
- Commit your changes:
git commit -m "Add feature-name"
- Push to your branch:
git push origin feature-name
- Open a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.