The project implements a Document Chat API that allows users to extract, store, and interact with document content using FastAPI, Redis, and LiteLLM. It provides functionalities for PDF data extraction, summarization, and question-answering based on document content.
- Extracts content from PDF files and web pages.
- Stores extracted data in AWS S3 for accessibility.
- Users can select a document and request a summary using an LLM model.
- Users can ask questions about a document, and the system provides relevant answers.
- Uses Redis Streams for request-response handling.
- Implements a Redis worker that processes user requests asynchronously.
- Utilizes AWS S3 for file storage and retrieval.
- Supports multiple AI models (OpenAI, Gemini) via LiteLLM.
- Vedant Mane
- Abhinav Gangurde
- Yohan Markose
WE ATTEST THAT WE HAVEN’T USED ANY OTHER STUDENTS’ WORK IN OUR ASSIGNMENT AND ABIDE BY THE POLICIES LISTED IN THE STUDENT HANDBOOK
Application: Streamlit Deployment
Backend FastAPI: Google Cloud Run
Redis Endpoint: Redis Streams
Google Codelab: Codelab
Google Docs: Project Document
Video Walkthrough: Video
- Streamlit: Frontend Framework
- FastAPI: API Framework
- Docling: PDF Document Data Extraction Tool
- AWS S3: External Cloud Storage
- Redis Streams: Efficient queue-based request processing
- LiteLLM: Model-agnostic interface for LLMs
- Google Cloud Run: Backend Deployment
-
User uploads a PDF or provides a URL.
-
Content is extracted and stored in AWS S3.
-
User selects a document and requests a summary or asks a question.
-
The request is sent to Redis, where a worker processes it using LiteLLM.
-
The response is pushed back to Redis and retrieved by FastAPI.
-
Finally, the FastAPI pushes the responses to the Streamlit AUI.
Users interact with the system through the Streamlit frontend or API calls, providing input in the form of:
- PDF content: Users can upload a PDF document for content extraction and processing.
- Summary: Users can request summary of the uploaded document.
- Query: Users can ask questions related to the uploaded document.
The frontend, built using Streamlit, provides a user-friendly interface for:
- Uploading PDFs for content extraction.
- Selecting PDFs for data processing
- Displaying summaries of the uploaded PDF content.
- Allowing users to ask questions based on the content of the uploaded document.
The FastAPI backend receives the user inputs and processes them:
- It handles PDF uploads, including content extraction and storage in S3.
- It manages document interactions with Redis Streams for asynchronous task processing.
- FastAPI routes include:
- /upload_pdf: To upload and process PDFs.
- /select_pdf: To select the PDF file for processing.
- /summarize: To summarize document content.
- /ask-question: To answer user queries based on the document content.
Redis Streams are used for managing asynchronous task processing:
- When a request is received by FastAPI, it is added to the request stream.
- Requests are categorized and processed through the consumer group by the redis worker threads.
- Once a request is processed, the response is added to the response stream.
This approach decouples the request handling from immediate response generation, enabling scalable and efficient task management.
The Redis Worker listens to the request stream for new tasks. Once a task is detected:
- The worker processes the request, interacts with the LiteLLM model, and generates the appropriate response.
- After processing, the worker pushes the response to the response stream.
- The worker is designed to handle error logging and ensures any issues in task processing are managed appropriately.
The LiteLLM model is utilized to:
- Generate summaries of the document content.
- Provide answers to user questions based on the content of the uploaded PDF.
- The model interacts with the request prompt and outputs a response, which is returned via the Redis worker to the frontend or user.
The backend utilizes various components for processing PDFs:
- PDF Extraction: Using the
pdf_docling_converter
to extract and process data from uploaded PDFs. - S3 File Storage: PDFs and extracted content are stored in AWS S3 for long-term storage and retrieval.
- Redis: Redis handles message queuing and asynchronous task execution, connecting the FastAPI backend and the Redis worker for processing requests.
- The FastAPI backend is deployed on a cloud server (e.g., AWS, Heroku, etc.).
- Redis is deployed on a cloud service to ensure high availability and scalability.
- The Streamlit frontend can be hosted on a platform such as Streamlit Cloud or Heroku, providing users with easy access to interact with the system.
- AWS S3 is used for storing uploaded PDF files and extracted content, offering scalable storage solutions.
- The system is monitored continuously to ensure smooth operation, with regular updates and deployments as required.
Required Python Version 3.12.*
git clone https://github.com/BigDataIA-Spring2025-4/DAMG7245_Assignment04_Part01.git
cd DAMG7245_Assignment04_Part01
python -m venv venvsource venv/bin/activate
pip install -r requirements.txt
Step 1: Create an AWS Account
- Go to AWS Signup and click Create an AWS Account.
- Follow the instructions to enter your email, password, and billing details.
- Verify your identity and choose a support plan.
Step 2: Log in to AWS Management Console
- Visit AWS Console and log in with your credentials.
- Search for S3 in the AWS services search bar and open it.
Step 3: Create an S3 Bucket
- Click Create bucket.
- Enter a unique Bucket name.
- Select a region closest to your users.
- Configure settings as needed (e.g., versioning, encryption).
- Click Create bucket to finalize.
pip install litellm
LiteLLM supports multiple LLM providers like OpenAI, Azure, and Gemini. Ensure you have API keys for the respective services.
Create a .env
file and add the following:
OPENAI_API_KEY=your_openai_api_key
GEMINI_API_KEY=your_gemini_api_key
XAI_API_KEY=your_grok_api_key
HUGGINGFACE_API_KEY=your_hugging_face_api_key
Run the following command to ensure LiteLLM is installed correctly:
import litellm
response = litellm.completion(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response)
Before starting the setup, ensure the following prerequisites are met:
- Redis Cloud Account: Create a free or paid account on Redis Cloud.
- Redis Cloud Database: Create a Redis database on Redis Cloud.
- Python 3.7+: Ensure Python is installed on your system.
- Redis Client Library: Install
redis-py
and other necessary libraries for the backend and worker.
-
Sign In to Redis Cloud: Go to Redis Cloud and sign in or create a new account.
-
Create a New Database:
- Once signed in, navigate to Databases and click Create a Database.
- Select the region closest to your application.
- Choose the Redis Essentials plan (or a paid plan based on your usage).
- Click Create Database.
-
Note the Connection Details:
- After the database is created, Redis Cloud will provide the connection details (host, port, username, password). You will need these details to connect your application to the Redis Cloud instance.
Set the following environment variables for connecting to Redis Cloud in your application. Create a .env
file in your project directory and add the account configuration details.
You can monitor the Redis Streams by using the xread
command in the Redis CLI to view the contents of the request and response streams.
To view the request stream:
redis-cli XREAD COUNT 10 STREAMS request_stream 0
To view the response stream:
redis-cli XREAD COUNT 10 STREAMS response_stream 0
Step 1: Download and Install Google Cloud SDK
- Visit the Google Cloud SDK documentation for platform-specific installation instructions.
- Download the installer for your operating system (Windows, macOS, or Linux).
- Follow the installation steps provided for your system.
Step 2: Initialize Google Cloud SDK
- Open a terminal or command prompt.
- Run
gcloud init
to begin the setup process. - Follow the prompts to log in with your Google account and select a project.
Step 3: Verify Installation
- Run
gcloud --version
to confirm installation. - Use
gcloud config list
to check the active configuration.
- Build the Docker Image
# Build and tag your image (make sure you're in the project directory)
docker build --platform=linux/amd64 --no-cache -t gcr.io/<YOUR_PROJECT_ID>/fastapi-app .
- Test Locally (Optional but Recommended)
# Run the container locally
docker run -p 8080:8080 gcr.io/<YOUR_PROJECT_ID>/fastapi-app
# For Managing Environment Variables
docker run --env-file .env -p 8080:8080 gcr.io/<YOUR_PROJECT_ID>/fastapi-app
Visit http://localhost:8080/docs to verify the API works.
- Push to Google Container Registry
# Push the image
docker push gcr.io/<YOUR_PROJECT_ID>/fastapi-app
- Deploy to Cloud Run
gcloud run deploy fastapi-service \
--image gcr.io/<YOUR_PROJECT_ID>/fastapi-app \
--platform managed \
--region us-central1 \
--allow-unauthenticated
- Get your Service URL
gcloud run services describe fastapi-service \
--platform managed \
--region <REGION> \
--format 'value(status.url)'
- Check Application Logs
gcloud run services logs read fastapi-service --region <REGION>