Skip to content

Chat with Multiple PDFs : VMchat with multiple PDFs by uploading files, extracting their text, and asking questions to get intelligent, context-aware answers.

Notifications You must be signed in to change notification settings

vishwajitvm/RAG-ChatBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ Chat with Multiple PDFs

This Streamlit application lets you VMchat with multiple PDFs by uploading files, extracting their text, and asking questions to get intelligent, context-aware answers.


πŸš€ How it works

1️⃣ PDF Upload
Upload one or more PDF files through the sidebar.

2️⃣ Extract Text
All text is extracted from the uploaded PDFs using PyPDF2.

3️⃣ Chunk Text
The extracted text is split into smaller overlapping chunks to maintain context and prepare for embedding.

4️⃣ Create Embeddings & Save Vector DB
Each chunk is converted into an embedding vector using Google Generative AI Embeddings and saved into a FAISS vector database for fast semantic search.

5️⃣ Ask Your Question
When you ask a question, it is also converted into an embedding vector.

6️⃣ Semantic Search
The app finds the most relevant chunks from your PDFs using similarity search in FAISS.

7️⃣ Generate Answer with LLM
The selected chunks and your question are passed to the Gemini language model (Google AI) to generate a detailed answer.

8️⃣ Display Answer & Save Conversation
The answer is shown in a clean, chat-style UI. All interactions are saved and can be downloaded as a CSV file.


πŸ—ΊοΈ Workflow Summary

PDF Upload β†’ Extract Text β†’ Chunk Text β†’ Create Embeddings β†’ Save Vector DB (FAISS) ↓ User Question β†’ Embed Question β†’ Search Similar Chunks β†’ Feed to LLM β†’ Generate Answer ↓ Display Answer & Save Conversation


βš™οΈ Technologies Used

  • Streamlit: For the interactive web UI.
  • PyPDF2: To extract text from PDFs.
  • LangChain: For text splitting, embeddings, and building QA chains.
  • Google Generative AI Embeddings: To create semantic vectors for text and questions.
  • FAISS: For fast vector-based similarity search.
  • Pandas: To handle conversation history and export CSV.

πŸ’‘ Author

Developed by Vishwajit VM
πŸ“§ [email protected]


🟒 Start chatting with your PDFs today!

About

Chat with Multiple PDFs : VMchat with multiple PDFs by uploading files, extracting their text, and asking questions to get intelligent, context-aware answers.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages