This Streamlit application lets you VMchat with multiple PDFs by uploading files, extracting their text, and asking questions to get intelligent, context-aware answers.
1οΈβ£ PDF Upload
Upload one or more PDF files through the sidebar.
2οΈβ£ Extract Text
All text is extracted from the uploaded PDFs using PyPDF2.
3οΈβ£ Chunk Text
The extracted text is split into smaller overlapping chunks to maintain context and prepare for embedding.
4οΈβ£ Create Embeddings & Save Vector DB
Each chunk is converted into an embedding vector using Google Generative AI Embeddings and saved into a FAISS vector database for fast semantic search.
5οΈβ£ Ask Your Question
When you ask a question, it is also converted into an embedding vector.
6οΈβ£ Semantic Search
The app finds the most relevant chunks from your PDFs using similarity search in FAISS.
7οΈβ£ Generate Answer with LLM
The selected chunks and your question are passed to the Gemini language model (Google AI) to generate a detailed answer.
8οΈβ£ Display Answer & Save Conversation
The answer is shown in a clean, chat-style UI. All interactions are saved and can be downloaded as a CSV file.
PDF Upload β Extract Text β Chunk Text β Create Embeddings β Save Vector DB (FAISS) β User Question β Embed Question β Search Similar Chunks β Feed to LLM β Generate Answer β Display Answer & Save Conversation
- Streamlit: For the interactive web UI.
- PyPDF2: To extract text from PDFs.
- LangChain: For text splitting, embeddings, and building QA chains.
- Google Generative AI Embeddings: To create semantic vectors for text and questions.
- FAISS: For fast vector-based similarity search.
- Pandas: To handle conversation history and export CSV.
Developed by Vishwajit VM
π§ [email protected]