Chat with Your PDFs using RAG

This project allows you to upload a PDF and ask questions about its content using Deepseek R1 via Ollama. The application processes PDFs, extracts text, indexes them into a vector store, and retrieves relevant context to generate concise answers.

Features

📂 Upload a PDF: Select a PDF file to process.
🔍 Text Extraction & Indexing: Extracts content and indexes it for efficient search.
💡 Question-Answering: Ask questions related to the PDF content and get relevant answers.
🚀 Powered by Ollama & LangChain: Uses Deepseek R1 for embeddings and responses.

Installation

Prerequisites

Python 3.8+
Ollama installed
Dependencies installed via pip

Setup

Clone this repository:

git clone https://github.com/hasan-py/chat-with-pdf-RAG.git
cd chat-with-pdf-RAG

Activate your python env and install the dependencies.

Install dependencies:
```
pip install -r requirements.txt
```
Run the Streamlit app:
```
streamlit run pdf_rag.py
```

How It Works

Upload a PDF: Use the UI to upload a document.
Processing: The app extracts text and chunks it for indexing.
Ask Questions: Enter a question in the chat box.
Get Answers: The system retrieves relevant text and responds concisely.

How to change model?

To change the model used for inference, you can modify the LLM variable in the pdf_rag.py file. The LLM variable is initialized with the deepseek-r1:8b model by default. You can replace it with any other model supported by Ollama.

File Structure

chat-with-pdf/
│── pdfs/                   # Directory for uploaded PDFs
│── pdf_rag.py              # Main Streamlit app
│── requirements.txt        # Dependencies
│── README.md               # Documentation
│── test_pdf_rag.py         # Unit Test

Technologies Used

Python
Streamlit (for UI)
LangChain (for text processing)
Ollama (for LLM inference)
PDFPlumber (for PDF extraction)

Contributing

Feel free to submit issues and PRs to improve the project! And follow this steps:

Before submitting PRs, please update the corresponding test cases.
Please attach a screen recording video to the PR description showing that all functionality is working properly.

Acknowledgments

Special thanks to the creators of LangChain, Ollama, Streamlit and the community for enabling this functionality.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
README.md		README.md
pdf_rag.py		pdf_rag.py
requirements.txt		requirements.txt
test_pdf_rag.py		test_pdf_rag.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chat with Your PDFs using RAG

Features

Installation

Prerequisites

Setup

How It Works

How to change model?

File Structure

Technologies Used

Contributing

Acknowledgments

About

Releases 1

Packages

Languages

hasan-py/chat-with-pdf-RAG

Folders and files

Latest commit

History

Repository files navigation

Chat with Your PDFs using RAG

Features

Installation

Prerequisites

Setup

How It Works

How to change model?

File Structure

Technologies Used

Contributing

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages