Talk2PDFs - Website Chatbot

Talk2PDFs is a web application that lets users interact with PDF documents through a chatbot. Users can upload PDFs or provide URLs, and the chatbot will use the extracted content to answer questions.

Technologies Used

Python
Streamlit - For the web interface.
Ollama - Using the llama3.2 model for language processing.
Visual Studio Build Tools - Required for compiling dependencies like ChromaDB.
1. Download and install Visual Studio Build Tools.
2. During installation, make sure to select the C++ build tools workload.
3. After installation, use pip to install ChromaDB

Key Features

PDF Upload/URL Input: Upload PDFs or provide URLs to process and extract text.
Text Extraction: Extracts and processes text from PDFs for interaction.
Chatbot Interaction: Ask questions related to the uploaded PDFs and get responses.
Text Preview: View a snippet of the extracted text before asking questions.
Real-Time Responses: Quickly get answers based on the content of the documents.
Integration with Vector Database: Uses ChromaDB for efficient document retrieval.
Session Memory: The chatbot retains previous interactions during a session for continuity.
Chat History: Keeps a log of the session’s conversations.

Setup

Using setup.py

You can set up the project using the provided setup.py file. This will automatically install the required dependencies listed in requirements.txt.

Make sure you have a requirements.txt file with the necessary packages.
Run the following command to install the package:

Manual Setup

If you prefer to set up manually:

Install Streamlit, Langchain, Langchain Community and ChromaDB:
```
pip install streamlit langchain langchain_community chromadb
```
Running:
```
streamlit run application.py
```
For running Ollama (LLM):
```
ollama run llama3.2
```

For Developers

Virtual Environment Setup:

Create a virtual environment:
```
python -m venv venv
```
Activate the virtual environment:

On Windows:

venv\Scripts\activate

On Windows:

source venv/bin/activate

Name	Name	Last commit message	Last commit date
Latest commit eshan-sud Update README.md Oct 5, 2024 b11a56f · Oct 5, 2024 History 12 Commits
pdfFiles	pdfFiles	Test commit	Oct 5, 2024
vectorDB	vectorDB	Test commit	Oct 5, 2024
LICENCE	LICENCE	Initial commit	Aug 25, 2024
README.md	README.md	Update README.md	Oct 5, 2024
application.py	application.py	Finished talk2PDFs	Oct 5, 2024
chatbot.py	chatbot.py	Finished talk2PDFs	Oct 5, 2024
requirements.txt	requirements.txt	Updated application's working	Oct 5, 2024
setup.py	setup.py	Finished talk2PDFs	Oct 5, 2024
util.py	util.py	Updated application's working	Oct 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Talk2PDFs - Website Chatbot

Technologies Used

Key Features

Setup

Using setup.py

Manual Setup

For Developers

Virtual Environment Setup:

About

Languages

License

eshan-sud/talk2pdfs

Folders and files

Latest commit

History

Repository files navigation

Talk2PDFs - Website Chatbot

Technologies Used

Key Features

Setup

Using setup.py

Manual Setup

For Developers

Virtual Environment Setup:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages