Data Loader LLM 🚀

This project is a lightweight, modular pipeline for extracting and processing data from various sources like PDFs, text files, directories, and web pages using LangChain and Groq's LLMs.

🔧 Features

✅ PDF data extraction with PyPDFLoader
✅ Directory-wise PDF processing using DirectoryLoader
✅ Raw .txt file summarization
✅ Web scraping + LLM-based question answering
✅ Uses ChatGroq with DeepSeek or LLaMA-3 models

🧱 Directory Structure

.
├── dataloader/
│   ├── directory_loader.py       # Load multiple PDFs from a folder
│   ├── pypdf_loader.py           # Load and query a single PDF
│   ├── text_loader.py            # Summarize .txt files
│   ├── webbase_loader.py         # Extract info from websites
│   ├── extra.py                  # (Optional utility file)
│   ├── text.txt                  # Sample text file
│   ├── data.pdf                  # Sample PDF
│   └── .env                      # Stores your GROQ_API_KEY

📦 Requirements

Install dependencies via:

pip install -r requirements.txt

`requirements.txt`

langchain-groq
groq
python-dotenv
langchain_community
pypdf
bs4

🔑 Setup

Create a .env file:
```
GROQ_API_KEY=your_groq_api_key_here
```

Run any of the scripts as needed:

python pypdf_loader.py
python webbase_loader.py
python text_loader.py
python directory_loader.py

💡 Prompt Examples

PDF: Tell me all the education institute names of the person
Web: Name of the darkest coffee
Text: Summarize the following text

📄 License

This project is licensed under the MIT License.

Author: Nitesh Kumar Singh
Built with ❤️ using LangChain, Groq, and Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data Loader LLM 🚀

🔧 Features

🧱 Directory Structure

📦 Requirements

`requirements.txt`

🔑 Setup

💡 Prompt Examples

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
directory_loader.py		directory_loader.py
pypdf_loader.py		pypdf_loader.py
requirements.txt		requirements.txt
text.txt		text.txt
text_loader.py		text_loader.py
webbase_loader.py		webbase_loader.py

Nitesh-lng/Data-Loader-LLM

Folders and files

Latest commit

History

Repository files navigation

Data Loader LLM 🚀

🔧 Features

🧱 Directory Structure

📦 Requirements

requirements.txt

🔑 Setup

💡 Prompt Examples

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`requirements.txt`

Packages