question-generator

LLM Question generator with any uploaded content by users

Overview

An AI-powered web application that automatically generates questions and answers from PDF documents using Large Language Models (LLMs) and vector embeddings. Built with FastAPI and Groq's Llama 3.1 model for fast, accurate Q&A generation.

Features

PDF Upload Interface: Drag-and-drop or click-to-browse file upload
AI Question Generation: Automatically generates relevant questions from PDF content
Smart Answer Generation: Uses RAG (Retrieval Augmented Generation) with FAISS vector store
CSV Export: Download generated Q&A pairs in CSV format
Real-time Progress: Loading indicators and status messages
Fast Processing: Optimized for speed with configurable question limits (default: 10 questions)

Tech Stack

Backend: FastAPI (Python 3.10+)
LLM: Groq API (Llama 3.1-8b-instant)
Embeddings: HuggingFace sentence-transformers (all-MiniLM-L6-v2)
Vector Store: FAISS
Framework: LangChain
Frontend: HTML, CSS, Vanilla JavaScript

Installation

1. Clone the repository

git clone git@github.com:asmshaon/question-generator.git
cd question-generator

2. Create and activate virtual environment

Using venv:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Using conda:

conda create -n question-generator python=3.10 -y
conda activate question-generator

3. Install dependencies

pip install -r requirements.txt

4. Set up environment variables

Create a .env file in the project root:

GROQ_API_KEY=your_groq_api_key_here

Get your Groq API key from: https://console.groq.com/

Usage

Start the application

python app.py

The server will start at http://localhost:8080

Generate Q&A from PDF

Open your browser and navigate to http://localhost:8080
Upload a PDF file (drag & drop or click to browse)
Click "Upload PDF" button
Once uploaded, click "Generate Q&A"
Wait for processing (typically 30-60 seconds for 10 questions)
Download the generated CSV file

Configuration

Adjust Number of Questions

Edit app.py line 46:

# Generate 10 questions (default)
ques_list = ques_list[:10]

# Generate 20 questions
ques_list = ques_list[:20]

# Generate all questions (remove limit)
# Comment out or delete the line

Modify LLM Settings

Edit src/helper.py:

Temperature: Lines 66, 90 (default: 0.3)
Chunk sizes: Lines 36, 46
Model: Lines 65, 89

Project Structure

question-generator/
├── app.py                  # FastAPI application
├── requirements.txt        # Python dependencies
├── setup.py               # Package setup
├── .env                   # Environment variables (create this)
├── .gitignore            # Git ignore rules
├── LICENSE               # MIT License
├── README.md             # This file
├── src/
│   ├── helper.py         # LLM pipeline and processing
│   └── prompt.py         # Prompt templates
├── templates/
│   └── index.html        # Web interface
├── static/
│   ├── docs/             # Uploaded PDFs (gitignored)
│   └── output/           # Generated CSVs (gitignored)
└── data/                 # Sample data (optional)

API Endpoints

`GET /`

Serves the main web interface

`POST /upload`

Upload a PDF file

Request: FormData with pdf_file and filename
Response: {"msg": "success", "pdf_filename": "path/to/file"}

`POST /analyze`

Generate Q&A from uploaded PDF

Request: FormData with pdf_filename
Response: {"output_file": "path/to/csv"} or {"error": "message"}

Performance Optimization

The application includes several optimizations:

Uses "stuff" chain type instead of "refine" (3-5x faster)
Reduced chunk sizes (5000/800 tokens)
Lower temperature (0.3) for faster responses
Limited to 10 questions by default
Progress indicators in console

Processing time varies based on document size and number of questions.

Troubleshooting

Port 8080 already in use

lsof -ti:8080 | xargs kill -9

Missing dependencies

pip install -r requirements.txt

GROQ API Key error

Ensure your .env file exists in the project root with a valid GROQ_API_KEY

Slow processing

Reduce the number of questions in app.py line 46 or decrease chunk sizes in src/helper.py

Dependencies

Key dependencies include:

fastapi - Web framework
uvicorn - ASGI server
langchain - LLM orchestration
langchain-community - LangChain integrations
langchain-groq - Groq LLM provider
langchain-huggingface - HuggingFace embeddings
sentence-transformers - Embedding models
faiss-cpu - Vector similarity search
pypdf - PDF processing
python-dotenv - Environment management

See requirements.txt for complete list.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Abu Saleh Muhammad Shaon

Email: srabon.php@gmail.com
GitHub: @asmshaon

Acknowledgments

Groq - Fast LLM inference
LangChain - LLM orchestration framework
HuggingFace - Embedding models
FAISS - Vector similarity search
FastAPI - Modern web framework

Future Enhancements

Support for multiple file formats (DOCX, TXT, etc.)
Batch processing of multiple files
Custom prompt templates
Question difficulty levels
Export to multiple formats (JSON, Excel, etc.)
User authentication and history
API rate limiting and caching

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
research		research
src		src
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
answers.txt		answers.txt
app.py		app.py
requirements.txt		requirements.txt
setup.py		setup.py

License

asmshaon/question-generator

Folders and files

Latest commit

History

Repository files navigation

question-generator

Overview

Features

Tech Stack

Installation

1. Clone the repository

2. Create and activate virtual environment

3. Install dependencies

4. Set up environment variables

Usage

Start the application

Generate Q&A from PDF

Configuration

Adjust Number of Questions

Modify LLM Settings

Project Structure

API Endpoints

GET /

POST /upload

POST /analyze

Performance Optimization

Troubleshooting

Port 8080 already in use

Missing dependencies

GROQ API Key error

Slow processing

Dependencies

Contributing

License

Author

Acknowledgments

Future Enhancements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`GET /`

`POST /upload`

`POST /analyze`

Packages