Skip to content

asmshaon/question-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

question-generator

LLM Question generator with any uploaded content by users

Overview

An AI-powered web application that automatically generates questions and answers from PDF documents using Large Language Models (LLMs) and vector embeddings. Built with FastAPI and Groq's Llama 3.1 model for fast, accurate Q&A generation.

Features

  • PDF Upload Interface: Drag-and-drop or click-to-browse file upload
  • AI Question Generation: Automatically generates relevant questions from PDF content
  • Smart Answer Generation: Uses RAG (Retrieval Augmented Generation) with FAISS vector store
  • CSV Export: Download generated Q&A pairs in CSV format
  • Real-time Progress: Loading indicators and status messages
  • Fast Processing: Optimized for speed with configurable question limits (default: 10 questions)

Tech Stack

  • Backend: FastAPI (Python 3.10+)
  • LLM: Groq API (Llama 3.1-8b-instant)
  • Embeddings: HuggingFace sentence-transformers (all-MiniLM-L6-v2)
  • Vector Store: FAISS
  • Framework: LangChain
  • Frontend: HTML, CSS, Vanilla JavaScript

Installation

1. Clone the repository

git clone git@github.com:asmshaon/question-generator.git
cd question-generator

2. Create and activate virtual environment

Using venv:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Using conda:

conda create -n question-generator python=3.10 -y
conda activate question-generator

3. Install dependencies

pip install -r requirements.txt

4. Set up environment variables

Create a .env file in the project root:

GROQ_API_KEY=your_groq_api_key_here

Get your Groq API key from: https://console.groq.com/

Usage

Start the application

python app.py

The server will start at http://localhost:8080

Generate Q&A from PDF

  1. Open your browser and navigate to http://localhost:8080
  2. Upload a PDF file (drag & drop or click to browse)
  3. Click "Upload PDF" button
  4. Once uploaded, click "Generate Q&A"
  5. Wait for processing (typically 30-60 seconds for 10 questions)
  6. Download the generated CSV file

Configuration

Adjust Number of Questions

Edit app.py line 46:

# Generate 10 questions (default)
ques_list = ques_list[:10]

# Generate 20 questions
ques_list = ques_list[:20]

# Generate all questions (remove limit)
# Comment out or delete the line

Modify LLM Settings

Edit src/helper.py:

  • Temperature: Lines 66, 90 (default: 0.3)
  • Chunk sizes: Lines 36, 46
  • Model: Lines 65, 89

Project Structure

question-generator/
├── app.py                  # FastAPI application
├── requirements.txt        # Python dependencies
├── setup.py               # Package setup
├── .env                   # Environment variables (create this)
├── .gitignore            # Git ignore rules
├── LICENSE               # MIT License
├── README.md             # This file
├── src/
│   ├── helper.py         # LLM pipeline and processing
│   └── prompt.py         # Prompt templates
├── templates/
│   └── index.html        # Web interface
├── static/
│   ├── docs/             # Uploaded PDFs (gitignored)
│   └── output/           # Generated CSVs (gitignored)
└── data/                 # Sample data (optional)

API Endpoints

GET /

Serves the main web interface

POST /upload

Upload a PDF file

  • Request: FormData with pdf_file and filename
  • Response: {"msg": "success", "pdf_filename": "path/to/file"}

POST /analyze

Generate Q&A from uploaded PDF

  • Request: FormData with pdf_filename
  • Response: {"output_file": "path/to/csv"} or {"error": "message"}

Performance Optimization

The application includes several optimizations:

  • Uses "stuff" chain type instead of "refine" (3-5x faster)
  • Reduced chunk sizes (5000/800 tokens)
  • Lower temperature (0.3) for faster responses
  • Limited to 10 questions by default
  • Progress indicators in console

Processing time varies based on document size and number of questions.

Troubleshooting

Port 8080 already in use

lsof -ti:8080 | xargs kill -9

Missing dependencies

pip install -r requirements.txt

GROQ API Key error

Ensure your .env file exists in the project root with a valid GROQ_API_KEY

Slow processing

Reduce the number of questions in app.py line 46 or decrease chunk sizes in src/helper.py

Dependencies

Key dependencies include:

  • fastapi - Web framework
  • uvicorn - ASGI server
  • langchain - LLM orchestration
  • langchain-community - LangChain integrations
  • langchain-groq - Groq LLM provider
  • langchain-huggingface - HuggingFace embeddings
  • sentence-transformers - Embedding models
  • faiss-cpu - Vector similarity search
  • pypdf - PDF processing
  • python-dotenv - Environment management

See requirements.txt for complete list.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Abu Saleh Muhammad Shaon

Acknowledgments

Future Enhancements

  • Support for multiple file formats (DOCX, TXT, etc.)
  • Batch processing of multiple files
  • Custom prompt templates
  • Question difficulty levels
  • Export to multiple formats (JSON, Excel, etc.)
  • User authentication and history
  • API rate limiting and caching

About

LLM Question generator with any uploaded content by users

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •