PDF Question-Answering Backend

A FastAPI backend service that enables real-time question-answering over uploaded PDF documents using WebSockets. Uses Gemini for text generation and ChromaDB for vector storage.

Backend Deployment link

https://ai-planet-backend-h3cw.onrender.com/

Free service of render is very slow to boot up.

Both file upload and websockets are hosted on this server. This link opens up a basic html with an upload button. For full functionality, the frontend (frontend/index.html) must be hosted on some other server or locally.

To serve the frontend locally:

cd frontend
python -m http.server 9000

This frontend is currently configured to work with locally hosted backend. It can be changed to use the deployed backend by changing the url in the fetch requrests.

Demo

https://github.com/codeblech/AI_Planet_backend/raw/refs/heads/main/screenshots/vid.mp4

Core Features

PDF upload endpoint with file validation and metadata storage
WebSocket endpoint for real-time Q&A
Session-based document management
Rate limiting for both HTTP and WebSocket endpoints
Automatic cleanup of uploaded files after WebSocket disconnection

Technical Stack

Framework: FastAPI
Database: SQLite(for document metadata) + ChromaDB (vector store)
File Storage: Local filesystem
LLM: Google Gemini 1.5
Rate Limiting: Redis
Testing: pytest with async support

Handles file validation, storage, and session initialization.

Testing

Comprehensive test suite covering:

File upload validation
WebSocket lifecycle
Rate limiting
PDF processing
Session cleanup

Setup

Install dependencies:

poetry install

Start Redis (Assuming Docker is installed):

docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest

Set environment variables:

GEMINI_API_KEY=your_api_key

Run server:

poetry run fastapi run app/main.py

Run tests:

pytest app/test_main.py -v

Docs available at:

http://localhost:8000/docs

and

http://localhost:8000/redoc

My mental notes while building this project.

Design Decisions

auth

not required as per requirements.

fastapi limiter

https://github.com/long2ice/fastapi-limiter
last updated: 11 months ago -> unmaintainted?, supports websockets -> chosen

slowapi

https://github.com/laurents/slowapi more active, used by many popular projects, but doesn't support websockets -> not chosen

redis

https://redis.io/docs/latest/operate/oss_and_stack/install/install-stack/docker/

redis/redis-stack contains both Redis Stack server and Redis Insight. This container is best for local development because you can use the embedded Redis Insight to visualize your data.
docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest

redis/redis-stack-server provides Redis Stack server only. This container is best for production deployment.
docker run -d --name redis-stack-server -p 6379:6379 redis/redis-stack-server:latest

hence, we'll use redis/redis-stack for local development.

extracting only text

only extracting text from pdfs as the requirement says.

LangChain bad

This is my third time trying to use LangChain. Now I've come to the conclusion that it is not worth the hassle. It is much simpler to implement the AI stuff without it. I tried to use LangChain for the pdf processing, but this library somehow manages to break every single thing it aims to optimize. Further, it has a lot of dependencies and bloat which makes it totally unsuitable for production.

LlamaIndex?

The docs are horrible. And the library suffers from the same problems as LangChain. Too much abstraction.

But the requirement says to use LangChain/LlamaIndex

I tried to come up with the best solution in the limited time constraint. In a case where some infra is already setup in LangChain/LlamaIndex, I would've used it.

ephemeral document storage

once some kind of user auth is implemented we can make the document storage persistent. But since the current requirement does not mention user auth, we'll just delete the files after the user disconnects.

background tasks

converting pdfs to text and storing them in the vector database can be done in the background. This is because the user is not waiting for the pdfs to be converted and stored, and the conversion and storage is not the main functionality of the app.

Future Improvements

template

https://github.com/fastapi/full-stack-fastapi-template/tree/master
might be useful for future full-stack projects.

periodic cleanup up uploads folder

in case that the client uploads files, but doesn't establish the websocket connection, the uploaded documents remain saved. These can be later deleted using a periodic cleanup task, which can be easily implemented using a cron job.

UI

show in the ui that pdfs and questions are being processed. But that's not the part of the requirement.

Notes

issue with pytest and fastapi-limiter

long2ice/fastapi-limiter#51

serving the frontend

if frontend is served from a live server like that in vscode, then it must be made sure that the upload folder is not served from that server. this is because the creation of new file in the upload folder will trigger a reload of the frontend, which will break the websocket connection.

langchain docs

A note on multimodal models Many modern LLMs support inference over multimodal inputs (e.g., images). In some applications -- such as question-answering over PDFs with complex layouts, diagrams, or scans -- it may be advantageous to skip the PDF parsing, instead casting a PDF page to an image and passing it to a model directly.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
app		app
docs		docs
frontend		frontend
screenshots		screenshots
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
LICENCE		LICENCE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF Question-Answering Backend

Backend Deployment link

Demo

Core Features

Technical Stack

Testing

Setup

My mental notes while building this project.

Design Decisions

auth

fastapi limiter

slowapi

redis

extracting only text

LangChain bad

LlamaIndex?

But the requirement says to use LangChain/LlamaIndex

ephemeral document storage

background tasks

Future Improvements

template

periodic cleanup up uploads folder

UI

Notes

issue with pytest and fastapi-limiter

serving the frontend

langchain docs

For more granular control over pdf parsing

LangChain chatbot tutorial

Screenshosts

About

Releases

Packages

Languages

License

codeblech/AI_Planet_backend

Folders and files

Latest commit

History

Repository files navigation

PDF Question-Answering Backend

Backend Deployment link

Demo

Core Features

Technical Stack

Testing

Setup

My mental notes while building this project.

Design Decisions

auth

fastapi limiter

slowapi

redis

extracting only text

LangChain bad

LlamaIndex?

But the requirement says to use LangChain/LlamaIndex

ephemeral document storage

background tasks

Future Improvements

template

periodic cleanup up uploads folder

UI

Notes

issue with pytest and fastapi-limiter

serving the frontend

langchain docs

For more granular control over pdf parsing

LangChain chatbot tutorial

Screenshosts

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages