NLQ Data Platform

A production-grade Natural Language Query platform combining a FastAPI backend for RAG-powered queries and a PySpark ETL pipeline for data ingestion.

Project Structure

nlq-data-platform/
├── backend/          # FastAPI server (NLQ + RAG pipeline)
├── etl/              # PySpark ETL pipeline (MySQL → S3)
├── docker-compose.yml
└── pyproject.toml    # Unified dependencies

Quick Start

Prerequisites

Python 3.14+
uv for dependency management

Installation

uv sync

Running Services

Backend API Server:

python backend/run.py
# API docs: http://localhost:8000/docs

ETL Pipeline:

python -m etl.main

Docker (all services):

docker-compose up -d

Services

Service	Description	Docs
Backend	FastAPI server with LangChain RAG pipeline for natural language queries	backend/README.md
ETL	PySpark pipeline for incremental data ingestion from MySQL to S3	etl/README.md

Environment Variables

Each service has its own .env file:

backend/.env — API keys, database URLs, secrets
etl/.env — MySQL credentials, S3 bucket config

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
backend		backend
etl		etl
retail_db		retail_db
.dockerignore		.dockerignore
.gitignore		.gitignore
DOCKER.md		DOCKER.md
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLQ Data Platform

Project Structure

Quick Start

Prerequisites

Installation

Running Services

Services

Environment Variables

About

Uh oh!

Releases

Packages

Languages

shyamprasadc/nlq-data-platform

Folders and files

Latest commit

History

Repository files navigation

NLQ Data Platform

Project Structure

Quick Start

Prerequisites

Installation

Running Services

Services

Environment Variables

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages