A web-based chat interface to interact with Vidhi.ai FastAPI backend and a React frontend. Basically, this is a RAG-tuned AI model that can answer all questions regarding The Indian Constitution. This project supports streaming responses from Ollama and can be hosted on a personal Windows machine, accessible publicly.
- Stream responses from Ollama in real-time.
- Modern chat UI built with React.
- Fully CORS-enabled for frontend-backend communication.
- Can be deployed on a public IP without ngrok.
- Simple setup for Windows with FastAPI and Python.
.
├── ollama_api.py # FastAPI backend serving Ollama API & React frontend
├── frontend/ # React frontend
│ ├── public/
│ ├── src/
│ └── build/ # Production build after `npm run build`
├── requirements.txt # Python dependencies
└── README.md
- Windows machine with Ollama installed and running locally.
- Python 3.10+
- Node.js & npm (for React frontend)
- Internet access if hosting publicly.
- Create a Python virtual environment:
python -m venv venv
.\venv\Scripts\activate- Install dependencies:
pip install fastapi uvicorn requests aiofiles- Make sure Ollama is running locally (default port
11434).
- Navigate to the frontend folder:
cd frontend- Install dependencies:
npm install- Build the frontend for production:
npm run buildThis generates the
frontend/build/folder served by FastAPI.
Modify your FastAPI app to serve the frontend:
from fastapi.staticfiles import StaticFiles
app.mount("/", StaticFiles(directory="frontend/build", html=True), name="frontend")Run FastAPI on port 8888:
uvicorn ollama_api:app --host 0.0.0.0 --port 8888 --reload-
Open Windows Firewall:
- Allow inbound TCP traffic on port
8888.
- Allow inbound TCP traffic on port
-
Router Port Forwarding:
- Forward external port 8888 to your PC's local IP (e.g.,
192.168.1.5).
- Forward external port 8888 to your PC's local IP (e.g.,
-
Access your frontend:
http://<PUBLIC_IP>:8888/
In frontend/src/api/ollama.js:
const API_BASE = "http://<PUBLIC_IP>:8888"; // replace with your public IP- Rebuild React:
npm run build- Now the frontend communicates with the backend over your public IP.
- Visit your public URL:
http://<PUBLIC_IP>:8888/
- Type a message and send it.
- Responses are streamed in real-time from the Ollama model (
llama2-uncensoredby default).
- Anyone with access to the public IP can send requests to your backend.
- Recommended: Add API key authentication for
/generate_stream. - Use HTTPS (via Nginx/Caddy) if exposing to the internet.
- Keep Windows firewall and router passwords secure.
Python:
- fastapi
- uvicorn
- requests
- aiofiles
React:
- react
- react-dom
- react-scripts
- Add multiple Ollama models in the frontend.
- Implement streaming updates in React (character-by-character).
- Add user authentication and API keys.
- Use Docker for easier deployment.
- Enable HTTPS via reverse proxy (recommended for public servers).
Open-source project for personal or educational use.