📄 CodeDoc Builder

AI-powered local documentation generator for developers. Analyze your code, generate structured technical docs, export to PDF or Word — all running entirely on your own machine.

⚠️ Security Notice & Disclaimer

Read before use.

USE AT YOUR OWN RISK.

This software is provided "as is", without warranty of any kind, express or implied.
The authors and contributors accept no responsibility for any damage, data loss,
security incidents, or other consequences arising from the use of this software.

Important security points:

This tool is designed for local use only. Do not expose the FastAPI backend (port 8000) or Ollama (port 11434) to the public internet or an untrusted network.
The backend does not implement authentication or API key protection. Anyone on your local network who can reach port 8000 can submit code and trigger LLM generation.
Uploaded code files are read into memory and passed to your local LLM. They are not sent to any external server or third-party API, but they are stored in a local SQLite database (codedoc.db) as part of the history feature. Delete this file if you handle sensitive code.
The OLLAMA_ORIGINS=* flag opens Ollama's CORS policy completely. This is required for the tool to function but should only be done in a trusted local development environment.
Do not process code containing secrets, API keys, credentials, or personally identifiable information (PII) unless you fully understand and accept the local storage implications described above.

🎯 Project Description

CodeDoc Builder is a developer productivity tool that automates the most tedious part of software development: writing documentation. Developers can upload any source file or paste code directly, and the tool produces a complete, structured technical document without any manual writing.

The project is intentionally built around local, offline LLMs via Ollama. This means:

No internet connection required after setup
No subscription fees or API costs
No code ever transmitted to a third party
Works on air-gapped machines and private codebases

The initial scope is single-file documentation. The architecture is designed to scale toward full codebase analysis in future versions.

🏹 Aim & Focus

Primary aim: Give individual developers a zero-cost, privacy-first way to generate professional documentation for their code in seconds, not hours.

Core focus areas:

Focus	Description
Privacy first	Your code stays on your machine. No external APIs, no telemetry, no cloud calls.
8B model compatibility	Prompts and chunking are specifically engineered for smaller 8B parameter models which run on consumer hardware (8–16 GB VRAM).
Structured output	Documentation follows a consistent 7-section format on every generation — not freeform AI rambling.
Developer experience	Real-time streaming output, syntax-highlighted preview, one-click export. Minimal friction.
Extensibility	Clean separation between backend logic and frontend. Built to grow toward codebase-wide analysis.

✅ Current Implementation

Backend (FastAPI + Python)

main.py — REST API with SSE (Server-Sent Events) streaming endpoint for real-time token delivery to the frontend. Handles file uploads via multipart/form-data.
ollama_client.py — Async Ollama client using httpx. Contains 7 individually tuned prompts, one per documentation section. Streams tokens directly from the Ollama /api/generate endpoint.
chunker.py — Splits code into ~2000-character chunks at newline boundaries. Sends only the first 3 chunks as context to stay within 8B model token limits (~6000 chars per prompt). Includes file extension → language auto-detection.
exporter.py — Server-side document generation. PDF via reportlab (dark themed, paginated, page-break-safe). Word via python-docx with full markdown parsing (**bold**, ## headings, ### subheadings) rendered as proper Word styles with black readable text.
storage.py — SQLite-based persistence using the standard library sqlite3. Stores generated documents with metadata (filename, language, model, timestamp) and full section content. Supports list, load, and delete operations.

Frontend (React + Vite)

Header.jsx — Live Ollama connectivity check on load and every 20 seconds. Auto-populates model dropdown by calling /api/health — shows only your actually installed models. Retry button for manual re-check.
CodeInput.jsx — Dual input mode: drag-and-drop file upload or direct code paste. Syntax-highlighted preview using react-syntax-highlighter (vscDarkPlus theme) with 30+ language support. Live line/character/chunk counter. Language hint selector.
DocOutput.jsx — Renders all 7 sections as they stream in, section by section. Blinking cursor on the active section. Smooth scroll as content grows.
HistoryPanel.jsx — Sidebar list of all previously generated documents from SQLite. Click to reload any past doc. Delete individual entries. Shows filename, language, model, and timestamp.
ExportBar.jsx — Downloads from FastAPI export endpoints: PDF (/api/export/pdf/{id}), Word DOCX (/api/export/word/{id}), Plain Text (/api/export/text/{id}), and clipboard copy.

Generated Document Structure

#	Section	Target Length
01	File Description	~50 words — what this file does
02	Data Flow Diagram	ASCII art showing data movement
03	Modules & Functions	Annotated list of all functions, classes, variables
04	Trigger / Entry Point	Where and how execution starts
05	Detailed Explanations	~50 words per function/class
06	Final Outcome	What the code ultimately produces
07	Summary	~100 word technical overview

🚀 Future Enhancements

Near-term (v2.0)

Multi-file upload queue — Select and document multiple files in a single session, with per-file status tracking
Codebase / folder scanning — Point at a directory and auto-discover all source files by extension
File tree explorer — Visual sidebar showing the directory structure of a scanned project
Cross-file dependency mapping — Identify and document imports, function calls, and data flows between files
Configurable document structure — Let users enable/disable sections, reorder them, or add custom sections

Medium-term (v3.0)

Incremental re-generation — Regenerate only the sections that changed, not the entire document
Model parameter tuning UI — Expose temperature, context length, and system prompt from the interface
Markdown export — Direct .md output for README generation and wiki publishing
Git integration — Auto-detect changed files since last commit and offer targeted re-documentation
Dark / light theme toggle — Currently dark only; add a light mode for daytime use

Long-term (v4.0+)

RAG-based codebase Q&A — Ask questions about your codebase using a local vector store + Ollama
Automated README generator — Generate a full project-level README from multiple file docs
CI/CD hook — CLI mode for documentation generation in a pipeline (codedoc generate ./src)
Multi-language project support — Polyglot codebases with mixed language detection per file
Plugin system — Allow custom prompt templates and section definitions via config files

🗂️ Project Structure

codedoc/
├── backend/
│   ├── main.py            ← FastAPI app (routes, SSE stream, export endpoints)
│   ├── ollama_client.py   ← Async Ollama client + 7 focused prompts
│   ├── chunker.py         ← Code splitter + language auto-detection
│   ├── exporter.py        ← PDF (reportlab) + Word (python-docx) generation
│   ├── storage.py         ← SQLite history (save / load / delete)
│   ├── codedoc.db         ← Auto-created on first run (gitignored)
│   └── requirements.txt
├── frontend/
│   ├── src/
│   │   ├── App.jsx                    ← Root component, SSE stream handler
│   │   ├── main.jsx                   ← React entry point
│   │   ├── index.css                  ← Global CSS variables + animations
│   │   └── components/
│   │       ├── Header.jsx             ← Ollama status + model selector
│   │       ├── CodeInput.jsx          ← Upload / paste + syntax preview
│   │       ├── DocOutput.jsx          ← Streaming section renderer
│   │       ├── HistoryPanel.jsx       ← Saved docs sidebar
│   │       └── ExportBar.jsx          ← PDF / Word / Text / Copy export
│   ├── index.html
│   ├── package.json
│   └── vite.config.js                 ← Dev proxy: /api → localhost:8000
└── README.md

🔧 Prerequisites

Python 3.10+
Node.js 18+
Ollama installed with at least one supported model

Recommended models

ollama pull llama3.1:8b       # Best general-purpose choice
ollama pull mistral:7b        # Fast, strong at structured output
ollama pull codellama:7b      # Best accuracy for code-heavy files
ollama pull gemma:7b          # Good alternative if others are slow

Minimum hardware: 8 GB RAM + 8 GB VRAM (GPU). CPU-only mode is supported but significantly slower.

⚙️ Setup & Running

Step 1 — Start Ollama with CORS open

Windows (run in a new terminal as Admin — close system tray Ollama first):

set OLLAMA_ORIGINS=*
ollama serve

macOS / Linux:

OLLAMA_ORIGINS=* ollama serve

Step 2 — Backend

cd backend
pip install -r requirements.txt
python main.py

Backend runs at: http://localhost:8000 API docs (auto-generated): http://localhost:8000/docs

Step 3 — Frontend

cd frontend
npm install
npm run dev

Frontend runs at: http://localhost:5173

Vite proxies all /api/* requests to the FastAPI backend automatically — no CORS configuration needed.

📖 Usage

Open http://localhost:5173 in your browser
The model dropdown auto-populates from your installed Ollama models
Upload a code file (drag & drop) or paste code in the text tab
Optionally click Syntax Preview to review with highlighting
Select a language hint if auto-detection is off
Click ⚡ Generate Documentation and watch sections stream in live
When complete, export as PDF, Word (.docx), Plain Text, or Copy
All generated docs are auto-saved — reload any past doc from the History sidebar

🏗️ Production Build

To serve the frontend directly from FastAPI (single server, no Vite needed):

cd frontend
npm run build             # Outputs built files to frontend/dist/

cd ../backend
python main.py            # Serves frontend/dist/ at http://localhost:8000

🧠 Notes on 8B Model Accuracy

Local 8B models are capable but have real constraints. This tool works around them:

Code is chunked at ~2000 characters; only the first 3 chunks (~6000 chars) are used as prompt context
Each of the 7 sections uses a separate, focused prompt — this avoids asking a small model to do too much at once
Section prompts include explicit word-count targets to prevent runaway output
Output quality varies significantly by model and code complexity — codellama:7b consistently performs best on code analysis
Very large files (> 10,000 lines) will have their tail truncated. Full codebase support is a planned feature.

📄 License

MIT License — see LICENSE for details.

This project is not affiliated with Ollama, Meta, Mistral AI, or any model provider. Model outputs are generated by third-party open-source models and may be inaccurate, incomplete, or misleading. Always review generated documentation before publishing or distributing it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 CodeDoc Builder

⚠️ Security Notice & Disclaimer

🎯 Project Description

🏹 Aim & Focus

✅ Current Implementation

Backend (FastAPI + Python)

Frontend (React + Vite)

Generated Document Structure

🚀 Future Enhancements

Near-term (v2.0)

Medium-term (v3.0)

Long-term (v4.0+)

🗂️ Project Structure

🔧 Prerequisites

Recommended models

⚙️ Setup & Running

Step 1 — Start Ollama with CORS open

Step 2 — Backend

Step 3 — Frontend

📖 Usage

🏗️ Production Build

🧠 Notes on 8B Model Accuracy

📄 License

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

Somesh-S-Dev/CodeDoc

Folders and files

Latest commit

History

Repository files navigation

📄 CodeDoc Builder

⚠️ Security Notice & Disclaimer

🎯 Project Description

🏹 Aim & Focus

✅ Current Implementation

Backend (FastAPI + Python)

Frontend (React + Vite)

Generated Document Structure

🚀 Future Enhancements

Near-term (v2.0)

Medium-term (v3.0)

Long-term (v4.0+)

🗂️ Project Structure

🔧 Prerequisites

Recommended models

⚙️ Setup & Running

Step 1 — Start Ollama with CORS open

Step 2 — Backend

Step 3 — Frontend

📖 Usage

🏗️ Production Build

🧠 Notes on 8B Model Accuracy

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages