Skip to content

AI-powered local documentation generator for developers. Analyze your code, generate structured technical docs, export to PDF or Word — all running entirely on your own machine.

License

Notifications You must be signed in to change notification settings

Somesh-S-Dev/CodeDoc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📄 CodeDoc Builder

AI-powered local documentation generator for developers. Analyze your code, generate structured technical docs, export to PDF or Word — all running entirely on your own machine.

Python FastAPI React Ollama License


⚠️ Security Notice & Disclaimer

Read before use.

USE AT YOUR OWN RISK.

This software is provided "as is", without warranty of any kind, express or implied.
The authors and contributors accept no responsibility for any damage, data loss,
security incidents, or other consequences arising from the use of this software.

Important security points:

  • This tool is designed for local use only. Do not expose the FastAPI backend (port 8000) or Ollama (port 11434) to the public internet or an untrusted network.
  • The backend does not implement authentication or API key protection. Anyone on your local network who can reach port 8000 can submit code and trigger LLM generation.
  • Uploaded code files are read into memory and passed to your local LLM. They are not sent to any external server or third-party API, but they are stored in a local SQLite database (codedoc.db) as part of the history feature. Delete this file if you handle sensitive code.
  • The OLLAMA_ORIGINS=* flag opens Ollama's CORS policy completely. This is required for the tool to function but should only be done in a trusted local development environment.
  • Do not process code containing secrets, API keys, credentials, or personally identifiable information (PII) unless you fully understand and accept the local storage implications described above.

🎯 Project Description

CodeDoc Builder is a developer productivity tool that automates the most tedious part of software development: writing documentation. Developers can upload any source file or paste code directly, and the tool produces a complete, structured technical document without any manual writing.

The project is intentionally built around local, offline LLMs via Ollama. This means:

  • No internet connection required after setup
  • No subscription fees or API costs
  • No code ever transmitted to a third party
  • Works on air-gapped machines and private codebases

The initial scope is single-file documentation. The architecture is designed to scale toward full codebase analysis in future versions.


🏹 Aim & Focus

Primary aim: Give individual developers a zero-cost, privacy-first way to generate professional documentation for their code in seconds, not hours.

Core focus areas:

Focus Description
Privacy first Your code stays on your machine. No external APIs, no telemetry, no cloud calls.
8B model compatibility Prompts and chunking are specifically engineered for smaller 8B parameter models which run on consumer hardware (8–16 GB VRAM).
Structured output Documentation follows a consistent 7-section format on every generation — not freeform AI rambling.
Developer experience Real-time streaming output, syntax-highlighted preview, one-click export. Minimal friction.
Extensibility Clean separation between backend logic and frontend. Built to grow toward codebase-wide analysis.

✅ Current Implementation

Backend (FastAPI + Python)

  • main.py — REST API with SSE (Server-Sent Events) streaming endpoint for real-time token delivery to the frontend. Handles file uploads via multipart/form-data.
  • ollama_client.py — Async Ollama client using httpx. Contains 7 individually tuned prompts, one per documentation section. Streams tokens directly from the Ollama /api/generate endpoint.
  • chunker.py — Splits code into ~2000-character chunks at newline boundaries. Sends only the first 3 chunks as context to stay within 8B model token limits (~6000 chars per prompt). Includes file extension → language auto-detection.
  • exporter.py — Server-side document generation. PDF via reportlab (dark themed, paginated, page-break-safe). Word via python-docx with full markdown parsing (**bold**, ## headings, ### subheadings) rendered as proper Word styles with black readable text.
  • storage.py — SQLite-based persistence using the standard library sqlite3. Stores generated documents with metadata (filename, language, model, timestamp) and full section content. Supports list, load, and delete operations.

Frontend (React + Vite)

  • Header.jsx — Live Ollama connectivity check on load and every 20 seconds. Auto-populates model dropdown by calling /api/health — shows only your actually installed models. Retry button for manual re-check.
  • CodeInput.jsx — Dual input mode: drag-and-drop file upload or direct code paste. Syntax-highlighted preview using react-syntax-highlighter (vscDarkPlus theme) with 30+ language support. Live line/character/chunk counter. Language hint selector.
  • DocOutput.jsx — Renders all 7 sections as they stream in, section by section. Blinking cursor on the active section. Smooth scroll as content grows.
  • HistoryPanel.jsx — Sidebar list of all previously generated documents from SQLite. Click to reload any past doc. Delete individual entries. Shows filename, language, model, and timestamp.
  • ExportBar.jsx — Downloads from FastAPI export endpoints: PDF (/api/export/pdf/{id}), Word DOCX (/api/export/word/{id}), Plain Text (/api/export/text/{id}), and clipboard copy.

Generated Document Structure

# Section Target Length
01 File Description ~50 words — what this file does
02 Data Flow Diagram ASCII art showing data movement
03 Modules & Functions Annotated list of all functions, classes, variables
04 Trigger / Entry Point Where and how execution starts
05 Detailed Explanations ~50 words per function/class
06 Final Outcome What the code ultimately produces
07 Summary ~100 word technical overview

🚀 Future Enhancements

Near-term (v2.0)

  • Multi-file upload queue — Select and document multiple files in a single session, with per-file status tracking
  • Codebase / folder scanning — Point at a directory and auto-discover all source files by extension
  • File tree explorer — Visual sidebar showing the directory structure of a scanned project
  • Cross-file dependency mapping — Identify and document imports, function calls, and data flows between files
  • Configurable document structure — Let users enable/disable sections, reorder them, or add custom sections

Medium-term (v3.0)

  • Incremental re-generation — Regenerate only the sections that changed, not the entire document
  • Model parameter tuning UI — Expose temperature, context length, and system prompt from the interface
  • Markdown export — Direct .md output for README generation and wiki publishing
  • Git integration — Auto-detect changed files since last commit and offer targeted re-documentation
  • Dark / light theme toggle — Currently dark only; add a light mode for daytime use

Long-term (v4.0+)

  • RAG-based codebase Q&A — Ask questions about your codebase using a local vector store + Ollama
  • Automated README generator — Generate a full project-level README from multiple file docs
  • CI/CD hook — CLI mode for documentation generation in a pipeline (codedoc generate ./src)
  • Multi-language project support — Polyglot codebases with mixed language detection per file
  • Plugin system — Allow custom prompt templates and section definitions via config files

🗂️ Project Structure

codedoc/
├── backend/
│   ├── main.py            ← FastAPI app (routes, SSE stream, export endpoints)
│   ├── ollama_client.py   ← Async Ollama client + 7 focused prompts
│   ├── chunker.py         ← Code splitter + language auto-detection
│   ├── exporter.py        ← PDF (reportlab) + Word (python-docx) generation
│   ├── storage.py         ← SQLite history (save / load / delete)
│   ├── codedoc.db         ← Auto-created on first run (gitignored)
│   └── requirements.txt
├── frontend/
│   ├── src/
│   │   ├── App.jsx                    ← Root component, SSE stream handler
│   │   ├── main.jsx                   ← React entry point
│   │   ├── index.css                  ← Global CSS variables + animations
│   │   └── components/
│   │       ├── Header.jsx             ← Ollama status + model selector
│   │       ├── CodeInput.jsx          ← Upload / paste + syntax preview
│   │       ├── DocOutput.jsx          ← Streaming section renderer
│   │       ├── HistoryPanel.jsx       ← Saved docs sidebar
│   │       └── ExportBar.jsx          ← PDF / Word / Text / Copy export
│   ├── index.html
│   ├── package.json
│   └── vite.config.js                 ← Dev proxy: /api → localhost:8000
└── README.md

🔧 Prerequisites

  • Python 3.10+
  • Node.js 18+
  • Ollama installed with at least one supported model

Recommended models

ollama pull llama3.1:8b       # Best general-purpose choice
ollama pull mistral:7b        # Fast, strong at structured output
ollama pull codellama:7b      # Best accuracy for code-heavy files
ollama pull gemma:7b          # Good alternative if others are slow

Minimum hardware: 8 GB RAM + 8 GB VRAM (GPU). CPU-only mode is supported but significantly slower.


⚙️ Setup & Running

Step 1 — Start Ollama with CORS open

Windows (run in a new terminal as Admin — close system tray Ollama first):

set OLLAMA_ORIGINS=*
ollama serve

macOS / Linux:

OLLAMA_ORIGINS=* ollama serve

Step 2 — Backend

cd backend
pip install -r requirements.txt
python main.py

Backend runs at: http://localhost:8000 API docs (auto-generated): http://localhost:8000/docs


Step 3 — Frontend

cd frontend
npm install
npm run dev

Frontend runs at: http://localhost:5173

Vite proxies all /api/* requests to the FastAPI backend automatically — no CORS configuration needed.


📖 Usage

  1. Open http://localhost:5173 in your browser
  2. The model dropdown auto-populates from your installed Ollama models
  3. Upload a code file (drag & drop) or paste code in the text tab
  4. Optionally click Syntax Preview to review with highlighting
  5. Select a language hint if auto-detection is off
  6. Click ⚡ Generate Documentation and watch sections stream in live
  7. When complete, export as PDF, Word (.docx), Plain Text, or Copy
  8. All generated docs are auto-saved — reload any past doc from the History sidebar

🏗️ Production Build

To serve the frontend directly from FastAPI (single server, no Vite needed):

cd frontend
npm run build             # Outputs built files to frontend/dist/

cd ../backend
python main.py            # Serves frontend/dist/ at http://localhost:8000

🧠 Notes on 8B Model Accuracy

Local 8B models are capable but have real constraints. This tool works around them:

  • Code is chunked at ~2000 characters; only the first 3 chunks (~6000 chars) are used as prompt context
  • Each of the 7 sections uses a separate, focused prompt — this avoids asking a small model to do too much at once
  • Section prompts include explicit word-count targets to prevent runaway output
  • Output quality varies significantly by model and code complexity — codellama:7b consistently performs best on code analysis
  • Very large files (> 10,000 lines) will have their tail truncated. Full codebase support is a planned feature.

📄 License

MIT License — see LICENSE for details.

This project is not affiliated with Ollama, Meta, Mistral AI, or any model provider. Model outputs are generated by third-party open-source models and may be inaccurate, incomplete, or misleading. Always review generated documentation before publishing or distributing it.

About

AI-powered local documentation generator for developers. Analyze your code, generate structured technical docs, export to PDF or Word — all running entirely on your own machine.

Topics

Resources

License

Stars

Watchers

Forks