Skip to content

wizzardx/chatmine

Repository files navigation

ChatMine

A powerful tool for importing, analyzing, and searching your AI chat conversations.

🚀 Built in 1 week as an experiment in AI-assisted development. Fully functional with 90% test coverage.

ChatMine allows you to import chat exports from Claude AI and ChatGPT, store them in a local SQLite database, and perform advanced text and semantic searches across your conversation history.

About This Project

I built ChatMine after realizing I had hundreds of AI conversations with valuable code snippets and solutions that I couldn't easily search through. What started as an interesting resume project (with AI enthusiastically encouraging me!) turned into a genuinely useful tool. This project was developed in 1 week with extensive AI assistance (Claude Code) as an experiment in rapid AI-driven development.

What I learned:

  • AI can dramatically accelerate development (1 week vs months)
  • Test coverage is crucial when using AI assistance (90% coverage)
  • Understanding your dependencies matters for long-term maintenance
  • Working software beats perfect understanding

Current Status:

  • ✅ Fully functional and tested
  • ✅ Solves a real problem: searching through months of AI conversations
  • ⚠️ Some ML libraries (FAISS, sentence-transformers) I'm still learning
  • 🤝 Seeking contributors who know these libraries well!

Features

  • 🔍 Advanced Search: Full-text search and AI-powered semantic search
  • 💻 Code Extraction: Automatically extract, analyze, and search code blocks
  • 📁 Markdown Export: Export conversations to organized markdown files for Unix tools
  • 📊 Analytics Dashboard: Web interface with conversation statistics and insights
  • 💬 Multi-Platform Support: Import from Claude AI and ChatGPT exports
  • Optimized Search: FAISS vector indexing for semantic search scalability
  • 🌐 Web Interface: Browse conversations with a modern web UI
  • 📱 CLI Tools: Command-line interface for automation and scripting

Quick Start

Installation

# Clone the repository
git clone https://github.com/wizzardx/chatmine
cd chatmine

# Install with Rye (recommended - faster)
rye sync

# Or install with pip (development mode, ~1GB download for AI models)
pip install -e .

Import Your First Conversations

Note: The database will be automatically created and set up when you run your first command.

From Claude AI

  1. Go to claude.ai → Settings → Export Data
  2. Download your conversations as a ZIP file
  3. Import into ChatMine:
chatmine import-claude path/to/claude-export.zip

From ChatGPT

  1. Go to ChatGPT → Settings → Data Export
  2. Download your conversations as a ZIP file
  3. Import into ChatMine:
chatmine import-chatgpt path/to/chatgpt-export.zip

Search Your Conversations

# Basic text search
chatmine search "python programming"

# Semantic search (AI-powered)
chatmine semantic-search "machine learning concepts"

# Search code blocks
chatmine code-search "function"

# Export conversations to markdown
chatmine export-conversations -o my_conversations

# View recent conversations
chatmine recent

Web Interface

Start the web server to browse conversations in your browser:

chatmine serve

Then open http://localhost:8000 in your browser.

CLI Commands

Command Description
import-claude <file> Import Claude AI conversation export
import-chatgpt <file> Import ChatGPT conversation export
search <query> Full-text search across conversations
semantic-search <query> AI-powered semantic search
recent Show recent conversations
stats Display database statistics
serve Start web interface
generate-embeddings Generate embeddings for semantic search (required)
rebuild-index Rebuild FAISS search index from existing embeddings (rarely needed)
code-search <query> Search extracted code blocks
code-stats Show statistics about extracted code blocks
export-code Export code blocks to individual files
export-conversations Export conversations to markdown files

Advanced Usage

Semantic Search Setup

For AI-powered semantic search, generate embeddings for your conversations:

# Generate embeddings (required for semantic search)
chatmine generate-embeddings

This downloads a small AI model (~90MB) and processes your conversations to enable semantic search. The command automatically builds the FAISS search index.

Only use rebuild-index if the search index gets corrupted:

# Rebuild search index from existing embeddings (rarely needed)
chatmine rebuild-index

Note: rebuild-index only rebuilds the search index from embeddings already stored in the database - it doesn't generate new embeddings. Most users only need generate-embeddings.

Code Block Extraction and Analysis

ChatMine automatically extracts code blocks during import and provides powerful search and analysis tools:

# Code extraction happens automatically during import!

# Search for specific code
chatmine code-search "function"
chatmine code-search --language python --code-type function

# View code statistics
chatmine code-stats

# Export code blocks to files
chatmine export-code --language python -o extracted_code/

Features:

  • Automatic extraction: Code blocks are extracted during import-claude and import-chatgpt
  • Auto-detection: Identifies 20+ programming languages
  • Smart classification: Categorizes code as functions, classes, snippets, etc.
  • Metadata extraction: Function names, class names, imports, comments
  • Export options: Save code blocks as individual files organized by language

Conversation Export to Markdown

Export your conversations to organized markdown files for use with Unix tools like rg, grep, and find:

# Preview what will be exported
chatmine export-conversations --preview

# Export all conversations
chatmine export-conversations -o my_conversations

# Export only Claude conversations from 2024
chatmine export-conversations --platform claude --date-from 2024-01

# Export in simple format (without metadata)
chatmine export-conversations --format simple

Directory Structure:

my_conversations/
├── claude/
│   ├── 2024-01/
│   │   ├── python-debugging-help.md
│   │   └── api-design-patterns.md
│   └── 2024-02/
│       └── machine-learning-concepts.md
├── chatgpt/
│   └── 2024-01/
│       └── react-component-design.md
└── chatmine_export_info.json

Search with Unix Tools:

# Search across all conversations
rg "async.*await" my_conversations/

# Find conversations about debugging
rg -l "debug|bug|error" my_conversations/

# Count conversations by platform
find my_conversations/ -name "*.md" | cut -d/ -f2 | sort | uniq -c

# Search for specific code patterns
grep -r "useState" my_conversations/chatgpt/

Features:

  • Rich metadata: YAML frontmatter with topics, code blocks, timestamps
  • Platform URLs: Direct links to original conversations
  • Topic extraction: Automatic keyword detection
  • Unix-friendly: Works perfectly with rg, grep, find, awk, etc.
  • Flexible filtering: By platform, date range, or format type

Performance Characteristics

  • Text search: Faster for single queries, especially with smaller datasets
  • Semantic search: Better for finding conceptually related content, but has ~4s startup overhead for model loading
  • For repeated searches: Consider using the web interface (chatmine serve) where the semantic search model stays loaded in memory

Web Interface Features

The web interface provides:

  • Dashboard: Overview of your conversation statistics
  • Search: Both text and semantic search with highlighting
  • Conversations: Browse all conversations with pagination
  • Details: View individual conversations with formatted messages

Configuration

ChatMine stores data in your current directory:

  • chatmine.db - SQLite database with conversations, messages, and extracted code blocks
  • embeddings.npy - Vector embeddings for semantic search
  • faiss_index.pkl - FAISS search index

Export Outputs:

  • exported_conversations/ - Markdown files organized by platform and date
  • exported_code/ - Code blocks saved as individual files by language

Architecture & Technologies Used

ChatMine uses modern Python tools and libraries:

  • SQLAlchemy + Alembic: Database ORM and migrations
  • Click: CLI framework
  • FastAPI + Uvicorn: Web interface
  • Sentence Transformers: AI embeddings for semantic search (learning in progress)
  • FAISS: Facebook's similarity search library (learning in progress)
  • Rich: Beautiful terminal output
  • 90% Test Coverage: Comprehensive test suite with pytest

What This Demonstrates

  • Rapid Development: Built a complex application in 1 week
  • Modern Python Stack: Using current best practices and tools
  • Test-Driven: High test coverage despite rapid development
  • Full-Stack: CLI, web interface, and data processing
  • AI Tool Proficiency: Effective use of AI assistance for development

Who This Is For

ChatMine is perfect for:

  • Developers who want to build a searchable library from their AI conversations
  • Researchers tracking their AI-assisted research over time
  • Anyone who exports their AI chats for privacy but never looks at them again
  • Data enthusiasts curious about their AI usage patterns

If you've ever thought "I know I asked ChatGPT about this before..." then ChatMine is for you!

Contributing

I'm actively seeking contributors, especially those familiar with:

  • FAISS optimization
  • Sentence transformers and embeddings
  • FastAPI best practices
  • Frontend improvements

This is a great opportunity to contribute to a real, working tool!

Development

Requirements

  • Python 3.8+
  • Rye (recommended) or pip

Setup Development Environment

# Install dependencies
rye sync

# Run tests
./scripts/test.sh

# Format code
rye fmt

# Type checking
mypy --strict src

Running Tests

# Run all tests with coverage
rye test

# Run specific test file
rye test src/chatmine/test_cli.py

Troubleshooting

Import Issues

  • Ensure ZIP files are valid exports from Claude AI or ChatGPT
  • Check file permissions and disk space
  • Run with --verbose flag for detailed error messages

Search Not Working

  • For semantic search: Run chatmine generate-embeddings first (this generates embeddings and builds the search index)
  • Check that your database contains conversations with chatmine stats
  • If search index is corrupted: Run chatmine rebuild-index to rebuild from existing embeddings

Web Interface Issues

  • Ensure port 8000 is available
  • Check firewall settings for local connections

My Journey

This project started when I realized I had hundreds of valuable AI conversations with no way to search through them effectively. While initially conceived as a resume project (with some encouragement from AI!), it solves a real problem that many AI users face:

  • Lost Knowledge: Solutions and code snippets buried in conversation history
  • No Analytics: No way to see patterns in how we use AI over time
  • Export Limbo: Platforms provide exports, but no tools to analyze them

While I leveraged AI extensively to build it quickly, I ensured quality through:

  • Comprehensive testing (90% coverage)
  • Clear documentation
  • Modular architecture
  • Solving a genuine problem

I'm continuing to deepen my understanding of the ML libraries used, and I welcome questions, suggestions, and contributions.

Future Plans

  • Add support for more AI platforms (Gemini, Perplexity)
  • Improve semantic search performance
  • Add conversation summarization
  • Create browser extension for automatic exports
  • Build conversation analytics features

License

MIT License - see the LICENSE file for details.

Contact

  • GitHub Issues: For bugs and feature requests
  • Discussions: For questions and ideas
  • LinkedIn: [Your LinkedIn] (currently seeking opportunities)

Built with ❤️ and AI assistance. If you find this useful, please star ⭐ the repository!