🪞 Mirror Web CLI v1.1.3

Professional Website Mirroring with Intelligent Framework Preservation & Enhanced Asset Processing

A powerful, universal website mirroring tool that intelligently detects and preserves framework structures while creating offline-ready websites. Works seamlessly with React, Next.js, Vue, Angular, Svelte, WordPress, and static sites.

✨ Key Features

🧠 Intelligent Framework Detection

Automatically detects 14+ frameworks (React, Vue, Angular, Next.js, Nuxt, Gatsby, Svelte, etc.)
Comprehensive pattern matching with confidence scoring
Framework-specific optimization strategies

🎨 Beautiful Terminal Experience

Modern UI with gradient effects and smooth animations
Professional progress tracking with step-by-step indicators
Color-coded status messages and comprehensive feedback

⚡ Advanced Asset Processing

Complete asset extraction and optimization (images, CSS, JS, fonts, icons, videos)
Smart URL rewriting for offline functionality
Framework-preserving structure generation
Comprehensive video support with 14+ video formats (.mp4, .webm, .ogg, etc.)

🧹 Clean Code Generation

Optional tracking script removal (analytics, GTM, Facebook Pixel)
Professional project structure ready for development
Offline-ready websites with localized resources
Next.js/React error handling for graceful offline operation

🆕 Auto-Differentiated Output Directories

Standard mirroring: Creates ./domain-standard/ directories
AI-enhanced mirroring: Creates ./domain-ai-enhanced/ directories
Easy comparison: Side-by-side analysis of different approaches
Organized workflow: Never overwrite previous results

🛠️ Recent Improvements (v1.1.3)

✅ Enhanced Environment Variable System

Priority-based .env loading with shell environment preservation
Improved OpenAI API key handling with multiple configuration sources
Better development workflow with .env.local support

✅ Next.js Image Optimizer Support

Robust handling of /_next/image endpoints with HTTP 402 avoidance
Original image extraction from optimizer URLs
Runtime asset rewriting with DOM mutation observer
Enhanced offline compatibility for Next.js applications

✅ Advanced Asset Processing

Microlink integration for screenshot services
Comprehensive hover/popover content capture
Responsive image support with srcset rewriting
Enhanced video and audio processing with extended timeouts

✅ Smart Output Organization

Auto-differentiated directories prevent accidental overwrites
Easy comparison between standard and AI-enhanced results
Professional project organization

🎥 Demo Video

Watch the tool in action: YouTube Demo.

🚀 Quick Start

Installation Options

Option 1: Global Installation (Recommended)

# Install globally from npm registry
npm install -g mirror-web-cli

# Verify installation
mirror-web-cli --version

Option 2: One-time Usage with npx

# Run directly without installation
npx mirror-web-cli https://example.com

Option 3: Development Installation

# Clone repository for development/customization
git clone https://github.com/SanjeevSaniel/mirror-web-cli.git
cd mirror-web-cli
npm install

# Run from source
node src/cli.js https://example.com

OpenAI API Key Setup (Required for AI Features)

🚨 IMPORTANT: Users must set up their OpenAI API key in their terminal environment before using AI features. The package does NOT include pre-configured API keys.

Step 1: Get OpenAI API Key

Visit OpenAI Platform
Create account and generate API key
Copy your API key (starts with sk-)

Step 2: Set Environment Variable (REQUIRED)

Windows PowerShell:

# Temporary (current session only)
$env:OPENAI_API_KEY="sk-your-api-key-here"

# Permanent (recommended)
[System.Environment]::SetEnvironmentVariable('OPENAI_API_KEY', 'sk-your-api-key-here', 'User')

# Verify setup
echo $env:OPENAI_API_KEY

Windows Command Prompt:

# Temporary (current session only)
set OPENAI_API_KEY=sk-your-api-key-here

# Verify setup
echo %OPENAI_API_KEY%

macOS/Linux (Bash/Zsh):

# Temporary (current session only)
export OPENAI_API_KEY="sk-your-api-key-here"

# Permanent (add to ~/.bashrc or ~/.zshrc)
echo 'export OPENAI_API_KEY="sk-your-api-key-here"' >> ~/.bashrc
source ~/.bashrc

# Verify setup
echo $OPENAI_API_KEY

Alternative: Command Line Parameter

# Pass API key directly (not recommended for security)
mirror-web-cli https://example.com --ai --openai-key "sk-your-key-here"

Step 3: Verify Setup

# Test basic functionality (should work without API key)
mirror-web-cli https://example.com --debug

# Test AI functionality (requires API key)
mirror-web-cli https://example.com --ai --debug

Requirements:

✅ OpenAI API keys only (must start with sk-)
✅ GPT-4o model for intelligent analysis
✅ Active OpenAI account with billing setup
✅ Terminal environment setup (no pre-configured keys)

Basic Usage (After Installation)

Standard Website Mirroring

# Basic mirroring (no AI, works immediately after install)
mirror-web-cli https://example.com
# → Creates: ./example.com-standard/

# Clean mirroring (removes tracking scripts)
mirror-web-cli https://react-site.com --clean
# → Creates: ./react-site.com-standard/

# Custom output directory
mirror-web-cli https://vue-app.com -o ./my-project
# → Creates: ./my-project/

# Debug mode with detailed logging
mirror-web-cli https://complex-site.com --debug
# → Shows detailed processing information

AI-Enhanced Mirroring (Requires API Key Setup)

# FIRST: Set up API key (see above section)
export OPENAI_API_KEY="sk-your-api-key-here"  # Linux/macOS
# or
$env:OPENAI_API_KEY="sk-your-api-key-here"    # Windows PowerShell

# THEN: Use AI features
mirror-web-cli https://example.com --ai
# → Creates: ./example.com-ai-enhanced/

# AI + Clean mirroring
mirror-web-cli https://complex-app.com --ai --clean
# → Creates: ./complex-app.com-ai-enhanced/

Error Handling (No API Key)

# If you try AI features without API key setup:
mirror-web-cli https://example.com --ai

# You'll see:
# ⚠️ AI features requested but no OPENAI_API_KEY found
# Add OPENAI_API_KEY to your environment...
# Continuing without AI features...

📁 Auto-Differentiated Output Directories

Mirror Web CLI automatically creates different output directories based on the analysis method:

Standard: ./domain-standard (e.g., ./example.com-standard)
AI-Enhanced: ./domain-ai-enhanced (e.g., ./example.com-ai-enhanced)
Custom: Uses your specified path with -o flag

This allows easy comparison between different analysis approaches and organized project management.

Serving the Output

# The tool generates a complete project structure
cd ./example.com-standard  # or ./example.com-ai-enhanced

# Use any static server to serve the mirrored site
python -m http.server 8000
# Open http://localhost:8000

# Or use Node.js static server
npx serve .

🎯 How It Works

1. Intelligent Page Loading

Launches headless browser with optimized settings
Waits for framework-specific elements (#__next, #root, #app)
Performs scroll-to-bottom for lazy-loaded content
Waits for images and network idle state

2. Framework Analysis Engine

📊 Detection Methods:
├── Script Source Analysis    → Framework bundles & runtime files
├── DOM Element Inspection   → Framework-specific containers  
├── Meta Tag Analysis        → Generator tags & signatures
├── Content Pattern Matching → Component structures
├── CSS Class Analysis       → Framework styling patterns
├── JSON Data Detection      → State management structures
└── Link Href Analysis       → Framework asset paths

3. Comprehensive Asset Extraction

🎯 Asset Categories:
├── 🖼️  Images     → src, srcset, lazy attributes, backgrounds
├── 🎨 Stylesheets → External CSS + inline styles with url() rewriting
├── ⚙️  Scripts    → External JS + inline scripts (with optional cleaning)
├── 🔠 Fonts      → Web fonts and icon fonts
├── 🎭 Icons      → Favicons and app icons
└── 🎥 Media      → Videos (.mp4, .webm, .ogg, .avi, .mov, etc.), audio files

4. Smart URL Rewriting

Converts all absolute URLs to relative paths
Creates organized asset directory structure
Generates short, stable, hashed filenames
Maintains proper file extensions and MIME types

5. Framework-Preserving Output

📁 Output Structure:
website.com/
├── index.html           # Main page with framework intact
├── package.json         # Project metadata & serve scripts
├── README.md           # Usage instructions
├── server.js           # Optional Node.js static server
└── assets/
    ├── images/         # All images with optimized names
    ├── css/           # Stylesheets with localized assets
    ├── js/            # JavaScript files (cleaned if --clean)
    ├── fonts/         # Web fonts and typography
    ├── icons/         # Favicons and app icons
    └── media/         # Videos (.mp4, .webm, .ogg), audio files, and other media

Next.js + Microlink offline support (v1.0.2)

Modern sites often use:

Next.js Image Optimizer: /_next/image?url=<original>&w=<size>&q=<quality>
Microlink-based previews: https://api.microlink.io/?url=... returning either JSON or direct images

This tool:

Skips downloading /_next/image directly (avoids 402s)
Extracts the original image URL from the url= param and downloads that
Aliases /_next/image?... to the same local file as the original
Injects a runtime MutationObserver rewriter that:
- Rewrites src, href, poster, inline style background-image
- Rewrites srcset and imagesrcset (browsers prefer srcset over src)
- Handles dynamically added DOM (hover cards, popovers, etc.)
Captures Microlink responses; if JSON, follows to the actual screenshot URL and downloads bytes

Verification

Run with --debug and open DevTools Console
Interact with the page (e.g., hover “Preview” links)

Look for lines like:

[MW rewrite] imagesrcset: /_next/image?url=... -> ./assets/images/asset_dc814d3448.png 1x, ...

Open the local asset path (e.g., http://localhost:8000/assets/images/asset_dc814d3448.png)

Troubleshooting (quick)

Blank hover/popover preview
- Serve over HTTP (not file://)
- Ensure srcset/imagesrcset are being rewritten (use --debug)
- Open the local asset URL from logs; if 404, rebuild the mirror
HTTP 402 from Next.js /_next/image
- Expected; the tool avoids these endpoints and downloads the original target from url=

Helpful snippet to locate candidates:

document.querySelectorAll('img, [style]').forEach(n => {
  const src = n.currentSrc || n.getAttribute('src') || '';
  const styleAttr = n.getAttribute('style') || '';
  const bg = getComputedStyle(n).backgroundImage || '';
  const hay = [src, styleAttr, bg].join(' ');
  if (/(microlink|_next\/image|og|twitter|card)/i.test(hay)) {
    console.log('el:', n, { src, styleAttr, bg });
  }
});

🔧 CLI Reference

Usage: mirror-web-cli <url> [options]

Arguments:
  url                     Target website URL to mirror

Options:
  -o, --output <dir>      Custom output directory (default: domain name)
  --clean                 Remove tracking scripts and analytics
  --ai                    Enable AI-powered analysis (requires OpenAI API key)
  --openai-key <key>      OpenAI API key for AI features (or set OPENAI_API_KEY env var)
  --debug                 Enable detailed debug logging
  --timeout <ms>          Page load timeout in milliseconds (default: 120000)
  --headless <bool>       Run browser in headless mode (default: true)
  -h, --help              Show help information
  -V, --version           Show version number

OpenAI API Key Priority

The tool checks for OpenAI API keys in this order:

--openai-key command line parameter
OPENAI_API_KEY environment variable
If neither is found, AI features are disabled with a helpful message
Keys must start with sk- (validated automatically)

🏗️ Framework Support

Framework	Detection	Preservation	Output Quality
React	✅ High confidence	✅ Component structure	⭐⭐⭐⭐⭐
Next.js	✅ Advanced patterns	✅ SSR/SSG structure	⭐⭐⭐⭐⭐
Vue.js	✅ Reactive patterns	✅ Template structure	⭐⭐⭐⭐⭐
Nuxt	✅ SSR detection	✅ Module organization	⭐⭐⭐⭐⭐
Angular	✅ Component analysis	✅ Module structure	⭐⭐⭐⭐⭐
Svelte	✅ Store patterns	✅ Component logic	⭐⭐⭐⭐⭐
Gatsby	✅ GraphQL detection	✅ Static generation	⭐⭐⭐⭐⭐
WordPress	✅ Theme detection	✅ Content structure	⭐⭐⭐⭐
Static Sites	✅ Always works	✅ Clean HTML/CSS/JS	⭐⭐⭐⭐⭐

🧪 Usage Examples

Basic Website Mirroring

# Simple static site
mirror-web-cli https://example.com
# → Creates: ./example.com-standard/ with complete offline functionality

React Application

# React SPA with complex routing
mirror-web-cli https://react-app.com --clean
# → Creates: ./react-app.com-standard/ preserves React structure, removes tracking, offline-ready

Next.js Website

# Next.js with image optimization and error handling
mirror-web-cli https://nextjs-site.com --clean
# → Creates: ./nextjs-site.com-standard/ with enhanced Next.js compatibility
# → Handles /_next/image URLs, fixes hydration issues, preserves SSR structure

E-commerce Site

# Complex site with lots of assets
mirror-web-cli https://shop.example.com --debug --clean
# → Creates: ./shop.example.com-standard/ with detailed logging, removes analytics

AI-Powered Analysis (OpenAI)

Windows PowerShell:

# Set environment variable first
$env:OPENAI_API_KEY="sk-proj-your-openai-key-here"
mirror-web-cli https://complex-app.com --ai --clean
# → Creates: ./complex-app.com-ai-enhanced/ with OpenAI GPT-4o framework analysis

macOS/Linux:

# Set environment variable first
export OPENAI_API_KEY="sk-proj-your-openai-key-here"
mirror-web-cli https://complex-app.com --ai --clean
# → Creates: ./complex-app.com-ai-enhanced/ with OpenAI GPT-4o framework analysis

Cross-platform (using CLI parameter):

# Compare standard vs AI-enhanced outputs
mirror-web-cli https://react-app.com --clean        # → ./react-app.com-standard/
mirror-web-cli https://react-app.com --ai --clean   # → ./react-app.com-ai-enhanced/

Development Workflow

# Mirror for development reference
mirror-web-cli https://design-system.com -o ./reference
cd ./reference
npm start  # Built-in development server

Video-Rich Websites

# Websites with hero videos (like VS Code, Apple, etc.)
mirror-web-cli https://code.visualstudio.com --clean
# → Downloads all video formats (.mp4, .webm), preserves video posters
# → Handles responsive video sources with media queries
# → Supports autoplay, muted, and poster attributes

# Complex video embedding
mirror-web-cli https://video-heavy-site.com --timeout 180000
# → Extended timeout for large video downloads
# → Maintains video element structure and JavaScript controls

🎨 Terminal UI Showcase

════════════════════════════════════════════════════════════════════════════════
                      🪞 Mirror Web CLI v1.1.3
                     Professional Website Mirroring
════════════════════════════════════════════════════════════════════════════════

✨ Features:
   • Intelligent framework detection (React, Vue, Angular, Next.js, etc.)
   • Framework-preserving output with professional structure
   • Comprehensive asset extraction and optimization
   • Clean code generation with tracking script removal

🚀 Quick Start:
   mirror-web-cli https://example.com
   mirror-web-cli https://react-app.com --clean -o ./my-project

Progress Tracking

╭──────────────────────────────────────────────────────────────────────────────╮
  ● Step 3/7  •  Framework Analysis
  Detecting technology stack and framework patterns...
╰──────────────────────────────────────────────────────────────────────────────╯

╭──────────────────────────────────────────────────────────────────────────────╮
  📦 Framework Analysis
  Framework:    Next.js
  Confidence:   95% ████████████████████░
  Complexity:   HIGH
  Strategy:     Preserve DOM; localize assets for exact Next.js look
╰──────────────────────────────────────────────────────────────────────────────╯

🛡️ Privacy & Security

Tracking Removal (--clean flag)

Google Analytics (gtag, ga, analytics.js)
Google Tag Manager (gtm, dataLayer)
Facebook Pixel (fbevents, facebook.com/tr)
Service Workers (registration scripts)
Third-party trackers (extensive database)

Safety Considerations

Always respect robots.txt and terms of service
Ensure you have permission to mirror content
Use responsibly and ethically
Consider rate limiting for large sites

🏗️ Architecture Overview

src/
├── cli.js                    # Command-line interface & argument parsing
├── core/                     # Core functionality modules
│   ├── mirror-cloner.js      # Main orchestrator class
│   ├── browser-engine.js     # Puppeteer browser management
│   ├── framework-analyzer.js # Intelligent framework detection
│   ├── asset-manager.js      # Comprehensive asset extraction
│   ├── framework-writer.js   # Output generation & structure
│   ├── display.js           # Beautiful terminal UI system
│   ├── logger.js            # Logging & warning management
│   ├── file-writer.js       # File system operations
│   ├── filename-utils.js    # Smart filename generation
│   └── server.js            # Optional static server
└── ai/                      # AI-powered analysis (optional)
    └── ai-analyzer.js       # OpenAI integration for analysis

🧩 Extending the Tool

Adding New Framework Detection

// In src/core/framework-analyzer.js
this.frameworks.myframework = {
  name: 'My Framework',
  patterns: [
    { type: 'script', pattern: /myframework\.js/ },
    { type: 'element', selector: '#my-app' },
    { type: 'meta', name: 'generator', pattern: /myframework/i }
  ]
};

Custom Asset Processing

// In src/core/asset-manager.js
async extractCustomAssets() {
  // Add your custom asset extraction logic
}

🤝 Contributing

We welcome contributions! Here's how to get started:

# Development setup
git clone https://github.com/SanjeevSaniel/mirror-web-cli.git
cd mirror-web-cli
npm install

# Run tests
npm test

# Development with debugging
npm run dev -- https://example.com --debug

Key Areas for Contribution

Framework Detection: Add support for new frameworks
Asset Processing: Improve extraction algorithms
Output Optimization: Enhance generated code quality
Terminal UI: Improve user experience
Documentation: Help others understand the tool

🐛 Troubleshooting

Common Issues

"Cannot read properties of undefined" Error

Fixed in v1.0 - update to latest version
Use --debug flag for detailed error information

Incomplete Asset Loading

Increase timeout: --timeout 180000 (3 minutes)
Check network connectivity
Some dynamic content may require JavaScript enabled

Framework Not Detected

Use --debug to see detection process
Framework patterns may need updating for newer versions
Manual inspection may be needed for custom frameworks

Environment Variable Issues

Windows PowerShell "export command not found":

# ❌ Wrong (Bash syntax)
export OPENAI_API_KEY="sk-..."

# ✅ Correct (PowerShell syntax)
$env:OPENAI_API_KEY="sk-..."

Windows Command Prompt:

# ✅ Correct (CMD syntax)
set OPENAI_API_KEY=sk-your-key-here

Verify environment variable is set:

# PowerShell
echo $env:OPENAI_API_KEY

# Command Prompt  
echo %OPENAI_API_KEY%

# Bash/Zsh
echo $OPENAI_API_KEY

AI Features Not Working

Verify OpenAI API key is set correctly (see above)
Check API key format: Must start with sk-
Ensure sufficient OpenAI credits/quota
Use --debug to see AI analysis process

Blank Screen or Empty Content

Iframe-based sites (like hitesh.ai):

Some sites are just iframe wrappers pointing to external URLs
Example: hitesh.ai loads hiteshchoudhary.com in an iframe

Solution: Mirror the actual content site directly:

# Instead of the wrapper
mirror-web-cli https://hitesh.ai

# Mirror the actual content
mirror-web-cli https://hiteshchoudhary.com --clean

Sites with heavy JavaScript dependencies:

Some React/Next.js sites may need additional processing
Try AI-enhanced mode for better framework handling:
```
mirror-web-cli https://your-site.com --ai --clean
```

Getting Help

Check the GitHub Issues
Use --debug flag for detailed logging
Include error output when reporting bugs

📊 Performance Stats

Average Processing Time: 15-45 seconds per site
Asset Extraction Rate: 95%+ success rate
Framework Detection Accuracy: 90%+ for supported frameworks
Memory Usage: Optimized for large sites (>1000 assets)

🙏 Acknowledgments

Special thanks to the amazing open-source community:

Puppeteer - Headless browser automation
Cheerio - Server-side HTML parsing
Chalk - Terminal styling
Commander - CLI framework
Sharp - Image processing

📄 License

MIT License - see LICENSE file for details.

Made with ❤️ by Sanjeev Saniel Kujur

Convert any website to universal HTML/CSS/JS with intelligent framework preservation!

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.claude		.claude
.github		.github
docs		docs
src		src
.env.example		.env.example
.gitignore		.gitignore
.npmignore		.npmignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
FIXES_AND_IMPROVEMENTS.md		FIXES_AND_IMPROVEMENTS.md
IMPLEMENTATION.md		IMPLEMENTATION.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

License

SanjeevSaniel/mirror-web-cli

Folders and files

Latest commit

History

Repository files navigation