Enhanced Virtual Tables

Last Updated: September 23, 2025 1:48 PM EST

The SQLite MCP Server provides intelligent data import capabilities with automatic schema detection, type inference, and seamless conversion of CSV and JSON files into queryable SQLite tables.

📚 See Real-World Use Cases for complete CSV/JSON import workflows, schema inference examples, and data processing pipelines.

🔧 Available Enhanced Virtual Table Tools

Tool	Description
`create_enhanced_csv_table`	Create CSV tables with automatic data type inference
`create_json_collection_table`	Create virtual tables for JSON file collections (JSONL, arrays)
`analyze_csv_schema`	Analyze CSV files and infer data types without creating tables
`analyze_json_schema`	Analyze JSON file collections and infer schema

📊 Smart CSV Import

Automatic Schema Detection

// 1. Analyze CSV structure
analyze_csv_schema({
  "csv_file_path": "./data/sales.csv"
})

// 2. Create enhanced virtual table
create_enhanced_csv_table({
  "table_name": "sales_data",
  "csv_file_path": "./data/sales.csv",
  "has_header": true
})

// 3. Query the data
read_query({
  "query": "SELECT * FROM sales_data WHERE amount > 1000"
})

Advanced CSV Processing

// Enhanced CSV table with custom options
create_enhanced_csv_table({
  "table_name": "customer_data",
  "csv_file_path": "./data/customers.csv",
  "has_header": true,
  "delimiter": ",",
  "auto_detect_types": true,
  "sample_size": 1000,  // Rows to sample for type inference
  "null_values": ["", "NULL", "N/A", "null"],
  "date_formats": ["YYYY-MM-DD", "MM/DD/YYYY", "DD-MM-YYYY"],
  "numeric_separators": {
    "decimal": ".",
    "thousands": ","
  }
})

Schema Analysis Without Import

// Analyze CSV without creating table
analyze_csv_schema({
  "csv_file_path": "./data/large_dataset.csv",
  "sample_size": 5000,
  "detect_encoding": true,
  "analyze_quality": true
})

Returns:

Inferred column types (INTEGER, REAL, TEXT, DATE, BOOLEAN)
Data quality metrics (null percentage, unique values)
Encoding detection
Delimiter detection
Header detection
Sample data preview

📄 JSON Collection Processing

JSON Lines (JSONL) Files

// Create table from JSONL file
create_json_collection_table({
  "table_name": "api_logs",
  "json_file_path": "./logs/api_requests.jsonl",
  "format": "jsonl",
  "auto_flatten": true,
  "max_depth": 3
})

// Query flattened JSON data
read_query({
  "query": `
    SELECT 
      timestamp,
      method,
      url,
      status_code,
      response_time_ms
    FROM api_logs 
    WHERE status_code >= 400
  `
})

JSON Array Files

// Import JSON array file
create_json_collection_table({
  "table_name": "product_catalog",
  "json_file_path": "./data/products.json",
  "format": "array",
  "schema_inference": true,
  "nested_objects": "flatten",  // flatten, json_column, separate_tables
  "arrays": "json_column"       // json_column, separate_rows, ignore
})

Complex JSON Schema Analysis

// Analyze complex JSON structure
analyze_json_schema({
  "json_file_path": "./data/complex_data.json",
  "format": "array",
  "max_sample": 1000,
  "analyze_nested": true,
  "suggest_normalization": true
})

Returns:

Inferred schema with data types
Nested object structure analysis
Array handling suggestions
Normalization recommendations
Sample data extraction

🎯 Intelligent Type Inference

Data Type Detection

The enhanced virtual tables automatically detect and convert:

Numeric Types:

// Automatically detected as INTEGER or REAL
"123"        → INTEGER
"123.45"     → REAL
"1,234.56"   → REAL (handles thousands separators)
"$1,234.56"  → REAL (removes currency symbols)

Date/Time Types:

// Various date formats automatically detected
"2025-09-23"           → DATE
"09/23/2025"           → DATE
"23-Sep-2025"          → DATE
"2025-09-23 14:30:00"  → DATETIME

Boolean Types:

// Boolean value detection
"true", "false"        → BOOLEAN
"yes", "no"           → BOOLEAN
"1", "0"              → BOOLEAN (contextual)
"Y", "N"              → BOOLEAN

Text Types:

// Default fallback for complex data
"Mixed content 123"    → TEXT
"Complex string data"  → TEXT

Custom Type Mapping

// Override automatic type detection
create_enhanced_csv_table({
  "table_name": "custom_types",
  "csv_file_path": "./data/data.csv",
  "has_header": true,
  "type_overrides": {
    "id": "TEXT",           // Force ID as text
    "amount": "REAL",       // Ensure decimal precision
    "date": "DATE",         // Force date parsing
    "active": "BOOLEAN"     // Convert to boolean
  },
  "custom_parsers": {
    "amount": "CAST(REPLACE(REPLACE($1, '$', ''), ',', '') AS REAL)",
    "date": "date($1)"
  }
})

🔍 Data Quality Analysis

Comprehensive Data Profiling

// Analyze data quality during import
create_enhanced_csv_table({
  "table_name": "quality_analyzed",
  "csv_file_path": "./data/messy_data.csv",
  "has_header": true,
  "quality_analysis": true,
  "generate_report": true
})

Quality Metrics Provided:

Null value percentages per column
Unique value counts
Data type consistency
Outlier detection
Pattern analysis
Completeness scores

Data Cleaning Suggestions

// Get cleaning recommendations
analyze_csv_schema({
  "csv_file_path": "./data/dirty_data.csv",
  "suggest_cleaning": true,
  "detect_duplicates": true,
  "validate_constraints": true
})

Cleaning Suggestions:

Duplicate row identification
Inconsistent formatting detection
Missing value patterns
Constraint violations
Normalization opportunities

💡 Real-World Use Cases

E-commerce Data Import

// 1. Analyze product data structure
const schema = await analyze_csv_schema({
  "csv_file_path": "./import/products.csv",
  "analyze_quality": true,
  "suggest_normalization": true
});

// 2. Create enhanced table with optimizations
create_enhanced_csv_table({
  "table_name": "products_import",
  "csv_file_path": "./import/products.csv",
  "has_header": true,
  "auto_detect_types": true,
  "quality_analysis": true,
  "type_overrides": {
    "sku": "TEXT",
    "price": "REAL",
    "in_stock": "BOOLEAN"
  }
});

// 3. Validate and clean data
read_query({
  "query": `
    SELECT 
      COUNT(*) as total_rows,
      COUNT(DISTINCT sku) as unique_skus,
      COUNT(CASE WHEN price > 0 THEN 1 END) as valid_prices,
      COUNT(CASE WHEN name IS NOT NULL AND name != '' THEN 1 END) as named_products
    FROM products_import
  `
});

// 4. Import clean data to permanent table
write_query({
  "query": `
    INSERT INTO products (sku, name, price, category, in_stock)
    SELECT sku, name, price, category, in_stock
    FROM products_import
    WHERE sku IS NOT NULL 
    AND price > 0 
    AND name IS NOT NULL
  `
});

Log File Analysis

// 1. Analyze log file structure
analyze_json_schema({
  "json_file_path": "./logs/application.jsonl",
  "format": "jsonl",
  "max_sample": 10000,
  "analyze_nested": true
});

// 2. Create structured log table
create_json_collection_table({
  "table_name": "app_logs",
  "json_file_path": "./logs/application.jsonl",
  "format": "jsonl",
  "auto_flatten": true,
  "nested_objects": "flatten",
  "timestamp_columns": ["timestamp", "created_at"],
  "index_columns": ["level", "component", "timestamp"]
});

// 3. Analyze error patterns
read_query({
  "query": `
    SELECT 
      component,
      level,
      COUNT(*) as count,
      MIN(timestamp) as first_occurrence,
      MAX(timestamp) as last_occurrence
    FROM app_logs
    WHERE level IN ('ERROR', 'WARN')
    GROUP BY component, level
    ORDER BY count DESC
  `
});

Survey Data Processing

// 1. Import survey responses from JSON
create_json_collection_table({
  "table_name": "survey_responses",
  "json_file_path": "./data/survey_results.json",
  "format": "array",
  "auto_flatten": true,
  "nested_objects": "separate_tables",
  "arrays": "separate_rows"
});

// 2. Analyze response patterns
read_query({
  "query": `
    SELECT 
      question_id,
      response_value,
      COUNT(*) as frequency,
      COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY question_id) as percentage
    FROM survey_responses
    WHERE question_type = 'multiple_choice'
    GROUP BY question_id, response_value
    ORDER BY question_id, frequency DESC
  `
});

🎯 Best Practices

1. Always Analyze Before Import

// Analyze first to understand your data
const analysis = await analyze_csv_schema({
  "csv_file_path": "./data/unknown_data.csv",
  "analyze_quality": true,
  "suggest_cleaning": true
});

console.log(`Detected ${analysis.columns.length} columns`);
console.log(`Data quality score: ${analysis.quality_score}`);

2. Use Appropriate Sample Sizes

// Large files: use sampling for performance
create_enhanced_csv_table({
  "table_name": "large_dataset",
  "csv_file_path": "./data/huge_file.csv",
  "sample_size": 10000,  // Sample for type inference
  "streaming_mode": true  // Process in chunks
});

// Small files: analyze everything
create_enhanced_csv_table({
  "table_name": "small_dataset",
  "csv_file_path": "./data/small_file.csv",
  "sample_size": -1  // Analyze all rows
});

3. Handle Nested JSON Appropriately

// Simple flattening for shallow objects
create_json_collection_table({
  "table_name": "simple_json",
  "json_file_path": "./data/simple.json",
  "format": "array",
  "auto_flatten": true,
  "max_depth": 2
});

// Preserve structure for complex objects
create_json_collection_table({
  "table_name": "complex_json",
  "json_file_path": "./data/complex.json",
  "format": "array",
  "nested_objects": "json_column",
  "preserve_structure": true
});

4. Validate After Import

// Always validate imported data
read_query({
  "query": `
    SELECT 
      'Data Import Validation' as check_type,
      COUNT(*) as total_rows,
      COUNT(CASE WHEN primary_key IS NULL THEN 1 END) as missing_keys,
      COUNT(DISTINCT primary_key) as unique_keys,
      CASE 
        WHEN COUNT(*) = COUNT(DISTINCT primary_key) THEN 'PASS'
        ELSE 'FAIL'
      END as uniqueness_check
    FROM imported_table
  `
});

📚 Related Pages

Virtual-Tables - Basic virtual table management
JSON-Helper-Tools - JSON manipulation tools
Advanced-Text-Processing - Text analysis capabilities
Core-Database-Tools - Basic database operations

📊 Smart Import Tip: Enhanced virtual tables combine the flexibility of virtual tables with intelligent schema detection. Use them to quickly explore and import external data sources with minimal manual configuration.

Enhanced Virtual Tables

Enhanced Virtual Tables

🔧 Available Enhanced Virtual Table Tools

📊 Smart CSV Import

Automatic Schema Detection

Advanced CSV Processing

Schema Analysis Without Import

📄 JSON Collection Processing

JSON Lines (JSONL) Files

JSON Array Files

Complex JSON Schema Analysis

🎯 Intelligent Type Inference

Data Type Detection

Custom Type Mapping

🔍 Data Quality Analysis

Comprehensive Data Profiling

Data Cleaning Suggestions

💡 Real-World Use Cases

E-commerce Data Import

Log File Analysis

Survey Data Processing

🎯 Best Practices

1. Always Analyze Before Import

2. Use Appropriate Sample Sizes

3. Handle Nested JSON Appropriately

4. Validate After Import

📚 Related Pages

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally