A powerful, intelligent library for generating JSON Schema from multiple JSON instances with smart merging, advanced inference, and modular refinements.
- 🎯 Intelligent Merging – Combines multiple JSON instances into a single schema
- 🔗 Configurable Combinators – Use
anyOforoneOffor conflicting types/properties - 🧠 Advanced Inference – Automatic format detection (email, uuid, date-time, etc.)
- 📍 Required & Empty Handling – Smart inference of
required,minProperties,minItems, etc. - 🔍 Pseudo-Array Detection – Treats inhomogeneous arrays as object-like structures when needed
- ⚡ Modular Pipeline – Chain of configurable comparators for full control
- 🛠️ CLI & Python API – Flexible usage from command line or code
- 📝 Rich Output – Colored console feedback with timing and instance count
pip install genschemafrom genschema import Converter, PseudoArrayHandler
from genschema.comparators import (
FormatComparator,
RequiredComparator,
EmptyComparator,
DeleteElement,
)
conv = Converter(
pseudo_handler=PseudoArrayHandler(),
base_of="anyOf", # or "oneOf"
)
# Add JSON data (files, dicts, or existing schemas)
conv.add_json("example1.json")
conv.add_json("example2.json")
conv.add_json({"name": "Alice", "email": "alice@example.com"})
# Register optional refinements
conv.register(FormatComparator())
conv.register(RequiredComparator())
conv.register(EmptyComparator())
conv.register(DeleteElement())
conv.register(DeleteElement("isPseudoArray"))
# Generate schema
result = conv.run()
print(result) # Pretty-printed JSON Schema# Basic: single or multiple files
genschema input1.json input2.json -o schema.json
# Use oneOf instead of anyOf
genschema *.json --base-of oneOf -o schema.json
# Disable refinements
genschema data.json --no-format --no-required --no-pseudo-array
# Read from stdin
cat data.json | genschema - -o schema.json| Feature | genschema | GenSON |
|---|---|---|
| Multiple Instance Merging | Yes | Yes |
| Variant Type Handling | Configurable anyOf or oneOf |
anyOf only |
| Format Inference | Yes (email, date-time, uuid, uri, etc.) | No |
| Required Properties | Configurable inference | Yes (present in all objects) |
| Empty/Min-Max Handling | Yes (minProperties, minItems, etc.) |
Limited |
| Pseudo-Array Detection | Yes | No |
| Modular Extensions | Comparator pipeline (easy to add/remove) | SchemaStrategy subclasses |
| CLI Support | Full-featured with rich output | Basic (genson) |
| Performance (avg. benchmark) | ~2.1× slower | Faster |
Note: Performance measured on static datasets of varying complexity. genschema prioritizes richer inference and flexibility over raw speed.
Modular pipeline design for clean, extensible code:
┌─────────────────┐ ┌─────────────────┐
│ Input JSONs │ │ Input Schemas │
└─────────────────┘ └─────────────────┘
│ │
└──────────┬────────────┘
▼
┌───────────────┐
│ Pipeline Run │
└───────────────┘
▼
┌───────────────────┐
│ Process Layer │◀─────┐
└───────────────────┘ │
│ │
▼ │
┌─────────────────────┐ │
│ Comparators Chain │─────┘
└─────────────────────┘
│
▼
┌───────────────┐
│ Result │
└───────────────┘
git clone https://github.com/Miskler/genschema.git
cd genschema
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e ".[dev]" # or make install-dev if Makefile existsmake test # Run tests with coverage
make lint # Lint code
make type-check # mypy checking
make format # Format with black
make docs # Build documentationFork the repository, create a feature branch, and submit a pull request.
Ensure tests pass and code follows black/mypy style.
make test
make lint
make type-checkAGPL-3.0 License – see LICENSE file for details.
Made with ❤️ for developers working with evolving JSON data