An ontology linter for labeled property graphs.
Define your graph schema in SHACL. Validate it against any LPG database.
flowchart LR
subgraph input [" "]
SHACL["movies.shacl.ttl(SHACL schema)"]
end
subgraph shacl_parser ["shacl_parser.py"]
RDFLib["rdflibTurtle → RDF graph"]
SHWalk["Walk SHACL shapesextract constraints"]
IR["Validation Plan(Check objects)"]
RDFLib --> SHWalk --> IR
end
subgraph mapping ["Mapping"]
direction TB
M1["URI → node labelmovies#Movie → :Movie"]
M2["URI → propertymovies#title → title"]
M3["URI → relationshipmovies#hasActor → :HAS_ACTOR"]
end
subgraph backends ["backends/"]
Cypher["cypher.pyNeo4j, Memgraph"]
GQL["gql.pyISO GQL"]
end
subgraph runner ["runner.py"]
Compile["compile_plan()Check → query string"]
Execute["execute_plan()run against live DB"]
DryRun["dry_run()print queries only"]
end
subgraph output [" "]
Report["Validation Report✓ pass / ✗ violationper node, per check"]
end
SHACL --> RDFLib
SHWalk -.-> mapping
IR --> Compile
Compile --> Cypher & GQL
Cypher & GQL --> Execute & DryRun
Execute --> Report
- Write shapes in SHACL — human-readable, formally grounded schema language
- Parser compiles shapes into a vendor-neutral validation plan (list of
Checkobjects) - Mapping converts RDF URIs to LPG names (labels, properties, relationship types) using conventions or explicit overrides
- Backends translate each check into an executable query (Cypher or GQL)
- Runner executes queries against your database; violations are collected into a report
uv run main.pyfrom graphlint.parser import parse_schema
from graphlint.backends.cypher import CypherBackend
from graphlint.runner import dry_run, execute_plan
with open("examples/movies.shacl.ttl") as f:
schema = f.read()
plan = parse_schema(schema, source="movies.shacl.ttl")
# Or with strict mode for closed-world coverage checks:
# plan = parse_schema(schema, source="movies.shacl.ttl", strict=True)
# Dry run — see the generated queries without a database
print(dry_run(plan, CypherBackend()))
# Or execute against a live Neo4j instance
from neo4j import GraphDatabase
driver = GraphDatabase.driver("neo4j://localhost:7687", auth=("neo4j", "password"))
report = execute_plan(plan, CypherBackend(), driver, target_uri="neo4j://localhost:7687")
print(report.print_table())@prefix ex: <http://example.org/movies#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ex:MovieShape
a sh:NodeShape ;
sh:targetClass ex:Movie ;
sh:property [ sh:path ex:title ; sh:datatype xsd:string ; sh:minCount 1 ] ;
sh:property [ sh:path ex:released ; sh:datatype xsd:integer ; sh:minCount 1 ] ;
sh:property [ sh:path ex:tagline ; sh:datatype xsd:string ] ;
sh:property [ sh:path ex:hasActor ; sh:nodeKind sh:IRI ; sh:node ex:PersonShape ; sh:minCount 1 ] ;
sh:property [ sh:path ex:hasDirector ; sh:nodeKind sh:IRI ; sh:node ex:PersonShape ; sh:minCount 1 ; sh:maxCount 1 ] .
ex:PersonShape
a sh:NodeShape ;
sh:targetClass ex:Person ;
sh:property [ sh:path ex:name ; sh:datatype xsd:string ; sh:minCount 1 ] ;
sh:property [ sh:path ex:born ; sh:datatype xsd:integer ] .This compiles into validation checks covering property existence, type constraints, allowed values, and relationship cardinality.
By default, graphlint only validates what your schema declares (open-world assumption). Enable strict mode to also check for things your schema doesn't mention:
| Check | Severity | What it catches |
|---|---|---|
| Undeclared labels | warning | Node labels in the database not declared as shapes |
| Undeclared relationship types | warning | Relationship types not referenced by any shape |
| Undeclared properties | warning | Properties on declared node types not mentioned in the schema |
| Empty shapes | warning | Shapes declared in the schema with zero matching nodes |
plan = parse_schema(schema, source="movies.shacl.ttl", strict=True)In the playground, toggle the strict checkbox in the connection bar.
Interactive web UI for testing schemas against a live database:
uv run python playground.py
# Open http://127.0.0.1:8420Features:
- Live editing — SHACL editor with auto-compile on keystroke
- Database connection — Connect to any Neo4j/Memgraph instance via Bolt
- Strict mode toggle — Enable closed-world coverage checks
- Four output tabs:
- Checks — validation plan grouped by shape, color-coded by severity
- Cypher — generated queries with syntax highlighting
- Results — validation report with pass/fail/warning cards, violating node details, and vacuous check detection (skips checks when no data exists for a shape)
- JSON — raw validation plan
| Backend | Status | Target databases |
|---|---|---|
| Cypher | ✓ | Neo4j, Memgraph |
| GQL | ✓ | ISO GQL-compliant databases |
| Gremlin | planned | Amazon Neptune, JanusGraph |
graphlint/
├── graphlint/
│ ├── __init__.py # Package metadata
│ ├── parser.py # Shared types, unified entry point
│ ├── shacl_parser.py # SHACL/Turtle → Validation Plan (IR)
│ ├── runner.py # Execute plan, produce reports
│ └── backends/
│ ├── __init__.py # Backend protocol
│ ├── cypher.py # Cypher query generation
│ └── gql.py # GQL query generation
├── examples/
│ └── movies.shacl.ttl # Example schema (SHACL)
├── templates/
│ └── playground.html # Playground UI template
├── playground.py # Interactive web playground
└── tests/
└── test_shacl_pipeline.py # SHACL pipeline tests
rdflib— RDF graph library (SHACL parser)neo4j— Neo4j driver (optional, only needed for execution)fastapi,uvicorn,jinja2— playground web UI (optional)
Neosemantics is a Neo4j plugin that bridges RDF and property graphs — importing/exporting RDF, loading ontologies, inferencing, and validating against SHACL. Validation is roughly 15% of its surface area. Graphlint is 100% focused on schema validation.
| Facet | neosemantics (n10s) | graphlint |
|---|---|---|
| Schema language | SHACL | SHACL |
| Deployment | Neo4j server plugin (Java JAR) | External Python tool |
| Database support | Neo4j only | Neo4j, Memgraph, ISO GQL (Gremlin planned) |
| RDF import/export | Full (Turtle, N-Triples, RDF/XML) | None |
| Ontology/inferencing | OWL, RDFS, SKOS with class/property hierarchy reasoning | None |
| Transactional enforcement | Yes — can roll back writes that violate constraints | No — read-only audit |
| Dry-run / CI mode | No — requires running Neo4j | Yes — generates queries without a database |
| Interactive tooling | No | Web playground for live schema exploration |
| Target audience | Semantic Web practitioners adopting Neo4j | Graph DB developers who want schema linting |
n10s assumes you're coming from the RDF world into Neo4j. Graphlint assumes you're already in the LPG world and want to borrow SHACL's rigor without adopting the full Semantic Web stack.
Graphlint validates your labeled property graph against schema constraints. It answers: "does my graph data conform to these shapes?"
It does not validate that the schema you provide is itself well-formed or idiomatic. If you hand it a SHACL document with misspelled predicates or unusual patterns, graphlint will silently produce fewer checks rather than reject the input. This is a deliberate tradeoff for a POC — schema authoring validation is a solved problem (pySHACL for SHACL), and graphlint assumes your schema has already been validated through those tools or your own review.
In short:
- In scope: LPG data ↔ schema constraint checking
- Out of scope: schema document validation, SHACL meta-validation
graphlint aims to be a practical bridge between formal graph schemas and real-world graph databases. Some features are not yet implemented:
Planned
- Gremlin backend (Amazon Neptune, JanusGraph)
- SPARQL backend for RDF stores
- Schema-level validation (meta-SHACL, ShExC syntax checking)
- Complex SHACL paths (
sh:alternativePath, sequence paths,sh:zeroOrMorePath,sh:oneOrMorePath)
LPG–RDF gap Some SHACL/ShEx features assume RDF semantics that don't exist natively in labeled property graphs:
| Feature | RDF Concept | LPG Status |
|---|---|---|
sh:uniqueLang |
Language-tagged literals | No equivalent in LPG — acknowledged but not enforced |
rdfs:subClassOf traversal |
Class hierarchies | Supported when declared in the SHACL file; no runtime inference |
| Blank node shapes | Anonymous resources | LPG nodes always have identity |
Named graphs / sh:shapesGraph |
RDF datasets | Single-graph validation only |
These gaps reflect fundamental modeling differences between RDF and LPG, not missing features. graphlint documents them transparently so users from either community can make informed decisions.
Early prototype. The core pipeline works: parse SHACL, compile to Cypher/GQL, execute against Neo4j, produce validation reports.
TBD