StructureLineage is a framework for generating synthetic pipelines, building Schema Dependency Graphs (SDGs), and evaluating lineage mappings with precision/recall metrics.
Author: Habib Maicha
Run the full pipeline interactively with one click:
The Quickstart notebook demonstrates how to:
- Clone the StructureLineage repo
- Generate a synthetic project
- Build the Schema Dependency Graph (SDG)
- Evaluate precision/recall against ground truth
- ✅ Auto-verify pipeline success
- 📊 Visualize the SDG (tables vs views)
Here’s a simplified layered view of tables (blue) and views (green):
- Python 3.9+
sqlglot,duckdb,networkx,pandas,pytest
Install dependencies:
pip install -r requirements.txtClone the repository and run the pipeline on your machine:
git clone https://github.com/habiiibo03/StructureLineage.git cd StructureLineage pip install -r requirements.txt Generate a synthetic project:
python -m src.tools.gen_synthetic examples/local_project --n_tables 3 --n_views 3Build the Schema Dependency Graph:
python -m src.sl_core.build_sdg examples/local_project Evaluate results:
pytestRun unit tests with:
pytest 