Preprocessing for Matrix Filter

Overview

Python scripts for preprocessing graphs and training the ML filter used by the CPU subgraph isomorphism (SubIso) application. The pipeline converts graph formats, generates embeddings, and trains similarity models.

Functionality

Rapids format → PyTorch Geometric (.pt)
PyTorch .pt → MatrixGraph embedding (binary)
Ground truth generation (for training)
Model training (similarity + GNN)

Tool Chain

Step	Script	Input	Output
1	`graph_reader.py`	Rapids-format graph	PyTorch .pt
2	`data.py` or `embedding.py`	PyTorch .pt	MatrixGraph embedding (binary)
3	(custom)	—	Ground truth (uint64 array)
4	`train.py`	Config + embeddings + ground truth	Trained model

Parameters

Script	Parameters	Description
`graph_reader.py`	`[input_path] [output_path]`	Convert Rapids graph to PyTorch
`data.py`	`[input_path] [output_path]`	Convert .pt to embedding binary
`train.py`	(config)	Train similarity model

Source

tools/python/graph_reader.py
tools/python/data.py
tools/python/embedding.py
tools/python/train.py

Examples

# Step 1: Rapids → PyTorch
python tools/python/graph_reader.py [path-to-rapids-graph] [output.pt]

# Step 2: PyTorch → embedding
python tools/python/data.py [path-to-binary-torch] [output-embedding]

# Step 4: Train (configure config.yaml first)
python tools/python/train.py

Ground Truth

Ground truth must be a 1D binary array of uint64_t storing candidate vertex IDs for the first query vertex of the pattern. See SubIsoTraining.md for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preprocessing for Matrix Filter

Overview

Functionality

Tool Chain

Parameters

Source

Examples

Ground Truth

FilesExpand file tree

Preprocessing4MatrixFilter.md

Latest commit

History

Preprocessing4MatrixFilter.md

File metadata and controls

Preprocessing for Matrix Filter

Overview

Functionality

Tool Chain

Parameters

Source

Examples

Ground Truth