Skip to content

Commit

Permalink
Merge branch 'main' of github.com:geometric-intelligence/TopoBenchmar…
Browse files Browse the repository at this point in the history
…k into dev
  • Loading branch information
gbg141 committed Dec 18, 2024
2 parents 30547f3 + 48f9fcf commit ecdd658
Show file tree
Hide file tree
Showing 11 changed files with 282 additions and 33 deletions.
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.ipynb linguist-vendored
44 changes: 27 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,8 @@ python -m topobenchmark model=cell/cwn dataset=graph/MUTAG

The same CLI override mechanism also applies when modifying more finer configurations within a `CONFIG GROUP`. Please, refer to the official [`hydra`documentation](https://hydra.cc/docs/intro/) for further details.



## :bike: Experiments Reproducibility
To reproduce Table 1 from the [`TopoBenchmark: A Framework for Benchmarking Topological Deep Learning`](https://arxiv.org/pdf/2406.06642) paper, please run the following command:

Expand All @@ -116,6 +118,7 @@ We list the neural networks trained and evaluated by `TopoBenchmark`, organized
| GAT | [Graph Attention Networks](https://openreview.net/pdf?id=rJXMpikCZ) |
| GIN | [How Powerful are Graph Neural Networks?](https://openreview.net/pdf?id=ryGs6iA5Km) |
| GCN | [Semi-Supervised Classification with Graph Convolutional Networks](https://arxiv.org/pdf/1609.02907v4) |
| GraphMLP | [Graph-MLP: Node Classification without Message Passing in Graph](https://arxiv.org/pdf/2106.04051) |

### Simplicial complexes
| Model | Reference |
Expand Down Expand Up @@ -145,7 +148,7 @@ We list the neural networks trained and evaluated by `TopoBenchmark`, organized
### Combinatorial complexes
| Model | Reference |
| --- | --- |
| GCCN | [Generalized Combinatorial Complex Neural Networks](https://arxiv.org/pdf/2410.06530) |
| GCCN | [TopoTune: A Framework for Generalized Combinatorial Complex Neural Networks](https://arxiv.org/pdf/2410.06530) |

## :bulb: TopoTune

Expand Down Expand Up @@ -178,12 +181,17 @@ python -m topobenchmark \

To use a single augmented Hasse graph expansion, use `model={domain}/topotune_onehasse` instead of `model={domain}/topotune`.

To specify a set of neighborhoods (routes) on the complex, use a list of neighborhoods each specified as `\[\[{source_rank}, {destination_rank}\], {neighborhood}\]`. Currently, the following options for `{neighborhood}` are supported:
- `up_laplacian`, from rank $r$ to $r$
- `down_laplacian`, from rank $r$ to $r$
- `boundary`, from rank $r$ to $r-1$
- `coboundary`, from rank $r$ to $r+1$
- `adjacency`, from rank $r$ to $r$ (stand-in for `up_adjacency`, as `down_adjacency` not yet supported in TopoBenchmark)
To specify a set of neighborhoods on the complex, use a list of neighborhoods each specified as a string of the form
`r-{neighborhood}-k`, where $k$ represents the source cell rank, and $r$ is the number of ranks up or down that the selected `{neighborhood}` considers. Currently, the following options for `{neighborhood}` are supported:
- `up_laplacian`, between cells of rank $k$ through $k+r$ cells.
- `down_laplacian`, between cells of rank $k$ through $k-r$ cells.
- `hodge_laplacian`, between cells of rank $k$ through both $k-r$ and $k+r$ cells.
- `up_adjacency`, between cells of rank $k$ through $k+r$ cells.
- `down_adjacency`, between cells of rank $k$ through $k-r$ cells.
- `up_incidence`, from rank $k$ to $k+r$.
- `down_incidence`, from rank $k$ to $k-r$.

The number $r$ can be omitted, in which case $r=1$ by default (e.g. `up_incidence-k` represents the incidence from rank $k$ to $k+1$).


### Using backbone models from any package
Expand Down Expand Up @@ -235,16 +243,18 @@ We list the liftings used in `TopoBenchmark` to transform datasets. Here, a _lif

</details>

## Data Transformations
<details>
<summary><b> Data Transformations <b></summary>

| Transform | Description | Reference |
| --- | --- | --- |
| Message Passing Homophily | Higher-order homophily measure for hypergraphs | [Source](https://arxiv.org/abs/2310.07684) |
| Group Homophily | Higher-order homophily measure for hypergraphs that considers groups of predefined sizes | [Source](https://arxiv.org/abs/2103.11818) |
</details>

## :books: Datasets


### Graphs
| Dataset | Task | Description | Reference |
| --- | --- | --- | --- |
| Cora | Classification | Cocitation dataset. | [Source](https://link.springer.com/article/10.1023/A:1009953814988) |
Expand All @@ -264,14 +274,14 @@ We list the liftings used in `TopoBenchmark` to transform datasets. Here, a _lif
| US-county-demos | Regression | In turn each node attribute is used as the target label. | [Source](https://arxiv.org/pdf/2002.08274) |
| ZINC | Regression | Graph-level regression. | [Source](https://pubs.acs.org/doi/10.1021/ci3001277) |




## :hammer_and_wrench: Development

To join the development of `TopoBenchmark`, you should install the library in dev mode.

For this, you can create an environment using conda or docker. Please, follow the steps in <a href="#jigsaw-get-started">:jigsaw: Get Started</a>.
### Hypergraphs
| Dataset | Task | Description | Reference |
| --- | --- | --- | --- |
| Cora-Cocitation | Classification | Cocitation dataset. | [Source](https://proceedings.neurips.cc/paper_files/paper/2019/file/1efa39bcaec6f3900149160693694536-Paper.pdf) |
| Citeseer-Cocitation | Classification | Cocitation dataset. | [Source](https://proceedings.neurips.cc/paper_files/paper/2019/file/1efa39bcaec6f3900149160693694536-Paper.pdf) |
| PubMed-Cocitation | Classification | Cocitation dataset. | [Source](https://proceedings.neurips.cc/paper_files/paper/2019/file/1efa39bcaec6f3900149160693694536-Paper.pdf) |
| Cora-Coauthorship | Classification | Cocitation dataset. | [Source](https://proceedings.neurips.cc/paper_files/paper/2019/file/1efa39bcaec6f3900149160693694536-Paper.pdf) |
| DBLP-Coauthorship | Classification | Cocitation dataset. | [Source](https://proceedings.neurips.cc/paper_files/paper/2019/file/1efa39bcaec6f3900149160693694536-Paper.pdf) |



Expand Down
2 changes: 1 addition & 1 deletion configs/evaluator/default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ num_classes: ${dataset.parameters.num_classes}
# Automatically selects the default metrics depending on the task
# Classification: [accuracy, precision, recall, auroc]
# Regression: [mae, mse]
metrics: ${get_default_metrics:${evaluator.task}}
metrics: ${get_default_metrics:${evaluator.task},${oc.select:dataset.parameters.metrics,null}}
# Select classification/regression config files to manually define the metrics
4 changes: 2 additions & 2 deletions configs/run.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
# order of defaults determines the order in which configs override each other
defaults:
- _self_
- dataset: graph/cocitation_cora
- model: graph/gcn_dgm
- dataset: graph/ZINC
- model: cell/topotune
- transforms: ${get_default_transform:${dataset},${model}} #tree #${get_default_transform:${dataset},${model}} #no_transform
- optimizer: default
- loss: default
Expand Down
38 changes: 33 additions & 5 deletions test/evaluator/test_evaluator.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,43 @@
""" Test the TBEvaluator class."""
import pytest

import torch
from topobenchmark.evaluator import TBEvaluator

class TestTBEvaluator:
""" Test the TBXEvaluator class."""

def setup_method(self):
""" Setup the test."""
self.evaluator_multilable = TBEvaluator(task="multilabel classification")
self.evaluator_regression = TBEvaluator(task="regression")
self.classification_metrics = ["accuracy", "precision", "recall", "auroc"]
self.evaluator_classification = TBEvaluator(task="classification", num_classes=3, metrics=self.classification_metrics)
self.evaluator_multilabel = TBEvaluator(task="multilabel classification", num_classes=2, metrics=self.classification_metrics)
self.regression_metrics = ["example", "mae"]
self.evaluator_regression = TBEvaluator(task="regression", num_classes=1, metrics=self.regression_metrics)
with pytest.raises(ValueError):
TBEvaluator(task="wrong")
repr = self.evaluator_multilable.__repr__()
TBEvaluator(task="wrong", num_classes=2, metrics=self.classification_metrics)

def test_repr(self):
"""Test the __repr__ method."""
assert "TBEvaluator" in self.evaluator_classification.__repr__()
assert "TBEvaluator" in self.evaluator_multilabel.__repr__()
assert "TBEvaluator" in self.evaluator_regression.__repr__()

def test_update_and_compute(self):
"""Test the update and compute methods."""
self.evaluator_classification.update({"logits": torch.randn(10, 3), "labels": torch.randint(0, 3, (10,))})
out = self.evaluator_classification.compute()
for metric in self.classification_metrics:
assert metric in out
self.evaluator_multilabel.update({"logits": torch.randn(10, 2), "labels": torch.randint(0, 2, (10, 2))})
out = self.evaluator_multilabel.compute()
for metric in self.classification_metrics:
assert metric in out
self.evaluator_regression.update({"logits": torch.randn(10, 1), "labels": torch.randn(10,)})
out = self.evaluator_regression.compute()
for metric in self.regression_metrics:
assert metric in out

def test_reset(self):
"""Test the reset method."""
self.evaluator_multilabel.reset()
self.evaluator_regression.reset()
3 changes: 3 additions & 0 deletions test/utils/test_config_resolvers.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,9 @@ def test_infer_num_cell_dimensions(self):

def test_get_default_metrics(self):
"""Test get_default_metrics."""
out = get_default_metrics("classification", ["accuracy", "precision"])
assert out == ["accuracy", "precision"]

out = get_default_metrics("classification")
assert out == ["accuracy", "precision", "recall", "auroc"]

Expand Down
3 changes: 3 additions & 0 deletions topobenchmark/evaluator/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
from torchmetrics.classification import AUROC, Accuracy, Precision, Recall
from torchmetrics.regression import MeanAbsoluteError, MeanSquaredError

from .metrics import ExampleRegressionMetric

# Define metrics
METRICS = {
"accuracy": Accuracy,
Expand All @@ -11,6 +13,7 @@
"auroc": AUROC,
"mae": MeanAbsoluteError,
"mse": MeanSquaredError,
"example": ExampleRegressionMetric,
}

from .base import AbstractEvaluator # noqa: E402
Expand Down
8 changes: 6 additions & 2 deletions topobenchmark/evaluator/evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,14 +37,15 @@ def __init__(self, task, **kwargs):
elif self.task == "multilabel classification":
parameters = {"num_classes": kwargs["num_classes"]}
parameters["task"] = "multilabel"
parameters["num_labels"] = kwargs["num_classes"]
metric_names = kwargs["metrics"]

elif self.task == "regression":
parameters = {}
metric_names = kwargs["metrics"]

else:
raise ValueError(f"Invalid task {kwargs['task']}")
raise ValueError(f"Invalid task {task}")

metrics = {}
for name in metric_names:
Expand Down Expand Up @@ -83,7 +84,10 @@ def update(self, model_out: dict):
if self.task == "regression":
self.metrics.update(preds, target.unsqueeze(1))

elif self.task == "classification":
elif (
self.task == "classification"
or self.task == "multilabel classification"
):
self.metrics.update(preds, target)

else:
Expand Down
108 changes: 108 additions & 0 deletions topobenchmark/evaluator/metrics/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
"""Init file for custom metrics in evaluator module."""

import importlib
import inspect
import sys
from pathlib import Path
from typing import Any


class LoadManager:
"""Manages automatic discovery and registration of loss classes."""

@staticmethod
def is_metric_class(obj: Any) -> bool:
"""Check if an object is a valid metric class.
Parameters
----------
obj : Any
The object to check if it's a valid loss class.
Returns
-------
bool
True if the object is a valid loss class (non-private class
with 'FeatureEncoder' in name), False otherwise.
"""
try:
from torchmetrics import Metric

return (
inspect.isclass(obj)
and not obj.__name__.startswith("_")
and issubclass(obj, Metric)
and obj is not Metric
)
except ImportError:
return False

@classmethod
def discover_metrics(cls, package_path: str) -> dict[str, type]:
"""Dynamically discover all metric classes in the package.
Parameters
----------
package_path : str
Path to the package's __init__.py file.
Returns
-------
Dict[str, Type]
Dictionary mapping loss class names to their corresponding class objects.
"""
metrics = {}
package_dir = Path(package_path).parent

# Add parent directory to sys.path to ensure imports work
parent_dir = str(package_dir.parent)
if parent_dir not in sys.path:
sys.path.insert(0, parent_dir)

# Iterate through all .py files in the directory
for file_path in package_dir.glob("*.py"):
if file_path.stem == "__init__":
continue

try:
# Use importlib to safely import the module
module_name = f"{package_dir.stem}.{file_path.stem}"
module = importlib.import_module(module_name)

# Find all loss classes in the module
for name, obj in inspect.getmembers(module):
if (
cls.is_metric_class(obj)
and obj.__module__ == module.__name__
):
metrics[name] = obj # noqa: PERF403

except ImportError as e:
print(f"Could not import module {module_name}: {e}")

return metrics


# Dynamically create the loss manager and discover losses
manager = LoadManager()
CUSTOM_METRICS = manager.discover_metrics(__file__)
CUSTOM_METRICS_list = list(CUSTOM_METRICS.keys())

# Combine manual and discovered losses
all_metrics = {**CUSTOM_METRICS}

# Generate __all__
__all__ = [
"CUSTOM_METRICS",
"CUSTOM_METRICS_list",
*list(all_metrics.keys()),
]

# Update locals for direct import
locals().update(all_metrics)

# from .example import ExampleRegressionMetric

# __all__ = [
# "ExampleRegressionMetric",
# ]
Loading

0 comments on commit ecdd658

Please sign in to comment.