Hypergraph Heat Kernel Lift (Hypergraph to Simplicial) #58

peekxc · 2024-07-12T15:01:22Z

This PR implements the hypergraph-to-simplicial topological conversion described in the paper below (1), which is aimed at encoding the higher-order information of a hypergraph into a weighted simplicial complex without loss of information. One notable application of this approach is to bring the power of the heat kernel to the realm of hypergraphs.

Note this is not a 'lift' in the sense that we are going from a higher order structure to a (theoretically) lower order one, though the submission requirements mention this is allowed.

Application example

To illustrate this lift, consider the set of all hypergraphs whose simplicial closures are given by the maximal simplices $S = \{(0,1,2), (1,2,3) \}$. As there are 9 non-maximal faces in the closure, there are $2^9$ distinct hypergraphs with identical closures; for any conversion to be useful, we would like a lift capable of distinguishing between hypergraphs in this set.

Of course, a natural weight map is the identity map $f : S_H \to \{0,1\}$ given by $S(\sigma) = 1$ if $\sigma \in H$ and $0$ otherwise; however, for various reasons, this assignment is not very useful.

Instead, below are 5 hypergraphs and their simplicial lifts weighted using the lift implemented here, along with two featurizations on the right; the first of these shows the amount of heat diffused at each vertex across time starting with a unit amount of heat at the red vertex, the second shows the heat kernel signature, a natural featurization one could associate with this lift. Keeping in line with simplicial weights representing thermal conductivity in the Laplacian sense, the 'width' of each edge is inversely proportional to its weight deduced by the input hypergraph.

To illustrate the diffusion curves, consider the 3rd row: the red vertex (0) starts with a unit amount of heat, which it diffuses through the weighted edges (0,1) and (0,2) via the heat kernel. The blue vertex heats up faster than the green due to having higher conductivity, and the orange vertex (3) heats up the slowest due to being not adjacent to the heat source (0).

Observe from above that not only is the HKS able to distinguish between different hypergraphs, but also symmetrically related graphs are handled naturally. To demonstrate this, below is a plot of the MDS embedding computed from the Euclidean distance matrix over the HKS-features, for a heuristic choice of time points $t_1, t_2, \dots, t_k$.

Though not exhibiting perfect symmetry, many of the distances are quite intuitive, and each of the $2^9$ distinct hypergraphs are indeed distinguished by the HKS.

Overview

The high-level algorithmic pipeline of this lift is as follows:

Define an undirected hypergraph $H = (V, \mathcal{E})$ to be converted to simplicial complex
Take the downward / simplicial closure $S$ of $H$
For each simplex $\sigma \in S$, compute a weight map $f: S \to R_+$ by combining the simplex's topological weight $w_\sigma$ with its associated affinity weight $\omega_\sigma$:

$$w(\sigma) = \omega_\sigma + \sum\limits_{\sigma' \in \mathrm{cofacet}(\sigma)} w_\sigma'$$

The computed weight function $w(\sigma)$ induces an inner product $\langle \rangle_w$ on the cochain space $C^p(S, \mathbb{R})$ of $S$. To capture higher-order interactions exploiting this inner product, choose a weighted Hodge Laplacian operators $\mathcal{L}_p$:

$$\langle f, g \rangle = \sum\limits_{\sigma \in S, \mathrm{dim}(\sigma) = d} w_\sigma \cdot f([\sigma]) g([\sigma]), \hspace{1em} \mathcal{L}_p = L_p^{\mathrm{up}} + L_p^{\mathrm{dn}}$$

Choose a weight-dependent featurization of $\mathcal{L}_p$ for learning purposes; a classical featurization is the Heat Kernel Signature, which captures multiscale diffusion-based information via the heat kernel.

Properties of the weights

There are many ways to map higher-order interactions to simplicial weights; to define a notion of 'topological weight', (1) define a weighting scheme that satisfies:

$$ \begin{cases} w_\sigma > 0 & \text{for every face } \sigma \in S, \\ w_\sigma \geq \sum\limits_{\tau \supset \sigma} w_\tau & \text{for every codim. 1 coface } \tau \in S \\ w_v = \sum\limits_{h \in H} 1(v \in h) & \text{ for all } v \in V. \end{cases} $$

This lifting implements the weighting scheme defined by Eq. 3 & Eqs. (63-65) from (1), which not only satisfies all these properties, but can also be computed quickly for the $d$-skeleton $S_d \subseteq S$.

For hypergraphs, the affinity weight is deduced by the number of times $c$ a face $\sigma \in S$ appears in a hyperedge $h \in H$ of order $n$, while the topological weight depends only the dimensions of the faces that contain the given simplex. For an example of the weighting scheme, see the picture below:

The primary use-case for this simplex-weight mapping is to make a valid inner product on the cochain space that captures higher-order interactions using only simplicial structure. In particular, multiscale invariants such as those deriving from the heat kernel were shown in [1] to yield more information gain than in the unweighted settings.

Implementation details

This lift implementation includes:

An implementation of the topological + affinity weighting scheme from (1)
Fast downward closure code for restricting to $d$-skeletons
Three additional hypergraph datasets (1 toy and 2 real)

NOTE: Two of the three supplied datasets come from Google drive links, which require the gdown package to download (we provide these scripts as valid loaders in the pipeline).

Proof of concept modeling code taken from the topomodelx tutorial page

The lifting code itself requires the hirola package to efficiently compute the $d$-skeleton. This was added as a dependency to the pyproject.toml.

References

Weighted simplicial complexes and their representation power of higher-order network data and topology." Physical Review E 106.3 (2022): 034319, by Baccini, Federica, Filippo Geraci, and Ginestra Bianconi.
Sun, Jian, Maks Ovsjanikov, and Leonidas Guibas. "A concise and provably informative multi‐scale signature based on heat diffusion." Computer graphics forum. Vol. 28. No. 5. Oxford, UK: Blackwell Publishing Ltd, 2009.

gbg141 · 2024-07-12T16:48:28Z

Hello @peekxc! Thank you for your submission. As we near the end of the challenge, I am collecting participant info for the purpose of selecting and announcing winners. Please email me (or have one member of your team email me) at guillermo_bernardez@ucsb.edu so I can share access to the voting form. In your email, please include:

your first and last name (as well as any other team members)
the title of the method you implemented
the input domain of the method you implemented
the output domain of the method you implemented
your pull request number (Hypergraph Heat Kernel Lift (Hypergraph to Simplicial) #58)

Before July 12, make sure that your submission respects all Submission Requirements laid out on the challenge page. Any submission that fails to meet this criteria will be automatically disqualified.

gbg141 · 2024-07-19T09:01:18Z

Hi @peekxc! While checking your submission I found out that you did not provide a tutorial notebook (in the corresponding folder there is only a .py file).

Did you forget to generate it?

peekxc · 2024-07-19T12:58:01Z

@gbg141 Ahh yes, sorry, I tend to prefer developing using the percent format (or Jupyter cells in VSCode); I can generate a .ipynb file with one click.

Would you like me to convert it?

peekxc and others added 19 commits July 9, 2024 22:10

basic abstract class structure

d9500f7

Adding to data transform.py

7631355

yaml config + basic tests

6173e7f

Add requirements for a working environment

c1da336

Adding test cases (not working yet)

c790efd

bug fix related to import statement

110926b

adding contact config

e6a7611

getting closer to actually importing a data set

d740049

config change to reflect new data

9873591

Loaders update

57b3339

Lift updates

459a3f9

dataset config changes

a8d1451

initial version passing tests

8d838e7

Merge branch 'main' of ssh://github.com/peekxc/challenge-icml-2024

6500824

linting

a6c5688

Ruff linting fixes

4c57465

More ruff linting fixes

2d79ca1

Mor eruff linting fixes + added hirola dependency

b038f49

Small bug fix

cbbc903

gbg141 added challenge-icml-2024 award-category-1 Lifting to Simplicial or Cell Domain award-category-4 Connectivity-based Lifting labels Jul 12, 2024

Expanded test cases

94d8593

gbg141 added Winner Awarded submission and removed challenge-icml-2024 labels Oct 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hypergraph Heat Kernel Lift (Hypergraph to Simplicial) #58

Hypergraph Heat Kernel Lift (Hypergraph to Simplicial) #58

peekxc commented Jul 12, 2024 •

edited

Loading

gbg141 commented Jul 12, 2024

gbg141 commented Jul 19, 2024

peekxc commented Jul 19, 2024 •

edited

Loading

Hypergraph Heat Kernel Lift (Hypergraph to Simplicial) #58

Are you sure you want to change the base?

Hypergraph Heat Kernel Lift (Hypergraph to Simplicial) #58

Conversation

peekxc commented Jul 12, 2024 • edited Loading

Application example

Overview

Properties of the weights

Implementation details

References

gbg141 commented Jul 12, 2024

gbg141 commented Jul 19, 2024

peekxc commented Jul 19, 2024 • edited Loading

peekxc commented Jul 12, 2024 •

edited

Loading

peekxc commented Jul 19, 2024 •

edited

Loading