A Random Latent Clique Lifting from Graphs to Simplicial Complexes #63

mauriciogtec · 2024-07-13T03:34:45Z

A Random Latent Clique Lifting from Graphs to Simplicial Complexes

TL;DR We propose a lifting that ensures both 1) small-world property and 2) edge/cell sparsity. Combining these two properties is very attractive for Topological Deep Learning (TDL) because it ensures computational efficiency due to the reduced number of higher-order connections: only a few message-passing layers connect any two nodes.

Background. A graph is sparse if its number of edges grows proportional to the number of nodes. Many real-world graphs are sparse, but they contain many densely connected subgraphs and exhibit high clustering coefficients. Moreover, such real-world graphs frequently exhibit the small-world property, where any two nodes are connected by a short path of length proportional to the logarithm of the number of nodes. For instance, these are well-known properties of social networks, biological networks, and the Internet.

Contributions. In this notebook, we present a novel random lifting procedure from graphs to simplicial complexes. The procedure is based on a relatively recent proposed Bayesian nonparametric random graph model for random clique covers (Williamson & Tec, 2020). Specifically, the model can learn latent clique complexes that are consistent with the input graph. The model can capture power-law degree distribution, global sparsity, and non-vanishing local clustering coefficient. Its small-world property is also guaranteed, which is a very attractive property for Topological Deep Learning (TDL).

In the original work [1], the distribution has been used as a prior on an observed input graph. In particular, in the Bayesian setting, the model is useful to obtain a distribution on latent clique complexes, i.e. a specific class of simplicial complexes, whose 1-skeleton structural properties are consistent with the ones of the input graph used to compute the likelihood. Indeed, one of the features of the posterior distribution from which the latent complex is sampled is that the set of latent 1-simplices (edges) is a superset of the set of edges of the input graph.

In the context of Topological Deep Learning [2][3] and the very recently emerged paradigm of Latent Topology Inference (LTI) [4], it is natural to look at the model in [1] as a novel LTI method able to infer a random latent simplicial complex from an input graph. Or, in other words, to use [1] as a novel random lifting procedure from graphs to simplicial complexes.

Next, we provide a quick introduction to the model in [1]. For a more in-depth exposition, please refer to the paper. To the best of our knowledge, this is the first random lifting procedure relying on Bayesian arguments.

To summarize, this is:

a non-deterministic lifting,
not present in the literature as a lifting procedure,
Based on connectivity,
modifying the initial connectivity of the graph by adding edges (thus, this can be also considered as a graph rewiring method).

The Random Clique Cover Model

Let $G=(V,E)$ be a graph with $V$ the set of vertices and $E$ the set of edges. Denote the numer of nodes as $N=|V|$. A clique cover can be described as a matrix $Z$ of size $K \times N$ where $K$ is the number of cliques such that $Z_{k, i} = 1$ if node $i$ is in clique $k$ and $Z_{k, i} = 0$ otherwise. The Random Clique Cover (RCC) Model, defined in [1], is a probabilistic model for the matrix $Z$. This matrix can have an infinite number of rows and columns, but only a finite number of them will be active. The model is based on the Indian Buffet Process (IBP), which is a distribution over binary matrices with a possibly infinite number of rows and columns, or more specifically, the Stable Beta IBP as described in [5]. While the mathematics behind the IBP are complex, the model admits a highly intuitive representation describe below.

First, recall that a clique is a fully connected subset of vertices. Therefore, a clique cover $Z$ induces an adjacency matrix by the formula $A = min(Z^T Z - diag(Z^T Z), 1)$, where $min$ is the element-wise minimum. The IBP model can be described recursively as follows:

Conditional on $Z_1, Z_2, ..., Z_{K-1}$, where $Z_j$ is the $j$-th row of $Z$. Then, $Z_K$ is drawn as follows:

$Z_K$ will contain new unobserved nodes according to a distribution:
$$Z_K | Z_1, Z_2, ..., Z_{K-1}\sim \text{Poisson}\left(\alpha \frac{\Gamma(1 + c)\Gamma(N + c + \sigma - 1)}{\Gamma(N + \sigma)\Gamma(c + \sigma)}\right)$$
The probability that a previously observed node $n$ will belong to $Z_K$ is proportional to how many cliques it is already in. Specifically, letting $m_i=\Sigma_{k=1}^{K-1} Z_{k, i}$, then
$$P(Z_{K,i}=1|Z_1, Z_2, ..., Z_{K-1}) = \frac{m_i + \sigma}{K + c - 1}.$$

The last expression is highly intuitive in the sense that the number of cliques that a node will appear in is proportional to the number of cliques it is already in.

The RCC model depends on four parameters $\alpha, c, \sigma, \pi$. The first three parameters are part of the IBP. Explaining them in detail is beyond the scope of this notebook. However, the reader may see [5]. Fortunately, the learned (posterior) values of $\alpha, \sigma, c$ are strongly determined by the data itself. By contrast, $\pi$ is approximately the probability that an edge is missing from the graph. Generally, the lower $\pi$ is, the lower the number of cliques will be and the less interconnected the nodes of the clique will be.

Importantly, by leveraging the possibility of latent inferred edges, one will superimpose the small-world property on the graph.

References

[1] Williamson, Sinead A., and Mauricio Tec. "Random clique covers for graphs with local density and global sparsity." Uncertainty in Artificial Intelligence (UAI). PMLR, 2020.

[2] Papamarkou, Theodore, et al. "Position paper: Challenges and opportunities in topological deep learning." arXiv preprint arXiv:2402.08871 (2024).

[3] Hajij, Mustafa, et al. "Topological deep learning: Going beyond graph data." arXiv preprint arXiv:2206.00606 (2022).

[4] Battiloro, Claudio, et al. "From latent graph to latent topology inference: Differentiable cell complex module." The Twelfth International Conference on Learning Representations (ICLR), 2024.

[5] Teh, Yee Whye, and Dilan Görür. "Indian Buffet Processes with Power-law Behavior." Advances in neural information processing systems. 2009.

…proe model speed

First version

Documentation and improved defaults

Improve the description of latent observability parameter

rename to latent clique lifting

added data transform

added testing file

Wrapup first submission

review-notebook-app · 2024-07-13T03:34:50Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Replaced default README with model explanation.

gbg141 · 2024-07-13T09:06:15Z

Hello @mauriciogtec! Thank you for your submission. As we near the end of the challenge, I am collecting participant info for the purpose of selecting and announcing winners. Please email me (or have one member of your team email me) at guillermo_bernardez@ucsb.edu so I can share access to the voting form. In your email, please include:

your first and last name (as well as any other team members)
the title of the method you implemented
the input domain of the method you implemented
the output domain of the method you implemented
your pull request number (A Random Latent Clique Lifting from Graphs to Simplicial Complexes #63)

Before July 12, make sure that your submission respects all Submission Requirements laid out on the challenge page. Any submission that fails to meet this criteria will be automatically disqualified.

mauriciogtec and others added 19 commits July 11, 2024 21:20

first version, need to double check that cells are as expected and im…

a3c0c8b

…proe model speed

Use simplicial lifting as base

d2369cf

Merge pull request #1 from gdasoulas/create_lifting

d5d4e5a

First version

Function with improved documentation

bf5f31a

Merge pull request #2 from gdasoulas/rcc_fast

f3a51a7

Documentation and improved defaults

Fixed interpretation of the pie parameter and renamed it to edge_prob

cfe6bc7

Merge pull request #3 from gdasoulas/pie_params

6d68673

Improve the description of latent observability parameter

rename to latent clique lifting

909a1a0

Merge pull request #4 from gdasoulas/name-change

8db0080

rename to latent clique lifting

added data transform

b9f657d

Merge pull request #5 from gdasoulas/data-transform-dict

e229e4e

added data transform

added testing file

c337d2c

Merge pull request #6 from gdasoulas/add_test

3365a46

added testing file

added tutorial for latent clique

2986fd9

added latent clique yaml

8cd6e40

Cleaned the warnings

d0ccab4

Improve RCC description

a6f53dc

upgrade notebook writing

1828b0b

Merge pull request #7 from gdasoulas/wrapup-first-submission

40e3dbf

Wrapup first submission

mauriciogtec changed the title ~~Lifting integrate upstream~~ A Random Latent Clique Lifting from Graphs to Simplicial Complexes Jul 13, 2024

mauriciogtec added 9 commits July 12, 2024 23:37

Update README.md

09c5a23

Replaced default README with model explanation.

update test to comply with Black/PEP8

0fe19fd

Fixed test formatting, aboid extraneous parentheses

ca55be6

Fixed equation typo

d538d4d

fixed typos

c83cc85

Remove assignment to unused variable num_cliques_input

bab4770

Fixed unused variables from test

0929c6a

Remove unnecessary assignments and imports

8339b07

Fixed unnecessary assignments and main code in latentclique_lifting.py

0f52706

mauriciogtec added 10 commits July 13, 2024 04:05

use seeded random generation from np.random.Generator

e14f930

use seeded random generation from np.random.Generator

06939eb

Fix unused import

df148b2

fix min variance to correct value

6e98445

fixed adj mat creation in class

90855d3

fix lifting

659d0a1

fix notebook

7196b2c

remove main from lift

eba8caf

Remove unnecessary assignments

33abd35

start from half edges only

57e1545

gbg141 added challenge-icml-2024 award-category-1 Lifting to Simplicial or Cell Domain award-category-4 Connectivity-based Lifting labels Jul 13, 2024

gbg141 added Winner Awarded submission and removed challenge-icml-2024 labels Oct 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Random Latent Clique Lifting from Graphs to Simplicial Complexes #63

A Random Latent Clique Lifting from Graphs to Simplicial Complexes #63

mauriciogtec commented Jul 13, 2024 •

edited

Loading

review-notebook-app bot commented Jul 13, 2024

gbg141 commented Jul 13, 2024

A Random Latent Clique Lifting from Graphs to Simplicial Complexes #63

Are you sure you want to change the base?

A Random Latent Clique Lifting from Graphs to Simplicial Complexes #63

Conversation

mauriciogtec commented Jul 13, 2024 • edited Loading

A Random Latent Clique Lifting from Graphs to Simplicial Complexes

The Random Clique Cover Model

References

review-notebook-app bot commented Jul 13, 2024

gbg141 commented Jul 13, 2024

mauriciogtec commented Jul 13, 2024 •

edited

Loading