-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A Random Latent Clique Lifting from Graphs to Simplicial Complexes #63
base: main
Are you sure you want to change the base?
Conversation
First version
Documentation and improved defaults
Improve the description of latent observability parameter
rename to latent clique lifting
added data transform
added testing file
Wrapup first submission
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Replaced default README with model explanation.
Hello @mauriciogtec! Thank you for your submission. As we near the end of the challenge, I am collecting participant info for the purpose of selecting and announcing winners. Please email me (or have one member of your team email me) at guillermo_bernardez@ucsb.edu so I can share access to the voting form. In your email, please include:
Before July 12, make sure that your submission respects all Submission Requirements laid out on the challenge page. Any submission that fails to meet this criteria will be automatically disqualified. |
A Random Latent Clique Lifting from Graphs to Simplicial Complexes
TL;DR We propose a lifting that ensures both 1) small-world property and 2) edge/cell sparsity. Combining these two properties is very attractive for Topological Deep Learning (TDL) because it ensures computational efficiency due to the reduced number of higher-order connections: only a few message-passing layers connect any two nodes.
Background. A graph is sparse if its number of edges grows proportional to the number of nodes. Many real-world graphs are sparse, but they contain many densely connected subgraphs and exhibit high clustering coefficients. Moreover, such real-world graphs frequently exhibit the small-world property, where any two nodes are connected by a short path of length proportional to the logarithm of the number of nodes. For instance, these are well-known properties of social networks, biological networks, and the Internet.
Contributions. In this notebook, we present a novel random lifting procedure from graphs to simplicial complexes. The procedure is based on a relatively recent proposed Bayesian nonparametric random graph model for random clique covers (Williamson & Tec, 2020). Specifically, the model can learn latent clique complexes that are consistent with the input graph. The model can capture power-law degree distribution, global sparsity, and non-vanishing local clustering coefficient. Its small-world property is also guaranteed, which is a very attractive property for Topological Deep Learning (TDL).
In the original work [1], the distribution has been used as a prior on an observed input graph. In particular, in the Bayesian setting, the model is useful to obtain a distribution on latent clique complexes, i.e. a specific class of simplicial complexes, whose 1-skeleton structural properties are consistent with the ones of the input graph used to compute the likelihood. Indeed, one of the features of the posterior distribution from which the latent complex is sampled is that the set of latent 1-simplices (edges) is a superset of the set of edges of the input graph.
In the context of Topological Deep Learning [2][3] and the very recently emerged paradigm of Latent Topology Inference (LTI) [4], it is natural to look at the model in [1] as a novel LTI method able to infer a random latent simplicial complex from an input graph. Or, in other words, to use [1] as a novel random lifting procedure from graphs to simplicial complexes.
Next, we provide a quick introduction to the model in [1]. For a more in-depth exposition, please refer to the paper. To the best of our knowledge, this is the first random lifting procedure relying on Bayesian arguments.
To summarize, this is:
The Random Clique Cover Model
Let$G=(V,E)$ be a graph with $V$ the set of vertices and $E$ the set of edges. Denote the numer of nodes as $N=|V|$ . A clique cover can be described as a matrix $Z$ of size $K \times N$ where $K$ is the number of cliques such that $Z_{k, i} = 1$ if node $i$ is in clique $k$ and $Z_{k, i} = 0$ otherwise. The Random Clique Cover (RCC) Model, defined in [1], is a probabilistic model for the matrix $Z$ . This matrix can have an infinite number of rows and columns, but only a finite number of them will be active. The model is based on the Indian Buffet Process (IBP), which is a distribution over binary matrices with a possibly infinite number of rows and columns, or more specifically, the Stable Beta IBP as described in [5]. While the mathematics behind the IBP are complex, the model admits a highly intuitive representation describe below.
First, recall that a clique is a fully connected subset of vertices. Therefore, a clique cover$Z$ induces an adjacency matrix by the formula $A = min(Z^T Z - diag(Z^T Z), 1)$ , where $min$ is the element-wise minimum. The IBP model can be described recursively as follows:
Conditional on$Z_1, Z_2, ..., Z_{K-1}$ , where $Z_j$ is the $j$ -th row of $Z$ . Then, $Z_K$ is drawn as follows:
The last expression is highly intuitive in the sense that the number of cliques that a node will appear in is proportional to the number of cliques it is already in.
The RCC model depends on four parameters$\alpha, c, \sigma, \pi$ . The first three parameters are part of the IBP. Explaining them in detail is beyond the scope of this notebook. However, the reader may see [5]. Fortunately, the learned (posterior) values of $\alpha, \sigma, c$ are strongly determined by the data itself. By contrast, $\pi$ is approximately the probability that an edge is missing from the graph. Generally, the lower $\pi$ is, the lower the number of cliques will be and the less interconnected the nodes of the clique will be.
Importantly, by leveraging the possibility of latent inferred edges, one will superimpose the small-world property on the graph.
References
[1] Williamson, Sinead A., and Mauricio Tec. "Random clique covers for graphs with local density and global sparsity." Uncertainty in Artificial Intelligence (UAI). PMLR, 2020.
[2] Papamarkou, Theodore, et al. "Position paper: Challenges and opportunities in topological deep learning." arXiv preprint arXiv:2402.08871 (2024).
[3] Hajij, Mustafa, et al. "Topological deep learning: Going beyond graph data." arXiv preprint arXiv:2206.00606 (2022).
[4] Battiloro, Claudio, et al. "From latent graph to latent topology inference: Differentiable cell complex module." The Twelfth International Conference on Learning Representations (ICLR), 2024.
[5] Teh, Yee Whye, and Dilan Görür. "Indian Buffet Processes with Power-law Behavior." Advances in neural information processing systems. 2009.