-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA]: Multiplex leiden clustering #4828
Comments
|
This is something we can explore. Within our current cugraph framework, we could potentially support this as follows:
Does this seem like a reasonable approach? We would need to determine when to address this in our road map. |
In this approach would it be possible for one layer to have directed and another to have undirected edges? |
As it doesn't exist yet, we can certainly pursue this idea. The biggest complication is that we don't have a directed version of Leiden yet, so we would have to update our Louvain/Leiden implementation to support directed graphs. Regarding some layers being directed and other layers being undirected, is this inherent in the construction of these layers, or a virtual abstraction you choose to lay on top of it? That is, when you create layer X... is the data inherently an undirected graph and you will always treat it as undirected and when you create layer Y the data is inherently a directed graph and you will always treat it as directed; or when you create a layer the edges are directed and you sometimes want to treat those directed edges as undirected and other times you want to treat the edges of the same layer as directed. The reason I ask is this... in libcugraph we store a directed graph (think of a CSR data structure). If we want to represent an undirected graph then we symmetrize the directed graph when we construct the CSR structure. If a layer you create is inherently directed or undirected and will always be treated that way, then we can easily construct that layer as symmetric or asymmetric when we create the graph and everything will just work. If you need to be able to treat a layer as symmetric (undirected) sometimes and asymmetric (directed) other times, then we need to think about a different solution. |
The use cases I would think of the layer would be inherently directed/undirected, but probably other people will come up with use cases where this is not the case. But given that the Leiden implementation is undirected so far it would probably be easier to first add the multiplex clustering before also adding support for directed graphs. At least that would be the priority for me. The directed graph clustering would be just a nice to have but not really necessary as in most cases i have seen it is just ignored whether the underlying graph is directed or not and just treated as an undirected graph. |
Is this a new feature, an improvement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
Critical (currently preventing usage)
Please provide a clear description of problem this feature solves
The (most used) Leiden implementation in leidenalg supports multiplex clustering, where you can cluster multiple graphs with the same vertices jointly. In the field of single-cell transcriptomics and spatially resolved transcriptomics this can be used to cluster multi-modality data (as done in muon) or to jointly cluster cells based on their features and spatial neighborhoods (as done in SpatialLeiden).
With the increasing datasets (hundred thousands to millions of cells/vertices), runtime for Leiden clustering on the CPU becomes a limiting factor for exploring various parameter combinations.
Describe your ideal solution
The leiden function should support a list (or similar) of graphs as input. Therefore, also the resolution parameter would need to be extended to support a resolution for each graph (layer). Furthermore, a new parameter that gives a weight to each layer corresponding to its "importance" would be needed.
Describe any alternatives you have considered
No response
Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: