[FEA]: Multiplex leiden clustering #4828

niklasmueboe · 2024-12-11T09:00:07Z

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Critical (currently preventing usage)

Please provide a clear description of problem this feature solves

The (most used) Leiden implementation in leidenalg supports multiplex clustering, where you can cluster multiple graphs with the same vertices jointly. In the field of single-cell transcriptomics and spatially resolved transcriptomics this can be used to cluster multi-modality data (as done in muon) or to jointly cluster cells based on their features and spatial neighborhoods (as done in SpatialLeiden).
With the increasing datasets (hundred thousands to millions of cells/vertices), runtime for Leiden clustering on the CPU becomes a limiting factor for exploring various parameter combinations.

Describe your ideal solution

The leiden function should support a list (or similar) of graphs as input. Therefore, also the resolution parameter would need to be extended to support a resolution for each graph (layer). Furthermore, a new parameter that gives a weight to each layer corresponding to its "importance" would be needed.

Describe any alternatives you have considered

No response

Additional context

No response

Code of Conduct

I agree to follow cuGraph's Code of Conduct
I have searched the open feature requests and have found no duplicates for this feature request

abs51295 · 2024-12-11T22:18:38Z

~~I would also consider adding support for directed weighted graphs since scanpy.tl.leiden uses a directed weighted graph with leidenalg package.~~(Nevermind since they are moving to igraph). Also, support for fixing the membership labels for a part of the graph is useful when dealing with merging of two different datasets: https://www.nature.com/articles/s41598-020-71805-1.

ChuckHastings · 2025-01-06T19:39:55Z

This is something we can explore. Within our current cugraph framework, we could potentially support this as follows:

Define edge types for each layer (number the layers from 0 to n)
Create a variation of the Leiden algorithm that considers the layers
Allow for different resolution values for each layer
Allow for certain layers to be ignored (so you don't have to recreate the graph in different scenarios

Does this seem like a reasonable approach? We would need to determine when to address this in our road map.

niklasmueboe · 2025-01-22T13:28:59Z

In this approach would it be possible for one layer to have directed and another to have undirected edges?

ChuckHastings · 2025-01-22T14:22:06Z

As it doesn't exist yet, we can certainly pursue this idea.

The biggest complication is that we don't have a directed version of Leiden yet, so we would have to update our Louvain/Leiden implementation to support directed graphs.

Regarding some layers being directed and other layers being undirected, is this inherent in the construction of these layers, or a virtual abstraction you choose to lay on top of it? That is, when you create layer X... is the data inherently an undirected graph and you will always treat it as undirected and when you create layer Y the data is inherently a directed graph and you will always treat it as directed; or when you create a layer the edges are directed and you sometimes want to treat those directed edges as undirected and other times you want to treat the edges of the same layer as directed.

The reason I ask is this... in libcugraph we store a directed graph (think of a CSR data structure). If we want to represent an undirected graph then we symmetrize the directed graph when we construct the CSR structure. If a layer you create is inherently directed or undirected and will always be treated that way, then we can easily construct that layer as symmetric or asymmetric when we create the graph and everything will just work. If you need to be able to treat a layer as symmetric (undirected) sometimes and asymmetric (directed) other times, then we need to think about a different solution.

niklasmueboe · 2025-01-22T14:54:31Z

The use cases I would think of the layer would be inherently directed/undirected, but probably other people will come up with use cases where this is not the case.

But given that the Leiden implementation is undirected so far it would probably be easier to first add the multiplex clustering before also adding support for directed graphs. At least that would be the priority for me. The directed graph clustering would be just a nice to have but not really necessary as in most cases i have seen it is just ignored whether the underlying graph is directed or not and just treated as an undirected graph.

niklasmueboe added ? - Needs Triage Need team to review and classify feature request New feature or request labels Dec 11, 2024

ChuckHastings mentioned this issue Jan 22, 2025

New Algorithms Parking Lot #3337

Open

ChuckHastings removed the ? - Needs Triage Need team to review and classify label Jan 22, 2025

abs51295 mentioned this issue Jan 22, 2025

[FEA]: Fix labels for a subset of nodes in Leiden clustering #4880

Open

2 tasks

niklasmueboe mentioned this issue Jan 28, 2025

GPU support for SpatialLeiden HiDiHlabs/SpatialLeiden#11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA]: Multiplex leiden clustering #4828

[FEA]: Multiplex leiden clustering #4828

niklasmueboe commented Dec 11, 2024

abs51295 commented Dec 11, 2024 •

edited

Loading

ChuckHastings commented Jan 6, 2025

niklasmueboe commented Jan 22, 2025

ChuckHastings commented Jan 22, 2025

niklasmueboe commented Jan 22, 2025

[FEA]: Multiplex leiden clustering #4828

[FEA]: Multiplex leiden clustering #4828

Comments

niklasmueboe commented Dec 11, 2024

Is this a new feature, an improvement, or a change to existing functionality?

How would you describe the priority of this feature request

Please provide a clear description of problem this feature solves

Describe your ideal solution

Describe any alternatives you have considered

Additional context

Code of Conduct

abs51295 commented Dec 11, 2024 • edited Loading

ChuckHastings commented Jan 6, 2025

niklasmueboe commented Jan 22, 2025

ChuckHastings commented Jan 22, 2025

niklasmueboe commented Jan 22, 2025

abs51295 commented Dec 11, 2024 •

edited

Loading