Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix eigen decomposition; new feature to store all optimization tries; and updates to documentation #95

Merged
merged 12 commits into from
Nov 14, 2024
20 changes: 11 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ There are also additional post-processing and analysis functions, including:
Optimal scale selection [6] is performed by default with the run function but can be repeated with different parameters if needed, see `pygenstability/optimal_scales.py`. To reduce noise, e.g., one can increase the parameter values for `block_size` and `window_size`. The optimal network partitions can then be plotted given a NetworkX nx_graph.

```python
results = pgs.identify_optimal_scales(results, block_size=10, window_size=5)
results = pgs.identify_optimal_scales(results, kernel_size=10, window_size=5)
pgs.plot_optimal_partitions(nx_graph, results)
```

Expand Down Expand Up @@ -134,7 +134,7 @@ We provide an easy-to-use interface in our `pygenstability.data_clustering.py` m

```python
clustering = pgs.DataClustering(
graph_method="cknn",
graph_method="cknn-mst",
k=5,
constructor="continuous_normalized")

Expand All @@ -146,7 +146,7 @@ clustering.scale_selection(kernel_size=0.2)
clustering.plot_scan()
```

We currently support $k$-Nearest Neighbor (kNN) and Continuous $k$-Nearest Neighbor (CkNN) [10] graph constructions (specified by `graph_method`) and `k` refers to the number of neighbours considered in the construction. See documentation for a list of all parameters. All functionalities of PyGenStability including plotting and scale selection are also available for data clustering. For example, given two-dimensional coordinates of the data points one can plot the optimal partitions directly:
We currently support $k$-Nearest Neighbor (kNN) and Continuous $k$-Nearest Neighbor (CkNN) [10] graph constructions (specified by `graph_method`) augmented with the minimum spanning tree to guarentee connectivity, where `k` refers to the number of neighbours considered in the construction. See documentation for a list of all parameters. All functionalities of PyGenStability including plotting and scale selection are also available for data clustering. For example, given two-dimensional coordinates of the data points one can plot the optimal partitions directly:

```python
# plot robust partitions
Expand All @@ -168,11 +168,13 @@ Please cite our paper if you use this code in your own work:
```
@article{pygenstability,
author = {Arnaudon, Alexis and Schindler, Dominik J. and Peach, Robert L. and Gosztolai, Adam and Hodges, Maxwell and Schaub, Michael T. and Barahona, Mauricio},
title = {PyGenStability: Multiscale community detection with generalized Markov Stability},
publisher = {arXiv},
year = {2023},
doi = {10.48550/ARXIV.2303.05385},
url = {https://arxiv.org/abs/2303.05385}
title = {Algorithm 1044: PyGenStability, a Multiscale Community Detection Framework with Generalized Markov Stability},
journal = {ACM Trans. Math. Softw.},
volume = {50},
number = {2},
pages = {15:1–15:8}
year = {2024},
doi = {10.1145/3651225}
}
```

Expand Down Expand Up @@ -248,7 +250,7 @@ If you are interested in trying our other packages, see the below list:

[6] D. J. Schindler, J. Clarke, and M. Barahona, 'Multiscale Mobility Patterns and the Restriction of Human Movement', *Royal Society Open Science*, vol. 10, no. 10, p. 230405, Oct. 2023, doi: 10.1098/rsos.230405.

[7] A. Arnaudon, D. J. Schindler, R. L. Peach, A. Gosztolai, M. Hodges, M. T. Schaub, and M. Barahona, 'PyGenStability: Multiscale community detection with generalized Markov Stability', *arXiv pre-print*, Mar. 2023, doi: 10.48550/arXiv.2303.05385.
[7] A. Arnaudon, D. J. Schindler, R. L. Peach, A. Gosztolai, M. Hodges, M. T. Schaub, and M. Barahona, 'Algorithm 1044: PyGenStability, a Multiscale Community Detection Framework with Generalized Markov Stability', *ACM Trans. Math. Softw.*, vol. 50, no. 2, p. 15:1–15:8, Jun. 2024, doi: 10.1145/3651225.

[8] S. Gomez, P. Jensen, and A. Arenas, 'Analysis of community structure in networks of correlated data'. *Physical Review E*, vol. 80, no. 1, p. 016114, Jul. 2009, doi: 10.1103/PhysRevE.80.016114.

Expand Down
21 changes: 12 additions & 9 deletions docs/index_readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ There are also additional post-processing and analysis functions, including:
Optimal scale selection [6] is performed by default with the run function but can be repeated with different parameters if needed, see `pygenstability/optimal_scales.py`. To reduce noise, e.g., one can increase the parameter values for `block_size` and `window_size`. The optimal network partitions can then be plotted given a NetworkX nx_graph.

```python
results = pgs.identify_optimal_scales(results, block_size=10, window_size=5)
results = pgs.identify_optimal_scales(results, kernel_size=10, window_size=5)
pgs.plot_optimal_partitions(nx_graph, results)
```

Expand Down Expand Up @@ -119,7 +119,7 @@ We provide an easy-to-use interface in our `pygenstability.data_clustering.py` m

```python
clustering = pgs.DataClustering(
graph_method="cknn",
graph_method="cknn-mst",
k=5,
constructor="continuous_normalized")

Expand All @@ -131,7 +131,7 @@ clustering.scale_selection(kernel_size=0.2)
clustering.plot_scan()
```

We currently support $k$-Nearest Neighbor (kNN) and Continuous $k$-Nearest Neighbor (CkNN) [10] graph constructions (specified by `graph_method`) and `k` refers to the number of neighbours considered in the construction. See documentation for a list of all parameters. All functionalities of PyGenStability including plotting and scale selection are also available for data clustering. For example, given two-dimensional coordinates of the data points one can plot the optimal partitions directly:
We currently support $k$-Nearest Neighbor (kNN) and Continuous $k$-Nearest Neighbor (CkNN) [10] graph constructions (specified by `graph_method`) augmented with the minimum spanning tree to guarentee connectivity, where `k` refers to the number of neighbours considered in the construction. See documentation for a list of all parameters. All functionalities of PyGenStability including plotting and scale selection are also available for data clustering. For example, given two-dimensional coordinates of the data points one can plot the optimal partitions directly:

```python
# plot robust partitions
Expand All @@ -153,11 +153,13 @@ Please cite our paper if you use this code in your own work:
```
@article{pygenstability,
author = {Arnaudon, Alexis and Schindler, Dominik J. and Peach, Robert L. and Gosztolai, Adam and Hodges, Maxwell and Schaub, Michael T. and Barahona, Mauricio},
title = {PyGenStability: Multiscale community detection with generalized Markov Stability},
publisher = {arXiv},
year = {2023},
doi = {10.48550/ARXIV.2303.05385},
url = {https://arxiv.org/abs/2303.05385}
title = {Algorithm 1044: PyGenStability, a Multiscale Community Detection Framework with Generalized Markov Stability},
journal = {ACM Trans. Math. Softw.},
volume = {50},
number = {2},
pages = {15:1–15:8}
year = {2024},
doi = {10.1145/3651225}
}
```

Expand Down Expand Up @@ -233,7 +235,7 @@ If you are interested in trying our other packages, see the below list:

[6] D. J. Schindler, J. Clarke, and M. Barahona, 'Multiscale Mobility Patterns and the Restriction of Human Movement', *Royal Society Open Science*, vol. 10, no. 10, p. 230405, Oct. 2023, doi: 10.1098/rsos.230405.

[7] A. Arnaudon, D. J. Schindler, R. L. Peach, A. Gosztolai, M. Hodges, M. T. Schaub, and M. Barahona, 'PyGenStability: Multiscale community detection with generalized Markov Stability', *arXiv pre-print*, Mar. 2023, doi: 10.48550/arXiv.2303.05385.
[7] A. Arnaudon, D. J. Schindler, R. L. Peach, A. Gosztolai, M. Hodges, M. T. Schaub, and M. Barahona, 'Algorithm 1044: PyGenStability, a Multiscale Community Detection Framework with Generalized Markov Stability', *ACM Trans. Math. Softw.*, vol. 50, no. 2, p. 15:1–15:8, Jun. 2024, doi: 10.1145/3651225.

[8] S. Gomez, P. Jensen, and A. Arenas, 'Analysis of community structure in networks of correlated data'. *Physical Review E*, vol. 80, no. 1, p. 016114, Jul. 2009, doi: 10.1103/PhysRevE.80.016114.

Expand All @@ -248,3 +250,4 @@ This program is free software: you can redistribute it and/or modify it under th
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

100 changes: 74 additions & 26 deletions src/pygenstability/constructors.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,13 @@ def load_constructor(constructor, graph, **kwargs):
"""Load a constructor from its name, or as a custom Constructor class."""
if isinstance(constructor, str):
if graph is None:
raise Exception(f"No graph was provided with a generic constructor {constructor}")
raise Exception(
f"No graph was provided with a generic constructor {constructor}"
)
try:
return getattr(sys.modules[__name__], f"constructor_{constructor}")(graph, **kwargs)
return getattr(sys.modules[__name__], f"constructor_{constructor}")(
graph, **kwargs
)
except AttributeError as exc:
raise Exception(f"Could not load constructor {constructor}") from exc
if not isinstance(constructor, Constructor):
Expand All @@ -53,8 +57,8 @@ def limit(*args, **kwargs):


def _compute_spectral_decomp(matrix):
"""Solve eigenalue problem."""
lambdas, v = la.eig(matrix.toarray())
"""Solve eigenalue problem for symmetric matrix."""
lambdas, v = la.eigh(matrix.toarray())
vinv = la.inv(v.real)
return lambdas.real, v.real, vinv

Expand All @@ -67,7 +71,9 @@ def _check_total_degree(degrees):

def _get_spectral_gap(laplacian):
"""Compute spectral gap."""
spectral_gap = np.round(max(np.real(sp.linalg.eigs(laplacian, which="SM", k=2)[0])), 8)
spectral_gap = np.round(
max(np.real(sp.linalg.eigs(laplacian, which="SM", k=2)[0])), 8
)
L.info("Spectral gap = 10^%s", np.around(np.log10(spectral_gap), 2))
return spectral_gap

Expand All @@ -80,7 +86,9 @@ class Constructor:
to return quality matrix, null model, and possible global shift.
"""

def __init__(self, graph, with_spectral_gap=False, exp_comp_mode="spectral", **kwargs):
def __init__(
self, graph, with_spectral_gap=False, exp_comp_mode="spectral", **kwargs
):
"""The constructor calls the prepare method upon initialisation.

Args:
Expand All @@ -105,7 +113,9 @@ def __init__(self, graph, with_spectral_gap=False, exp_comp_mode="spectral", **k
def _get_exp(self, scale):
"""Compute matrix exponential at a given scale."""
if self.exp_comp_mode == "expm":
exp = sp.linalg.expm(-scale * self.partial_quality_matrix.toarray().astype(_DTYPE))
exp = sp.linalg.expm(
-scale * self.partial_quality_matrix.toarray().astype(_DTYPE)
)
if self.exp_comp_mode == "spectral":
lambdas, v, vinv = self.spectral_decomp
exp = v @ np.diag(np.exp(-scale * lambdas)) @ vinv
Expand Down Expand Up @@ -168,20 +178,28 @@ def _get_data(self, scale):
class constructor_continuous_combinatorial(Constructor):
r"""Constructor for continuous combinatorial Markov Stability.

The quality matrix is:
This implementation follows equation (16) in [1]_. The quality matrix is:

.. math::

F(t) = \Pi\exp(-Lt)
F(t) = \Pi\exp(-tL/<d>)

where :math:`L=D-A` is the combinatorial Laplacian and :math:`\Pi=\mathrm{diag}(\pi)`,
where :math:`<d>=(\boldsymbol{1}^T D \boldsymbol{1})/N` is the average degree,
:math:`L=D-A` is the combinatorial Laplacian and :math:`\Pi=\mathrm{diag}(\pi)`,
with null model :math:`v_1=v_2=\pi=\frac{\boldsymbol{1}}{N}`.

References:
.. [1] Lambiotte, R., Delvenne, J.-C., & Barahona, M. (2019). Random Walks, Markov
Processes and the Multiscale Modular Organization of Complex Networks.
IEEE Trans. Netw. Sci. Eng., 1(2), p. 76-90.
"""

@_limit_numpy
def prepare(self, **kwargs):
"""Prepare the constructor with non-scale dependent computations."""
laplacian, degrees = sp.csgraph.laplacian(self.graph, return_diag=True, normed=False)
laplacian, degrees = sp.csgraph.laplacian(
self.graph, return_diag=True, normed=False
)
_check_total_degree(degrees)
laplacian /= degrees.mean()
pi = np.ones(self.graph.shape[0]) / self.graph.shape[0]
Expand All @@ -208,20 +226,27 @@ def _get_data(self, scale):
class constructor_continuous_normalized(Constructor):
r"""Constructor for continuous normalized Markov Stability.

The quality matrix is:
This implementation follows equation (10) in [1]_. The quality matrix is:

.. math::

F(t) = \Pi\exp(-Lt)
F(t) = \Pi\exp(-tL)

where :math:`L=D^{-1}(D-A)` is the random-walk normalized Laplacian and
:math:`\Pi=\mathrm{diag}(\pi)` with null model :math:`v_1=v_2=\pi=\frac{d}{2M}`.

References:
.. [1] Lambiotte, R., Delvenne, J.-C., & Barahona, M. (2019). Random Walks, Markov
Processes and the Multiscale Modular Organization of Complex Networks.
IEEE Trans. Netw. Sci. Eng., 1(2), p. 76-90.
"""

@_limit_numpy
def prepare(self, **kwargs):
"""Prepare the constructor with non-scale dependent computations."""
laplacian, degrees = sp.csgraph.laplacian(self.graph, return_diag=True, normed=False)
laplacian, degrees = sp.csgraph.laplacian(
self.graph, return_diag=True, normed=False
)
_check_total_degree(degrees)
normed_laplacian = sp.diags(1.0 / degrees).dot(laplacian)

Expand Down Expand Up @@ -250,12 +275,12 @@ def _get_data(self, scale):
class constructor_signed_modularity(Constructor):
"""Constructor of signed modularity.

This implementation is equation (18) of [1]_, where quality is the adjacency matrix and
This implementation is equation (18) of [2]_, where quality is the adjacency matrix and
the null model is the difference between the standard modularity null models based on
positive and negative degree vectors.

References:
.. [1] Gomez, S., Jensen, P., & Arenas, A. (2009). Analysis of community structure in
.. [2] Gomez, S., Jensen, P., & Arenas, A. (2009). Analysis of community structure in
networks of correlated data. Physical Review E, 80(1), 016114.
"""

Expand Down Expand Up @@ -293,16 +318,20 @@ def _get_data(self, scale):
class constructor_signed_combinatorial(Constructor):
r"""Constructor for continuous signed combinatorial Markov Stability.

The quality matrix is:
This implementation follows equation (19) in [3]_. The quality matrix is:

.. math::

F(t) = \exp(-Lt)^T\exp(-Lt)
F(t) = \exp(-tL)^T\exp(-tL)

where :math:`L=D_{\mathrm{abs}}-A` is the signed combinatorial Laplacian,
:math:`D_{\mathrm{abs}}=\mathrm{diag}(d_\mathrm{abs})` the diagonal matrix of absolute node
strengths :math:`d_\mathrm{abs}`, and the associated null model is given by
:math:`v_1=v_2=\boldsymbol{0}`, where :math:`\boldsymbol{0}` is the vector of zeros.

References:
.. [3] Schaub, M., Delvenne, J.-C., Lambiotte, R., & Barahona, M. (2019). Multiscale
dynamical embeddings of complex networks. Physical Review E, 99(6), 062308.
"""

def prepare(self, **kwargs):
Expand Down Expand Up @@ -337,7 +366,8 @@ class constructor_directed(Constructor):
where :math:`I` denotes the identity matrix, :math:`M(\alpha)` is the transition matrix of a
random walk with teleportation and damping factor :math:`0\le \alpha < 1`, and
:math:`\Pi=\mathrm{diag}(\pi)` for the associated null model :math:`v_1=v_2=\pi` given by the
eigenvector solving :math:`\pi M(\alpha) = \pi`, which is related to PageRank.
eigenvector solving :math:`\pi M(\alpha) = \pi`, which is related to PageRank. See [1]_ for
details.

The transition matrix :math:`M(\alpha)` is given by

Expand All @@ -349,12 +379,19 @@ class constructor_directed(Constructor):
where :math:`D` denotes the diagonal matrix of out-degrees with :math:`D_{ii}=1` if the
out-degree :math:`d_i=0` and :math:`a` denotes the vector of dangling nodes, i.e. :math:`a_i=1`
if the out-degree :math:`d_i=0` and :math:`a_i=0` otherwise.

References:
.. [1] Lambiotte, R., Delvenne, J.-C., & Barahona, M. (2019). Random Walks, Markov
Processes and the Multiscale Modular Organization of Complex Networks.
IEEE Trans. Netw. Sci. Eng., 1(2), p. 76-90.
"""

@_limit_numpy
def prepare(self, **kwargs):
"""Prepare the constructor with non-scale dependent computations."""
assert self.exp_comp_mode == "expm"
assert (
self.exp_comp_mode == "expm"
), 'exp_comp_mode="expm" is required for "constructor_directed"'

alpha = kwargs.get("alpha", 0.8)
n_nodes = self.graph.shape[0]
Expand All @@ -366,11 +403,17 @@ def prepare(self, **kwargs):

self.partial_quality_matrix = sp.csr_matrix(
alpha * np.diag(dinv).dot(self.graph.toarray())
+ ((1 - alpha) * np.diag(np.ones(n_nodes)) + np.diag(alpha * (dinv == 0.0))).dot(ones)
+ (
(1 - alpha) * np.diag(np.ones(n_nodes)) + np.diag(alpha * (dinv == 0.0))
).dot(ones)
- np.eye(n_nodes)
)

pi = abs(sp.linalg.eigs(self.partial_quality_matrix.transpose(), which="SM", k=1)[1][:, 0])
pi = abs(
sp.linalg.eigs(self.partial_quality_matrix.transpose(), which="SM", k=1)[1][
:, 0
]
)
pi /= pi.sum()
self.partial_null_model = np.array([pi, pi])

Expand Down Expand Up @@ -424,9 +467,10 @@ def prepare(self, **kwargs):

self.partial_quality_matrix = sp.csr_matrix(
alpha * np.diag(dinv).dot(self.graph.toarray())
+ ((1 - alpha) * np.diag(np.ones(n_nodes)) + np.diag(alpha * (dinv == 0.0))).dot(
ones
)
+ (
(1 - alpha) * np.diag(np.ones(n_nodes))
+ np.diag(alpha * (dinv == 0.0))
).dot(ones)
- np.eye(n_nodes)
)

Expand All @@ -435,7 +479,11 @@ def prepare(self, **kwargs):
sp.diags(dinv).dot(self.graph) - sp.diags(np.ones(n_nodes))
)

pi = abs(sp.linalg.eigs(self.partial_quality_matrix.transpose(), which="SM", k=1)[1][:, 0])
pi = abs(
sp.linalg.eigs(self.partial_quality_matrix.transpose(), which="SM", k=1)[1][
:, 0
]
)
pi /= pi.sum()
self.partial_null_model = np.array([pi, pi])

Expand Down
Loading
Loading