Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] [BUG] [ENH] [WIP] Bug fixes and enhancements for time-resolved spectral connectivity estimation #104

Merged
merged 51 commits into from
Nov 21, 2022

Conversation

ruuskas
Copy link
Contributor

@ruuskas ruuskas commented Aug 31, 2022

PR Description

This PR closes #90, closes #84, and closes #17. Issues discussed in #70 and #73 are addressed. Please see the commit messages for description of the changes. Unfortunately, many fixes and enhancements are bundled together, as it's impossible to separate them without breaking functionality.

In summary,

  • API is streamlined with mne_connectivity.spectral_connectivity_epochs.
  • The bug mentioned in [BUG] Enable averaging connectivity over tapers instead of averaging PSD over tapers #84 is fixed and connectivity is now averaged over tapers, instead of the power spectra.
  • The issue with too high PLV values mentioned in [DOC] Example for spectral connectivity methods comparing epochs and time #73 is fixed, time-resolved PLV is now correctly averaged over time points before taking the absolute value.
  • An array of connectivity methods is now supported and an array of corresponding connectivity results is returned.
  • The block_size parameter now corresponds to actual block size instead of number of blocks.
  • wPLI and PLI connectivity metrics are added.
  • Connectivity can be averaged over Epochs and frequencies
  • Frequencies for multitapering are obtained automatically similarly to spectral_connectivity_epochs.

The following ZIP archive contains a comparison between the current release version, the version of this PR, and the HyPyP implementation of spectral connectivity (mne_connectivity_testing_random_data.html). This shows that the issue with too high connectivity scores with random data is now fixed. The archive also contains a comparison with HyPyP using MNE-Python example data.

examples.zip

I marked this as WIP as I still have some concerns. Namely,

  • What is the purpose of spectral smoothing in connectivity estimation, is it necessary and is there some good reference material?
  • Are the computed connectivity values correct, how to verify this?
  • The implementation is extremely slow compared to HyPyP, even with parallel computation enabled. This is due to the way connectivity scores are computed between individual signals in a for-loop. I'm considering vectorizing the computation and using numpy.einsum as is done in HyPyP. With the current implementation, computing source space connectivity scores for the example dataset takes about 20 minutes per connectivity metric when the aparc_sub parcellation (~400 parcels) is used, whereas HyPyP takes less than two minutes for the same task on my local hardware (please see the attached example for details).
  • The regression test fails due to assertion error. However, this is explained by the (apparently) erroneous code in the original.
  • Should there be more logging?

Merge checklist

Maintainer, please confirm the following before merging:

  • All comments resolved
  • This is not your own PR
  • All CIs are happy
  • PR title starts with [MRG]
  • whats_new.rst is updated
  • PR description includes phrase "closes <#issue-number>"

…, then average

FIX: add working support for computation of multiple connectivity metrics at once, as indicated by existing docstring

FIX: correct calculation of PLV and coherence connectivity metrics

FIX: block_size parameter now actually corresponds to the size of blocks instead of number of blocks

ENH: add PLI and wPLI connectivity metrics

ENH: improve docstring and typechecks in code

ENH: streamline the public API with mne_connectivity.spectral_connectivity_epochs

ENH: enable averaging connectivity results over frequencies and epochs
Copy link
Member

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @ruuskas! Excuse me for asking possibly dumb questions, but I'm not as familiar with the time-resolved connectivity as epochs.

My understanding is that spectral_connectivity_epochs is that each Epoch is seen as a "sample" of connectivity, and then the difference is that the spectral connectivity either preserves the time-dimension, or not by estimating connectivity over time.

How does this differ from your proposal for spectral_connectivity_time? Is there a good reference that explains the differences? I suspect it would be nice to document this for users and our own sake to keep things straight in our head.

Copy link
Member

@britta-wstnr britta-wstnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your PR! I did not make it all the way through right now, but left a few comments and questions. My biggest question is: is this function trying to do too much? Can we split it in two functions, e.g. based on the different underlying methods used (Morlet wavelets, multitapers etc)? If I get it correctly, some of the parameters rule each other out right now?
I wonder if the amount of parameters will be overwhelming for the user. What do you think, @adam2392 ?

mne_connectivity/spectral/time.py Outdated Show resolved Hide resolved
Comment on lines 88 to 90
cwt_freqs : array
Array of frequencies of interest for time-frequency decomposition.
Only used in 'cwt_morlet' mode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to fmin and fmax if this is supplied?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_compute_freq_mask from mne_connectivity.spectral.epochs.py is used, and hence the frequencies outside the range specified by fmin and fmax are not used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine, but this behavior is not clear from the documentation.Hence why I wondered if this should be two different functions to avoid such confusion,.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it might be a good idea to break this up into two wrapper functions and make the existing function private. How would you go about naming such new functions? If this is done, then I believe spectral_connectivity_epochs should also be broken up, which comes with the cost of breaking the code of everyone currently using it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh fair point. Alternatively we could introduce a methods parameter and then be very clear in the documentation of which other parameters belong to which method. And throw clear error messages if "illegal" combinations are used. Thoughts, @adam2392, and maybe @larsoner ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(in other words, I'm somewhat in favor of keeping this as one function, if the only reason for splitting it is that you want to expose a few non-overlapping params from each of the two methods)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely one option. I don't have a strong opinion here, what about @adam2392, @britta-wstnr ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay for changing then to method_kwargs to keep in line with mne-Python!

Copy link
Contributor Author

@ruuskas ruuskas Sep 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at going for a **method_kw argument like in the case of psd above. However, I think this only adds confusion, as some of the method parameters are required, not optional as you would expect from additional keyword arguments. Some of the method parameters are also shared, and both shared and non-shared parameters are also used in the spectral_connectivity_time function itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah in that case 2 separate functions is probably the right choice then

mne_connectivity/spectral/time.py Show resolved Hide resolved
ruuskas and others added 3 commits September 1, 2022 10:25
Co-authored-by: Adam Li <adam2392@gmail.com>
Co-authored-by: Adam Li <adam2392@gmail.com>
@ruuskas
Copy link
Contributor Author

ruuskas commented Sep 1, 2022

Reply to @adam2392

My understanding is that spectral_connectivity_epochs is that each Epoch is seen as a "sample" of connectivity, and then the difference is that the spectral connectivity either preserves the time-dimension, or not by estimating connectivity over time.

How does this differ from your proposal for spectral_connectivity_time? Is there a good reference that explains the differences? I suspect it would be nice to document this for users and our own sake to keep things straight in our head.

I apologize if this answer is too basic, but here's how I understand the difference between time-resolved connectivity and connectivity evaluated across trials.

My understanding is that when connectivity is computed over epochs (or trials), each epoch is seen as an independent realization of the same event, and the connectivity scores are computed as phase differences between the different trials at each time point. That is, the difference in phase is evaluated across trials. Therefore, the connectivity score measures how consistent the phase is across trials, and this type of connectivity estimation is suitable for evoked responses.

When computing time-resolved connectivity, the difference in phase is evaluated over time within each epoch. Hence, the connectivity score measures the consistency of the phase difference over time within an epoch. The scores for each epoch can then be averaged, or we could compute the standard deviation to assess dynamic connectivity. This type of connectivity estimation is useful for resting state measurements.

Equations 1 and 2 in Bruna et al., 2018 show the difference between connectivity over trials and time-resolved connectivity for the phase locking value. Mike X Cohen also covers the topic in this short YouTube video.

ruuskas and others added 2 commits September 1, 2022 11:13
…putation may be None

Co-authored-by: Britta Westner <britta.wstnr@gmail.com>
Co-authored-by: Britta Westner <britta.wstnr@gmail.com>
@adam2392
Copy link
Member

adam2392 commented Sep 1, 2022

Thank you for your PR! I did not make it all the way through right now, but left a few comments and questions. My biggest question is: is this function trying to do too much? Can we split it in two functions, e.g. based on the different underlying methods used (Morlet wavelets, multitapers etc)? If I get it correctly, some of the parameters rule each other out right now?
I wonder if the amount of parameters will be overwhelming for the user. What do you think, @adam2392 ?

Makes sense to break it up. Imo I ALWAYS get confused if I don't look at the function for awhile, so I think for a user, it must be worse. I think tho refactoring the code could be a downstream PR, where we possibly even introduce a deprecation cycle?

For now, based on #104 (comment), we can move forward by just having sensible error checks/messages? How does that sound? @ruuskas and @britta-wstnr

@ruuskas
Copy link
Contributor Author

ruuskas commented Sep 2, 2022

Quote from @adam2392

Makes sense to break it up. Imo I ALWAYS get confused if I don't look at the function for awhile, so I think for a user, it must be worse. I think tho refactoring the code could be a downstream PR, where we possibly even introduce a deprecation cycle?

For now, based on #104 (comment), we can move forward by just having sensible error checks/messages? How does that sound? @ruuskas and @britta-wstnr

I will add some error checks to the main spectral_connectivity_time function, and we'll see if it adds too much clutter/should be moved to a separate helper function.

@ruuskas
Copy link
Contributor Author

ruuskas commented Nov 9, 2022 via email

@adam2392
Copy link
Member

adam2392 commented Nov 9, 2022

Both this and the previous implementation are incorrect as far as my understanding goes. One could argue that the new implementation is more incorrect based on the tests I posted in #84.

Ahh... I see. So perhaps, let's figure out what's going on via #84 before this PR moves forward? You can always keep this PR open and open up a larger one whenever we converge in the GH discussion. WDYT?

@ruuskas
Copy link
Contributor Author

ruuskas commented Nov 10, 2022

Ahh... I see. So perhaps, let's figure out what's going on via #84 before this PR moves forward? You can always keep this PR open and open up a larger one whenever we converge in the GH discussion. WDYT?

Sure. We shall continue the discussion there.

Compute a weighted average of the tapered cross spectra when using
the multitaper mode. Weighting is
derived from the concentration ratios between the DPSS windows.
@ruuskas
Copy link
Contributor Author

ruuskas commented Nov 14, 2022

Hi @adam2392. I made the changes here as well, so the CSD is now correctly aggregated with a weighted average and connectivity is computed thereafter. This closes #84.

@@ -410,11 +411,25 @@ def _spectral_connectivity(data, method, kernel, foi_idx,
data, sfreq, freqs, n_cycles=n_cycles, output='complex',
decim=decim, n_jobs=n_jobs, **kw_cwt)
out = np.expand_dims(out, axis=2) # same dims with multitaper
weights = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments perhaps in the docstring for future developers of what the weights are intended to doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this and tried to make the docstring better in general. For some reason, Sphinx now gives this error

mne-connectivity/mne_connectivity/spectral/time.py:docstring of mne_connectivity.spectral.time.spectral_connectivity_time:59: WARNING: Inline literal start-string without end-string.

I don't see any rogue inline literal start-strings and the HTML looks good. A quick Google search suggests there might be an issue with the configuration (or it's just me).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the changes into the PR and I can take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added them already after posting the comment above.

Comment on lines 420 to 433
if isinstance(n_cycles, (int, float)):
n_cycles = [n_cycles] * len(freqs)
mt_bandwidth = mt_bandwidth if mt_bandwidth else 4
n_tapers = int(np.floor(mt_bandwidth - 1))
weights = np.zeros((n_tapers, len(freqs), out.shape[-1]))
for i, (f, n_c) in enumerate(zip(freqs, n_cycles)):
window_length = np.arange(0., n_c / float(f), 1.0 / sfreq).shape[0]
half_nbw = mt_bandwidth / 2.
n_tapers = int(np.floor(mt_bandwidth - 1))
_, eigvals = dpss_windows(window_length, half_nbw, n_tapers,
sym=False)
weights[:, i, :] = np.sqrt(eigvals[:, np.newaxis])
# weights have shape (n_tapers, n_freqs, n_times)
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larsoner, @drammock, @britta-wstnr how does this look to you?

Made the docstring more stylish, removed unnecessary things and
added better compliance with MNE-Python style guidelines.
The _spectral_connectivity function doesn't need defaults as these are
already spelled out in the main spectral_connectivity_time function
signature.
@ruuskas
Copy link
Contributor Author

ruuskas commented Nov 15, 2022

Hey @ruuskas sorry for the delay. Been swamped. I will try to do a more comprehensive review tonight!

Hi @adam2392. No worries, thanks! In the meanwhile, I've done four of the things I listed in an earlier comment, they're in the fork on separate branches for now.

  • Adding corrected imaginary PLV as a connectivity metric
  • Making the code more efficient by computing all spectral measures with a single run of the pairwise computation part. This should make the computation faster, as the most time-consuming step w[:, w_y] * np.conj(w[:, w_x]) would only be computed once instead of separately for each metric.
  • Adding the possibility of specifying the frequencies of interest instead of frequency bands.
  • (Perhaps) Adding filter-Hilbert as a method of obtaining the analytic signal.
  • (Perhaps) Removing the spectral smoothing, which would allow for a more efficient implementation of PLV (and ciPLV), using the algorithm presented in Bruna et al., 2018. This cannot be done if you want to keep the option for smoothing. I don't know if people use smoothing in general. Based on the tests in #73, smoothing seems to increase the ratio between the PLV of coupled signals and noise. However, those simulations were done with the earlier different implementation of PLV, whose interpretation is more complex.

What do you think of this, do you want any of this to be merged into this module? The first three points are done and have been merged into the main branch in my fork. The 'hilbert' branch contains the fourth one although its otherwise outdated as I started to think that maybe Hilbert is not necessary for me.

I think it would be good to at least follow the suggestion from @britta-wstnr and @drammock and split the function into spectral_connectivity_time_multitaper, spectral_connectivity_time_morlet and potentially spectral_connectivity_time_hilbert.

If any of this is included, would it be best to put everything in separate PRs, or make another big one?

Copy link
Member

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @ruuskas !

We just want to fix the CI stuff and add an entry to whats_new.rst now

@ruuskas
Copy link
Contributor Author

ruuskas commented Nov 17, 2022

Nice work @ruuskas !

We just want to fix the CI stuff and add an entry to whats_new.rst now

Thanks @adam2392 ! I added the whats_new.rst entry though I'm not sure if it's good.

Notably, averaging over time when computing the metrics is strictly speaking not a bugfix, rather it is the behavior one would expect given the co-existing spectral_connectivity_epochs function (i.e., the result is static connectivity over time vs static connectivity over trials).

The time-resolved implementation found in frites.conn.conn_spec introduces a bias to the metrics and is harder to interpret (requires surrogate data which is not practical due to computational constraints). This was discussed in #73. Hence, I would suggest that we merge this PR as it is a more standard-ish way to implement resting-state connectivity analysis and corresponds to the formulations of the metrics in the relevant papers. If time-resolved connectivity is still wanted, it should probably be ported again since frites.conn.conn_spec has changed quite a bit after this function was originally ported over, although the implementation in this PR could also easily be changed to compute time-resolved connectivity. In any case such an approach isn't really suitable for all-to-all connectivity as connectivity matrices would have shape (n_epochs, n_channels, n_channels, n_freqs, n_times).

@larsoner
Copy link
Member

CIs all better

@ruuskas
Copy link
Contributor Author

ruuskas commented Nov 17, 2022

CIs all better

Thanks @larsoner. Is it a bug that :func: and :class: are required for Sphinx?

@larsoner
Copy link
Member

They are not strictly required, it's just more explicit. Without them the default_role kicks in, so if it ever changes, the ones without the :func: etc. would start failing. The real fix was the

``whatever``th

to

``whatever``-th

the non-alphanumeric made it clear that the inline code markup had ended

Copy link
Member

@adam2392 adam2392 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM now! Thanks a lot @ruuskas.

As a next step, we can tackle the issues you listed in a single PR, or multiple PRs.

@ruuskas ruuskas changed the title [BUG] [ENH] [WIP] Bug fixes and enhancements for time-resolved spectral connectivity estimation [MRG] [BUG] [ENH] [WIP] Bug fixes and enhancements for time-resolved spectral connectivity estimation Nov 18, 2022
@ruuskas
Copy link
Contributor Author

ruuskas commented Nov 21, 2022

Thanks @adam2392. It will probably be easiest to combine the further updates into one PR again as there have been breaking changes here after they were initially implemented.

@adam2392
Copy link
Member

Okay I am merging this for now. Any issues, feel free to make a GH issue!

@ruuskas feel free to start the next PR! Will review whenever yiu need me to. Thanks for all the great contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants