Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSOC] Add decoding module #193

Merged
merged 43 commits into from
Jun 19, 2024
Merged

[GSOC] Add decoding module #193

merged 43 commits into from
Jun 19, 2024

Conversation

tsbinns
Copy link
Collaborator

@tsbinns tsbinns commented May 29, 2024

As described in #182.

This adds support for a base class with fit and transform methods, as well as daughter classes for the CaCoh and MIC methods. The code takes advantage of the MNE-Python classes for computing the CSD and the existing MNE-Connectivity estimator classes for computing the filters and patterns.

As of now there are no unit tests. These will be added.

As of now there are no examples/tutorials. These will be added.

As of now there are no plotting methods. These will come in a separate PR.

As of now there is only support for a single component. This is described more in #183 and will come in a separate PR.

To slightly reduce the size of this PR's diff, I opened a separate one (#192) which adds support for storing the filters used for CaCoh and MIC.

@tsbinns tsbinns closed this May 29, 2024
@tsbinns tsbinns reopened this May 29, 2024
Copy link
Collaborator

@wmvanvliet wmvanvliet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start and super neat code. There are some design choices here regarding what to implement at what layer of abstraction that raise some questions (see comments in code).

mne_connectivity/decoding/decomposition.py Outdated Show resolved Hide resolved
mne_connectivity/decoding/decomposition.py Outdated Show resolved Hide resolved
mne_connectivity/decoding/decomposition.py Show resolved Hide resolved
mne_connectivity/spectral/epochs_multivariate.py Outdated Show resolved Hide resolved
mne_connectivity/decoding/decomposition.py Show resolved Hide resolved
@wmvanvliet
Copy link
Collaborator

I like the tutorial (cannot really call it an "example" anymore, but mne-connectivity does not have a distinction between examples and tutorials). It explained a lot.

@tsbinns
Copy link
Collaborator Author

tsbinns commented Jun 13, 2024

That's comments of @wmvanvliet and @larsoner addressed!

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wmvanvliet feel free to merge if you're happy!

@drammock
Copy link
Member

please wait, I am in the middle of reviewing now

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: I didn't review the tests.

mne_connectivity/spectral/epochs_multivariate.py Outdated Show resolved Hide resolved
mne_connectivity/spectral/epochs_multivariate.py Outdated Show resolved Hide resolved
mne_connectivity/spectral/epochs_multivariate.py Outdated Show resolved Hide resolved
mne_connectivity/spectral/epochs_multivariate.py Outdated Show resolved Hide resolved
mne_connectivity/decoding/decomposition.py Outdated Show resolved Hide resolved
mne_connectivity/decoding/decomposition.py Outdated Show resolved Hide resolved
mne_connectivity/decoding/decomposition.py Outdated Show resolved Hide resolved
Comment on lines 205 to 208
# As you can see, the connectivity profile of the transformed data using filters fit on
# the first 30 epochs is very similar to the connectivity profile when using filters fit
# on the last 30 epochs. This shows that the filters are generalisable, able to extract
# the same components of connectivity which they were trained on from new data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worth pointing out that connectivity estimates outside the 15-20 Hz band are lower when the filters are trained on different data (i.e., better SNR)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now see that @wmvanvliet had a similar comment to mine, in a resolved comment below. I do think though that it's worth mentioning the SNR point here too, in addition to the overfitting point made later using the sliding window example.

Copy link
Collaborator Author

@tsbinns tsbinns Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can add this. How about something like:

"As you can see, the connectivity profile of the transformed data using filters fit on the first 30 epochs is very similar to the connectivity profile when using filters fit on the last 30 epochs. This shows that the filters are generalisable, able to extract the same components of connectivity which they were trained on from new data. Additionally, by fitting the filters to only a select frequency band, we can avoid overfitting in frequency bins where no interactions are present (here outside of the 15-20 Hz band) and thereby enhance the overall signal-to-noise ratio of our connectivity estimates."

examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved
Comment on lines 419 to 421
# We use the latter approach below, fitting the filters to the 15-20 Hz band and using
# the ``"imcoh"`` method in the call to the ``spectral_connectivity_...()`` functions.
# Plotting the results, we see a peak in connectivity at 15-20 Hz.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something is confusing me here. The first example faked some data with simulated connectivity in the 15-20Hz band, fit filters from 5-35 Hz, and recovered the fact that connectivity was higher in the 15-20Hz band. But in this example, you load real data and fit filters on the 15-20Hz band, and transform the data using those filters... so IIUC the plot for case 2 basically just shows that the filter worked. Do I have that right?

If so, what I find confusing about it is how similar the plots are styled, as if they're showing basically the same thing on fake vs real data. But case 1 is showing "we can recover actual (simulated) connectivity and we don't see spurious connectivity where it doesn't exist", while case 2 is showing "the spatial filters actually capture connectivity in the band we asked for, and (mostly) suppress connectivity in frequencies outside that band". I'm not totally sure how best to improve the tutorial to overcome this confusion; maybe first you can confirm that I'm actually understanding / interpreting correctly, then we can talk about how to improve it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case 1 is showing "we can recover actual (simulated) connectivity and we don't see spurious connectivity where it doesn't exist", while case 2 is showing "the spatial filters actually capture connectivity in the band we asked for, and (mostly) suppress connectivity in frequencies outside that band"

So case 1 is just about showing that you can fit filters to one piece of data and apply it to another (not necessarily about showing there is no spurious connectivity), and case 2 is about fitting filters to and transforming the same piece of data (not necessarily that the filters are capturing connectivity in a given band and suppressing it elsewhere). For either/both of these I could have used real/simulated data.

For case 1 I looked at using the fieldtrip_cmc data for this, but found it difficult to identify a connectivity component that was present across multiple parts of the recording duration (which is necessary if you want to fit the filters to one set of epochs and apply them to another). I'm sure I could have explored this more, but in the interest of time I settled on using simulated data where I know there is a consistent connectivity component throughout the whole length of data.

For case 2 I could have used simulated data again, but I always think these tutorials look more convincing with some real data examples, and for this case since we are fitting to and transforming the same data, identifying a connectivity component which is consistent across multiple parts of the recording is not necessary.


The first example faked some data with simulated connectivity in the 15-20Hz band, fit filters from 5-35 Hz, and recovered the fact that connectivity was higher in the 15-20Hz band

For case 1, I start by showing the connectivity of the data that our simulated data indeed involves an interaction at 15-20 Hz. I compute connectivity using the spectral_connectivity_epochs() function where filters are fit to each frequency bin.

I then use the new decomposition class to fit filters to only the 15-20 Hz range which are trained on the first 30 epochs and applied to the last 30 epochs. Alongside this I show the results of the spectral_connectivity_epochs() function for the last 30 epochs (i.e. fitting and transforming these epochs, with filters fit to each frequency bin).

This is to demonstrate that the filters trained on one piece of data can extract this same connectivity component from a new piece of data. The point is not to show that the filters do not pick up spurious connectivity.


you load real data and fit filters on the 15-20Hz band, and transform the data using those filters... so IIUC the plot for case 2 basically just shows that the filter worked.

In case 2 I show that you can also use the decomposition class to fit and transform the same piece of data, similar to what is happening in the spectral_connectivity_... functions (except with fitting filters to a single band, not each frequency bin).

In the same way as for case 1 I am still only fitting filters to 15-20 Hz using the decomposition class. The first plot where the results of fitting & transforming the same data using the decomposition classes is shown alone is just to demonstrate that there is connectivity at 15-20 Hz in the data.

Then I compute connectivity using the spectral_connectivity_epochs() function (where filters are fit to each frequency bin) to show that fitting and transforming the same piece of data using the decomposition class gives a near-identical result to the spectral_connectivity_... approach, at least for the 15-20 Hz range where the decomposition class fit the filters.

The point is to demonstrate that fitting to and transforming the same piece of data with the decomposition class is a valid use-case. The point is not that connectivity outside the given frequency band is suppressed (it's not actively suppressed, the filters just don't care about optimising this connectivity). It does show that "the spatial filters actually capture connectivity in the band we asked for", but this is also shown in case 1.


Does this make it any clearer about what each example is trying to show? These are the key points:

  • case 1 is just about showing that you can fit filters to one piece of data and apply it to another
  • case 2 is about fitting filters to and transforming the same piece of data

Maybe having a final summary after case 1 & 2 helps get this across.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to demonstrate that the filters trained on one piece of data can extract this same connectivity component from a new piece of data. The point is not to show that the filters do not pick up spurious connectivity.

But perhaps the tutorial can be changed to include the point that the filters do not pick up spurious connectivity. This is an important argument in support of the statement "can extract the same connectivity component".

At any rate, I feel this is something we want users to think about when they make decisions regarding what parts of the signal to fit the filters on and what parts to apply the filters to. The tutorial can help set them on the right path.

@tsbinns
Copy link
Collaborator Author

tsbinns commented Jun 14, 2024

Thanks @drammock for the detailed feedback!

I have addressed everything except for the bigger points about the example's clarity. Will try to work on those now.

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always think these tutorials look more convincing with some real data examples

I generally agree, as long as the tutorial says things like "from the literature, we expect to find XYZ [cite,cite] reflecting the brain doing ABC" (so, e.g., "connectivity in beta band reflecting a cortico-cerebellar loop for motor planning" or whatever), and there's a dataset available that shows the effect.

case 1 is just about showing that you can fit filters to one piece of data and apply it to another (not necessarily about showing there is no spurious connectivity), and case 2 is about fitting filters to and transforming the same piece of data

I think if the same data were used for both of these, the tutorial could be easier to follow. Something like:

  1. simulate data
  2. fit the filters to first half of epochs
  3. apply to same epochs & visualize result (& compare with spect_conn_epo function)
  4. now apply to held-out epochs & visualize result

The narrative could be something like "Case 1: the new Decomp Class can do what the existing function did (fit/transform to same data). Case 2: the new Decomp Class can also be used in this other way (fit/transform on separate train/test data)."

it's not actively suppressed, the filters just don't care about optimising this connectivity

Fair enough, "suppressed" was not a good word choice on my part. I think it's relevant to discuss the out-of-band frequencies though. E.g., you could say things like "here we don't see much out-of-band connectivity, but in theory a filter trained on 15-25Hz might also pick up connectivity in other bands, if the spatial pattern of the connectivity in those other bands happened to coincide with the spatial pattern that maximizes our band of interest."

examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved
examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved
@tsbinns
Copy link
Collaborator Author

tsbinns commented Jun 18, 2024

@drammock

I think if the same data were used for both of these, the tutorial could be easier to follow.

Okay, then I will switch to simulated data for both.

I think it's relevant to discuss the out-of-band frequencies though.

Sure, and that also align's with @wmvanvliet's comment. I will add a section on this as well.

@tsbinns
Copy link
Collaborator Author

tsbinns commented Jun 18, 2024

Have restructured the example/tutorial based on people's comment. Thanks again @drammock & @wmvanvliet!

For the point about how activity outside of the fitting freq. range is handled, I just added text explaining this. Could also add some examples showing this, but I would only do that if people think it is necessary.

########################################################################################
# Component specificity of filters
# --------------------------------
# We have spoken much about how the filters extract particular components of
# connectivity, which we elaborate on here. The filters act as spatial weights,
# controlling how much each channel contributes to the given connectivity component.
# Although we fit these filters to a specific frequency band, they do not operate in a
# frequency-specific manner.
#
# For example, say you have two sets of data: *Data 1* with an interaction at 15-20 Hz;
# and *Data 2* with an interaction at 5-10 Hz. We fit the filters at 15-20 Hz to *Data
# 1*, and apply the filters to *Data 2*.
#
# If the connectivity components in *Data 1* and *Data 2* have different spatial
# distributions (i.e. different channels contribute to connectivity in each set of
# data), the filters fit to 15-20 Hz on *Data 1* will not extract the 5-10 Hz
# connectivity from *Data 2*.
#
# On the other hand, if the connectivity components in *Data 1* and *Data 2* have the
# same spatial distribution (i.e. the same channels contribute to connectivity in both
# sets of data), the filters fit to 15-20 Hz on *Data 1* will extract the 5-10 Hz
# connectivity from *Data 2*. Because of this, it is generally recommended that you only
# consider the connectivity results for those frequencies where you originally fit the
# filters.
#
# Furthermore, if *Data 1* and *Data 2* both have interactions at the same frequency
# band (e.g. 15-20 Hz) but with different spatial distributions, the filters fit to
# 15-20 Hz on *Data 1* will not extract the 15-20 Hz connectivity from *Data 2*. This is
# because the filters extract connectivity components according to particular spatial
# distributions, and if the spatial distributions differ, these interactions are by
# definition distinct components, even if they occur at the same frequencies.

@tsbinns tsbinns changed the title [GSOC] [WIP] Add decoding module [GSOC] Add decoding module Jun 18, 2024
Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice improvements. Just a few minor nitpicks / wordsmithings. (also it looks like you have a merge conflict to resolve)

examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved
examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved
examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved
examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved
examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved
examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved
@tsbinns
Copy link
Collaborator Author

tsbinns commented Jun 19, 2024

Thanks again @drammock! Incorporated all your suggestions and fixed the merge conflict.

@tsbinns tsbinns merged commit 57a1271 into mne-tools:main Jun 19, 2024
12 checks passed
@wmvanvliet
Copy link
Collaborator

Thanks heaps @tsbinns !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants