[GSOC] Add decoding module #193

tsbinns · 2024-05-29T12:18:56Z

As described in #182.

This adds support for a base class with fit and transform methods, as well as daughter classes for the CaCoh and MIC methods. The code takes advantage of the MNE-Python classes for computing the CSD and the existing MNE-Connectivity estimator classes for computing the filters and patterns.

As of now there are no unit tests. These will be added.

As of now there are no examples/tutorials. These will be added.

As of now there are no plotting methods. These will come in a separate PR.

As of now there is only support for a single component. This is described more in #183 and will come in a separate PR.

To slightly reduce the size of this PR's diff, I opened a separate one (#192) which adds support for storing the filters used for CaCoh and MIC.

wmvanvliet

Great start and super neat code. There are some design choices here regarding what to implement at what layer of abstraction that raise some questions (see comments in code).

mne_connectivity/decoding/decomposition.py

mne_connectivity/spectral/epochs_multivariate.py

mne_connectivity/decoding/decomposition.py

examples/decoding/cohy_decomposition.py

wmvanvliet · 2024-06-05T07:43:11Z

I like the tutorial (cannot really call it an "example" anymore, but mne-connectivity does not have a distinction between examples and tutorials). It explained a lot.

…into decoding

tsbinns · 2024-06-13T08:46:01Z

That's comments of @wmvanvliet and @larsoner addressed!

larsoner

@wmvanvliet feel free to merge if you're happy!

drammock · 2024-06-13T19:27:06Z

please wait, I am in the middle of reviewing now

drammock

NB: I didn't review the tests.

mne_connectivity/spectral/epochs_multivariate.py

mne_connectivity/decoding/decomposition.py

drammock · 2024-06-13T19:30:05Z

examples/decoding/cohy_decomposition.py

+# As you can see, the connectivity profile of the transformed data using filters fit on
+# the first 30 epochs is very similar to the connectivity profile when using filters fit
+# on the last 30 epochs. This shows that the filters are generalisable, able to extract
+# the same components of connectivity which they were trained on from new data.


is it worth pointing out that connectivity estimates outside the 15-20 Hz band are lower when the filters are trained on different data (i.e., better SNR)?

I now see that @wmvanvliet had a similar comment to mine, in a resolved comment below. I do think though that it's worth mentioning the SNR point here too, in addition to the overfitting point made later using the sliding window example.

Sure, I can add this. How about something like:

"As you can see, the connectivity profile of the transformed data using filters fit on the first 30 epochs is very similar to the connectivity profile when using filters fit on the last 30 epochs. This shows that the filters are generalisable, able to extract the same components of connectivity which they were trained on from new data. Additionally, by fitting the filters to only a select frequency band, we can avoid overfitting in frequency bins where no interactions are present (here outside of the 15-20 Hz band) and thereby enhance the overall signal-to-noise ratio of our connectivity estimates."

examples/decoding/cohy_decomposition.py

drammock · 2024-06-13T19:58:21Z

examples/decoding/cohy_decomposition.py

+# We use the latter approach below, fitting the filters to the 15-20 Hz band and using
+# the ``"imcoh"`` method in the call to the ``spectral_connectivity_...()`` functions.
+# Plotting the results, we see a peak in connectivity at 15-20 Hz.


something is confusing me here. The first example faked some data with simulated connectivity in the 15-20Hz band, fit filters from 5-35 Hz, and recovered the fact that connectivity was higher in the 15-20Hz band. But in this example, you load real data and fit filters on the 15-20Hz band, and transform the data using those filters... so IIUC the plot for case 2 basically just shows that the filter worked. Do I have that right?

If so, what I find confusing about it is how similar the plots are styled, as if they're showing basically the same thing on fake vs real data. But case 1 is showing "we can recover actual (simulated) connectivity and we don't see spurious connectivity where it doesn't exist", while case 2 is showing "the spatial filters actually capture connectivity in the band we asked for, and (mostly) suppress connectivity in frequencies outside that band". I'm not totally sure how best to improve the tutorial to overcome this confusion; maybe first you can confirm that I'm actually understanding / interpreting correctly, then we can talk about how to improve it.

case 1 is showing "we can recover actual (simulated) connectivity and we don't see spurious connectivity where it doesn't exist", while case 2 is showing "the spatial filters actually capture connectivity in the band we asked for, and (mostly) suppress connectivity in frequencies outside that band"

So case 1 is just about showing that you can fit filters to one piece of data and apply it to another (not necessarily about showing there is no spurious connectivity), and case 2 is about fitting filters to and transforming the same piece of data (not necessarily that the filters are capturing connectivity in a given band and suppressing it elsewhere). For either/both of these I could have used real/simulated data.

For case 1 I looked at using the fieldtrip_cmc data for this, but found it difficult to identify a connectivity component that was present across multiple parts of the recording duration (which is necessary if you want to fit the filters to one set of epochs and apply them to another). I'm sure I could have explored this more, but in the interest of time I settled on using simulated data where I know there is a consistent connectivity component throughout the whole length of data.

For case 2 I could have used simulated data again, but I always think these tutorials look more convincing with some real data examples, and for this case since we are fitting to and transforming the same data, identifying a connectivity component which is consistent across multiple parts of the recording is not necessary.

The first example faked some data with simulated connectivity in the 15-20Hz band, fit filters from 5-35 Hz, and recovered the fact that connectivity was higher in the 15-20Hz band

For case 1, I start by showing the connectivity of the data that our simulated data indeed involves an interaction at 15-20 Hz. I compute connectivity using the spectral_connectivity_epochs() function where filters are fit to each frequency bin.

I then use the new decomposition class to fit filters to only the 15-20 Hz range which are trained on the first 30 epochs and applied to the last 30 epochs. Alongside this I show the results of the spectral_connectivity_epochs() function for the last 30 epochs (i.e. fitting and transforming these epochs, with filters fit to each frequency bin).

This is to demonstrate that the filters trained on one piece of data can extract this same connectivity component from a new piece of data. The point is not to show that the filters do not pick up spurious connectivity.

you load real data and fit filters on the 15-20Hz band, and transform the data using those filters... so IIUC the plot for case 2 basically just shows that the filter worked.

In case 2 I show that you can also use the decomposition class to fit and transform the same piece of data, similar to what is happening in the spectral_connectivity_... functions (except with fitting filters to a single band, not each frequency bin).

In the same way as for case 1 I am still only fitting filters to 15-20 Hz using the decomposition class. The first plot where the results of fitting & transforming the same data using the decomposition classes is shown alone is just to demonstrate that there is connectivity at 15-20 Hz in the data.

Then I compute connectivity using the spectral_connectivity_epochs() function (where filters are fit to each frequency bin) to show that fitting and transforming the same piece of data using the decomposition class gives a near-identical result to the spectral_connectivity_... approach, at least for the 15-20 Hz range where the decomposition class fit the filters.

The point is to demonstrate that fitting to and transforming the same piece of data with the decomposition class is a valid use-case. The point is not that connectivity outside the given frequency band is suppressed (it's not actively suppressed, the filters just don't care about optimising this connectivity). It does show that "the spatial filters actually capture connectivity in the band we asked for", but this is also shown in case 1.

Does this make it any clearer about what each example is trying to show? These are the key points:

case 1 is just about showing that you can fit filters to one piece of data and apply it to another

case 2 is about fitting filters to and transforming the same piece of data

Maybe having a final summary after case 1 & 2 helps get this across.

This is to demonstrate that the filters trained on one piece of data can extract this same connectivity component from a new piece of data. The point is not to show that the filters do not pick up spurious connectivity.

But perhaps the tutorial can be changed to include the point that the filters do not pick up spurious connectivity. This is an important argument in support of the statement "can extract the same connectivity component".

At any rate, I feel this is something we want users to think about when they make decisions regarding what parts of the signal to fit the filters on and what parts to apply the filters to. The tutorial can help set them on the right path.

tsbinns · 2024-06-14T11:55:14Z

Thanks @drammock for the detailed feedback!

I have addressed everything except for the bigger points about the example's clarity. Will try to work on those now.

drammock

I always think these tutorials look more convincing with some real data examples

I generally agree, as long as the tutorial says things like "from the literature, we expect to find XYZ [cite,cite] reflecting the brain doing ABC" (so, e.g., "connectivity in beta band reflecting a cortico-cerebellar loop for motor planning" or whatever), and there's a dataset available that shows the effect.

case 1 is just about showing that you can fit filters to one piece of data and apply it to another (not necessarily about showing there is no spurious connectivity), and case 2 is about fitting filters to and transforming the same piece of data

I think if the same data were used for both of these, the tutorial could be easier to follow. Something like:

simulate data
fit the filters to first half of epochs
apply to same epochs & visualize result (& compare with spect_conn_epo function)
now apply to held-out epochs & visualize result

The narrative could be something like "Case 1: the new Decomp Class can do what the existing function did (fit/transform to same data). Case 2: the new Decomp Class can also be used in this other way (fit/transform on separate train/test data)."

it's not actively suppressed, the filters just don't care about optimising this connectivity

Fair enough, "suppressed" was not a good word choice on my part. I think it's relevant to discuss the out-of-band frequencies though. E.g., you could say things like "here we don't see much out-of-band connectivity, but in theory a filter trained on 15-25Hz might also pick up connectivity in other bands, if the spatial pattern of the connectivity in those other bands happened to coincide with the spatial pattern that maximizes our band of interest."

examples/decoding/cohy_decomposition.py

tsbinns · 2024-06-18T10:06:13Z

@drammock

I think if the same data were used for both of these, the tutorial could be easier to follow.

Okay, then I will switch to simulated data for both.

I think it's relevant to discuss the out-of-band frequencies though.

Sure, and that also align's with @wmvanvliet's comment. I will add a section on this as well.

tsbinns · 2024-06-18T15:49:38Z

Have restructured the example/tutorial based on people's comment. Thanks again @drammock & @wmvanvliet!

For the point about how activity outside of the fitting freq. range is handled, I just added text explaining this. Could also add some examples showing this, but I would only do that if people think it is necessary.

mne-connectivity/examples/decoding/cohy_decomposition.py

Lines 493 to 523 in 29f2b53

    
           ######################################################################################## 
        
           # Component specificity of filters 
        
           # -------------------------------- 
        
           # We have spoken much about how the filters extract particular components of 
        
           # connectivity, which we elaborate on here. The filters act as spatial weights, 
        
           # controlling how much each channel contributes to the given connectivity component. 
        
           # Although we fit these filters to a specific frequency band, they do not operate in a 
        
           # frequency-specific manner. 
        
           # 
        
           # For example, say you have two sets of data: *Data 1* with an interaction at 15-20 Hz; 
        
           # and *Data 2* with an interaction at 5-10 Hz. We fit the filters at 15-20 Hz to *Data 
        
           # 1*, and apply the filters to *Data 2*. 
        
           # 
        
           # If the connectivity components in *Data 1* and *Data 2* have different spatial 
        
           # distributions (i.e. different channels contribute to connectivity in each set of 
        
           # data), the filters fit to 15-20 Hz on *Data 1* will not extract the 5-10 Hz 
        
           # connectivity from *Data 2*. 
        
           # 
        
           # On the other hand, if the connectivity components in *Data 1* and *Data 2* have the 
        
           # same spatial distribution (i.e. the same channels contribute to connectivity in both 
        
           # sets of data), the filters fit to 15-20 Hz on *Data 1* will extract the 5-10 Hz 
        
           # connectivity from *Data 2*. Because of this, it is generally recommended that you only 
        
           # consider the connectivity results for those frequencies where you originally fit the 
        
           # filters. 
        
           # 
        
           # Furthermore, if *Data 1* and *Data 2* both have interactions at the same frequency 
        
           # band (e.g. 15-20 Hz) but with different spatial distributions, the filters fit to 
        
           # 15-20 Hz on *Data 1* will not extract the 15-20 Hz connectivity from *Data 2*. This is 
        
           # because the filters extract connectivity components according to particular spatial 
        
           # distributions, and if the spatial distributions differ, these interactions are by 
        
           # definition distinct components, even if they occur at the same frequencies.

drammock

Really nice improvements. Just a few minor nitpicks / wordsmithings. (also it looks like you have a merge conflict to resolve)

examples/decoding/cohy_decomposition.py

tsbinns · 2024-06-19T10:27:16Z

Thanks again @drammock! Incorporated all your suggestions and fixed the merge conflict.

wmvanvliet · 2024-06-19T13:21:25Z

Thanks heaps @tsbinns !

tsbinns added 9 commits May 28, 2024 14:50

Add filter storage

2fa5aca

Merge branch 'mne-tools:main' into decoding

6e206ce

Refactor results reshaping

68a0890

Fix filter indexing for storage

0368367

Update fill_doc dictionary

bb75cb1

Add n_components to ingored numpydoc words

115757c

Add decoding module

e7c9da9

Update API with decoding module

5e51923

Rename file and add suport for cwt_morlet mode

9fe0f28

tsbinns closed this May 29, 2024

tsbinns reopened this May 29, 2024

Merge branch 'main' into decoding

6302a4d

wmvanvliet reviewed May 31, 2024

View reviewed changes

tsbinns added 6 commits June 3, 2024 21:26

Make property docstrings private

fa6a001

Bug fix error check

15e99e4

Bug fix fit_transform no return

58eca90

Bug fix _check_X 2d array

9b43dfb

Add preliminary decomp example

bb0b520

Merge branch 'main' into decoding

cdc7ce4

wmvanvliet reviewed Jun 4, 2024

View reviewed changes

examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved

wmvanvliet reviewed Jun 5, 2024

View reviewed changes

examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved

wmvanvliet reviewed Jun 5, 2024

View reviewed changes

examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved

wmvanvliet reviewed Jun 5, 2024

View reviewed changes

examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved

tsbinns added 6 commits June 5, 2024 13:38

Merge branch 'main' into decoding

8cd98e1

Switch to cleaner epoch indexing

a3bf253

Fix spelling error

ea48ce3

Update error checking

9bcff58

Merge branch 'decoding' of https://github.com/tsbinns/mne-connectivity …

17cfeab

…into decoding

Bug fix indices setter wrong format

6c8ae97

tsbinns added 3 commits June 12, 2024 19:57

Merge remote-tracking branch 'upstream/main' into decoding

8afa1b0

Switch from matmul to at

b8b57bb

Shorten tests with kwargs

88eafb5

Add reviewers as authors

e72e97b

larsoner approved these changes Jun 13, 2024

View reviewed changes

drammock reviewed Jun 13, 2024

View reviewed changes

tsbinns added 2 commits June 14, 2024 13:42

Update from review

21b7e3d

Fix f-string 3.9

0d9f501

tsbinns added 2 commits June 14, 2024 14:05

Fix platform-specific test fail

5e38316

Update from review

3a83212

tsbinns mentioned this pull request Jun 14, 2024

[DOC] Make SSD fit docstring clearer mne-tools/mne-python#12664

Merged

drammock reviewed Jun 17, 2024

View reviewed changes

examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved

examples/decoding/cohy_decomposition.py Outdated Show resolved Hide resolved

Update example from review

29f2b53

Merge branch 'main' into decoding

fca189b

tsbinns changed the title ~~[GSOC] [WIP] Add decoding module~~ [GSOC] Add decoding module Jun 18, 2024

This was referenced Jun 18, 2024

[MAINT] Clean up string formatting #205

Merged

[MAINT] Modernise super method calls #206

Merged

drammock approved these changes Jun 18, 2024

View reviewed changes

tsbinns added 2 commits June 19, 2024 12:17

Merge remote-tracking branch 'upstream/main' into decoding

48769d5

Update example from review

4062d53

tsbinns merged commit 57a1271 into mne-tools:main Jun 19, 2024
12 checks passed

tsbinns mentioned this pull request Jun 19, 2024

[GSOC] Store multivariate connectivity filters #192

Closed

tsbinns mentioned this pull request Jul 29, 2024

[GSOC] Add a 'decoding' module to expand flexibility of multivariate methods #182

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GSOC] Add decoding module #193

[GSOC] Add decoding module #193

tsbinns commented May 29, 2024 •

edited

Loading

wmvanvliet left a comment

wmvanvliet commented Jun 5, 2024

tsbinns commented Jun 13, 2024

larsoner left a comment

drammock commented Jun 13, 2024

drammock left a comment

drammock Jun 13, 2024

drammock Jun 13, 2024

tsbinns Jun 14, 2024 •

edited

Loading

drammock Jun 13, 2024

tsbinns Jun 14, 2024

wmvanvliet Jun 17, 2024

tsbinns commented Jun 14, 2024

drammock left a comment

tsbinns commented Jun 18, 2024

tsbinns commented Jun 18, 2024

drammock left a comment

tsbinns commented Jun 19, 2024 •

edited

Loading

wmvanvliet commented Jun 19, 2024

[GSOC] Add decoding module #193

[GSOC] Add decoding module #193

Conversation

tsbinns commented May 29, 2024 • edited Loading

wmvanvliet left a comment

Choose a reason for hiding this comment

wmvanvliet commented Jun 5, 2024

tsbinns commented Jun 13, 2024

larsoner left a comment

Choose a reason for hiding this comment

drammock commented Jun 13, 2024

drammock left a comment

Choose a reason for hiding this comment

drammock Jun 13, 2024

Choose a reason for hiding this comment

drammock Jun 13, 2024

Choose a reason for hiding this comment

tsbinns Jun 14, 2024 • edited Loading

Choose a reason for hiding this comment

drammock Jun 13, 2024

Choose a reason for hiding this comment

tsbinns Jun 14, 2024

Choose a reason for hiding this comment

wmvanvliet Jun 17, 2024

Choose a reason for hiding this comment

tsbinns commented Jun 14, 2024

drammock left a comment

Choose a reason for hiding this comment

tsbinns commented Jun 18, 2024

tsbinns commented Jun 18, 2024

drammock left a comment

Choose a reason for hiding this comment

tsbinns commented Jun 19, 2024 • edited Loading

wmvanvliet commented Jun 19, 2024

tsbinns commented May 29, 2024 •

edited

Loading

tsbinns Jun 14, 2024 •

edited

Loading

tsbinns commented Jun 19, 2024 •

edited

Loading