-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSOC] Add decoding module #193
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start and super neat code. There are some design choices here regarding what to implement at what layer of abstraction that raise some questions (see comments in code).
I like the tutorial (cannot really call it an "example" anymore, but mne-connectivity does not have a distinction between examples and tutorials). It explained a lot. |
That's comments of @wmvanvliet and @larsoner addressed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wmvanvliet feel free to merge if you're happy!
please wait, I am in the middle of reviewing now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB: I didn't review the tests.
# As you can see, the connectivity profile of the transformed data using filters fit on | ||
# the first 30 epochs is very similar to the connectivity profile when using filters fit | ||
# on the last 30 epochs. This shows that the filters are generalisable, able to extract | ||
# the same components of connectivity which they were trained on from new data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it worth pointing out that connectivity estimates outside the 15-20 Hz band are lower when the filters are trained on different data (i.e., better SNR)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now see that @wmvanvliet had a similar comment to mine, in a resolved comment below. I do think though that it's worth mentioning the SNR point here too, in addition to the overfitting point made later using the sliding window example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can add this. How about something like:
"As you can see, the connectivity profile of the transformed data using filters fit on the first 30 epochs is very similar to the connectivity profile when using filters fit on the last 30 epochs. This shows that the filters are generalisable, able to extract the same components of connectivity which they were trained on from new data. Additionally, by fitting the filters to only a select frequency band, we can avoid overfitting in frequency bins where no interactions are present (here outside of the 15-20 Hz band) and thereby enhance the overall signal-to-noise ratio of our connectivity estimates."
# We use the latter approach below, fitting the filters to the 15-20 Hz band and using | ||
# the ``"imcoh"`` method in the call to the ``spectral_connectivity_...()`` functions. | ||
# Plotting the results, we see a peak in connectivity at 15-20 Hz. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something is confusing me here. The first example faked some data with simulated connectivity in the 15-20Hz band, fit filters from 5-35 Hz, and recovered the fact that connectivity was higher in the 15-20Hz band. But in this example, you load real data and fit filters on the 15-20Hz band, and transform the data using those filters... so IIUC the plot for case 2 basically just shows that the filter worked. Do I have that right?
If so, what I find confusing about it is how similar the plots are styled, as if they're showing basically the same thing on fake vs real data. But case 1 is showing "we can recover actual (simulated) connectivity and we don't see spurious connectivity where it doesn't exist", while case 2 is showing "the spatial filters actually capture connectivity in the band we asked for, and (mostly) suppress connectivity in frequencies outside that band". I'm not totally sure how best to improve the tutorial to overcome this confusion; maybe first you can confirm that I'm actually understanding / interpreting correctly, then we can talk about how to improve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case 1 is showing "we can recover actual (simulated) connectivity and we don't see spurious connectivity where it doesn't exist", while case 2 is showing "the spatial filters actually capture connectivity in the band we asked for, and (mostly) suppress connectivity in frequencies outside that band"
So case 1 is just about showing that you can fit filters to one piece of data and apply it to another (not necessarily about showing there is no spurious connectivity), and case 2 is about fitting filters to and transforming the same piece of data (not necessarily that the filters are capturing connectivity in a given band and suppressing it elsewhere). For either/both of these I could have used real/simulated data.
For case 1 I looked at using the fieldtrip_cmc data for this, but found it difficult to identify a connectivity component that was present across multiple parts of the recording duration (which is necessary if you want to fit the filters to one set of epochs and apply them to another). I'm sure I could have explored this more, but in the interest of time I settled on using simulated data where I know there is a consistent connectivity component throughout the whole length of data.
For case 2 I could have used simulated data again, but I always think these tutorials look more convincing with some real data examples, and for this case since we are fitting to and transforming the same data, identifying a connectivity component which is consistent across multiple parts of the recording is not necessary.
The first example faked some data with simulated connectivity in the 15-20Hz band, fit filters from 5-35 Hz, and recovered the fact that connectivity was higher in the 15-20Hz band
For case 1, I start by showing the connectivity of the data that our simulated data indeed involves an interaction at 15-20 Hz. I compute connectivity using the spectral_connectivity_epochs()
function where filters are fit to each frequency bin.
I then use the new decomposition class to fit filters to only the 15-20 Hz range which are trained on the first 30 epochs and applied to the last 30 epochs. Alongside this I show the results of the spectral_connectivity_epochs()
function for the last 30 epochs (i.e. fitting and transforming these epochs, with filters fit to each frequency bin).
This is to demonstrate that the filters trained on one piece of data can extract this same connectivity component from a new piece of data. The point is not to show that the filters do not pick up spurious connectivity.
you load real data and fit filters on the 15-20Hz band, and transform the data using those filters... so IIUC the plot for case 2 basically just shows that the filter worked.
In case 2 I show that you can also use the decomposition class to fit and transform the same piece of data, similar to what is happening in the spectral_connectivity_...
functions (except with fitting filters to a single band, not each frequency bin).
In the same way as for case 1 I am still only fitting filters to 15-20 Hz using the decomposition class. The first plot where the results of fitting & transforming the same data using the decomposition classes is shown alone is just to demonstrate that there is connectivity at 15-20 Hz in the data.
Then I compute connectivity using the spectral_connectivity_epochs()
function (where filters are fit to each frequency bin) to show that fitting and transforming the same piece of data using the decomposition class gives a near-identical result to the spectral_connectivity_...
approach, at least for the 15-20 Hz range where the decomposition class fit the filters.
The point is to demonstrate that fitting to and transforming the same piece of data with the decomposition class is a valid use-case. The point is not that connectivity outside the given frequency band is suppressed (it's not actively suppressed, the filters just don't care about optimising this connectivity). It does show that "the spatial filters actually capture connectivity in the band we asked for", but this is also shown in case 1.
Does this make it any clearer about what each example is trying to show? These are the key points:
- case 1 is just about showing that you can fit filters to one piece of data and apply it to another
- case 2 is about fitting filters to and transforming the same piece of data
Maybe having a final summary after case 1 & 2 helps get this across.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to demonstrate that the filters trained on one piece of data can extract this same connectivity component from a new piece of data. The point is not to show that the filters do not pick up spurious connectivity.
But perhaps the tutorial can be changed to include the point that the filters do not pick up spurious connectivity. This is an important argument in support of the statement "can extract the same connectivity component".
At any rate, I feel this is something we want users to think about when they make decisions regarding what parts of the signal to fit the filters on and what parts to apply the filters to. The tutorial can help set them on the right path.
Thanks @drammock for the detailed feedback! I have addressed everything except for the bigger points about the example's clarity. Will try to work on those now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always think these tutorials look more convincing with some real data examples
I generally agree, as long as the tutorial says things like "from the literature, we expect to find XYZ [cite,cite] reflecting the brain doing ABC" (so, e.g., "connectivity in beta band reflecting a cortico-cerebellar loop for motor planning" or whatever), and there's a dataset available that shows the effect.
case 1 is just about showing that you can fit filters to one piece of data and apply it to another (not necessarily about showing there is no spurious connectivity), and case 2 is about fitting filters to and transforming the same piece of data
I think if the same data were used for both of these, the tutorial could be easier to follow. Something like:
- simulate data
- fit the filters to first half of epochs
- apply to same epochs & visualize result (& compare with
spect_conn_epo
function) - now apply to held-out epochs & visualize result
The narrative could be something like "Case 1: the new Decomp Class can do what the existing function did (fit/transform to same data). Case 2: the new Decomp Class can also be used in this other way (fit/transform on separate train/test data)."
it's not actively suppressed, the filters just don't care about optimising this connectivity
Fair enough, "suppressed" was not a good word choice on my part. I think it's relevant to discuss the out-of-band frequencies though. E.g., you could say things like "here we don't see much out-of-band connectivity, but in theory a filter trained on 15-25Hz might also pick up connectivity in other bands, if the spatial pattern of the connectivity in those other bands happened to coincide with the spatial pattern that maximizes our band of interest."
Okay, then I will switch to simulated data for both.
Sure, and that also align's with @wmvanvliet's comment. I will add a section on this as well. |
Have restructured the example/tutorial based on people's comment. Thanks again @drammock & @wmvanvliet! For the point about how activity outside of the fitting freq. range is handled, I just added text explaining this. Could also add some examples showing this, but I would only do that if people think it is necessary. mne-connectivity/examples/decoding/cohy_decomposition.py Lines 493 to 523 in 29f2b53
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice improvements. Just a few minor nitpicks / wordsmithings. (also it looks like you have a merge conflict to resolve)
Thanks again @drammock! Incorporated all your suggestions and fixed the merge conflict. |
Thanks heaps @tsbinns ! |
As described in #182.
This adds support for a base class with
fit
andtransform
methods, as well as daughter classes for the CaCoh and MIC methods. The code takes advantage of the MNE-Python classes for computing the CSD and the existing MNE-Connectivity estimator classes for computing the filters and patterns.As of now there are no unit tests. These will be added.
As of now there are no examples/tutorials. These will be added.
As of now there are no plotting methods. These will come in a separate PR.
As of now there is only support for a single component. This is described more in #183 and will come in a separate PR.
To slightly reduce the size of this PR's diff, I opened a separate one (#192) which adds support for storing the filters used for CaCoh and MIC.