Pluralistic Approach to Decoding Motor Activity from Different Regions of the Rodent Brain

By Pod-96-manymanifolds

Project Contributors:

Narotam Singh (@narotamsingh)
Prakriti Nayak (@PrakritiNayak)
Rishika Mohanta (@neurorishika)
Vaibhav Sharma (@VaibhavSharma08)

Scientific Question

Can neuronal spikes and population activity in different motor implicated regions of the rodent brain predict the motor output and directional motor output accurately?

Introduction

There exists a high degree of correlation and redundancy in the activity of neural populations across different regions of the brain. Motor function is a result of multiple activities coordinating over various regions, governing from the decision to carry the output to the performance of the same. In this project, we explored different approaches to decode activity from motor implicated regions in rodents using spike and population activity data.

Methods

Original Experiment Description

Read about the details of the experiment at https://doi.org/10.1038/s41586-019-1787-x. We are concerned with the motor activity (motion of the wheel) during the duration when the visual stimulus is not correlated with motion of the wheel to avoid confounders.

Data source

The original data is from "Distributed coding of choice, action and engagement across the mouse brain." Steinmetz et al. (2019). The original raw data is available at https://figshare.com/articles/steinmetz/9598406. We use a precleaned version provided for us by Neuromatch Academy available under the Open Science Framework (OSF) at https://osf.io/agvxh (part 1), https://osf.io/uv3mw (part 2), and https://osf.io/ehmw2 (part 3).

We then further cleaned to consider only recordings from motor related areas with more than 50 neurons from atleast 2 mice (figures below). We only considered the the open loop condition ie. data between stimulus onset and go cue to avoid representations of moving stimulus from appearing in the neural data we are analysing.

Fig: A Schematic of the Motor Control System with regions involved and associated regions in Steinmetz et. al. 2019 Dataset

Fig: Summary of the Cleaned Data

A Pluralistic Strategy

Our mentor @JasonRitt made a really inspiring statement which amounted to something like "Sometimes, there is no one best way to do something. Find a reasonably good way to go ahead, and acknowledge the limitations." We took it to heart and decided to take a pluralistic approach to answering our question.

Population Spike Code Approach

We implemented a General Linear Model with Spiking History and L2 regularization pipeline in python 3.6 to decode the motor output (wheel) from the neural spike data from 50 randomly sampled neurons from 100 randomly sampled trials from different (Session, Brain Area) pairs. We implemented this using Cross Validated Ridge Regressor in the scikit-learn package. The length of the temporal kernel ie. the spiking history required for the decoding models to decode optimally can vary from region to region. So we evaluate different kernel sizes between 50 to 250ms and choose the optimal kernel size that maximizes R^2 score for analysis.

Latent Information Approach

We implemented a pipeline to compute latent dimensions of the neural spike data from 50 randomly sampled neurons from 100 randomly sampled trials using the Gaussian Process Factor Analysis (Yu, 2009) implemention in the ELEctroPHysiology ANalysis Toolbox (Elephant package in python 3.6). This was followed by a Ridge Regression to reconstruct motor output (wheel) and Linear Discriminant Analysis (LDA) to classify left motor output, right motor output and no motor output.

Results

Dataset for Left-Right motion is not completely balanced.

Fig: Within trial distribution of motor output (`wheel` velocity) in our duration of interest

We found that our data set is not very balanced in terms of motor putput. For a majority of the trial open loop duration the animal is not moving and is stationary. We have to account for this in our analysis.

Spikes from brain areas roughly predict Motor Output to different degrees

In our analysis of the Spike Trains using our time-resolved L2 regularised Linear Model based decoding model, we evaluated the model R^2 score and prediction-ground truth Pearson's correlation coefficient for 20 80%-20% train-test splits.

Fig: Time-Resolved Linear Regression Goodness-of-Fit Analysis

We find that not all regions of the brain are equally good at predicting the motor output. Certain motor implicated regions can predict motion much better than others such as the primary motor cortex. Interstingly, regions associated with motor feedback such as the somatosensory cortex could also predict motor output effectively, while other regions such as mediodorsal nucleus of thalamus fails to predict it accurate.

Fig: Corelation between Goodness-of-Fit measures

Futher, the two regression measures we used were highly correlated.

Fig: Senstivity of R^2 Score to Session ID

Even within the same regions there is huge variability in the ability to predict the motor output. This is true not only across mice and sessions but also within multiple samples from the same session.

Fig: Optimal Length of Temporal History required for predicting motor output

The optimal temporal history required to predict the motor output highly varied in our constrained range of 10-250 ms suggesting different timescales of motor activity in the brain.

Fig: Sensitivity of length of temporal history to Session ID

The optimal temporal history itself has some sensitivity to Session ID

Amount of Motor Information coded in the brain is different across Brain Regions

Here, we try to quantify the information stored in the coefficients of the Linear Model coefficients. We look at the temporal coefficients for the 50 neurons and quantify the amount of information in the distribution of coefficients. If there is a diverity of kernels, it may suggest that different neurons are responsible for different properties of the motor output. We quantify the information in kernels by considering different metrics for information such as average entropy of coefficient distribution over time, average variability in coefficients over time, and fraction of PCs required to explain 90% variability.

Fig: Metrics for Information Content in Kernels

We found that certain regions have average higher variance and entropy of kernel information. This seems to suggest that there is more diversity of information in certain regions vs others. the VPM while it doesnt do very well at coding for motor output the diversity of the kernel seems to suggest more diversity of information.

Fig: Correlation between metrics for Information Content in Kernels

We find that the entropy measure and variation measure are highly correlated across brain regions but the PCA participation fraction is quite variable and the correlation with other measures is region dependent. This might be becuase the PCA measure is normalised wrt kernel size.

Fig: Sensitivity of metrics for Information Content in Kernels to Session ID

As expected by the virtue of variability of biology and electrode sampling there is some degree of senstitivity to Session ID

Latent Neural modes of different areas don’t separate Motor output Equally

Fig: Measurement for optimal dimensionality for GPFA for 5 randomly sampled regions and session (Fitted with Sigmoid)

Fig: Motor Output mapped on GPFA Representation (Best and Worst Seperation of left vs right vs no motion)

Fig: LDA for Motor Output in 20-dimensional GPFA Reduced Space (Best and Worst Seperation of left vs right vs no motion)

Fig: LDA Based Classification Results adjusted for Label Imbalanced Dataset

References

Steinmetz, Nicholas A., Peter Zatka-Haas, Matteo Carandini, and Kenneth D. Harris. "Distributed coding of choice, action and engagement across the mouse brain." Nature 576, no. 7786 (2019): 266-273.
Shenoy, Krishna V., Maneesh Sahani, and Mark M. Churchland. "Cortical control of arm movements: a dynamical systems perspective." Annual review of neuroscience 36 (2013).
Gallego, Juan A., Matthew G. Perich, Lee E. Miller, and Sara A. Solla. "Neural manifolds for the control of movement." Neuron 94, no. 5 (2017): 978-984.
Byron, M. Yu, John P. Cunningham, Gopal Santhanam, Stephen I. Ryu, Krishna V. Shenoy, and Maneesh Sahani. "Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity." In Advances in neural information processing systems, pp. 1881-1888. 2009.
Yegenoglu, Alper, Detlef Holstein, Long Duc Phan, Michael Denker, Andrew Davison, and Sonja Grün. Elephant–open-source tool for the analysis of electrophysiological data sets. No. FZJ-2015-06042. Computational and Systems Neuroscience, 2015.
Pedregosa, Fabian, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel et al. "Scikit-learn: Machine learning in Python." the Journal of machine Learning research 12 (2011): 2825-2830.
Waskom, M. "seaborn: statistical data visualization. Python 2.7 and 3.5."

Acknowledgements

The project members express their gratitude to the entire NMA team for making and conducting the Neuromatch Academy 2020. A special thanks to our Mentor Jason Ritt. And last but not the least our TA Shagun Ajmera. We also like to acknowledge the beautiful cartoons made by @zekeart_ which we edited for our illustrations.

This project was done as part of Neuromatch Academy July 13-31 2020 + Project Code and Data Archive under MIT Licence. 2020

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
Documentation		Documentation
Latent Information Approach		Latent Information Approach
Population Spike Code Approach		Population Spike Code Approach
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pod-96-manimanifolds.png		pod-96-manimanifolds.png

License

neurorishika/pod-96-manymanifolds

Folders and files

Latest commit

History

Repository files navigation