Non-local means denoising of multi-channel electrophysiology timeseries using PyTorch.
The software is in the alpha stage of development. Testers and contributers are welcome to assist.
Electrophysiology recordings contain spike waveforms superimposed on a noisy background signal. Each detected neuron may fire hundreds or even thousands of times over the duration of a recording, producing a very similar voltage trace at each event. Here we aim to leverage this redundancy in order to denoise the recording in a preprocessing step, prior to spike sorting or other analyses. Our approach is to use non-local means to suppress the noisy signal while retaining the part of the signal that repeats.
Non-local means is a denoising algorithm that has primarily been used for 2d images, but it can also be adapted for use with 1d signals. Let be the multi-channel voltage trace for and where is the number of electrode channels, and is the number of timepoints. If computation time were not a concern, we would ideally denoise each half-overlapping clip (or snippet) by the formula:
where and are -dimensional vectors representing clips with timepoints, is the denoised vector/clip,
is a weighting function, and the summation is over all clips in the entire recording (including all translations). Thus, each of the half-overlapping target clips are replaced by a weighted average of all clips . A back-of-the-envelope calculation reveals that the computational burden is prohibitive for even relatively short recordings of a few minutes, but we will outline below the various strategies we employ to overcome this challenge in our software.
The term appearing in the exponent is the normalized distance between the clips given by
where is a matrix that whitens the clip vectors based on the noise model empirically derived as follows:
where
is the singular-value decomposition of the matrix of all clips and is the rank-K diagonal matrix that picks out the top singular vectors. Here, is chosen to capture a user-specified percentage of the total variance in the recording. The noise level is also empirically estimated from the data to represent the expected distance between two noise clips (i.e., clips without detectable spikes). The software also allows for a scaling factor for adjusting .
As a practical matter, we need to be able to perform the above denoising procedure within a reasonable timeframe relative to other processing steps. Below we discuss several strategies we use to speed up the computation.
The first simplification is to split the recording into discrete time blocks, typically 30 seconds or a minute, and denoise independently in each block. For firing rates greater around 1 Hz or higher this is okay since we only need a few dozen representative events for each neuron. In the future we may provide a way to also probe beyond the boundaries of the discrete blocks, but for now the user must choose a fixed duration for block sizes, with a tradeoff between block duration and computational efficiency.
The time-consuming part of the non-local means formula is the summation over all clips in the time block of size . Fortunately, the weighted average can be viewed as a statistical procedure which may be substantially sped up using strategic subsampling. Because firing events are usually sparse, the majority of clips are close to the backround noise and therefore have a very large number of nearby neighbors such that is close to the maximum of . For these cases it is acceptable to perform vast subsampling, perhaps summing over only a couple hundred randomly-selected clips. On the other extreme, some clips may include spikes that are relatively rare, and thus there will be a small number of source clips that contribute anything substantial to the sum.
While algorithms such as clustering or k-nearest neighbors could be used to more intelligently sample, we would like to avoid these types of methods in this preprocessing step. We view clustering and classification as part of the spike sorting step, and not this denoising operation which seeks only to isolate the signal from the noise.
The procedure we use for adaptive subsampling involves computing the summation in batches and selectively dropping out clips from both the sources () and the targets () between each batch. In the first batch we compute for all 's and a small subset of the 's and seperately accumulate both the numerator and the denominator . We then use the size of the denominator as a criterion for determining which clips to exclude from the subsequent batches, the idea being that a large denominator means that a large number of nearby neighbors have already contributed to the weighted average. The user sets a threshold (e.g., 30) for dropping out clips in this way.
In addition to dropping out target clips from the computation, it is crucial to also drop out source clips. A large denominator for a target clip means that it must have a relatively large number of nearby neighbors. Therefore, the source clips that are overlapping (in time) to such a target clip would presumably also have a large number of neighbors, and in fact all of its nearby neighbors would be expected to have a large number of nearby neighbors. Thus it should be safe to drop out source clips as well based on the denominator criterion for the time-overlapping target clips.
In summary, adaptive subsampling is achieved by computing the weighted sum by accumulating the numerators and denominators in batches, while dropping out both source and target clips based on the denominator threshold criterion.
Here we need to describe how we keep track of numerators and denominators for each target clip, how exactly we apply the denominator dropout criterion, and how to combine the values for time-overlapping clips.
The method of non-local means works well when the signal repeats many times throughout the time block. But when neural events overlap in time the resulting waveform is a superposition that will usually not match any other waveform. Even if the same two neurons fire simultaneously in multiple instances, the time offset between the two events is expected to be variable, thus producing a spectrum of different superpositioned waveforms. Overlapping spike events are thus expected to form isolated clips that have few if any nearby neighbors.
While our method cannot be expected to denoise such events, we can expect the noisy signal of the superimposed waveforms to survive the denoising process. This is because only one source term (the clip itself) is expected to contribute substantially (always with a weight of ) to the average. Therefore, we expect that the denoised recording will retain overlapping or other rare events, but without denoising them.
Describe the procedure for denoising in neighborhoods.
Describe how we use PyTorch to efficiently compute the matrix-matrix multiplications needed in the above-described algorithm.
- Python (tested on 3.6 and 3.7)
- PyTorch (tested on v1.0.0)
- CUDA toolkit - if using GPU (recommended)
- MKL - if using CPU instead of CUDA
To test whether PyTorch and CUDA are set up properly, run the following in ipython:
import torch
if torch.cuda.is_available():
print('CUDA is available for PyTorch!')
else:
print('CUDA is NOT available for PyTorch.')
Recommended
- SpikeInterface --
pip install spikeinterface
- SpikeForest
pip install --upgrade ephys_nlm
After cloning this repository:
cd ephys_nlm
pip install -e .
# Then in subsequent updates:
git pull
pip install -e .
The following is taken from a notebook in the examples/ directory. It generates a short synthetic recording and denoise it.
from ephys_nlm import ephys_nlm_v1, ephys_nlm_v1_opts
import spikeextractors as se
import spikewidgets as sw
import matplotlib.pyplot as plt
# Create a synthetic recording for purposes of demo
recording, sorting_true = se.example_datasets.toy_example(duration=30, num_channels=4, K=20, seed=4)
# Specify the denoising options
opts = ephys_nlm_v1_opts(
multi_neighborhood=False, # False means all channels will be denoised in one neighborhood
block_size_sec=30, # Size of block in seconds -- each block is denoised separately
clip_size=30, # Size of a clip (aka snippet) in timepoints
sigma='auto', # Auto compute the noise level
sigma_scale_factor=1, # Scale factor for auto-computed noise level
whitening='auto', # Auto compute the whitening matrix
whitening_pctvar=90, # Percent of variance to retain - controls number of SVD components to keep
denom_threshold=30 # Higher values lead to a slower but more accurate calculation.
)
# Do the denoising
recording_denoised, runtim_info = ephys_nlm_v1(
recording=recording,
opts=opts,
device='cpu', # cuda is recommended for non-demo situations
verbose=1
)
Also included in the notebook is SpikeInterface code used to view the original and denoised timeseries:
# View the original and denoised timeseries
plt.figure(figsize=(16,5))
sw.TimeseriesWidget(recording=recording, trange=(0, 0.2), ax=plt.gca()).plot();
plt.figure(figsize=(16,5))
sw.TimeseriesWidget(recording=recording_denoised, trange=(0, 0.2), ax=plt.gca()).plot();
This should produce output similar to the following:
Apache-2.0 -- We request that you acknowledge the original authors in any derivative work.
Jeremy Magland, Center for Computational Mathematics (CCM), Flatiron Institute
Alex Barnett, James Jun, and members of CCM for many useful discussions