Merge pull request #70 from dxenes1/master

Adds MICrONS Co-registration notebook from NeuroData ReHack 2023
dandi · Aug 27, 2024 · 47f71b9 · 47f71b9
2 parents 1f72bb5 + 7c4b3d1
commit 47f71b9
Show file tree

Hide file tree

Showing 9 changed files with 1,933 additions and 0 deletions.
diff --git a/000402/MICrONS/coregistration/README.md b/000402/MICrONS/coregistration/README.md
@@ -0,0 +1,23 @@
+# MICrONS Co-registration Analysis Notebook
+**This is an example notebook on how query functional data from DANDI and update it with co-registration information to analyze structure-function data in one place.**
+
+This project was developed during the 2023 NWB NeuroData ReHack workshop held in Granada, Spain. Please contact us at info@bossdb.org if you have any questions. 
+
+## Why is this needed?
+Comparison and generation of functional connectivity networks and structural connectivity networks is a key goal of the MICrONS project and requires joint analysis across archives. This notebook provides a starting point for those who want a configurable and dynamic way of obtaining the latest co-registered functional and structural data.
+
+Our goal with this notebook to provide a seamless integration of  generate novel insight into the stimulus-dependent and time-dependent activation of functional subnetworks. This will also allow direct investigation of causal connectivity estimation using functional data. Follow on work could investigate improvements to functional connectivity estimation methods. Even more exciting, however, is the potential to generate a wide range of hypotheses relating the structure and function of neural networks within mammalian cortex.
+
+## How to run
+You'll need a fresh python virtual environment (ver. 3.8 or higher) to run this notebook.
+
+1. Install python requirements from `requirements.txt` or through conda from the `environment.yml` file.
+2. Run `microns_nwb_coreg_notebook` notebook for a complete end-to-end demo of how to integrate the latest co-registed cells from CAVE with the MICrONS functiondal data stored in DANDI.
+3. Feel free to import/modify `microns_nwb_coreg.py` or `ng_utils.py` script functions for your own use-case!
+
+## Citations 
+>O. Rübel et al., “The Neurodata Without Borders ecosystem for neurophysiological data science,” eLife, vol. 11, p. e78362, Oct. 2022, doi: 10.7554/eLife.78362.
+
+> J. A. Bae et al., “Functional connectomics spanning multiple areas of mouse visual cortex,” bioRxiv, p. 2021.07.28.454025, Jan. 2021, doi: 10.1101/2021.07.28.454025.
+
+> Z. Ding et al., “Functional connectomics reveals general wiring rule in mouse visual cortex,” Neuroscience, preprint, Mar. 2023. doi: 10.1101/2023.03.13.531369.
diff --git a/000402/MICrONS/coregistration/ScanUnit.pkl b/000402/MICrONS/coregistration/ScanUnit.pkl
diff --git a/000402/MICrONS/coregistration/environment.yml b/000402/MICrONS/coregistration/environment.yml
@@ -0,0 +1,187 @@
+name: neurodatarehack2023
+channels:
+  - defaults
+dependencies:
+  - bzip2=1.0.8=h620ffc9_4
+  - ca-certificates=2023.12.12=hca03da5_0
+  - libffi=3.4.4=hca03da5_0
+  - ncurses=6.4=h313beb8_0
+  - openssl=3.0.13=h1a28f6b_0
+  - pip=23.3.1=py311hca03da5_0
+  - python=3.11.7=hb885b13_0
+  - readline=8.2=h1a28f6b_0
+  - setuptools=68.2.2=py311hca03da5_0
+  - sqlite=3.41.2=h80987f9_0
+  - tk=8.6.12=hb8d0fd4_0
+  - wheel=0.41.2=py311hca03da5_0
+  - xz=5.4.5=h80987f9_0
+  - zlib=1.2.13=h5a0b063_0
+  - pip:
+      - aiohttp==3.9.3
+      - aiosignal==1.3.1
+      - appdirs==1.4.4
+      - appnope==0.1.4
+      - arrow==1.3.0
+      - asciitree==0.3.3
+      - asttokens==2.4.1
+      - attrs==23.2.0
+      - bidsschematools==0.7.2
+      - blessed==1.20.0
+      - boto3==1.34.44
+      - botocore==1.34.44
+      - brotli==1.1.0
+      - cachetools==5.3.2
+      - caveclient==5.15.2
+      - certifi==2024.2.2
+      - chardet==5.2.0
+      - charset-normalizer==3.3.2
+      - ci-info==0.3.0
+      - click==8.1.7
+      - click-didyoumean==0.3.0
+      - cloud-files==4.21.1
+      - cloud-volume==8.29.1
+      - comm==0.2.1
+      - compressed-segmentation==2.2.2
+      - compresso==3.2.2
+      - contourpy==1.2.0
+      - crackle-codec==0.7.1
+      - crc32c==2.3.post0
+      - cycler==0.12.1
+      - cython==3.0.8
+      - dandi==0.59.1
+      - dandischema==0.8.4
+      - debugpy==1.8.1
+      - decorator==5.1.1
+      - deflate==0.5.0
+      - dill==0.3.8
+      - dnspython==2.6.1
+      - dracopy==1.4.0
+      - email-validator==2.1.0.post1
+      - etelemetry==0.3.1
+      - executing==2.0.1
+      - fasteners==0.19
+      - fastremap==1.14.1
+      - fonttools==4.49.0
+      - fpzip==1.2.3
+      - fqdn==1.5.1
+      - frozenlist==1.4.1
+      - fscacher==0.4.0
+      - fsspec==2024.2.0
+      - gevent==24.2.1
+      - google-api-core==2.17.1
+      - google-auth==2.28.0
+      - google-cloud-core==2.4.1
+      - google-cloud-storage==2.14.0
+      - google-crc32c==1.5.0
+      - google-resumable-media==2.7.0
+      - googleapis-common-protos==1.62.0
+      - greenlet==3.0.3
+      - h5py==3.10.0
+      - hdmf==3.12.2
+      - humanize==4.9.0
+      - idna==3.6
+      - importlib-metadata==7.0.1
+      - inflection==0.5.1
+      - iniconfig==2.0.0
+      - interleave==0.2.1
+      - ipykernel==6.29.2
+      - ipython==8.21.0
+      - isodate==0.6.1
+      - isoduration==20.11.0
+      - jaraco-classes==3.3.1
+      - jedi==0.19.1
+      - jmespath==1.0.1
+      - joblib==1.3.2
+      - json5==0.9.14
+      - jsonpointer==2.4
+      - jsonschema==4.21.1
+      - jsonschema-specifications==2023.12.1
+      - jupyter-client==8.6.0
+      - jupyter-core==5.7.1
+      - keyring==24.3.0
+      - keyrings-alt==5.0.0
+      - kiwisolver==1.4.5
+      - markdown==3.5.2
+      - matplotlib==3.8.3
+      - matplotlib-inline==0.1.6
+      - more-itertools==10.2.0
+      - multidict==6.0.5
+      - multiprocess==0.70.16
+      - natsort==8.4.0
+      - nest-asyncio==1.6.0
+      - networkx==3.2.1
+      - numcodecs==0.12.1
+      - numpy==1.23.2
+      - nwbinspector==0.4.33
+      - oldest-supported-numpy==2023.12.21
+      - orjson==3.9.14
+      - packaging==23.2
+      - pandas==2.2.0
+      - parso==0.8.3
+      - pathos==0.3.2
+      - pexpect==4.9.0
+      - pillow==10.2.0
+      - platformdirs==4.2.0
+      - pluggy==1.4.0
+      - posix-ipc==1.1.1
+      - pox==0.3.4
+      - ppft==1.7.6.8
+      - prompt-toolkit==3.0.43
+      - protobuf==4.25.3
+      - psutil==5.9.8
+      - ptyprocess==0.7.0
+      - pure-eval==0.2.2
+      - pyarrow==11.0.0
+      - pyasn1==0.5.1
+      - pyasn1-modules==0.3.0
+      - pybind11==2.11.1
+      - pycryptodomex==3.20.0
+      - pydantic==1.10.14
+      - pygments==2.17.2
+      - pynwb==2.5.0
+      - pyout==0.7.3
+      - pyparsing==3.1.1
+      - pysimdjson==6.0.2
+      - pyspng-seunglab==1.1.1
+      - pytest==8.0.1
+      - python-dateutil==2.8.2
+      - python-jsonschema-objects==0.5.2
+      - pytz==2024.1
+      - pyyaml==6.0.1
+      - pyzmq==25.1.2
+      - referencing==0.33.0
+      - requests==2.31.0
+      - rfc3339-validator==0.1.4
+      - rfc3987==1.3.8
+      - rpds-py==0.18.0
+      - rsa==4.9
+      - ruamel-yaml==0.18.6
+      - ruamel-yaml-clib==0.2.8
+      - s3fs==0.4.2
+      - s3transfer==0.10.0
+      - scipy==1.12.0
+      - semantic-version==2.10.0
+      - simplejpeg==1.7.2
+      - six==1.16.0
+      - stack-data==0.6.3
+      - tenacity==8.2.3
+      - tornado==6.4
+      - tqdm==4.66.2
+      - traitlets==5.14.1
+      - types-python-dateutil==2.8.19.20240106
+      - typing-extensions==4.9.0
+      - tzdata==2024.1
+      - uri-template==1.3.0
+      - urllib3==2.0.7
+      - wcwidth==0.2.13
+      - webcolors==1.13
+      - yarl==1.9.4
+      - zarr==2.17.0
+      - zarr-checksum==0.4.0
+      - zfpc==0.1.2
+      - zfpy==1.0.0
+      - zipp==3.17.0
+      - zope-event==5.0
+      - zope-interface==6.2
+      - zstandard==0.22.0
+
diff --git a/000402/MICrONS/coregistration/microns_nwb_coreg.py b/000402/MICrONS/coregistration/microns_nwb_coreg.py
@@ -0,0 +1,145 @@
+##############################################################################################
+# Copyright 2023 The Johns Hopkins University Applied Physics Laboratory LLC
+# All rights reserved.
+# Permission is hereby granted, free of charge, to any person obtaining a copy of this 
+# software and associated documentation files (the "Software"), to deal in the Software 
+# without restriction, including without limitation the rights to use, copy, modify, 
+# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to 
+# permit persons to whom the Software is furnished to do so.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 
+# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 
+# PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 
+# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 
+# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE 
+# OR OTHER DEALINGS IN THE SOFTWARE.
+##############################################################################################
+
+"""
+This script contains all functions necessary to update a MICrONS NWB file with the latest automated coregistration from CAVE. 
+
+Check the DANDISET and CAVE_COREG_TABLE variables before running to make sure they are the correct dataset and tables
+you want to pull.
+
+DEV NOTE: Unable to update NWB file for session 5 scan 7. 
+
+Example Usage:
+```
+from microns_nwb_coreg import get_microns_nwb_file, update_microns_nwb_file
+
+microns_nwb = get_microns_nwb_file(session_no=4, scan_no=4)
+microns_nwb = update_microns_nwb_file(microns_nwb)
+"""
+
+from dandi.dandiapi import DandiAPIClient
+from caveclient import CAVEclient
+
+from fsspec.implementations.cached import CachingFileSystem
+from fsspec import filesystem
+from h5py import File
+from pynwb import NWBHDF5IO
+from pynwb.file import NWBFile
+
+from tqdm import tqdm
+import pandas as pd
+
+from pynwb.ophys import PlaneSegmentation
+
+
+DANDISET_ID = "000402"
+CAVE_COREG_TABLE = "apl_functional_coreg_forward_v5"
+
+
+def get_microns_nwb_file(session_no:int, scan_no:int):
+    file_path = f"sub-17797/sub-17797_ses-{session_no}-scan-{scan_no}_behavior+image+ophys.nwb"
+    with DandiAPIClient() as client:
+        asset = client.get_dandiset(DANDISET_ID, 'draft').get_asset_by_path(file_path)
+        s3_url = asset.get_content_url(follow_redirects=1, strip_query=True)
+
+    # First, create a virtual filesystem based on the http protocol
+    fs = filesystem("http")
+
+    # Create a cache to save downloaded data to disk (optional)
+    fs = CachingFileSystem(
+        fs=fs,
+        cache_storage="nwb-cache",  # Local folder for the cache
+    )
+
+    # Next, open the file with NWBHDF5IO
+    file_system = fs.open(s3_url, "rb")
+    file = File(file_system, mode="r")
+    io = NWBHDF5IO(file=file, load_namespaces=True)
+
+    microns_data = io.read()
+    return microns_data
+
+def create_new_plane_segmentation(old, df, descriptions):
+    ps = PlaneSegmentation(
+        name=old.name, 
+        description=old.description, 
+        imaging_plane=old.imaging_plane,
+        id=df.index.tolist()
+    )
+
+    for col in df.columns:
+        if col in old.colnames:
+            old_col = find_column_by_name(old, col)
+            ps.add_column(name=old_col.name, description=old_col.description, data=df[col].tolist())
+        else:
+            ps.add_column(name=col, description=descriptions[col], data=df[col].tolist())
+    return ps
+
+
+def find_column_by_name(table,col_name):
+    for c in table.columns:
+        if c.name == col_name:
+            return c
+
+def update_microns_nwb_file(
+    nwb: NWBFile
+):
+
+    # Pre-requisite data loading
+    cave = CAVEclient("minnie65_phase3_v1")
+    coreg = cave.materialize.query_table(CAVE_COREG_TABLE)
+    scan_units = pd.read_pickle("./ScanUnit.pkl")
+
+    session, scan_idx = int(nwb.session_id.split('-')[0]), int(nwb.session_id.split('-')[2])
+    if session == 5 and scan_idx == 7:
+        print("Error: This file does not contain a unit_id column")
+        return 
+    scan_units_modified = scan_units[(scan_units['session']==session) & (scan_units['scan_idx']==scan_idx)]
+
+    image_segmentation = nwb.processing["ophys"].data_interfaces["ImageSegmentation"]
+
+    all_ps = list(image_segmentation.plane_segmentations)
+    for ps_name in tqdm(all_ps):
+
+        ps = image_segmentation.plane_segmentations.pop(ps_name)
+        field = int(ps_name[-1])
+        field_scan_units = scan_units_modified[scan_units_modified['field'] == field]
+        ps_df = ps[:]
+        ps_df['mask_id'] = ps_df.index
+        ps_df_with_units = ps_df.merge(field_scan_units, on='mask_id', how='left').drop(columns=[
+            'mask_id', 'session', 'scan_idx', 'field'
+        ])
+
+        coreg_units = coreg[
+            (coreg['session']==session) & 
+            (coreg['scan_idx']==scan_idx) & 
+            (coreg['field'] == field)
+        ][['target_id', 'unit_id']]
+
+        if len(coreg_units):
+            ps_df_with_units = ps_df_with_units.merge(coreg_units, on='unit_id').rename(
+                columns={
+                    'target_id': 'auto_match_cave_nuclei_id', 
+                    'cave_ids': 'manual_match_cave_nuclei_id'
+                }
+            )
+
+        description = {x: "Placeholder" for x in ps_df_with_units.columns}
+        new_ps = create_new_plane_segmentation(ps, ps_df_with_units, description)
+        image_segmentation.plane_segmentations.add(new_ps)
+
+    return nwb