Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dandi export fixes #1095

Merged
merged 20 commits into from
Oct 3, 2024
Merged

Dandi export fixes #1095

merged 20 commits into from
Oct 3, 2024

Conversation

samuelbray32
Copy link
Collaborator

@samuelbray32 samuelbray32 commented Sep 9, 2024

Description

Fixes for the Dandi export process

  • Solves Dandi export of nwb w/ external files #1092
    • Copy files with external paths (e.g. videos) rather than symlink when compiling dandi set (to prevent editing original)
    • Allow dandi.organize to edit the copied nwb to resolve external path issues
  • Solves external_link nwb objects
    • Copied raw file (session_.nwb) contains external links for e-series data (not permitted by dandi)
    • During export, trace links to source file, upload that instead. Ensures all raw data uploaded for Dandi
    • When fetching nwb from Dandi, if entry is not in DandiPath check for non-copied file name. If so, stream that file
  • Other improvements
    • speed up translation of names to dandi path in translate_name_to_dandi by using lookup of the object id when available
    • Minor fixes in update_analysis_for_dandi_standards()
Development notes Todo:
  • Handle Raw nwb files

    • Original (.nwb)
    • Copy (_.nwb)
      • included in the accessed files list in testing
      • copied to solve Dandi export of nwb w/ external files #1092
        • This breaks the relative filepath for linked nwb objects (e.g. e-series, analog)
        • copying the original nwb into the same folder would resolve but expensive~
      • NEW ERROR
        • cannot be uploaded to DANDI to to external links to nwb objects
        • Will instead need to upload the original nwb file
  • Alternative (?)

    • Upload the original nwb but not the copy (_.nwb) file
      • Update This is required by DANDI not allowing external links
    • make new copy nwb file when necessary in the docker image from the original on Dandi

Checklist:

  • N This PR should be accompanied by a release: (yes/no/unsure)
  • NA If release, I have updated the CITATION.cff
  • N This PR makes edits to table definitions: (yes/no)
  • NA If table edits, I have included an alter snippet for release notes.
  • NA If this PR makes changes to position, I ran the relevant tests locally.
  • I have updated the CHANGELOG.md with PR number and description.
  • NA I have added/edited docs/notebooks to reflect the changes

@edeno edeno added the enhancement New feature or request label Sep 25, 2024
@samuelbray32 samuelbray32 marked this pull request as ready for review September 30, 2024 21:41
Copy link
Collaborator

@edeno edeno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor things.

src/spyglass/utils/nwb_helper_fn.py Outdated Show resolved Hide resolved
src/spyglass/common/common_dandi.py Outdated Show resolved Hide resolved
src/spyglass/utils/nwb_helper_fn.py Outdated Show resolved Hide resolved
@edeno edeno requested a review from CBroz1 September 30, 2024 22:05
Copy link
Member

@CBroz1 CBroz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some formatting/efficiency comments. If these fixes can fit in before your vacation, great. If not, we can merge what we have and I'll do a clean-up pass later

src/spyglass/common/common_dandi.py Outdated Show resolved Hide resolved
src/spyglass/common/common_dandi.py Outdated Show resolved Hide resolved
@@ -272,6 +274,26 @@ def make(self, key):
query.list_file_paths(paper_key) + restr_graph.file_paths
)

# Check for linked nwb objects and add them to the export
exclude_files = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like you're adding items to file_paths, just to remove them again with the index, is that right? I think I would structure this as on overwrite, rather than a gather + remove

keep_links = set() 
for file in file_paths:
    if not (links := get_linked(...)):
         keep_files.add(file)
    logger.warning(...)
file_paths = {'file_path': link for link in keep_links}

unique_dicts is slow. Adding things to a set enforces uniqueness as you go

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, I 'll put in a cleaner method along these lines

src/spyglass/utils/nwb_helper_fn.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@edeno edeno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on my end (pending changelog conflicts)

@edeno edeno requested a review from CBroz1 October 2, 2024 22:53
@CBroz1 CBroz1 merged commit f77483d into master Oct 3, 2024
7 checks passed
@CBroz1 CBroz1 deleted the export_fixes branch October 3, 2024 14:34
@CBroz1 CBroz1 mentioned this pull request Oct 3, 2024
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants