-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor gnps classes #169
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add the detection of GNPS workflow "METABOLOMICS-SNETS-V2"
- rename enum names - change enum values to GNPS workflow names - add docstrings
Add the scenario of GNPSFormat.Unknown
- replace workflow name with enum value - update docstrings
- replace workflow name with enum value - update docstrings
- move existing GNPS files - add new file `ProteoSAFe-METABOLOMICS-SNETS-V2-189e8bf1-download_clustered_spectra.zip` - add new file `ProteoSAFe-Unknown.zip`
Move some fixtures from top level to metabolomics level
- add tests for workflow SNETSV2 and Unknown - use new fixtures `gnps_zip_files` and `gnps_file_mappings_files`
- use httpx to replace urllib - remove private functions not needed any more
Add GNPSFormat checking
Checking GNPSFormat has been moved to `__init__`
- add detection of gnps website availability - add tests for all supported GNPS workflows
- add detection of GNPS workflow in initiation - change method `extract` to private `_extract` - rewrite the private methods based on GNPS workflow types - add detailed docstring to explain which file is extracted and renamed update extractor
- update URL for GNPS USI - add `_validate` to validate annotation file (.tsv) - add `_load` method to modularise loading code - change method `get_annotations` to property `annotations` - update docstring
- add `_validate` to validate file mappings file - add `_load*` methods to modularise loading code - change method `mappings` to property - add detailed docstring
- add `_validate` method - change method `families` to property - refactor `_load` method to make loading code more modular - add detailed docstring
- replace mgf parser with community pakcage `pyteomics` - add `_validate` method - add `_load` method - add detailed docstring
…t_antismash_data` to make sure deletion of exttract_path can always happen.
gcroci2
approved these changes
Aug 29, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments :)
gcroci2
approved these changes
Aug 29, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR intends to check GNPS data, refactor classes and functions on downloading, extracting and loading GNPS data, add detailed docstring to explain how GNPS data is processed, and update unit tests for all types of GNPS data.
👉 It's easier to review the code commit by commit or class by class.
Though this PR has so many commits, actually the changes on GNPS classes followed the similar pattern.
Major changes
requests
and only usehttpx
to handle http requestspytemoics
to parse MGF fileSNETS-V2
) to enumGNPSFormat
gnps
, includingGNPSDownloader
,GNPSExtractor
,GNPSAnnotationLoader
,GNPSFileMappingLoader
,GNPSMolecularFamilyLoader
,GNPSSpectrumLoader
, and functions fromgnps_format.py
GNPSFormat
gnps
moduleutils.py
Expected failed tests
pairedomics
foldertest_loader.py