-
Notifications
You must be signed in to change notification settings - Fork 12
Feature/podp_downloaded_bgc_data #316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Feature/podp_downloaded_bgc_data #316
Conversation
Local strain mappings (NPLinker#310)
…n for better readabiliy
…ieval into distinct functions for improved clarity
…/nplinker into feature/antismash-jobs
…_md5_sums for consistency
…on for better organization
…ror handling in antismash_job_is_done
…remove return value documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the handling of antiSMASH BGC data in PODP mode by checking for pre‐downloaded data before fetching new data and refactors various related modules. Key changes include:
- Renaming and updating schema fields (e.g., "resolved_refseq_id" to "resolved_id" and "resolve_attempted" to "failed_previously") in tests and production code.
- Introducing new functions such as process_existing_antismash_data, download_and_extract_from_antismash_db, and enhancing antiSMASH API and NCBI download handling.
- Updating dependency types and workflow configuration to support the new functionality.
Reviewed Changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tests/unit/schemas/test_genome_status_schema.py | Updated test cases to reflect refactored genome status fields. |
tests/unit/genomics/* | Adjusted tests to use new naming convention and integration with updated genome status and downloader functions. |
src/nplinker/genomics/antismash/podp_antismash_downloader.py | Refactored genome status handling and BGC data retrieval using new resolver and downloader methods. |
src/nplinker/genomics/antismash/antismash_downloader.py | Updated API for downloading/extracting antiSMASH data; added support for both API and database sources. |
src/nplinker/genomics/antismash/antismash_api_client.py | Introduced job submission and status APIs for antiSMASH with minor spelling issues corrected. |
pyproject.toml and GitHub workflows | Updated dependency lists to reflect new type stubs. |
Files not reviewed (1)
- src/nplinker/schemas/genome_status_schema.json: Language not supported
Comments suppressed due to low confidence (1)
src/nplinker/genomics/antismash/antismash_downloader.py:123
- The parameter name 'antimash_id' appears to be a misspelling; it should be 'antismash_id' to be consistent.
def extract_antismash_data(archive: str | PathLike, extract_root: str | PathLike, antimash_id: str) -> None:
respose_data = response.json() | ||
|
||
if "state" not in respose_data: | ||
raise ValueError(f"Job state missing in response for job_id: {job_id}") | ||
|
||
job_state = respose_data["state"] |
Copilot
AI
Apr 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable 'respose_data' seems to be misspelled; consider renaming it to 'response_data' for clarity.
respose_data = response.json() | |
if "state" not in respose_data: | |
raise ValueError(f"Job state missing in response for job_id: {job_id}") | |
job_state = respose_data["state"] | |
response_data = response.json() | |
if "state" not in response_data: | |
raise ValueError(f"Job state missing in response for job_id: {job_id}") | |
job_state = response_data["state"] |
Copilot uses AI. Check for mistakes.
This pull request enhances the handling of antiSMASH BGC data in PODP mode by checking for already downloaded files before fetching new data.
It builds upon pull request #315 and introduces the following improvements:
bgc_path