Skip to content

Conversation

liannette
Copy link
Contributor

@liannette liannette commented Mar 21, 2025

This pull request enhances the handling of antiSMASH BGC data in PODP mode by checking for already downloaded files before fetching new data.

It builds upon pull request #315 and introduces the following improvements:

  • Before attempting to obtain BGC data, the project downloads directory is checked to see if the file is already present.
  • This check is performed even when the genome status does not contain a bgc_path
  • This feature allows users to deposit pre-existing antiSMASH data directly into the project downloads directory, reducing redundant downloads and unnecessary API requests

liannette and others added 30 commits March 18, 2025 17:58
…ieval into distinct functions for improved clarity
@liannette liannette changed the title Feature/podp_downloadedbgc data Feature/podp_downloaded_bgc_data Mar 21, 2025
@CunliangGeng CunliangGeng requested a review from Copilot April 1, 2025 15:02
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the handling of antiSMASH BGC data in PODP mode by checking for pre‐downloaded data before fetching new data and refactors various related modules. Key changes include:

  • Renaming and updating schema fields (e.g., "resolved_refseq_id" to "resolved_id" and "resolve_attempted" to "failed_previously") in tests and production code.
  • Introducing new functions such as process_existing_antismash_data, download_and_extract_from_antismash_db, and enhancing antiSMASH API and NCBI download handling.
  • Updating dependency types and workflow configuration to support the new functionality.

Reviewed Changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/unit/schemas/test_genome_status_schema.py Updated test cases to reflect refactored genome status fields.
tests/unit/genomics/* Adjusted tests to use new naming convention and integration with updated genome status and downloader functions.
src/nplinker/genomics/antismash/podp_antismash_downloader.py Refactored genome status handling and BGC data retrieval using new resolver and downloader methods.
src/nplinker/genomics/antismash/antismash_downloader.py Updated API for downloading/extracting antiSMASH data; added support for both API and database sources.
src/nplinker/genomics/antismash/antismash_api_client.py Introduced job submission and status APIs for antiSMASH with minor spelling issues corrected.
pyproject.toml and GitHub workflows Updated dependency lists to reflect new type stubs.
Files not reviewed (1)
  • src/nplinker/schemas/genome_status_schema.json: Language not supported
Comments suppressed due to low confidence (1)

src/nplinker/genomics/antismash/antismash_downloader.py:123

  • The parameter name 'antimash_id' appears to be a misspelling; it should be 'antismash_id' to be consistent.
def extract_antismash_data(archive: str | PathLike, extract_root: str | PathLike, antimash_id: str) -> None:

Comment on lines +65 to +70
respose_data = response.json()

if "state" not in respose_data:
raise ValueError(f"Job state missing in response for job_id: {job_id}")

job_state = respose_data["state"]
Copy link

Copilot AI Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable 'respose_data' seems to be misspelled; consider renaming it to 'response_data' for clarity.

Suggested change
respose_data = response.json()
if "state" not in respose_data:
raise ValueError(f"Job state missing in response for job_id: {job_id}")
job_state = respose_data["state"]
response_data = response.json()
if "state" not in response_data:
raise ValueError(f"Job state missing in response for job_id: {job_id}")
job_state = response_data["state"]

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant