Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ena-submission): Create ena samples #2312

Closed
wants to merge 10 commits into from
Closed

Conversation

anna-parker
Copy link
Contributor

@anna-parker anna-parker commented Jul 19, 2024

resolves #2313, #2398

preview URL: https://create-ena-samples.loculus.org/

Summary

Uses same principal as create_ena_projects to keep submission state in DB - please review that PR first :-)

Summary

This adds the following rule to the ena-submission snakemake file :

  • create_samples rule . This function will continuously (in a loop) scan for new sequences where a sample needs to be created and trigger their creation. It will also update both the submission_table and the sample_table.
  • It also adds some basic unit tests for sub-functions used by create_samples.
  • It uses the metadata mapping defined in: feat(ena-submission): Decide on (sample) metadata fields mapping  #2313 (comment)

High level overview of sample_creation:

In a loop:

  1. Get sequences in submission_table in state SUBMITTED_PROJECT
  • if (there exists an entry in the sample_table for the corresponding (accession, version)):
    -- if (entry is in status SUBMITTED): update submission_table to SUBMITTED_SAMPLE.
    -- else: update submission_table to SUBMITTING_SAMPLE.
  • else: create sample entry in sample_table for (accession, version).
  1. Get sequences in submission_table in state SUBMITTING_SAMPLE
  • if (corresponding sample_table entry is in state SUBMITTED): update entries to state SUBMITTED_SAMPLE.
  1. Get sequences in sample_table in state READY, prepare submission object, set status to SUBMITTING
  • if (submission succeeds): set status to SUBMITTED and fill in results, the results of a successful submission are an sra_run_accession (starting with ERS) , a biosample_accession (starting with SAM) and an ena-internal ena_submission_accession.
  • else: set status to HAS_ERRORS and fill in errors
  1. Get sequences in sample_table in state HAS_ERRORS for over 15min and sequences in status SUBMITTING for over 15min: #TODO (handle failure ena-submission: Recover from failed project/sample/assembly submission #2311), currently just throw an error

ENA Sample

This PR will create samples in ENA with the following attributes, with additional sample attributes defined by the metadata mapping: metadata mapping defined in: #2313 (comment)

<SAMPLE_SET>
        <SAMPLE center_name="{{ institution }}" alias="{{ loculus accession}}:{{organism}}:{{unique_id}}">
                <TITLE>{{scientific name}}: Genome sequencing</TITLE>
                <SAMPLE_NAME>
                        <TAXON_ID>{{ taxon_id}}</TAXON_ID>
                        <SCIENTIFIC_NAME>{{scientific name}}</SCIENTIFIC_NAME>
                </SAMPLE_NAME>
                <DESCRIPTION>Automated upload of {{scientific name}} sequences submitted by {{institution}} from {{db}}</DESCRIPTION>
                <SAMPLE_LINKS>
                        <SAMPLE_LINK>
                                <XREF_LINK>
                                        <DB>{{db}}</DB>
                                        <ID>{{ loculus accession}}</ID>
                                </XREF_LINK>
                        </SAMPLE_LINK>
                </SAMPLE_LINKS>
                <SAMPLE_ATTRIBUTES>
                        <SAMPLE_ATTRIBUTE>
                                <TAG>geographic location (country and/or sea)</TAG>
                                <VALUE>China</VALUE>
                        </SAMPLE_ATTRIBUTE>
                </SAMPLE_ATTRIBUTES>
        </SAMPLE>
</SAMPLE_SET>

PR Checklist

image
  • Test locally on ENA dev instance
  • Test on preview on ENA dev instance:
    upload to submission_table and project creation works as intended:
image sample creation starts: image sadly the dev instance is currently down, but adding the error message to the sample_table works: image but this correctly triggers a slack notification: image but manual intervention (setting to SUBMITTED) sets the submission_table to state SUBMITTED_SAMPLE as intended.

@anna-parker anna-parker changed the title Create ena samples feat(ena-submission): Create ena samples Jul 19, 2024
@anna-parker anna-parker force-pushed the create_ena_samples branch 2 times, most recently from 7329cc0 to 1c378aa Compare August 9, 2024 10:15
@anna-parker anna-parker marked this pull request as ready for review August 9, 2024 11:50
@anna-parker anna-parker force-pushed the create_ena_samples branch 4 times, most recently from 6e1830d to 7453de9 Compare August 14, 2024 07:03
@anna-parker anna-parker added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels Aug 14, 2024
@anna-parker anna-parker added the preview Triggers a deployment to argocd label Aug 29, 2024
@corneliusroemer corneliusroemer added the deposition related to ENA/INSDC deposition label Aug 29, 2024
@anna-parker anna-parker removed the preview Triggers a deployment to argocd label Aug 29, 2024
@anna-parker anna-parker added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels Aug 29, 2024
@corneliusroemer corneliusroemer added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels Sep 16, 2024
@anna-parker
Copy link
Contributor Author

Closing as this was part of #2417

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deposition related to ENA/INSDC deposition
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat(ena-submission): Decide on (sample) metadata fields mapping
2 participants