Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GitHub Action for Nextclade annotations #158

Merged
merged 7 commits into from
Apr 1, 2024

Commits on Mar 25, 2024

  1. Prototype GitHub Action for Nextclade annotations

    Adds rules, config, and GitHub Action file to support running Nextclade
    on all available sequences. Not yet tested.
    huddlej committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    7ee96b5 View commit details
    Browse the repository at this point in the history

Commits on Mar 26, 2024

  1. Simplify Nextclade build config

    Remove unnecessary configuration parameters from the Nextclade build
    config and update the workflow to allow these parameters to be missing.
    Since Snakemake evaluates the Python code in each rule's inputs,
    outputs, and params, rules that we don't plan to run in the workflow can
    produce key errors when their config parameters are not defined.
    huddlej committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    d748234 View commit details
    Browse the repository at this point in the history
  2. Simplify Nextclade dataset logic

    Simplifies the logic to get Nextclade datasets by following the same
    pattern as the flu_frequencies workflow [1] where we grab the default
    dataset for a given lineage and segment instead of specifying a
    reference name. The "broad" and more recent references for H3N2 HA, for
    example, are not too different from each other, but the Nextclade
    annotations for the former are far more verbose than for the latter. We
    also want the files produced by this workflow to plug directly into the
    flu_frequencies workflow logic, so it is best to use the same approach
    here.
    
    [1] https://github.com/nextstrain/flu_frequencies/blob/6e4298fac3361f4a6751d85bcb963064dbb9eee1/Snakefile#L95
    huddlej committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    d18e7f2 View commit details
    Browse the repository at this point in the history
  3. Run Nextclade for NA

    Adds NA to list of segments, since we want to know the subclade
    annotations for NA as well as HA and use these data to estimate
    frequencies.
    huddlej committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    3ba4c13 View commit details
    Browse the repository at this point in the history
  4. Trigger Nextclade on PR

    Add temporary trigger for Nextclade workflow on PR event. This should
    trigger the workflow when I push the update to the PR. If it works, I
    should drop this commit again.
    huddlej committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    b4464b0 View commit details
    Browse the repository at this point in the history
  5. Remove pull request trigger

    The workflow ran successfully, so removing this trigger.
    huddlej committed Mar 26, 2024
    Configuration menu
    Copy the full SHA
    4adf731 View commit details
    Browse the repository at this point in the history

Commits on Mar 29, 2024

  1. Set Nextclade threads to a factor of 36

    Reduce threads requested for Nextclade runs from 16 to 12 so we can run 3 Nextclade jobs at once (one per lineage) on a 36-core instance of AWS Batch.
    huddlej authored Mar 29, 2024
    Configuration menu
    Copy the full SHA
    94cdfe8 View commit details
    Browse the repository at this point in the history