Skip to content

Suggested improvements for Metamist analysis IO #146

@EddieLF

Description

@EddieLF

CPG Flow could use some changes to more cleanly interact with Metamist analysis records, first during workflow startup, and also at the point where analyses are written for the successfully completed stages.

Input

  1. Querying for all cram and gvcf analyses upon startup
    I can see the merit to collecting all cram and gvcf paths, however it's not necessary for every workflow. For example, long-read cohorts or RNAseq cohorts do not need to check for gvcf existence. Done here during _populate_analysis, which runs for every workflow.

  2. Async queries
    We should prefer to use asynchronous queries to Metamist

  3. Using enums
    Addressing Analysis types in metamist.py not consistent with Metamist #128, CPG flow should query the Metamist enum tables for the allowed values, and then validate against these values from the enum table. See here for an implementation that queries the enum table and returns allowed types for each enum, not just the Analysis types.

Output

  1. Allow writing of multiple analysis outputs to a single record. See Allow writing multiple files into a single analysis record #126

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions