Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List process container images in preview mode #4069

Merged
merged 71 commits into from
Aug 24, 2023

Conversation

bentsherman
Copy link
Member

@bentsherman bentsherman commented Jun 30, 2023

Close #3340

Currently, this PR just lists the container image for each process during a preview run. It creates a basic task config for each process and tries to resolve the container directive. Even if the directive is dynamic, as long as it's defined in terms of variables that are defined at the pipeline level (e.g. workflow, ext directive from config), it will work.

If for some reason the container is defined in terms of some task specific property, it resolve will fail and just print null. But in practice I think this is extremely rare.

Successful tests:

  • rnaseq
  • sarek
  • differentialabundance
  • demultiplex
  • nanoseq
  • scrnaseq

Remaining questions:

  1. Do we want to create a separate download command for this?
  2. What should the output format be? CSV? JSON?

Signed-off-by: Ben Sherman <bentshermann@gmail.com>
@netlify

This comment was marked as off-topic.

@ewels
Copy link
Member

ewels commented Jun 30, 2023

2. What should the output format be? CSV? JSON?

Personally, I quite like the idea of JSON. It's likely that containers could be shared between multiple processes in a lot of pipelines. If we use JSON then we could structure the output so that we have a deduplicated list of container URLs, each with an array of process names that correspond to it.

Having said that, it's not too difficult to deduplicate a CSV for a user. So... maybe both? :trollface:

@bentsherman
Copy link
Member Author

Instead of making a download command, I think we should make it be just another report for the run command, like the workflow diagram. That way, you can generate the container manifest with the exact command line you would use to run the pipeline. Then the downloading could be handled by a Bash script, or a separate Nextflow command that takes the container manifest file as input.

Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's nice! I think at this point it could make sense to make it generic and allow the user to provide the list of directives they want to preview

modules/nextflow/src/main/groovy/nextflow/Session.groovy Outdated Show resolved Hide resolved
modules/nextflow/src/main/groovy/nextflow/Session.groovy Outdated Show resolved Hide resolved
@pditommaso pditommaso marked this pull request as draft July 4, 2023 16:30
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
@bentsherman
Copy link
Member Author

Okay, I added some config options for preview. Now it will write a JSON file a report, and you can specify which process directives to preview. Here is a sample of the default preview with nf-core/rnaseq:

{
    "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF": {
        "container": "nf-core/ubuntu:20.04",
        "cpus": 1,
        "memory": "6 GB",
        "time": "4h"
    },
    "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA": {
        "container": "nf-core/ubuntu:20.04",
        "cpus": 1,
        "memory": "6 GB",
        "time": "4h"
    },
    "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA": {
        "container": "biocontainers/python:3.9--1",
        "cpus": 1,
        "memory": "6 GB",
        "time": "4h"
    },
    // ...
}

@pditommaso if you want to enable it separately, we could add a preview.enabled config option. But that might be confusing with the -preview CLI option. A separate option like -preview-directives could be fine, but also I could see the preview report expanding to include other information. I think it would be fine to just enable it with -preview, as this is the kind of thing users are expecting from the preview mode anyway.

@pditommaso
Copy link
Member

But that might be confusing with the -preview

What if we call this inspect and turns on the preview flag implicitly?

@bentsherman
Copy link
Member Author

I think that would be more confusing 😆 In that case I would rather have a preview.enabled config option that implicitly enables -preview. That's the best way I can think of to enable this report separately from the existing preview behavior.

cc @ewels @drpatelh for their thoughts

@ewels
Copy link
Member

ewels commented Jul 6, 2023

I asked @bentsherman to summarise this for me:

  1. -preview CLI option automatically enables the preview report
  2. preview.enabled config option enables the preview report and implicitly enables -preview (but not the other way around)
  3. same as (2), except we called it inspect.enabled (or -inspect CLI option)

For the use cases we had been thinking about, I think a CLI flag makes most sense. This is information that the end user wants in a one-off way (usually for deployment). It's not a report that would normally need to be generated for every run. It's also not really specific to pipeline, but rather the user - so wouldn't make sense to put in a pipeline config. Sure, can use a user config, but one-off user-level spells CLI flag to me.

So my +1 is for either -preview or -inspect, don't really mind which.

@bentsherman
Copy link
Member Author

I agree that it should be a CLI option. If it's going to be separate from -preview, then I think it should be of the form -preview-*, like -preview-config or -preview-report, so that it's clear that it's an extension of the preview feature rather than a completely separate thing.

Calling it -inspect would draw the same confusion that we get over similar operators like distinct() vs unique(), combine() vs cross() vs join(), collect() vs toList(), etc.

Signed-off-by: Ben Sherman <bentshermann@gmail.com>
@bentsherman bentsherman marked this pull request as ready for review July 7, 2023 14:40
@pditommaso
Copy link
Member

To move on this, I think we should focus on the primary use case, that's the previous of pipeline container configuration. In this extend I'd suggest the following

  1. specialise the implementation for the container setting, instead of arbitrary directives
  2. enable it with the CLI option -preview-containers
  3. make it wave aware, this may be not so trivial because it will require a TaskRun instance to invoke this code

@bentsherman
Copy link
Member Author

Okay, the report is now just containers, and it uses a fake task run to preview the container so that it can be Wave-aware. The report can be JSON or Nextflow config based on the file extension.

nextflow run nf-core/rnaseq -preview-containers preview.json

{
    "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF": "quay.io/nf-core/ubuntu:20.04",
    "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA": "quay.io/nf-core/ubuntu:20.04",
    "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA": "quay.io/biocontainers/python:3.9--1",
    // ...

nextflow run nf-core/rnaseq -preview-containers preview.config

process { withName: 'NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF' { container = 'quay.io/nf-core/ubuntu:20.04' } }
process { withName: 'NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA' { container = 'quay.io/nf-core/ubuntu:20.04' } }
process { withName: 'NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA' { container = 'quay.io/biocontainers/python:3.9--1' } }
// ...

@drpatelh
Copy link
Contributor

drpatelh commented Aug 1, 2023

This is awesome 🤩

…o/nextflow into 3340-preview-container-images
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@ewels
Copy link
Member

ewels commented Aug 23, 2023

Nice! This certainly looks like what we want, and I like the separate command 👍🏻

Some quick questions:

  1. Does it work with singularity containers too?
  2. Can you pass parameters / configs to it? (eg. as we use here)
  3. Any intentions to add other things like cpus, memory, time etc as done further up?

It would be great to test some of the larger / more weird nf-core pipelines with it. Ben's initial PR comment had a good list.

@bentsherman
Copy link
Member Author

  1. Does it work with singularity containers too?

Yes, it is independent of the container runtime so it works with all of them.

  1. Can you pass parameters / configs to it? (eg. as we use here)

You can specify config files and profiles, but not CLI params or params file. Currently params can only be included through config files.

  1. Any intentions to add other things like cpus, memory, time etc as done further up?

I think that's a great idea...

Signed-off-by: Ben Sherman <bentshermann@gmail.com>
@bentsherman
Copy link
Member Author

nf-core pipelines appear to be working as before. Paolo didn't change any of the core logic, just the user interface

@pditommaso
Copy link
Member

Does it work with singularity containers too?

Yes. Above all we are going to have Singulatiy native builds via Wave. Local conversion is not going to be needed any more!

Can you pass parameters / configs to it? (eg. as we use here)

Currently, only the profile, but adding the support for params is straightforward. I'm going to add it

Any intentions to add other things like cpus, memory, time etc as done further up?

Not in the very short term, but the use of a dedicated command would make it possible to easily extend this functionality.

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@pditommaso
Copy link
Member

pditommaso commented Aug 23, 2023

Added support for -revion, -params-file and pipeline custom params. Fixed quiet mode. Almost ready. This is the output for rnaseq 👇.

I'm curios about the feedback from @drpatelh. I remember he was working on a python script doing something similar. Wonder if this address his problem.

[
    {
        "name": "NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC",
        "container": "quay.io/biocontainers/fastqc:0.11.9--0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_FORWARD:UCSC_BEDCLIP",
        "container": "quay.io/biocontainers/ucsc-bedclip:377--h0b8a92a_2"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXSTATS",
        "container": "quay.io/biocontainers/samtools:1.17--h00cdaf9_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:DESEQ2_QC_STAR_SALMON",
        "container": "quay.io/biocontainers/mulled-v2-8849acf39a43cdd6c839a369a74c0adc823e2f91:ab110436faf952a33575c64dd74615a84011450b-0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS",
        "container": "quay.io/biocontainers/samtools:1.17--h00cdaf9_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:MULTIQC_CUSTOM_BIOTYPE",
        "container": "quay.io/biocontainers/python:3.9--1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX",
        "container": "quay.io/nf-core/ubuntu:20.04"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STAR_ALIGN",
        "container": "quay.io/biocontainers/mulled-v2-1fa26d1ce03c295fe2fdcf85831a92fbcbd7e8c2:1df389393721fc66f3fd8778ad938ac711951107-0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:CUSTOM_DUMPSOFTWAREVERSIONS",
        "container": "quay.io/biocontainers/multiqc:1.14--pyhdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAGSTAT",
        "container": "quay.io/biocontainers/samtools:1.17--h00cdaf9_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE",
        "container": "quay.io/biocontainers/trim-galore:0.6.7--hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_FORWARD:UCSC_BEDGRAPHTOBIGWIG",
        "container": "quay.io/biocontainers/ucsc-bedgraphtobigwig:377--h446ed27_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:CAT_FASTQ",
        "container": "quay.io/nf-core/ubuntu:20.04"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:SAMTOOLS_INDEX",
        "container": "quay.io/biocontainers/samtools:1.17--h00cdaf9_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONANNOTATION",
        "container": "quay.io/biocontainers/rseqc:3.0.1--py37h516909a_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES",
        "container": "quay.io/biocontainers/samtools:1.16.1--h6899075_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_SALMON:SALMON_TX2GENE",
        "container": "quay.io/biocontainers/python:3.9--1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_QUANT",
        "container": "quay.io/biocontainers/salmon:1.10.1--h7e5ed60_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_SE_GENE_SCALED",
        "container": "quay.io/biocontainers/bioconductor-summarizedexperiment:1.24.0--r41hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_REVERSE:UCSC_BEDCLIP",
        "container": "quay.io/biocontainers/ucsc-bedclip:377--h0b8a92a_2"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONSATURATION",
        "container": "quay.io/biocontainers/rseqc:3.0.1--py37h516909a_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:DESEQ2_QC_SALMON",
        "container": "quay.io/biocontainers/mulled-v2-8849acf39a43cdd6c839a369a74c0adc823e2f91:ab110436faf952a33575c64dd74615a84011450b-0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF",
        "container": "quay.io/nf-core/ubuntu:20.04"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:DUPRADAR",
        "container": "quay.io/biocontainers/bioconductor-dupradar:1.28.0--r42hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:FQ_SUBSAMPLE",
        "container": "quay.io/biocontainers/fq:0.9.1--h9ee0642_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:STRINGTIE_STRINGTIE",
        "container": "quay.io/biocontainers/stringtie:2.2.1--hecb563c_2"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:PICARD_MARKDUPLICATES",
        "container": "quay.io/biocontainers/picard:3.0.0--hdfd78af_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:SUBREAD_FEATURECOUNTS",
        "container": "quay.io/biocontainers/subread:2.0.1--hed695b0_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_SALMON:SALMON_TXIMPORT",
        "container": "quay.io/biocontainers/bioconductor-tximeta:1.12.0--r41hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BEDTOOLS_GENOMECOV",
        "container": "quay.io/biocontainers/bedtools:2.30.0--hc088bd4_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_INNERDISTANCE",
        "container": "quay.io/biocontainers/rseqc:3.0.1--py37h516909a_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA",
        "container": "quay.io/nf-core/ubuntu:20.04"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_SALMON:SALMON_QUANT",
        "container": "quay.io/biocontainers/salmon:1.10.1--h7e5ed60_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_TXIMPORT",
        "container": "quay.io/biocontainers/bioconductor-tximeta:1.12.0--r41hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_SE_GENE",
        "container": "quay.io/biocontainers/bioconductor-summarizedexperiment:1.24.0--r41hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_INDEX",
        "container": "quay.io/biocontainers/samtools:1.17--h00cdaf9_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_TX2GENE",
        "container": "quay.io/biocontainers/python:3.9--1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_BAMSTAT",
        "container": "quay.io/biocontainers/rseqc:3.0.1--py37h516909a_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_READDUPLICATION",
        "container": "quay.io/biocontainers/rseqc:3.0.1--py37h516909a_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS",
        "container": "quay.io/biocontainers/samtools:1.17--h00cdaf9_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BBMAP_BBSPLIT",
        "container": "quay.io/biocontainers/bbmap:39.01--h5c4e2a8_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:INPUT_CHECK:SAMPLESHEET_CHECK",
        "container": "quay.io/biocontainers/python:3.9--1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_SALMON:SALMON_SE_GENE",
        "container": "quay.io/biocontainers/bioconductor-summarizedexperiment:1.24.0--r41hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT",
        "container": "quay.io/biocontainers/bbmap:39.01--h5c4e2a8_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXSTATS",
        "container": "quay.io/biocontainers/samtools:1.17--h00cdaf9_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_SE_GENE_LENGTH_SCALED",
        "container": "quay.io/biocontainers/bioconductor-summarizedexperiment:1.24.0--r41hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUALIMAP_RNASEQ",
        "container": "quay.io/biocontainers/qualimap:2.2.2d--1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_READDISTRIBUTION",
        "container": "quay.io/biocontainers/rseqc:3.0.1--py37h516909a_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_SALMON:SALMON_SE_GENE_SCALED",
        "container": "quay.io/biocontainers/bioconductor-summarizedexperiment:1.24.0--r41hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT",
        "container": "quay.io/biocontainers/salmon:1.10.1--h7e5ed60_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE",
        "container": "quay.io/biocontainers/mulled-v2-1fa26d1ce03c295fe2fdcf85831a92fbcbd7e8c2:1df389393721fc66f3fd8778ad938ac711951107-0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED",
        "container": "quay.io/biocontainers/perl:5.26.2"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_SORT",
        "container": "quay.io/biocontainers/samtools:1.17--h00cdaf9_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_INFEREXPERIMENT",
        "container": "quay.io/biocontainers/rseqc:3.0.1--py37h516909a_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:MULTIQC",
        "container": "quay.io/biocontainers/multiqc:1.14--pyhdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_SE_TRANSCRIPT",
        "container": "quay.io/biocontainers/bioconductor-summarizedexperiment:1.24.0--r41hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA",
        "container": "quay.io/biocontainers/python:3.9--1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAGSTAT",
        "container": "quay.io/biocontainers/samtools:1.17--h00cdaf9_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_SALMON:SALMON_SE_GENE_LENGTH_SCALED",
        "container": "quay.io/biocontainers/bioconductor-summarizedexperiment:1.24.0--r41hdfd78af_0"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_REVERSE:UCSC_BEDGRAPHTOBIGWIG",
        "container": "quay.io/biocontainers/ucsc-bedgraphtobigwig:377--h446ed27_1"
    },
    {
        "name": "NFCORE_RNASEQ:RNASEQ:QUANTIFY_SALMON:SALMON_SE_TRANSCRIPT",
        "container": "quay.io/biocontainers/bioconductor-summarizedexperiment:1.24.0--r41hdfd78af_0"
    }
]

pditommaso and others added 2 commits August 23, 2023 22:49
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
@ewels
Copy link
Member

ewels commented Aug 23, 2023

I remember he was working on a python script doing something similar.

Yup - there are several. nf-core/tools is a python script that does it, and I'm aware of a few others. The problem is that none of them really understand the code and mostly work by just trying to parse the text strings. So whenever we change anything (such as the recent adoption of docker.registry, but there are others), everything breaks 💥 It's also very fragile to whitespace / other insignificant syntactic differences (eg. types of quote marks).

Wonder if this address his problem.

Hopefully! That was the motivation for the issue and this resulting PR. It's certainly looking that way, I'm happy! 😅

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@pditommaso
Copy link
Member

Made a few changes:

  1. Added a prompt confirmation when Wave + freeze mode is enabled, to prevent massive build operation if unwanted

  2. Added a few tests more

  3. Restructure the JSON using as below:

            {
                "processes": [
                    {
                        "name": "proc2",
                        "container": "container2"
                    },
                    {
                        "name": "proc1",
                        "container": "container1"
                    }
                ]
            }

Think it's ready

@pditommaso
Copy link
Member

Think more on this, i'm not super convinced of adding a prompt confirmation when using Wave for building the container images.

A better solution could be to perform a dry-run request to wave (currently not existing) by default, and submit a real build request when a specific option is provided.

pditommaso and others added 2 commits August 24, 2023 15:17
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
@pditommaso
Copy link
Member

Thanks Ben for updating the docs. Let's merge this

@pditommaso pditommaso merged commit 090c31c into master Aug 24, 2023
19 checks passed
@pditommaso pditommaso deleted the 3340-preview-container-images branch August 24, 2023 14:26
@drpatelh
Copy link
Contributor

Sorry, just catching up with this. Thank you 🙏

It is indeed the perfect replacement for the custom Python scripts we have cobbled together to scrape container information.

@pditommaso
Copy link
Member

Excellent. I need to show you guys the native build for singularity containers. It should help you a lot.

@pditommaso pditommaso added this to the 23.10.0 milestone Sep 10, 2023
abhi18av pushed a commit to abhi18av/nextflow that referenced this pull request Oct 28, 2023
This commit introduces a new nextflow command named `inspect`. 

The inspect command allows resolving a pipeline script or project reporting 
all container images used by the pipeline execution. 

The main advantage of this command over  the existing `config` command is that
it's able to resolve container names defined "dynamically" or Wave containers that 
are only determined at execution time. 

The command option `-concretise` when used along with the Wave freeze option 
allows building ahead all the container images required by the pipeline execution.  


Signed-off-by: Ben Sherman <bentshermann@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New command/option to download containers for offline usage
8 participants