diff --git a/.editorconfig b/.editorconfig index dd9ffa53..e1058815 100644 --- a/.editorconfig +++ b/.editorconfig @@ -11,6 +11,7 @@ indent_style = space [*.{md,yml,yaml,html,css,scss,js}] indent_size = 2 + # These files are edited and tested upstream in nf-core/modules [/modules/nf-core/**] charset = unset @@ -25,13 +26,12 @@ insert_final_newline = unset trim_trailing_whitespace = unset indent_style = unset + + [/assets/email*] indent_size = unset -# ignore Readme -[README.md] -indent_style = unset -# ignore python +# ignore python and markdown [*.{py,md}] indent_style = unset diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index 0779fb9b..09ba835d 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -19,7 +19,7 @@ If you'd like to write some code for nf-core/smrnaseq, the standard workflow is 1. Check that there isn't already an issue about your idea in the [nf-core/smrnaseq issues](https://github.com/nf-core/smrnaseq/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this 2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [nf-core/smrnaseq repository](https://github.com/nf-core/smrnaseq) to your GitHub account 3. Make the necessary changes / additions within your forked repository following [Pipeline conventions](#pipeline-contribution-conventions) -4. Use `nf-core schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10). +4. Use `nf-core pipelines schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10). 5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/). @@ -40,7 +40,7 @@ There are typically two types of tests that run: ### Lint tests `nf-core` has a [set of guidelines](https://nf-co.re/developers/guidelines) which all pipelines must adhere to. -To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core lint ` command. +To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core pipelines lint ` command. If any failures or warnings are encountered, please follow the listed URL for more documentation. @@ -75,7 +75,7 @@ If you wish to contribute a new step, please use the following coding standards: 2. Write the process block (see below). 3. Define the output channel if needed (see below). 4. Add any new parameters to `nextflow.config` with a default (see below). -5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core schema build` tool). +5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core pipelines schema build` tool). 6. Add sanity checks and validation for all relevant parameters. 7. Perform local tests to validate that the new code works as expected. 8. If applicable, add a new test command in `.github/workflow/ci.yml`. @@ -86,11 +86,11 @@ If you wish to contribute a new step, please use the following coding standards: Parameters should be initialised / defined with default values in `nextflow.config` under the `params` scope. -Once there, use `nf-core schema build` to add to `nextflow_schema.json`. +Once there, use `nf-core pipelines schema build` to add to `nextflow_schema.json`. ### Default processes resource requirements -Sensible defaults for process resource requirements (CPUs / memory / time) for a process should be defined in `conf/base.config`. These should generally be specified generic with `withLabel:` selectors so they can be shared across multiple processes/steps of the pipeline. A nf-core standard set of labels that should be followed where possible can be seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/master/nf_core/pipeline-template/conf/base.config), which has the default process as a single core-process, and then different levels of multi-core configurations for increasingly large memory requirements defined with standardised labels. +Sensible defaults for process resource requirements (CPUs / memory / time) for a process should be defined in `conf/base.config`. These should generally be specified generic with `withLabel:` selectors so they can be shared across multiple processes/steps of the pipeline. A nf-core standard set of labels that should be followed where possible can be seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/main/nf_core/pipeline-template/conf/base.config), which has the default process as a single core-process, and then different levels of multi-core configurations for increasingly large memory requirements defined with standardised labels. The process resources can be passed on to the tool dynamically within the process with the `${task.cpus}` and `${task.memory}` variables in the `script:` block. @@ -103,7 +103,7 @@ Please use the following naming schemes, to make it easy to understand what is g ### Nextflow version bumping -If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core bump-version --nextflow . [min-nf-version]` +If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core pipelines bump-version --nextflow . [min-nf-version]` ### Images and figures diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index ef59ff45..2544ad43 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -17,8 +17,8 @@ Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-core/smrn - [ ] If you've fixed a bug or added code that should be tested, add tests! - [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/smrnaseq/tree/master/.github/CONTRIBUTING.md) - [ ] If necessary, also make a PR on the nf-core/smrnaseq _branch_ on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository. -- [ ] Make sure your code lints (`nf-core lint`). -- [ ] Ensure the test suite passes (`nf-test test main.nf.test -profile test,docker`). +- [ ] Make sure your code lints (`nf-core pipelines lint`). +- [ ] Ensure the test suite passes (`nextflow run . -profile test,docker --outdir `). - [ ] Check for unexpected warnings in debug mode (`nextflow run . -profile debug,test,docker --outdir `). - [ ] Usage Documentation in `docs/usage.md` is updated. - [ ] Output Documentation in `docs/output.md` is updated. diff --git a/.github/workflows/awsfulltest.yml b/.github/workflows/awsfulltest.yml index 99c96e77..36fdbad6 100644 --- a/.github/workflows/awsfulltest.yml +++ b/.github/workflows/awsfulltest.yml @@ -1,19 +1,36 @@ name: nf-core AWS full size tests -# This workflow is triggered on published releases. +# This workflow is triggered on PRs opened against the master branch. # It can be additionally triggered manually with GitHub actions workflow dispatch button. # It runs the -profile 'test_full' on AWS batch on: - release: - types: [published] + pull_request: + branches: + - master workflow_dispatch: + pull_request_review: + types: [submitted] + jobs: - run-tower: + run-platform: name: Run AWS full tests - if: github.repository == 'nf-core/smrnaseq' + # run only if the PR is approved by at least 2 reviewers and against the master branch or manually triggered + if: github.repository == 'nf-core/smrnaseq' && github.event.review.state == 'approved' && github.event.pull_request.base.ref == 'master' || github.event_name == 'workflow_dispatch' runs-on: ubuntu-latest steps: - - name: Launch workflow via tower + - uses: octokit/request-action@v2.x + id: check_approvals + with: + route: GET /repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number }}/reviews + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + - id: test_variables + if: github.event_name != 'workflow_dispatch' + run: | + JSON_RESPONSE='${{ steps.check_approvals.outputs.data }}' + CURRENT_APPROVALS_COUNT=$(echo $JSON_RESPONSE | jq -c '[.[] | select(.state | contains("APPROVED")) ] | length') + test $CURRENT_APPROVALS_COUNT -ge 2 || exit 1 # At least 2 approvals are required + - name: Launch workflow via Seqera Platform uses: seqeralabs/action-tower-launch@v2 # Add full size test data (but still relatively small datasets for few samples) # on the `test_full.config` test runs with only one set of parameters @@ -32,7 +49,7 @@ jobs: - uses: actions/upload-artifact@v4 with: - name: Tower debug log file + name: Seqera Platform debug log file path: | - tower_action_*.log - tower_action_*.json + seqera_platform_action_*.log + seqera_platform_action_*.json diff --git a/.github/workflows/awstest.yml b/.github/workflows/awstest.yml index 5386cbc0..398ec4cc 100644 --- a/.github/workflows/awstest.yml +++ b/.github/workflows/awstest.yml @@ -5,13 +5,13 @@ name: nf-core AWS test on: workflow_dispatch: jobs: - run-tower: + run-platform: name: Run AWS tests if: github.repository == 'nf-core/smrnaseq' runs-on: ubuntu-latest steps: - # Launch workflow using Tower CLI tool action - - name: Launch workflow via tower + # Launch workflow using Seqera Platform CLI tool action + - name: Launch workflow via Seqera Platform uses: seqeralabs/action-tower-launch@v2 with: workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }} @@ -23,11 +23,11 @@ jobs: { "outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/smrnaseq/results-test-${{ github.sha }}" } - profiles: test + profiles: test,illumina - uses: actions/upload-artifact@v4 with: - name: Tower debug log file + name: Seqera Platform debug log file path: | - tower_action_*.log - tower_action_*.json + seqera_platform_action_*.log + seqera_platform_action_*.json diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index cdadbf16..dc67eea0 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -4,12 +4,23 @@ on: push: branches: - dev + - master pull_request: + branches: + - dev + - master release: types: [published] + workflow_dispatch: env: NXF_ANSI_LOG: false + NXF_SINGULARITY_CACHEDIR: ${{ github.workspace }}/.singularity + NXF_SINGULARITY_LIBRARYDIR: ${{ github.workspace }}/.singularity + NFT_VER: "0.9.0" + NFT_WORKDIR: "~" + NFT_DIFF: "pdiff" + NFT_DIFF_ARGS: "--line-numbers --expand-tabs=2" concurrency: group: "${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}" @@ -17,32 +28,56 @@ concurrency: jobs: test: - name: Run pipeline with test data + name: "Run pipeline with test data (${{ matrix.NXF_VER }} | ${{ matrix.test_name }} | ${{ matrix.profile }})" # Only run on push if this is the nf-core dev branch (merged PRs) if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/smrnaseq') }}" runs-on: ubuntu-latest strategy: + fail-fast: false matrix: + shard: [1, 2, 3, 4] NXF_VER: - - "23.04.0" - - "latest-everything" - profile: - - "test" - - "test_no_genome" - - "test_umi" - - "test_index" + - "24.04.2" + profile: ["docker"] + env: + SHARDS: "4" steps: - name: Check out pipeline code - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4 + uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 + with: + fetch-depth: 0 + + - uses: actions/setup-python@v4 + with: + python-version: "3.11" + architecture: "x64" - - name: Install Nextflow - uses: nf-core/setup-nextflow@v1 + - name: Install pdiff to see diff between nf-test snapshots + run: | + python -m pip install --upgrade pip + pip install pdiff + + - uses: nf-core/setup-nextflow@v2 with: version: "${{ matrix.NXF_VER }}" - - name: Disk space cleanup - uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1 + - uses: nf-core/setup-nf-test@v1 + with: + version: ${{ env.NFT_VER }} - - name: Run pipeline with test data + - name: Run Tests (Shard ${{ matrix.shard }}/${{ env.SHARDS }}) run: | - nextflow run ${GITHUB_WORKSPACE} -profile ${{ matrix.profile }},docker --outdir ./results + nf-test test \ + --ci \ + --shard ${{ matrix.shard }}/${{ env.SHARDS }} \ + --changed-since HEAD^ \ + --profile "+${{ matrix.profile }},ci" \ + --filter pipeline \ + --junitxml=test.xml + + - name: Publish Test Report + uses: mikepenz/action-junit-report@v3 + if: always() # always run even if the previous step fails + with: + report_paths: test.xml + annotate_only: true diff --git a/.github/workflows/download_pipeline.yml b/.github/workflows/download_pipeline.yml index 08622fd5..1552cf2e 100644 --- a/.github/workflows/download_pipeline.yml +++ b/.github/workflows/download_pipeline.yml @@ -1,4 +1,4 @@ -name: Test successful pipeline download with 'nf-core download' +name: Test successful pipeline download with 'nf-core pipelines download' # Run the workflow when: # - dispatched manually @@ -8,12 +8,14 @@ on: workflow_dispatch: inputs: testbranch: - description: "The specific branch you wish to utilize for the test execution of nf-core download." + description: "The specific branch you wish to utilize for the test execution of nf-core pipelines download." required: true default: "dev" pull_request: types: - opened + - edited + - synchronize branches: - master pull_request_target: @@ -28,15 +30,20 @@ jobs: runs-on: ubuntu-latest steps: - name: Install Nextflow - uses: nf-core/setup-nextflow@v1 + uses: nf-core/setup-nextflow@v2 - - uses: actions/setup-python@0a5c61591373683505ea898e09a3ea4f39ef2b9c # v5 + - name: Disk space cleanup + uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1 + + - uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 with: - python-version: "3.11" + python-version: "3.12" architecture: "x64" - - uses: eWaterCycle/setup-singularity@931d4e31109e875b13309ae1d07c70ca8fbc8537 # v7 + + - name: Setup Apptainer + uses: eWaterCycle/setup-apptainer@4bb22c52d4f63406c49e94c804632975787312b3 # v2.0.0 with: - singularity-version: 3.8.3 + apptainer-version: 1.3.4 - name: Install dependencies run: | @@ -49,24 +56,64 @@ jobs: echo "REPOTITLE_LOWERCASE=$(basename ${GITHUB_REPOSITORY,,})" >> ${GITHUB_ENV} echo "REPO_BRANCH=${{ github.event.inputs.testbranch || 'dev' }}" >> ${GITHUB_ENV} + - name: Make a cache directory for the container images + run: | + mkdir -p ./singularity_container_images + - name: Download the pipeline env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images run: | - nf-core download ${{ env.REPO_LOWERCASE }} \ + nf-core pipelines download ${{ env.REPO_LOWERCASE }} \ --revision ${{ env.REPO_BRANCH }} \ --outdir ./${{ env.REPOTITLE_LOWERCASE }} \ --compress "none" \ --container-system 'singularity' \ - --container-library "quay.io" -l "docker.io" -l "ghcr.io" \ + --container-library "quay.io" -l "docker.io" -l "community.wave.seqera.io" \ --container-cache-utilisation 'amend' \ - --download-configuration + --download-configuration 'yes' - name: Inspect download run: tree ./${{ env.REPOTITLE_LOWERCASE }} - - name: Run the downloaded pipeline + - name: Count the downloaded number of container images + id: count_initial + run: | + image_count=$(ls -1 ./singularity_container_images | wc -l | xargs) + echo "Initial container image count: $image_count" + echo "IMAGE_COUNT_INITIAL=$image_count" >> ${GITHUB_ENV} + + - name: Run the downloaded pipeline (stub) + id: stub_run_pipeline + continue-on-error: true env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images NXF_SINGULARITY_HOME_MOUNT: true - run: nextflow run ./${{ env.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ env.REPO_BRANCH }}) -stub -profile test,singularity --outdir ./results + run: nextflow run ./${{ env.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ env.REPO_BRANCH }}) -stub -profile test,singularity,illumina --outdir ./results + - name: Run the downloaded pipeline (stub run not supported) + id: run_pipeline + if: ${{ job.steps.stub_run_pipeline.status == failure() }} + env: + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images + NXF_SINGULARITY_HOME_MOUNT: true + run: nextflow run ./${{ env.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ env.REPO_BRANCH }}) -profile test,singularity --outdir ./results + + - name: Count the downloaded number of container images + id: count_afterwards + run: | + image_count=$(ls -1 ./singularity_container_images | wc -l | xargs) + echo "Post-pipeline run container image count: $image_count" + echo "IMAGE_COUNT_AFTER=$image_count" >> ${GITHUB_ENV} + + - name: Compare container image counts + run: | + if [ "${{ env.IMAGE_COUNT_INITIAL }}" -ne "${{ env.IMAGE_COUNT_AFTER }}" ]; then + initial_count=${{ env.IMAGE_COUNT_INITIAL }} + final_count=${{ env.IMAGE_COUNT_AFTER }} + difference=$((final_count - initial_count)) + echo "$difference additional container images were \n downloaded at runtime . The pipeline has no support for offline runs!" + tree ./singularity_container_images + exit 1 + else + echo "The pipeline can be downloaded successfully!" + fi diff --git a/.github/workflows/fix-linting.yml b/.github/workflows/fix-linting.yml index 56151e57..5dbcd658 100644 --- a/.github/workflows/fix-linting.yml +++ b/.github/workflows/fix-linting.yml @@ -13,7 +13,7 @@ jobs: runs-on: ubuntu-latest steps: # Use the @nf-core-bot token to check out so we can push later - - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4 + - uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 with: token: ${{ secrets.nf_core_bot_auth_token }} @@ -32,9 +32,9 @@ jobs: GITHUB_TOKEN: ${{ secrets.nf_core_bot_auth_token }} # Install and run pre-commit - - uses: actions/setup-python@0a5c61591373683505ea898e09a3ea4f39ef2b9c # v5 + - uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 with: - python-version: 3.11 + python-version: "3.12" - name: Install pre-commit run: pip install pre-commit diff --git a/.github/workflows/linting.yml b/.github/workflows/linting.yml index 073e1876..a502573c 100644 --- a/.github/workflows/linting.yml +++ b/.github/workflows/linting.yml @@ -1,6 +1,6 @@ name: nf-core linting # This workflow is triggered on pushes and PRs to the repository. -# It runs the `nf-core lint` and markdown lint tests to ensure +# It runs the `nf-core pipelines lint` and markdown lint tests to ensure # that the code meets the nf-core guidelines. on: push: @@ -14,13 +14,12 @@ jobs: pre-commit: runs-on: ubuntu-latest steps: - - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4 + - uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 - - name: Set up Python 3.11 - uses: actions/setup-python@0a5c61591373683505ea898e09a3ea4f39ef2b9c # v5 + - name: Set up Python 3.12 + uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 with: - python-version: 3.11 - cache: "pip" + python-version: "3.12" - name: Install pre-commit run: pip install pre-commit @@ -32,27 +31,42 @@ jobs: runs-on: ubuntu-latest steps: - name: Check out pipeline code - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4 + uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 - name: Install Nextflow - uses: nf-core/setup-nextflow@v1 + uses: nf-core/setup-nextflow@v2 - - uses: actions/setup-python@0a5c61591373683505ea898e09a3ea4f39ef2b9c # v5 + - uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 with: - python-version: "3.11" + python-version: "3.12" architecture: "x64" + - name: read .nf-core.yml + uses: pietrobolcato/action-read-yaml@1.1.0 + id: read_yml + with: + config: ${{ github.workspace }}/.nf-core.yml + - name: Install dependencies run: | python -m pip install --upgrade pip - pip install nf-core + pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} + + - name: Run nf-core pipelines lint + if: ${{ github.base_ref != 'master' }} + env: + GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }} + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + GITHUB_PR_COMMIT: ${{ github.event.pull_request.head.sha }} + run: nf-core -l lint_log.txt pipelines lint --dir ${GITHUB_WORKSPACE} --markdown lint_results.md - - name: Run nf-core lint + - name: Run nf-core pipelines lint --release + if: ${{ github.base_ref == 'master' }} env: GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_PR_COMMIT: ${{ github.event.pull_request.head.sha }} - run: nf-core -l lint_log.txt lint --dir ${GITHUB_WORKSPACE} --markdown lint_results.md + run: nf-core -l lint_log.txt pipelines lint --release --dir ${GITHUB_WORKSPACE} --markdown lint_results.md - name: Save PR number if: ${{ always() }} @@ -60,7 +74,7 @@ jobs: - name: Upload linting log file artifact if: ${{ always() }} - uses: actions/upload-artifact@5d5d22a31266ced268874388b861e4b58bb5c2f3 # v4 + uses: actions/upload-artifact@65462800fd760344b1a7b4382951275a0abb4808 # v4 with: name: linting-logs path: | diff --git a/.github/workflows/linting_comment.yml b/.github/workflows/linting_comment.yml index b706875f..42e519bf 100644 --- a/.github/workflows/linting_comment.yml +++ b/.github/workflows/linting_comment.yml @@ -11,7 +11,7 @@ jobs: runs-on: ubuntu-latest steps: - name: Download lint results - uses: dawidd6/action-download-artifact@f6b0bace624032e30a85a8fd9c1a7f8f611f5737 # v3 + uses: dawidd6/action-download-artifact@bf251b5aa9c2f7eeb574a96ee720e24f801b7c11 # v6 with: workflow: linting.yml workflow_conclusion: completed diff --git a/.github/workflows/release-announcements.yml b/.github/workflows/release-announcements.yml index d468aeaa..c6ba35df 100644 --- a/.github/workflows/release-announcements.yml +++ b/.github/workflows/release-announcements.yml @@ -12,7 +12,7 @@ jobs: - name: get topics and convert to hashtags id: get_topics run: | - curl -s https://nf-co.re/pipelines.json | jq -r '.remote_workflows[] | select(.full_name == "${{ github.repository }}") | .topics[]' | awk '{print "#"$0}' | tr '\n' ' ' >> $GITHUB_OUTPUT + echo "topics=$(curl -s https://nf-co.re/pipelines.json | jq -r '.remote_workflows[] | select(.full_name == "${{ github.repository }}") | .topics[]' | awk '{print "#"$0}' | tr '\n' ' ')" | sed 's/-//g' >> $GITHUB_OUTPUT - uses: rzr/fediverse-action@master with: @@ -25,13 +25,13 @@ jobs: Please see the changelog: ${{ github.event.release.html_url }} - ${{ steps.get_topics.outputs.GITHUB_OUTPUT }} #nfcore #openscience #nextflow #bioinformatics + ${{ steps.get_topics.outputs.topics }} #nfcore #openscience #nextflow #bioinformatics send-tweet: runs-on: ubuntu-latest steps: - - uses: actions/setup-python@0a5c61591373683505ea898e09a3ea4f39ef2b9c # v5 + - uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 with: python-version: "3.10" - name: Install dependencies diff --git a/.github/workflows/template_version_comment.yml b/.github/workflows/template_version_comment.yml new file mode 100644 index 00000000..e8aafe44 --- /dev/null +++ b/.github/workflows/template_version_comment.yml @@ -0,0 +1,46 @@ +name: nf-core template version comment +# This workflow is triggered on PRs to check if the pipeline template version matches the latest nf-core version. +# It posts a comment to the PR, even if it comes from a fork. + +on: pull_request_target + +jobs: + template_version: + runs-on: ubuntu-latest + steps: + - name: Check out pipeline code + uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 + with: + ref: ${{ github.event.pull_request.head.sha }} + + - name: Read template version from .nf-core.yml + uses: nichmor/minimal-read-yaml@v0.0.2 + id: read_yml + with: + config: ${{ github.workspace }}/.nf-core.yml + + - name: Install nf-core + run: | + python -m pip install --upgrade pip + pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} + + - name: Check nf-core outdated + id: nf_core_outdated + run: echo "OUTPUT=$(pip list --outdated | grep nf-core)" >> ${GITHUB_ENV} + + - name: Post nf-core template version comment + uses: mshick/add-pr-comment@b8f338c590a895d50bcbfa6c5859251edc8952fc # v2 + if: | + contains(env.OUTPUT, 'nf-core') + with: + repo-token: ${{ secrets.NF_CORE_BOT_AUTH_TOKEN }} + allow-repeats: false + message: | + > [!WARNING] + > Newer version of the nf-core template is available. + > + > Your pipeline is using an old version of the nf-core template: ${{ steps.read_yml.outputs['nf_core_version'] }}. + > Please update your pipeline to the latest version. + > + > For more documentation on how to update your pipeline, please see the [nf-core documentation](https://github.com/nf-core/tools?tab=readme-ov-file#sync-a-pipeline-with-the-template) and [Synchronisation documentation](https://nf-co.re/docs/contributing/sync). + # diff --git a/.gitignore b/.gitignore index 4109b5c9..24fc8b91 100644 --- a/.gitignore +++ b/.gitignore @@ -6,4 +6,6 @@ results/ testing/ testing* *.pyc +null/ execution_trace* +.nf-test* diff --git a/.gitpod.yml b/.gitpod.yml index 105a1821..46118637 100644 --- a/.gitpod.yml +++ b/.gitpod.yml @@ -4,17 +4,14 @@ tasks: command: | pre-commit install --install-hooks nextflow self-update - - name: unset JAVA_TOOL_OPTIONS - command: | - unset JAVA_TOOL_OPTIONS vscode: extensions: # based on nf-core.nf-core-extensionpack - - esbenp.prettier-vscode # Markdown/CommonMark linting and style checking for Visual Studio Code + #- esbenp.prettier-vscode # Markdown/CommonMark linting and style checking for Visual Studio Code - EditorConfig.EditorConfig # override user/workspace settings with settings found in .editorconfig files - Gruntfuggly.todo-tree # Display TODO and FIXME in a tree view in the activity bar - mechatroner.rainbow-csv # Highlight columns in csv files in different colors - # - nextflow.nextflow # Nextflow syntax highlighting + - nextflow.nextflow # Nextflow syntax highlighting - oderwat.indent-rainbow # Highlight indentation level - streetsidesoftware.code-spell-checker # Spelling checker for source code - charliermarsh.ruff # Code linter Ruff diff --git a/.nf-core.yml b/.nf-core.yml index 8a74623b..98e8cdf5 100644 --- a/.nf-core.yml +++ b/.nf-core.yml @@ -1,5 +1,19 @@ -repository_type: pipeline +bump_version: null lint: nextflow_config: - config_defaults: - params.fastp_known_mirna_adapters +nf_core_version: 3.0.2 +org_path: null +repository_type: pipeline +template: + author: "P. Ewels, C. Wang, R. Hammar\xE9n, L. Pantano, A. Peltzer" + description: Small RNA-Seq Best Practice Analysis Pipeline. + force: false + is_nfcore: true + name: smrnaseq + org: nf-core + outdir: . + skip_features: null + version: 2.4.0 +update: null diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index af57081f..9e9f0e1c 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -3,8 +3,11 @@ repos: rev: "v3.1.0" hooks: - id: prettier + additional_dependencies: + - prettier@3.2.5 + - repo: https://github.com/editorconfig-checker/editorconfig-checker.python - rev: "2.7.3" + rev: "3.0.3" hooks: - id: editorconfig-checker alias: ec diff --git a/.prettierignore b/.prettierignore index 437d763d..610e5069 100644 --- a/.prettierignore +++ b/.prettierignore @@ -1,3 +1,4 @@ + email_template.html adaptivecard.json slackreport.json diff --git a/CHANGELOG.md b/CHANGELOG.md index c42d00e3..32568137 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,64 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## v2.4.0 - 2024-10-14 - Navy Iron Boxer + +- [[#349]](https://github.com/nf-core/smrnaseq/pull/349) - Fix [MIRTOP_QUANT conda issue](https://github.com/nf-core/smrnaseq/issues/347) - change conda-base to conda-forge channel. +- [[#350]](https://github.com/nf-core/smrnaseq/pull/350) - Fix [MIRTOP_QUANT conda issue](https://github.com/nf-core/smrnaseq/issues/347) - set python version to 3.7 to fix pysam issue. +- [[#361]](https://github.com/nf-core/smrnaseq/pull/361) - Fix [[#332]](https://github.com/nf-core/smrnaseq/issues/332) - Fix documentation to use only single-end. +- [[#364]](https://github.com/nf-core/smrnaseq/pull/364) - Fix [Protocol inheritance issue](https://github.com/nf-core/smrnaseq/issues/351) - fixing protocol inheritance from subworkflow with move to config profile(s) for different protocols. +- [[#372]](https://github.com/nf-core/smrnaseq/pull/372) - Fix [Plain test profile](https://github.com/nf-core/smrnaseq/issues/371) - Updated default protocol value to "custom". +- [[#374]](https://github.com/nf-core/smrnaseq/pull/374) - Fix [default tests](https://github.com/nf-core/smrnaseq/issues/375) so that they do not require additional profiles in CI. Change GitHub CI fail-fast strategy to false. +- [[#375]](https://github.com/nf-core/smrnaseq/pull/375) - Test [technical repeats](https://github.com/nf-core/smrnaseq/issues/212) - Test merging of technical repeats. +- [[#377]](https://github.com/nf-core/smrnaseq/pull/377) - Fix [Linting](https://github.com/nf-core/smrnaseq/issues/369) - Fixed linting warnings and updated modules & subworkflows. +- [[#378]](https://github.com/nf-core/smrnaseq/pull/378) - Fix [`--mirtrace_species` bug](<(https://github.com/nf-core/smrnaseq/issues/348)>) - Make `MIRTRACE` process conditional. Add mirgenedb test. +- [[#380]](https://github.com/nf-core/smrnaseq/pull/380) - Fix [edgeR_mirBase.R](https://github.com/nf-core/smrnaseq/issues/187) - Fix checking number of samples which causes error in plotMDS. Add nf-tests for local modules using custom R scripts. +- [[#381]](https://github.com/nf-core/smrnaseq/pull/381) - Update [Convert tests to nf-tests](https://github.com/nf-core/smrnaseq/issues/379) - CI tests to nf-tests. +- [[#382]](https://github.com/nf-core/smrnaseq/pull/382) - Add [collapse_mirtop.R](https://github.com/nf-core/smrnaseq/issues/174) - Add nf-tests for local modules using custom R scripts. +- [[#383]](https://github.com/nf-core/smrnaseq/pull/383) - Fix [parameter `--skip_fastp` throws an error](https://github.com/nf-core/smrnaseq/issues/263) - Fix parameter --skip_fastp. +- [[#384]](https://github.com/nf-core/smrnaseq/pull/384) - Fix [filter status bug fix](https://github.com/nf-core/smrnaseq/issues/360) - Fix filter stats module and add filter contaminants test profile. +- [[#386]](https://github.com/nf-core/smrnaseq/pull/386) - Fix [Nextflex trimming support](https://github.com/nf-core/smrnaseq/issues/365) - Fix Nextflex trimming support. +- [[#387]](https://github.com/nf-core/smrnaseq/pull/387) - Add [contaminant filter failure because the Docker image for BLAT cannot be pulled](https://github.com/nf-core/smrnaseq/issues/354) - Add nf-test to local module `blat_mirna` and fixes . Adds a small test profile to test contaminant filter results. +- [[#388]](https://github.com/nf-core/smrnaseq/pull/388) - Fix [igenomes fix](https://github.com/nf-core/smrnaseq/issues/360) - Fix workflow scripts so that they can use igenome parameters. +- [[#391]](https://github.com/nf-core/smrnaseq/pull/391) - Fix [error because of large chromosomes](https://github.com/nf-core/smrnaseq/issues/132) - Change `.bai` index for `.csi` index in `samtools_index` to fix . +- [[#392]](https://github.com/nf-core/smrnaseq/pull/392) - Update [Reduce tests](https://github.com/nf-core/smrnaseq/issues/389) - Combine and optimize tests, and reduce samplesheets sizes. +- [[#397]](https://github.com/nf-core/smrnaseq/pull/397) - Fix [contaminant filter failure because of the Docker image for BLAT](https://github.com/nf-core/smrnaseq/issues/354) - Improvements to contaminant filter subworkflow and replacement for nf-core modules. +- [[#398]](https://github.com/nf-core/smrnaseq/pull/398) - Update [Input channels](https://github.com/nf-core/smrnaseq/issues/390) - Updated channel and params handling through workflows. +- [[#405]](https://github.com/nf-core/smrnaseq/pull/405) - Fix [Umicollapse algo wrong set](https://github.com/nf-core/smrnaseq/issues/404) - Fix potential bug in Umicollapse (not effective as we do not allow PE data in smrnaseq - but for consistency) +- [[#420]](https://github.com/nf-core/smrnaseq/pull/420) - Fix [mirTrace produces an error in test nextflex](https://github.com/nf-core/smrnaseq/issues/419) - Allow config mode to be used in mirtrace/qc +- [[#425]](https://github.com/nf-core/smrnaseq/pull/425) - Raise [minimum required NXF version for pipeline](https://github.com/nf-core/smrnaseq/issues/424) - usage of `arity` in some modules now requires this +- [[#426]](https://github.com/nf-core/smrnaseq/pull/426) - Add [nf-core mirtop](https://github.com/nf-core/smrnaseq/issues/426) - replace local for nf-core `mirtop` +- [[#427]](https://github.com/nf-core/smrnaseq/pull/427) - Add [nf-core pigz uncompress](https://github.com/nf-core/smrnaseq/issues/422) - replace local `mirdeep_pigz` +- [[#429]](https://github.com/nf-core/smrnaseq/pull/429) - Make [saving of intermediate files optional](https://github.com/nf-core/smrnaseq/issues/424) - Allows user to choose whether to save intermediate files or not. Replaces several params that referred to the same such as `params.save_aligned` and `params.save_aligned_mirna_quant`. +- [[#430]](https://github.com/nf-core/smrnaseq/pull/430) - Emit a [warning if paired-end end data is used](https://github.com/nf-core/smrnaseq/issues/423) - pipeline handles SE data +- [[#432]](https://github.com/nf-core/smrnaseq/pull/432) - Update [MultiQC and all modules to latest version](https://github.com/nf-core/smrnaseq/issues/428) - Include UMIcollapse module in MultiQC. +- [[#435]](https://github.com/nf-core/smrnaseq/pull/435) - Replace local instances of bowtie for nf-core [`bowtie2`](https://github.com/nf-core/smrnaseq/issues/434) and [`bowtie1`](https://github.com/nf-core/smrnaseq/issues/433) - Additionally adds a `bioawk` module that cleans fasta files. +- [[#438]](https://github.com/nf-core/smrnaseq/pull/438) - Update [Mirtop to latest version](https://github.com/nf-core/smrnaseq/issues/437) - Process samples separately and join results with `CSVTK_JOIN`. +- [[#439]](https://github.com/nf-core/smrnaseq/pull/439) - Fix [Fix paired end samples processing](https://github.com/nf-core/smrnaseq/issues/415) - Fix paired end sample handling and add test profile. +- [[#441]](https://github.com/nf-core/smrnaseq/pull/441) - Migrate [local contaminant bowtie to nf-core](https://github.com/nf-core/smrnaseq/issues/436) - Replace local processes with `BOWTIE2_ALIGN`. +- [[#443]](https://github.com/nf-core/smrnaseq/pull/443) - Migrate [mirna and genome_quant bowtie to nf-core](https://github.com/nf-core/smrnaseq/issues/436) - Replace local processes with `BOWTIE_ALIGN`. +- [[#447]](https://github.com/nf-core/smrnaseq/pull/447) - Fix [Minor fixes and general pipeline cleanup](https://github.com/nf-core/smrnaseq/issues/400) - Update variable and processes names, update channel comments, remove unused modules and params. +- [[#448]](https://github.com/nf-core/smrnaseq/pull/448) - Migrate local mirdeep to [nf-core mirdeep2 modules and subworkflow](https://github.com/nf-core/smrnaseq/issues/443) and generate [test profile for mirdeep2](https://github.com/nf-core/smrnaseq/issues/399). +- [[#452]](https://github.com/nf-core/smrnaseq/pull/452) - Fix [Fix ch_bowtie_index channel structure](https://github.com/nf-core/smrnaseq/issues/451) and replace untarfiles with untar [replace untarfiles with untar](https://github.com/nf-core/smrnaseq/issues/449). +- [[#457]](https://github.com/nf-core/smrnaseq/pull/457) - QC all input [fasta files and clean them](https://github.com/nf-core/smrnaseq/issues/455). +- [[#459]](https://github.com/nf-core/smrnaseq/pull/459) - Update modules and subworkflows [and fix linting](https://github.com/nf-core/smrnaseq/issues/458). +- [[#462]](https://github.com/nf-core/smrnaseq/pull/462) - Remove automatic wrapping of fasta files by `seqkit replace`. Minor documentation updates. +- [[#464]](https://github.com/nf-core/smrnaseq/pull/464) - Added [proper licences and authorship information to scripts in `bin` folder](https://github.com/nf-core/smrnaseq/issues/465) + +### Software dependencies + +| Dependency | Old version | New version | +| ---------- | ----------- | ----------- | +| `bioawk` | - | 1.0 | +| `bowtie` | 1.3.1 | 1.3.0 | +| `bowtie2` | 2.4.5 | 2.5.2 | +| `csvtk` | - | 0.30 | +| `gawk` | - | 5.3.0 | +| `mirtop` | 0.4.25 | 0.4.28 | +| `multiqc` | 1.21 | 1.25.1 | +| `samtools` | 1.19.2 | 1.21 | +| `seqkit` | 2.6.1 | 2.8.1 | + ## v2.3.1 - 2024-04-18 - Gray Zinc Dalmation Patch - [[#328]](https://github.com/nf-core/smrnaseq/pull/328) - Fix [casting issue](https://github.com/nf-core/smrnaseq/issues/327) in mirtrace module diff --git a/README.md b/README.md index ccb136f0..a546cc1b 100644 --- a/README.md +++ b/README.md @@ -9,11 +9,11 @@ [![GitHub Actions Linting Status](https://github.com/nf-core/smrnaseq/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-core/smrnaseq/actions/workflows/linting.yml)[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/smrnaseq/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.10696391?labelColor=000000)](https://doi.org/10.5281/zenodo.10696391) [![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com) -[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A523.04.0-23aa62.svg)](https://www.nextflow.io/) +[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A524.04.2-23aa62.svg)](https://www.nextflow.io/) [![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/) [![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/) [![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/) -[![Launch on Seqera Platform](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Seqera%20Platform-%234256e7)](https://tower.nf/launch?pipeline=https://github.com/nf-core/smrnaseq) +[![Launch on Seqera Platform](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Seqera%20Platform-%234256e7)](https://cloud.seqera.io/launch?pipeline=https://github.com/nf-core/smrnaseq) [![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23smrnaseq-4A154B?labelColor=000000&logo=slack)](https://nfcore.slack.com/channels/smrnaseq)[![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core)[![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core)[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core) @@ -78,7 +78,15 @@ You can find numerous talks on the nf-core events page from various topics inclu > [!NOTE] > If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data. -First, prepare a samplesheet with your input data that looks as follows: +You can test the pipeline as follows: + +```bash +nextflow run nf-core/smrnaseq \ + -profile test,docker \ + --outdir +``` + +In order to use the pipeline with your own data, first prepare a samplesheet with your input data that looks as follows: `samplesheet.csv`: @@ -100,17 +108,18 @@ Now, you can run the pipeline using: ```bash nextflow run nf-core/smrnaseq \ - -profile \ + -profile , \ --input samplesheet.csv \ --genome 'GRCh37' \ --mirtrace_species 'hsa' \ - --protocol 'illumina' \ --outdir ``` +> [!IMPORTANT] +> Remember to add a protocol as an additional profile (such as `illumina`, `nexttflex`, `qiaseq` or `cats`) when running with your own data. If no protocol is indicated via -profile, the pipeline will likely fail. Alternatively, if needed to run a custom protocol, parameters must be set manually, and auto-detect feature is available. See [usage documentation](https://nf-co.re/smrnaseq/usage) for more details about these profiles. + > [!WARNING] -> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; -> see [docs](https://nf-co.re/usage/configuration#custom-configuration-files). +> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files). For more details and further functionality, please refer to the [usage documentation](https://nf-co.re/smrnaseq/usage) and the [parameter documentation](https://nf-co.re/smrnaseq/parameters). @@ -124,9 +133,14 @@ For more details about the output files and reports, please refer to the nf-core/smrnaseq was originally written by P. Ewels, C. Wang, R. Hammarén, L. Pantano, A. Peltzer. +Lorena Pantano ([@lpantano](https://github.com/lpantano)) from MIT updated the pipeline to Nextflow DSL2. + We thank the following people for their extensive assistance in the development of this pipeline: -Lorena Pantano ([@lpantano](https://github.com/lpantano)) from MIT updated the pipeline to Nextflow DSL2. +- [@atrigila] Anabella Trigila +- [@nschcolnicov] Nicolás Alejandro Schcolnicov +- [@christopher-mohr] Christopher Mohr +- [@grst] Gregor Sturm ## Contributions and Support diff --git a/assets/multiqc_config.yml b/assets/multiqc_config.yml index 9264f6fa..55973984 100644 --- a/assets/multiqc_config.yml +++ b/assets/multiqc_config.yml @@ -1,8 +1,7 @@ report_comment: > - This report has been generated by the nf-core/smrnaseq + This report has been generated by the nf-core/smrnaseq analysis pipeline. For information about how to interpret these results, please see the - documentation. - + documentation. report_section_order: "nf-core-smrnaseq-methods-description": order: -1000 @@ -31,3 +30,6 @@ module_order: info: "This section of the report shows FastQC results after UMI-based deduplication." path_filters: - "**/*.deduplicated_fastqc.zip" +sp: + mirtop: + fn: mirtop_stats.log diff --git a/assets/schema_input.json b/assets/schema_input.json index 892b1996..b5face8a 100644 --- a/assets/schema_input.json +++ b/assets/schema_input.json @@ -1,5 +1,5 @@ { - "$schema": "http://json-schema.org/draft-07/schema", + "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/nf-core/smrnaseq/master/assets/schema_input.json", "title": "nf-core/smrnaseq pipeline - params.input schema", "description": "Schema for the file provided with params.input", diff --git a/bin/collapse_mirtop.r b/bin/collapse_mirtop.r index 6b5c77f1..49c95832 100755 --- a/bin/collapse_mirtop.r +++ b/bin/collapse_mirtop.r @@ -1,4 +1,7 @@ #!/usr/bin/env Rscript + +# Written by Lorena Pantano and released under the MIT license. See LICENSE https://github.com/nf-core/smrnaseq/blob/master/LICENSE for details. + library(data.table) # Command line arguments args = commandArgs(trailingOnly=TRUE) diff --git a/bin/edgeR_miRBase.r b/bin/edgeR_miRBase.r index 5be691fc..5c561437 100755 --- a/bin/edgeR_miRBase.r +++ b/bin/edgeR_miRBase.r @@ -1,5 +1,8 @@ #!/usr/bin/env Rscript +# Originally written by Phil Ewels and Chuan Wang and released under the MIT license. +# Contributions by Alexander Peltzer, Anabella Trigila, James Fellows Yates, Sarah Djebali, Kevin Menden, Konrad Stawinski and Lorena Pantano also released under the MIT license. See LICENSE https://github.com/nf-core/smrnaseq/blob/master/LICENSE for details. + # Command line arguments args = commandArgs(trailingOnly=TRUE) @@ -79,7 +82,7 @@ for (i in 1:2) { } # Make MDS plot (only perform with 3 or more samples) - if (length(filelist[[1]]) > 2){ + if (ncol(dataNorm$counts) > 2){ pdf(paste(header,"_edgeR_MDS_plot.pdf",sep="")) MDSdata <- plotMDS(dataNorm) dev.off() @@ -111,6 +114,8 @@ for (i in 1:2) { # Write clustered distance values to file write.table(hmap$carpet, paste(header,"_log2CPM_sample_distances.txt",sep=""), quote=FALSE, sep="\t") + } else { + warning("Not enough samples to create an MDS plot. At least 3 samples are required.") } } diff --git a/conf/base.config b/conf/base.config index 544ed42d..7d3a72eb 100644 --- a/conf/base.config +++ b/conf/base.config @@ -10,9 +10,9 @@ process { - cpus = { check_max( 1 * task.attempt, 'cpus' ) } - memory = { check_max( 6.GB * task.attempt, 'memory' ) } - time = { check_max( 4.h * task.attempt, 'time' ) } + cpus = { 1 * task.attempt } + memory = { 6.GB * task.attempt } + time = { 4.h * task.attempt } errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' } maxRetries = 1 @@ -24,30 +24,30 @@ process { // adding in your processes. // See https://www.nextflow.io/docs/latest/config.html#config-process-selectors withLabel:process_single { - cpus = { check_max( 1 , 'cpus' ) } - memory = { check_max( 6.GB * task.attempt, 'memory' ) } - time = { check_max( 4.h * task.attempt, 'time' ) } + cpus = { 1 } + memory = { 6.GB * task.attempt } + time = { 4.h * task.attempt } } withLabel:process_low { - cpus = { check_max( 2 * task.attempt, 'cpus' ) } - memory = { check_max( 12.GB * task.attempt, 'memory' ) } - time = { check_max( 6.h * task.attempt, 'time' ) } + cpus = { 2 * task.attempt } + memory = { 12.GB * task.attempt } + time = { 4.h * task.attempt } } withLabel:process_medium { - cpus = { check_max( 6 * task.attempt, 'cpus' ) } - memory = { check_max( 36.GB * task.attempt, 'memory' ) } - time = { check_max( 8.h * task.attempt, 'time' ) } + cpus = { 6 * task.attempt } + memory = { 36.GB * task.attempt } + time = { 8.h * task.attempt } } withLabel:process_high { - cpus = { check_max( 12 * task.attempt, 'cpus' ) } - memory = { check_max( 72.GB * task.attempt, 'memory' ) } - time = { check_max( 10.h * task.attempt, 'time' ) } + cpus = { 12 * task.attempt } + memory = { 72.GB * task.attempt } + time = { 16.h * task.attempt } } withLabel:process_long { - time = { check_max( 20.h * task.attempt, 'time' ) } + time = { 20.h * task.attempt } } withLabel:process_high_memory { - memory = { check_max( 200.GB * task.attempt, 'memory' ) } + memory = { 200.GB * task.attempt } } withLabel:error_ignore { errorStrategy = 'ignore' @@ -56,7 +56,4 @@ process { errorStrategy = 'retry' maxRetries = 2 } - withName:CUSTOM_DUMPSOFTWAREVERSIONS { - cache = false - } } diff --git a/conf/ci.config b/conf/ci.config new file mode 100644 index 00000000..0c63ba51 --- /dev/null +++ b/conf/ci.config @@ -0,0 +1,13 @@ +// CI max resource settings +process { + withLabel:'.*' { + cpus = 2 + memory = 6.GB + time = 6.h + } + withLabel:process_single { + cpus = 2 + memory = 6.GB + time = 6.h + } +} diff --git a/conf/igenomes_ignored.config b/conf/igenomes_ignored.config new file mode 100644 index 00000000..b4034d82 --- /dev/null +++ b/conf/igenomes_ignored.config @@ -0,0 +1,9 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for iGenomes paths +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Empty genomes dictionary to use when igenomes is ignored. +---------------------------------------------------------------------------------------- +*/ + +params.genomes = [:] diff --git a/conf/modules.config b/conf/modules.config index e67745fe..2abe9be6 100644 --- a/conf/modules.config +++ b/conf/modules.config @@ -42,6 +42,14 @@ process { ] } + withName: '.*:PREPARE_GENOME:UNTAR_BOWTIE_INDEX' { + publishDir = [ + mode: params.publish_dir_mode, + enabled: params.save_intermediates, + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + // // FASTQ_FASTQC_UMITOOLS_FASTP // @@ -49,7 +57,6 @@ process { ext.args = [ "", params.trim_fastq ? "" : "--disable_adapter_trimming", params.clip_r1 > 0 ? "--trim_front1 ${params.clip_r1}" : "", // Remove bp from the 5' end of read 1. - params.three_prime_clip_r1 > 0 ? "--trim_tail1 ${params.three_prime_clip_r1}" : "", // Remove bp from the 3' end of read 1 AFTER adapter/quality trimming has been performed. params.fastp_min_length > 0 ? "-l ${params.fastp_min_length}" : "", params.fastp_max_length > 0 ? "--max_len1 ${params.fastp_max_length}" : "", params.three_prime_adapter == "auto-detect" ? "" : "--adapter_sequence ${params.three_prime_adapter}" @@ -70,6 +77,37 @@ process { mode: params.publish_dir_mode, pattern: "*.fail.fastq.gz", enabled: params.save_trimmed_fail + ], + [ + path: { "${params.outdir}/fastp/fastq" }, + mode: params.publish_dir_mode, + pattern: "*.fastp.fastq.gz", + enabled: params.save_merged + ] + ] + } + // + // FASTQ_FASTQC_UMITOOLS_FASTP + // + withName: '.*:FASTP3' { + ext.prefix = { "${meta.id}.fastp3" } + ext.args = [ "", + "--disable_adapter_trimming", + "--disable_quality_filtering", + params.three_prime_clip_r1 > 0 ? "--trim_tail1 ${params.three_prime_clip_r1}" : "", // Remove bp from the 3' end of read 1 AFTER adapter/quality trimming has been performed. + params.fastp_min_length > 0 ? "-l ${params.fastp_min_length}" : "", + params.fastp_max_length > 0 ? "--max_len1 ${params.fastp_max_length}" : "", + ].join(" ").trim() + publishDir = [ + [ + path: { "${params.outdir}/fastp/on_raw" }, + mode: params.publish_dir_mode, + pattern: "*.{json,html}" + ], + [ + path: { "${params.outdir}/fastp/on_raw/log" }, + mode: params.publish_dir_mode, + pattern: "*.log" ] ] } @@ -80,6 +118,7 @@ process { publishDir = [ path: { "${params.outdir}/fastqc/raw" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } @@ -89,6 +128,7 @@ process { publishDir = [ path: { "${params.outdir}/fastqc/trimmed" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } @@ -131,6 +171,18 @@ process { publishDir = [ path: { "${params.outdir}/bowtie_index/genome" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: 'CLEAN_FASTA' { + ext.args = "-c fastx '{gsub(/[^ATGCatgc]/, \"N\", \$seq); sub(/ .*/, \"\", \$name); print \">\"\$name\"\\n\"\$seq}'" + ext.prefix = {"${meta.id}_clean.fa"} + publishDir = [ + path: { "${params.outdir}/bowtie_index/genome" }, + mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } @@ -140,7 +192,7 @@ process { // withName: '.*:UMICOLLAPSE_FASTQ' { - ext.args = { meta.single_end ? "--algo ${params.umitools_method} --two-pass" : "--method ${params.umitools_method} --two-pass --paired --remove-unpaired --remove-chimeric" } + ext.args = { meta.single_end ? "--algo ${params.umitools_method} --two-pass" : "--algo ${params.umitools_method} --two-pass --paired --remove-unpaired --remove-chimeric" } ext.prefix = { "${meta.id}.umi_dedup.sorted" } publishDir = [ path: { "${params.outdir}/umi_dedup/bam_deduplicated" }, @@ -175,10 +227,9 @@ process { // // MIRTRACE QC // - withName: 'MIRTRACE_RUN' { + withName: 'MIRTRACE_QC' { publishDir = [ - //"mirtrace" already part of the published folder - path: { "${params.outdir}" }, + path: { "${params.outdir}/mirtrace/${meta.id}" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] @@ -191,6 +242,81 @@ process { publishDir = [ path: { "${params.outdir}/contaminant_filter/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, mode: params.publish_dir_mode, + enabled: false, + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:BLAT.*' { + ext.args = '-out=blast8' + ext.prefix = {"${meta.id}_${meta2.id}"} + tag = {"${meta.id} ${meta2.id}"} + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:GAWK.*' { + ext.prefix = {"significant_hits_${meta.id}"} + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:SEQKIT_GREP.*' { + ext.prefix = {"filtered_${meta.id}"} + ext.args = '-v' + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:BOWTIE2_ALIGN.*' { + ext.args = '--very-sensitive-local -k 1' + ext.prefix = {"${meta.contaminant}_${meta.id}"} + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:STATS_GAWK_RRNA' { + ext.prefix = {"${meta.contaminant}_${meta.id}"} + ext.suffix = "stats" + ext.args2 = '\'BEGIN {tot=0} {if(NR==4 || NR==5){tot+=\$1}} END {print "\\"' + "rRNA" + '\\": " tot}\'' + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:STATS_GAWK_TRNA' { + ext.prefix = {"${meta.contaminant}_${meta.id}"} + ext.suffix = "stats" + ext.args2 = '\'BEGIN {tot=0} {if(NR==4 || NR==5){tot+=\$1}} END {print "\\"' + "tRNA" + '\\": " tot}\'' + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:STATS_GAWK_CDNA' { + ext.prefix = {"${meta.contaminant}_${meta.id}"} + ext.suffix = "stats" + ext.args2 = '\'BEGIN {tot=0} {if(NR==4 || NR==5){tot+=\$1}} END {print "\\"' + "cDNA" + '\\": " tot}\'' + publishDir = [ enabled: false ] + } + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:STATS_GAWK_NCRNA' { + ext.prefix = {"${meta.contaminant}_${meta.id}"} + ext.suffix = "stats" + ext.args2 = '\'BEGIN {tot=0} {if(NR==4 || NR==5){tot+=\$1}} END {print "\\"' + "ncRNA" + '\\": " tot}\'' + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:STATS_GAWK_PIRNA' { + ext.prefix = {"${meta.contaminant}_${meta.id}"} + ext.suffix = "stats" + ext.args2 = '\'BEGIN {tot=0} {if(NR==4 || NR==5){tot+=\$1}} END {print "\\"' + "piRNA" + '\\": " tot}\'' + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:STATS_GAWK_OTHER' { + ext.prefix = {"${meta.contaminant}_${meta.id}"} + ext.suffix = "stats" + ext.args2 = '\'BEGIN {tot=0} {if(NR==4 || NR==5){tot+=\$1}} END {print "\\"' + "other" + '\\": " tot}\'' + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:CONTAMINANT_FILTER:FILTER_STATS' { + publishDir = [ + path: { "${params.outdir}/contaminant_filter/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }, + mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } @@ -202,6 +328,7 @@ process { publishDir = [ path: { "${params.outdir}/mirna_quant/reference" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } @@ -209,6 +336,7 @@ process { publishDir = [ path: { "${params.outdir}/mirna_quant/reference" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } @@ -216,6 +344,7 @@ process { publishDir = [ path: { "${params.outdir}/bowtie_index/mirna_mature" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } @@ -223,15 +352,24 @@ process { publishDir = [ path: { "${params.outdir}/bowtie_index/mirna_hairpin" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:BOWTIE_MAP_MATURE' { + ext.args = [ "", + "-t", + "-k 50", + "--best", + "--strata", + "-e 99999", + "--chunkmbs 2048", + ].join(" ").trim() publishDir = [ path: { "${params.outdir}/mirna_quant/bam/mature" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, - enabled: params.save_aligned_mirna_quant + enabled: params.save_intermediates ] } withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:BAM_STATS_MATURE:.*' { @@ -239,15 +377,24 @@ process { publishDir = [ path: { "${params.outdir}/mirna_quant/bam/mature" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:BOWTIE_MAP_HAIRPIN' { + ext.args = [ "", + "-t", + "-k 50", + "--best", + "--strata", + "-e 99999", + "--chunkmbs 2048", + ].join(" ").trim() publishDir = [ path: { "${params.outdir}/mirna_quant/bam/hairpin" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, - enabled: params.save_aligned_mirna_quant + enabled: params.save_intermediates ] } withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:BAM_STATS_HAIRPIN:.*' { @@ -255,6 +402,7 @@ process { publishDir = [ path: { "${params.outdir}/mirna_quant/bam/hairpin" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } @@ -265,30 +413,73 @@ process { saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } - withName: 'SEQCLUSTER_SEQUENCES' { + withName: 'SEQCLUSTER_COLLAPSE' { publishDir = [ path: { "${params.outdir}/mirna_quant/seqcluster" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] + ext.args = "-m 1 --min_size 15" + ext.prefix = {"${meta.id}_seqcluster"} } withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:BOWTIE_MAP_SEQCLUSTER' { + ext.args = [ "", + "-t", + "-k 50", + "--best", + "--strata", + "-e 99999", + "--chunkmbs 2048", + ].join(" ").trim() publishDir = [ path: { "${params.outdir}/mirna_quant/bam/seqcluster" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, - enabled: params.save_aligned_mirna_quant + enabled: params.save_intermediates ] } - withName: 'MIRTOP_QUANT' { + + + // Mirtop + + withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:BAM_STATS_MIRNA_MIRTOP:.*' { publishDir = [ - //mirtop already part of the output folder - path: { "${params.outdir}/mirna_quant/" }, + path: { "${params.outdir}/mirna_quant/mirtop" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } - withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:TABLE_MERGE' { + + withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:BAM_STATS_MIRNA_MIRTOP:MIRTOP_COUNTS' { + ext.args = '--add-extra' + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:BAM_STATS_MIRNA_MIRTOP:MIRTOP_STATS' { + publishDir = [ enabled: false ] + } + + withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:BAM_STATS_MIRNA_MIRTOP:MIRTOP_GFF' { + publishDir = [ + path: { "${params.outdir}/mirna_quant/mirtop/gff" }, + mode: params.publish_dir_mode, + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:CSVTK_JOIN' { + ext.args = "--fields 'UID,Read,miRNA,Variant,iso_5p,iso_3p,iso_add3p,iso_snp,iso_5p_nt,iso_3p_nt,iso_add3p_nt,iso_snp_nt' --tabs --outer-join --na \"0\" --out-delimiter \"\t\"" + ext.prefix = "joined_samples_mirtop" + publishDir = [ + path: { "${params.outdir}/mirna_quant/mirtop" }, + mode: params.publish_dir_mode, + saveAs: { filename -> filename.equals('versions.yml') ? null : filename } + ] + } + + + withName: 'NFCORE_SMRNASEQ:MIRNA_QUANT:DATATABLE_MERGE' { publishDir = [ path: { "${params.outdir}/mirna_quant/mirtop" }, mode: params.publish_dir_mode, @@ -297,6 +488,7 @@ process { } + // // GENOME_QUANT // @@ -305,9 +497,18 @@ process { publishDir = [ path: { "${params.outdir}/genome_quant/bam" }, mode: params.publish_dir_mode, + enabled: params.save_intermediates, saveAs: { filename -> filename.equals('versions.yml') ? null : filename } ] } + + withName: 'SAMTOOLS_INDEX' { + ext.args = '-c' + publishDir = [ + enabled: params.save_intermediates, + ] + } + withName: 'NFCORE_SMRNASEQ:GENOME_QUANT:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:.*' { ext.prefix = { "${meta.id}.sorted" } publishDir = [ @@ -317,11 +518,19 @@ process { ] } withName: 'NFCORE_SMRNASEQ:GENOME_QUANT:BOWTIE_MAP_GENOME' { + ext.args = [ "", + "-t", + "-k 50", + "--best", + "--strata", + "-e 99999", + "--chunkmbs 2048", + ].join(" ").trim() publishDir = [ path: { "${params.outdir}/genome_quant/bam" }, mode: params.publish_dir_mode, saveAs: { filename -> filename.equals('versions.yml') ? null : filename }, - enabled: params.save_aligned + enabled: params.save_intermediates ] } @@ -329,31 +538,30 @@ process { // // MIRDEEP // - withName: 'NFCORE_SMRNASEQ:MIRDEEP2:MIRDEEP2_MAPPER' { - publishDir = [ - path: { "${params.outdir}/mirdeep2/mapper" }, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] + + withName: 'MIRDEEP2_MAPPER' { + ext.args = "-c -j -m -v" + publishDir = [ enabled: false ] } - withName: 'NFCORE_SMRNASEQ:MIRDEEP2:MIRDEEP2_RUN' { - publishDir = [ - path: { "${params.outdir}/mirdeep2/run" }, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] + + withName: 'SEQKIT_REPLACE' { + ext.args = '-p "\\s+|\\." -w 0' + ext.suffix = "fasta" + publishDir = [ enabled: false ] + } + + withName: 'SEQKIT_FQ2FA' { + publishDir = [ enabled: false ] + } + + withName: 'MIRDEEP2_MIRDEEP2' { + errorStrategy = { task.exitStatus in (255) ? 'ignore' : '' } } // // reports // - withName: 'CUSTOM_DUMPSOFTWAREVERSIONS' { - publishDir = [ - path: { "${params.outdir}/pipeline_info" }, - mode: params.publish_dir_mode, - pattern: '*_versions.yml' - ] - } + withName: 'MULTIQC' { ext.args = params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' publishDir = [ diff --git a/conf/protocol_cats.config b/conf/protocol_cats.config new file mode 100644 index 00000000..c7e38014 --- /dev/null +++ b/conf/protocol_cats.config @@ -0,0 +1,6 @@ +//This profile handles CATs miRNA defaults. Include it as an additional profile to set certain pipeline parameters appropriately. +params{ + clip_r1 = 3 + three_prime_clip_r1 = 0 + three_prime_adapter = "AAAAAAAA" +} diff --git a/conf/protocol_illumina.config b/conf/protocol_illumina.config new file mode 100644 index 00000000..d86e4e3f --- /dev/null +++ b/conf/protocol_illumina.config @@ -0,0 +1,6 @@ +//This profile handles Illumina miRNA defaults. Include it as an additional profile to set certain pipeline parameters appropriately. +params{ + clip_r1 = 0 + three_prime_clip_r1 = 0 + three_prime_adapter = "TGGAATTCTCGGGTGCCAAGG" +} diff --git a/conf/protocol_nextflex.config b/conf/protocol_nextflex.config new file mode 100644 index 00000000..7992a38f --- /dev/null +++ b/conf/protocol_nextflex.config @@ -0,0 +1,6 @@ +//This profile handles Nextflex miRNA defaults. Include it as an additional profile to set certain pipeline parameters appropriately. +params{ + clip_r1 = 4 + three_prime_clip_r1 = 4 + three_prime_adapter = "TGGAATTCTCGGGTGCCAAGG" +} diff --git a/conf/protocol_qiaseq.config b/conf/protocol_qiaseq.config new file mode 100644 index 00000000..da59ac1a --- /dev/null +++ b/conf/protocol_qiaseq.config @@ -0,0 +1,6 @@ +//This profile handles QIASEQ miRNA defaults. Include it as an additional profile to set certain pipeline parameters appropriately. +params{ + clip_r1 = 0 + three_prime_clip_r1 = 0 + three_prime_adapter = "AACTGTAGGCACCATCAAT" +} diff --git a/conf/test.config b/conf/test.config index a56b2e96..b3952ad0 100644 --- a/conf/test.config +++ b/conf/test.config @@ -10,25 +10,30 @@ ---------------------------------------------------------------------------------------- */ +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} + params { config_profile_name = 'Test profile' config_profile_description = 'Minimal test dataset to check pipeline function' - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '6.GB' - max_time = '6.h' - // Input data input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet.csv' fasta = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/genome.fa' + bowtie_index = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/bowtie_index.tar.gz' mirtrace_species = 'hsa' - protocol = 'illumina' skip_mirdeep = true save_merged = false - save_aligned_mirna_quant = false - cleanup = true //Otherwise tests dont run through properly. } + +// Include illumina config to run test without additional profiles + +includeConfig 'protocol_illumina.config' diff --git a/conf/test_contamination.config b/conf/test_contamination.config new file mode 100644 index 00000000..266c288c --- /dev/null +++ b/conf/test_contamination.config @@ -0,0 +1,35 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running minimal tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a fast and simple pipeline test. + + Use as follows: + nextflow run nf-core/smrnaseq -profile test_contamination, --outdir + +---------------------------------------------------------------------------------------- +*/ + +params { + config_profile_name = 'Test profile' + config_profile_description = 'Minimal test dataset to check pipeline function with contamination filter' + + // Input data + + input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet.csv' + fasta = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/genome.fa' + + mirtrace_species = 'hsa' + skip_mirdeep = true + save_merged = false + + + filter_contamination = true + cdna = "https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/Homo_sapiens.GRCh37.cdna.all.fa" + ncrna = "https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/Homo_sapiens.GRCh37.ncrna.fa" + trna = "https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/hg19-tRNAs.fa" +} + +// Include illumina config to run test without additional profiles + +includeConfig 'protocol_illumina.config' diff --git a/conf/test_contamination_tech_reps.config b/conf/test_contamination_tech_reps.config new file mode 100644 index 00000000..86f1dbfa --- /dev/null +++ b/conf/test_contamination_tech_reps.config @@ -0,0 +1,36 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running minimal tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a fast and simple pipeline test. + + Use as follows: + nextflow run nf-core/smrnaseq -profile test_contamination_tech_reps, --outdir + +---------------------------------------------------------------------------------------- +*/ +// Test covers techincal_repeats, skip_fastqc, filter_contamination and running without genome. + +params { + config_profile_name = 'Test technical repeats profile' + config_profile_description = 'Minimal test dataset to check pipeline function' + + // Input data + input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet_technical_repeats_short.csv' + + mirtrace_species = 'hsa' + save_intermediates = true + + skip_multiqc = true + skip_mirdeep = true + skip_fastqc = true + + filter_contamination = true + cdna = "https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/Homo_sapiens.GRCh37.cdna.all.fa" + ncrna = "https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/Homo_sapiens.GRCh37.ncrna.fa" + trna = "https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/hg19-tRNAs.fa" +} + +// Include illumina config to run test without additional profiles + +includeConfig 'protocol_illumina.config' diff --git a/conf/test_full.config b/conf/test_full.config index 964dc5b2..cc5ecd92 100644 --- a/conf/test_full.config +++ b/conf/test_full.config @@ -18,7 +18,6 @@ params { input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet-full.csv' genome = 'GRCh37' mirtrace_species = 'hsa' - protocol = 'illumina' } diff --git a/conf/test_full_filter_contamination.config b/conf/test_full_filter_contamination.config new file mode 100644 index 00000000..7d8da991 --- /dev/null +++ b/conf/test_full_filter_contamination.config @@ -0,0 +1,30 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running full-size tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a full size pipeline test. + + Use as follows: + nextflow run nf-core/smrnaseq -profile test_full_filter_contamination, --outdir + +---------------------------------------------------------------------------------------- +*/ + +params { + config_profile_name = 'Full test profile' + config_profile_description = 'Full test dataset to check pipeline function with filter contamination feature' + + // Input data for full size test + genome = 'GRCh37' + input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet-full.csv' + mirtrace_species = 'hsa' + three_prime_adapter = 'auto-detect' + filter_contamination = true + cdna = "https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/Homo_sapiens.GRCh37.cdna.all.fa" + ncrna = "https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/Homo_sapiens.GRCh37.ncrna.fa" + trna = "https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/hg19-tRNAs.fa" +} + +includeConfig 'protocol_qiaseq.config' + + diff --git a/conf/test_mirgenedb.config b/conf/test_mirgenedb.config new file mode 100644 index 00000000..097c73fe --- /dev/null +++ b/conf/test_mirgenedb.config @@ -0,0 +1,35 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running minimal tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a fast and simple pipeline test. + + Use as follows: + nextflow run nf-core/smrnaseq -profile test_mirgenedb, --outdir + +---------------------------------------------------------------------------------------- +*/ + +params { + config_profile_name = 'Test profile with mirgeneDB inputs and run mirdeep2' + config_profile_description = 'Minimal test dataset to check pipeline function with mirgeneDB inputs and run mirdeep2' + + // Input data + input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet_test_short.csv' + fasta = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/genome.fa' + + mirgenedb = true + + mirgenedb_mature = "https://github.com/nf-core/test-datasets/raw/smrnaseq/MirGeneDB/mirgenedb_hsa_mature.fa" + mirgenedb_hairpin = "https://github.com/nf-core/test-datasets/raw/smrnaseq/MirGeneDB/mirgenedb_hsa_hairpin.fa" + mirgenedb_gff = "https://github.com/nf-core/test-datasets/raw/smrnaseq/MirGeneDB/mirgenedb_hsa.gff" + mirgenedb_species = "Hsa" + + skip_mirdeep = false + save_intermediates = true + +} + +// Include illumina config to run test without additional profiles + +includeConfig 'protocol_illumina.config' diff --git a/conf/test_nextflex.config b/conf/test_nextflex.config new file mode 100644 index 00000000..93d817e2 --- /dev/null +++ b/conf/test_nextflex.config @@ -0,0 +1,33 @@ +/* +======================================================================================== + Nextflow config file for running minimal tests +======================================================================================== + Defines input files and everything required to run a fast and simple pipeline test. + + Use as follows: + nextflow run nf-core/smrnaseq -profile test_nextflex, + +---------------------------------------------------------------------------------------- +*/ +// This test profile tests nextflex without genome and paired end sample handling + +params { + config_profile_name = 'Nextflex Test profile' + config_profile_description = 'Minimal test dataset to check pipeline function' + + // Input data + input = 'https://raw.githubusercontent.com/nf-core/test-datasets/smrnaseq/samplesheet/v2.0/samplesheet_test_nextflex.csv' + mature = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/mature.fa' + hairpin = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/hairpin.fa' + mirna_gtf = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/hsa.gff3' + mirtrace_species = 'hsa' + + skip_mirdeep = true + save_intermediates = true + //skip_fastp // this profile should not be used with skip_fastq to allow for testing paired end sample handling + +} + +// Include nextflex config to run test without additional profiles + +includeConfig 'protocol_nextflex.config' diff --git a/conf/test_no_genome.config b/conf/test_no_genome.config deleted file mode 100644 index aae8ce91..00000000 --- a/conf/test_no_genome.config +++ /dev/null @@ -1,31 +0,0 @@ -/* -======================================================================================== - Nextflow config file for running minimal tests -======================================================================================== - Defines input files and everything required to run a fast and simple pipeline test. - - Use as follows: - nextflow run nf-core/smrnaseq -profile test, - ----------------------------------------------------------------------------------------- -*/ - -params { - config_profile_name = 'Test profile' - config_profile_description = 'Minimal test dataset to check pipeline function' - - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '6.GB' - max_time = '6.h' - - // Input data - input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet.csv' - mature = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/mature.fa' - hairpin = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/hairpin.fa' - mirna_gtf = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/hsa.gff3' - mirtrace_species = 'hsa' - skip_mirdeep = true - protocol = 'illumina' - -} diff --git a/conf/test_index.config b/conf/test_skipfastp.config similarity index 60% rename from conf/test_index.config rename to conf/test_skipfastp.config index bb9f4707..de786332 100644 --- a/conf/test_index.config +++ b/conf/test_skipfastp.config @@ -5,31 +5,24 @@ Defines input files and everything required to run a fast and simple pipeline test. Use as follows: - nextflow run nf-core/smrnaseq -profile test_index, --outdir + nextflow run nf-core/smrnaseq -profile test_skipfastp, --outdir ---------------------------------------------------------------------------------------- */ +// Test covers running with genome, index and skipfastp params { - config_profile_name = 'Test index profile' - config_profile_description = 'Minimal test dataset to check pipeline function with bowtie index' - - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '6.GB' - max_time = '6.h' + config_profile_name = 'Test profile' + config_profile_description = 'Minimal test dataset to check pipeline function skipping trimming' // Input data - input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet.csv' + input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet_skipfastp.csv' fasta = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/genome.fa' bowtie_index = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/bowtie_index.tar.gz' mirtrace_species = 'hsa' - protocol = 'illumina' skip_mirdeep = true - save_merged = false - save_aligned_mirna_quant = false - - cleanup = true //Otherwise tests dont run through properly. + skip_fastp = true + save_intermediates = true } diff --git a/conf/test_technical_repeats.config b/conf/test_technical_repeats.config new file mode 100644 index 00000000..2c969ddd --- /dev/null +++ b/conf/test_technical_repeats.config @@ -0,0 +1,28 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running minimal tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a fast and simple pipeline test. + + Use as follows: + nextflow run nf-core/smrnaseq -profile test_technical_repeats, --outdir + +---------------------------------------------------------------------------------------- +*/ + +params { + config_profile_name = 'Test technical repeats profile' + config_profile_description = 'Minimal test dataset to check pipeline function' + + // Input data + + input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet_technical_repeats.csv' + fasta = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/genome.fa' + + mirtrace_species = 'hsa' + skip_mirdeep = true + save_intermediates = true + + skip_fastqc = true + skip_multiqc = true +} diff --git a/conf/test_umi.config b/conf/test_umi.config index c7d0db15..5945efcf 100644 --- a/conf/test_umi.config +++ b/conf/test_umi.config @@ -14,23 +14,23 @@ params { config_profile_name = 'Test profile' config_profile_description = 'Minimal test dataset to check pipeline function' - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '6.GB' - max_time = '6.h' - // Input data input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet_umi.csv' fasta = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/genome.fa' + bowtie_index = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/bowtie_index.tar.gz' mirtrace_species = 'hsa' - protocol = 'illumina' skip_mirdeep = true //UMI Specific testcase with_umi = true - umitools_extract_method = 'regex' - umitools_bc_pattern = '.+(?PAACTGTAGGCACCATCAAT){s<=2}(?P.{12})(?P.*)' - save_umi_intermeds = true + umitools_extract_method = 'regex' + umitools_bc_pattern = '.+(?PAACTGTAGGCACCATCAAT){s<=2}(?P.{12})(?P.*)' + save_umi_intermeds = true + save_intermediates = true } + +// Include illumina config to run test without additional profiles + +includeConfig 'protocol_illumina.config' diff --git a/docs/images/mqc_fastqc_adapter.png b/docs/images/mqc_fastqc_adapter.png deleted file mode 100755 index 361d0e47..00000000 Binary files a/docs/images/mqc_fastqc_adapter.png and /dev/null differ diff --git a/docs/images/mqc_fastqc_counts.png b/docs/images/mqc_fastqc_counts.png deleted file mode 100755 index cb39ebb8..00000000 Binary files a/docs/images/mqc_fastqc_counts.png and /dev/null differ diff --git a/docs/images/mqc_fastqc_quality.png b/docs/images/mqc_fastqc_quality.png deleted file mode 100755 index a4b89bf5..00000000 Binary files a/docs/images/mqc_fastqc_quality.png and /dev/null differ diff --git a/docs/output.md b/docs/output.md index 10d3e677..39ee17a3 100644 --- a/docs/output.md +++ b/docs/output.md @@ -6,12 +6,13 @@ This document describes the output produced by the pipeline. Most of the plots are taken from the MultiQC report, which summarises results at the end of the pipeline. -The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory. +The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level `/results` directory. ## Pipeline overview The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps: +- [Preprocessing](#preprocessing) - Preprocessing of reference files - [FastQC](#fastqc) - read quality control - [UMI-tools extract](#umi-tools-extract) - UMI barcode extraction - [UMI-collapse deduplicate](#umicollapse-deduplicate) - read deduplication @@ -27,6 +28,20 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d - [MultiQC](#multiqc) - aggregate report, describing results of the whole pipeline - [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution +If `--save_intermediates` is specified, intermediate files generated by each process will be saved in the output directory. + +## Preprocessing + +
+Output files + +- `bowtie_index/genome`: Cleaned genome.fa fasta. +- `untar/bowtie_index`: Uncompressed bowtie index file. + +
+ +Preprocessing is done to format reference files before using them in the workflow, it includes [`untar`](https://www.gnu.org/software/tar/manual/) and [`bioawk`](https://github.com/lh3/bioawk). If the `bowtie_index` file provided is in gzip format it will be processed by `untar`. The fasta file provided will be cleaned using `bioawk`. + ### FastQC
@@ -47,7 +62,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
Output files -- `umitools/` +- `umi_dedup/fastq_extracted_umi/` - `*.fastq.gz`: If `--save_umi_intermeds` is specified, FastQ files **after** UMI extraction will be placed in this directory. - `*.log`: Log file generated by the UMI-tools `extract` command. @@ -59,13 +74,13 @@ To facilitate processing of input data which has the UMI barcode already embedde ## FastP -[FastP](https://github.com/OpenGene/fastp) is used for removal of adapter contamination and trimming of low quality regions. +[FastP](https://github.com/OpenGene/fastp) is used for removal of adapter contamination and trimming of low-quality regions. MultiQC reports the percentage of bases removed by FastP in the _General Statistics_ table, along some further information on the results. **Output directory: `results/fastp`** -Contains FastQ files with quality and adapter trimmed reads for each sample, along with a log file describing the trimming. +Contains FastQ files with quality and adapter-trimmed reads for each sample, along with a log file describing the trimming. - `sample_fastp.json` - JSON report file with information on parameters and trimming metrics - `sample_fastp.html` - HTML report with some visualizations of trimming metrics @@ -77,8 +92,7 @@ FastP can automatically detect adapter sequences when not specified directly by
Output files -- `umi_dedup/` - - `*.log`: Results statistics files detailing the UMI deduplication results. +- `umi_dedup/bam_deduplicated` - `*.fastq.gz`: If `--save_umi_intermeds` is specified, the deduplicated fastq.gz files **after** UMI deduplication will be placed in this directory.
@@ -86,29 +100,38 @@ FastP can automatically detect adapter sequences when not specified directly by ## Bowtie2 -[Bowtie2](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) is used to align the reads to user-defined databases of contaminants. +[Bowtie2](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) is used to align the reads to user-defined databases and to build indexes for `--filter_contaminant` files. MultiQC reports the number of reads that were removed by each of the contaminant databases. ## Bowtie -[Bowtie](http://bowtie-bio.sourceforge.net/index.shtml) is used for mapping adapter trimmed reads against the mature miRNAs and miRNA precursors (hairpins) of the chosen database [miRBase](http://www.mirbase.org/) or [MirGeneDB](https://mirgenedb.org/). +[Bowtie](http://bowtie-bio.sourceforge.net/index.shtml) is used for building the index for the fasta genome, if needed. It is also used for mapping adapter trimmed reads against the mature miRNAs and miRNA precursors (hairpins) of the chosen database [miRBase](http://www.mirbase.org/) or [MirGeneDB](https://mirgenedb.org/). + +**Output directory: `results/`** -**Output directory: `results/samtools`** +- `bowtie_index/` + - `mirna_hairpin/bowtie`: hairpin.fa bowtie index files. + - `mirna_mature/bowtie`: mature.fa bowtie index files. +- `genome_quant/` + - `genome_quant/bam/.*bam`: The aligned BAM file results. + - `genome_quant/bam/.*unmapped.fastq.gz`: Unmapped reads results. +- `mirna_quant/` -- `sample_mature.bam`: The aligned BAM file of alignment against mature miRNAs -- `sample_mature_unmapped.fq.gz`: Unmapped reads against mature miRNAs _This file will be used as input for the alignment against miRNA precursors (hairpins)_ -- `sample_mature_hairpin.bam`: The aligned BAM file of alignment against miRNA precursors (hairpins) that didn't map to the mature -- `sample_mature_hairpin_unmapped.fq.gz`: Unmapped reads against miRNA precursors (hairpins) -- `sample_mature_hairpin_genome.bam`: The aligned BAM file of alignment against that didn't map to the precursor. + - `mirna_quant/bam/{hairpin,mature,seqcluster}/.*bam`: The aligned BAM file results against hairpin, mature or seqcluster. + - `mirna_quant/bam/{hairpin,mature,seqcluster}/.*unmapped.fastq.gz`: Unmapped reads for hairpin, mature or seqcluster. + +If `--save_intermediates` is specified, these files will be placed in this directory. ## SAMtools [SAMtools](http://samtools.sourceforge.net/) is used for sorting and indexing the output BAM files from Bowtie. In addition, the numbers of features are counted with the `idxstats` option. -**Output directory: `results/samtools/samtools_stats`** +**Output directory: `results/{genome_quant,mirna_quant}/bam`** + +These files will be saved in this directory if `--save_intermediates` is specified. In any case, these stats will always be available in the MultiQC report. -- `stats|idxstats|flagstat`: BAM stats for each of the files listed above. +- `.*stats|.*idxstats|.*flagstat`: BAM stats for each of the files listed above. ![samtools](images/samtools_alignment_plot.png) @@ -116,7 +139,7 @@ MultiQC reports the number of reads that were removed by each of the contaminant [edgeR](https://bioconductor.org/packages/release/bioc/html/edgeR.html) is an R package used for differential expression analysis of RNA-seq expression profiles. -**Output directory: `results/edgeR`** +**Output directory: `results/mirna_quant/edger_qc`** - `[mature/hairpin]_normalized_CPM.txt` TMM normalized counts of reads aligned to mature miRNAs/miRNA precursors (hairpins) - `[mature/hairpin]_edgeR_MDS_plot.pdf` Multidimensional scaling plot of all samples based on the expression profile of mature miRNAs/miRNA precursors (hairpins) @@ -134,11 +157,11 @@ MultiQC reports the number of reads that were removed by each of the contaminant [mirtop](https://github.com/miRTop/mirtop) is used to parse the BAM files from `bowtie` alignment, and produce a [mirgff3](https://github.com/miRTop/mirGFF3) file with information about miRNAs and isomirs. -**Output directory: `results/mirtop`** +**Output directory: `results/mirna_quant/mirtop`** -- `mirtop.gff`: [mirgff3](https://github.com/miRTop/mirGFF3) file -- `mirtop.tsv`: tabular file of the previous file for easy integration with downstream analysis. -- `mirtop_rawData.tsv`: File compatible with [isomiRs](http://lpantano.github.io/isomiRs/reference/IsomirDataSeqFromMirtop.html) Bioconductor package to perform isomiRs analysis. +- `gff/{sample.id}.gff`: [mirgff3](https://github.com/miRTop/mirGFF3) file +- `joined_samples_mirtop.tsv`: a tabular version of the previous file for easy integration with downstream analysis. +- `export/{sample.id}_mirtop_rawData.tsv`: File compatible with [isomiRs](http://lpantano.github.io/isomiRs/reference/IsomirDataSeqFromMirtop.html) Bioconductor package to perform isomiRs analysis. - `mirna.tsv`: tabular file with miRNA counts after summarizing unique isomiRs for each miRNA ## miRDeep2 @@ -147,15 +170,15 @@ MultiQC reports the number of reads that were removed by each of the contaminant **Output directory: `results/mirdeep2`** -- `mirdeep/timestamp_sample.bed` File with the known and novel miRNAs in bed format. -- `mirdeep/timestamp_sample.csv` File with an overview of all detected miRNAs (known and novel) in csv format. -- `mirdeep/timestamp_sample.html` A HTML report with an overview of all detected miRNAs (known and novel) in html format. +- `mirdeep2/result_{sample.id}.bed` File with the known and novel miRNAs in bed format. +- `mirdeep2/result_{sample.id}.csv` File with an overview of all detected miRNAs (known and novel) in csv format. +- `mirdeep2/result_{sample.id}.html` A HTML report with an overview of all detected miRNAs (known and novel) in html format. ## miRTrace -[miRTrace](https://github.com/friedlanderlab/mirtrace) is a quality control specifically for small RNA sequencing data (smRNA-Seq). Each sample is characterized by profiling sequencing quality, read length, sequencing depth and miRNA complexity and also the amounts of miRNAs versus undesirable sequences (derived from tRNAs, rRNAs and sequencing artifacts). By default, the pipeline sets the PHRED-offset to the most common +33, so if you need to adjust this, use the `params.phred_offset` option to include this accordingly for your FASTQ files. +[miRTrace](https://github.com/friedlanderlab/mirtrace) is a quality control specifically for small RNA sequencing data (smRNA-Seq). Each sample is characterized by profiling sequencing quality, read length, sequencing depth and miRNA complexity and also the amounts of miRNAs versus undesirable sequences (derived from tRNAs, rRNAs and sequencing artifacts). By default, the pipeline sets the PHRED offset to the most common value of +33, so if you need to adjust this, use the `params.phred_offset` option to include this accordingly for your FASTQ files. -**Output directory: `results/mirtrace`** +**Output directory: `results/mirtrace/${sample.id}`** - `mirtrace-report.html` An interactive HTML report summarizing all output statistics from miRTrace - `mirtrace-results.json` A JSON file with all output statistics from miRTrace @@ -163,7 +186,7 @@ MultiQC reports the number of reads that were removed by each of the contaminant - `qc_passed_reads.all.collapsed` FASTA file per sample with sequence reads that passed QC in miRTrace - `qc_passed_reads.rnatype_unknown.collapsed` FASTA file per sample with unknown reads in the RNA type analysis -Refer to the [tool manual](https://github.com/friedlanderlab/mirtrace/blob/master/release-bundle-includes/manual.pdf) for detailed specifications about output files. Here is an example of the RNA types plot that you will see: +The files for each sample can also be visualized into a single plot in the MultiQC report. Refer to the [tool manual](https://github.com/friedlanderlab/mirtrace/blob/master/release-bundle-includes/manual.pdf) for detailed specifications about output files. Here is an example of the RNA types plot that you will see: ![mirtrace](images/mirtrace_plot.png) @@ -171,9 +194,8 @@ Refer to the [tool manual](https://github.com/friedlanderlab/mirtrace/blob/maste ![MultiQC - FastQC adapter content plot](images/mqc_fastqc_adapter.png) -:::note -The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They may contain adapter sequence and potentially regions with low quality. -::: +> [!NOTE] +> The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They may contain adapter sequence and potentially regions with low quality. ### MultiQC @@ -191,6 +213,9 @@ The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They m Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQC. The pipeline has special steps which also allow the software versions to be reported in the MultiQC output for future traceability. For more information about how to use MultiQC reports, see . +> [!NOTE] +> There may be a discrepancy in read counts number displayed in MultiQC between the original FASTQ and BAM files, this is due to secondary alignments being reported by the aligner, which can inflate the total read count number in the BAM files. [More info about this behavior can be found here](https://github.com/nf-core/smrnaseq/issues/94). + ### Pipeline information
@@ -198,7 +223,7 @@ Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQ - `pipeline_info/` - Reports generated by Nextflow: `execution_report.html`, `execution_timeline.html`, `execution_trace.txt` and `pipeline_dag.dot`/`pipeline_dag.svg`. - - Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameter's are used when running the pipeline. + - Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameters are used when running the pipeline. - Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`. - Parameters used by the pipeline run: `params.json`. diff --git a/docs/usage.md b/docs/usage.md index 881bb2ff..46fbe6eb 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -10,38 +10,48 @@ This option indicates the experimental protocol used for the sample preparation. Currently supporting: -- 'illumina': adapter (`TGGAATTCTCGGGTGCCAAGG`) -- 'nextflex': adapter (`TGGAATTCTCGGGTGCCAAGG`), clip_r1 (`4`), three_prime_clip_r1 (`4`) -- 'qiaseq': adapter (`AACTGTAGGCACCATCAAT`) -- 'cats': adapter (`GATCGGAAGAGCACACGTCTG`), clip_r1(`3) -- 'custom' (where the user can indicate the `three_prime_adapter`, `clip_r1` and `three_prime_clip_r1` manually) +- 'illumina': three_prime_adapter (`TGGAATTCTCGGGTGCCAAGG`), clip_r1 (`0`), three_prime_clip_r1 (`0`) +- 'nextflex': three_prime_adapter (`TGGAATTCTCGGGTGCCAAGG`), clip_r1 (`4`), three_prime_clip_r1 (`4`) +- 'qiaseq': three_prime_adapter (`AACTGTAGGCACCATCAAT`), clip_r1 (`0`), three_prime_clip_r1 (`0`) +- 'cats': three_prime_adapter (`AAAAAAAA`), clip_r1(`3`), three_prime_clip_r1 (`0`) + +This option is not chosen as a parameter but as an additional profile that sets the corresponding `three_prime_adapter`, `clip_r1` and `three_prime_clip_r1` parameters accordingly. You can choose to either use any of the provided profiles by running the pipeline with e.g. `illumina` to set the defaults as described above in a more convenient way. + +```bash +-profile your_other_profiles,illumina +``` + +In case you have a custom protocol, please supply the `three_prime_adapter`, `clip_r1` and `three_prime_clip_r1` manually. The parameter `--three_prime_adapter` is set to the Illumina TruSeq single index adapter sequence `AGATCGGAAGAGCACACGTCTGAACTCCAGTCA`. This is also to ensure, that the auto-detect functionality of `FASTP` is disabled. Please make sure to adapt this adapter sequence accordingly for your run. -:warning: At least the `custom` protocol has to be specified, otherwise the pipeline won't run. In case you specify the `custom` protocol, ensure that the parameters above are set accordingly or the defaults will be applied. If you want to auto-detect the adapters using `fastp`, please set `--three_prime_adapter` to `auto-detect`. +:warning: If you do not choose a profile that sets the `three_prime_adapter`, `clip_r1` and `three_prime_clip_r1` options, the pipeline won't run. If you want to auto-detect the adapters using `fastp`, please set `--three_prime_adapter` to `auto-detect`. ### `mirtrace_species` or `mirgenedb_species` -It should point to the 3-letter species name used by [miRBase](https://www.mirbase.org/help/genome_summary.shtml) or [MirGeneDB](https://www.mirgenedb.org/browse). Note the difference in case for the two databases. +It should point to the 3-letter species name used by [miRBase](https://www.mirbase.org/browse) or [MirGeneDB](https://www.mirgenedb.org/browse). Note the difference in case for the two databases. ### miRNA related files Different parameters can be set for the two supported databases. By default `miRBase` will be used with the parameters below. - `mirna_gtf`: If not supplied by the user, then `mirna_gtf` will point to the latest GFF3 file in miRbase: `https://mirbase.org/download/CURRENT/genomes/${params.mirtrace_species}.gff3` -- `mature`: points to the FASTA file of mature miRNA sequences. `https://mirbase.org/download/mature.fa` -- `hairpin`: points to the FASTA file of precursor miRNA sequences. `https://mirbase.org/download/hairpin.fa` +- `mature`: points to the FASTA file of mature miRNA sequences. Default: `https://mirbase.org/download/mature.fa` +- `hairpin`: points to the FASTA file of precursor miRNA sequences. Default: `https://mirbase.org/download/hairpin.fa` If MirGeneDB should be used instead it needs to be specified using `--mirgenedb` and use the parameters below. -- `mirgenedb_gff`: The data can not be downloaded automatically (URLs are created with short term tokens in it), thus the user needs to supply the gff file for either his species, or all species downloaded from `https://mirgenedb.org/download`. The total set will automatically be subsetted to the species specified with `--mirgenedb_species`. -- `mirgenedb_mature`: points to the FASTA file of mature miRNA sequences. Download from `https://mirgenedb.org/download`. -- `mirgenedb_hairpin`: points to the FASTA file of precursor miRNA sequences. Download from `https://mirgenedb.org/download`. Note that MirGeneDB does not have a dedicated `hairpin` file, but the `Precursor sequences` are to be used. +- `mirgenedb_gff`: The GFF file cannot be downloaded automatically due to the presence of short-term tokens in the URLs. Therefore, the user must manually provide the GFF file, either for their species of interest or for all species, by downloading it from [MirGeneDB](https://mirgenedb.org/download). The provided dataset will be automatically filtered based on the species specified with the `--mirgenedb_species` parameter. +- `mirgenedb_mature`: This parameter should point to the FASTA file containing mature miRNA sequences. The file can be manually downloaded from [MirGeneDB](https://mirgenedb.org/download). +- `mirgenedb_hairpin`: This parameter should point to the FASTA file containing precursor miRNA sequences. Note that MirGeneDB does not offer a dedicated hairpin file, but the precursor sequences can be downloaded from [MirGeneDB](https://mirgenedb.org/download) and used instead. ### Genome - `fasta`: the reference genome FASTA file -- `bt_indices`: points to the folder containing the `bowtie2` indices for the genome reference specified by `fasta`. **Note:** if the FASTA file in `fasta` is not the same file used to generate the `bowtie2` indices, then the pipeline will fail. +- `bowtie_index`: points to the folder containing the `bowtie` indices for the genome reference specified by `fasta`. + +> [!NOTE] +> if the FASTA file in `fasta` is not the same file used to generate the `bowtie` indices, then the pipeline will fail. ### Contamination filtering @@ -56,6 +66,12 @@ Contamination filtering of the sequencing reads is optional and can be invoked u - `pirna`: Used to supply a FASTA file containing piRNA contamination sequence. e.g. The FASTA file is first compared to the available miRNA sequences and overlaps are removed. - `other_contamination`: Used to supply an additional filtering set. The FASTA file is first compared to the available miRNA sequences and overlaps are removed. +## mirDeep2 + +If the software encounters an error with exit status 255, it will be ignored, and the pipeline will continue to complete. In such cases, the pipeline will log a note that includes the path to the work directory where the issue occurred. You can inspect this work directory to examine your input data and troubleshoot the issue. + +Error 255 is typically related to the core algorithm of miRDeep generating empty output files. This often happens when the reads being processed do not correspond to putative mature miRNA sequences, or if the provided precursors do not meet the criteria for valid miRNA precursors, both of which may stem from the input reads used. A common cause of this error is running the pipeline with a small subset of the input reads. + ### UMI handling The pipeline handles UMIs with two tools. Umicollapse to deduplicate on entire read sequence after 3'adapter removal. Followed by Umitools-extract to extract the miRNA adapter and UMI. This can be achieved by using the parameters for UMI handling as follows (in this case for QIAseq miRNA Library Kit): @@ -64,9 +80,8 @@ The pipeline handles UMIs with two tools. Umicollapse to deduplicate on entire r --with_umi --umitools_extract_method regex --umitools_bc_pattern = '.+(?PAACTGTAGGCACCATCAAT){s<=2}(?P.{12})(?P.*)' ``` -:::note -You will have to specify custom umitools_bc_pattern patterns if your UMI read structure is different. Please check the required capability in your UMI handling manual. It should be set in a way, that only the insert sequence of the RNA molecule is left after extraction. Please refer to the manual of the used kit for the expected read structure. -::: +> [!NOTE] +> If your UMI read structure differs, you'll need to specify custom `umitools_bc_pattern` patterns. Ensure that the pattern is set so that only the insert sequence of the RNA molecule remains after extraction. For details, refer to the UMI handling manual or the documentation of the kit you're using for the expected read structure. ## Samplesheet input @@ -91,9 +106,12 @@ CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz ### Full samplesheet -The pipeline will auto-detect whether a sample is single- or paired-end using the information provided in the samplesheet. The samplesheet can have as many columns as you desire. However, there is a strict requirement for the first 3 columns to match those defined in the table below. +The pipeline will auto-detect whether a sample is single- or paired-end using the information provided in the samplesheet. The samplesheet must have at least 2 columns (`sample` and `fastq1`). A third column can be added if the sample is paired-end (`fastq2`). + +> [!NOTE] +> Most of the tools used can't accommodate paired end reads, so whenever paired-end samples are used as inputs, only the R1 files are used by the pipeline. -A final samplesheet file consisting of both single- and paired-end data may look something like the one below. This is for 6 samples, where `TREATMENT_REP3` has been sequenced twice. +A final samplesheet file consisting of single-end data and may look something like the one below. This is for 6 samples, where `TREATMENT_REP3` has been sequenced twice. ```console sample,fastq_1 @@ -106,10 +124,11 @@ TREATMENT_REP3,AEG588A6_S6_L003_R1_001.fastq.gz TREATMENT_REP3,AEG588A6_S6_L004_R1_001.fastq.gz ``` -| Column | Description | -| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). | -| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". | +| Column | Description | Requirement | +| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- | +| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). | Mandatory | +| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". | Mandatory | +| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". | Optional | An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline. @@ -136,9 +155,8 @@ If you wish to repeatedly use the same parameters for multiple runs, rather than Pipeline settings can be provided in a `yaml` or `json` file via `-params-file `. -:::warning -Do not use `-c ` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args). -::: +> [!WARNING] +> Do not use `-c ` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args). The above pipeline run specified with a params file in yaml format: @@ -146,9 +164,9 @@ The above pipeline run specified with a params file in yaml format: nextflow run nf-core/smrnaseq -profile docker -params-file params.yaml ``` -with `params.yaml` containing: +with: -```yaml +```yaml title="params.yaml" input: './samplesheet.csv' outdir: './results/' genome: 'GRCh37' @@ -157,6 +175,10 @@ genome: 'GRCh37' You can also generate such `YAML`/`JSON` files via [nf-core/launch](https://nf-co.re/launch). +## Optional parameters + +If `--save_intermediates` is specified, the intermediate files generated in the pipeline will be saved in the output directory. + ### Updating the pipeline When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline: @@ -182,15 +204,13 @@ The `bin` directory contains some scripts used by the pipeline which may also be To further assist in reproducbility, you can use share and re-use [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter. -:::tip -If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles. -::: +> [!TIP] +> If you wish to share such a profile (such as uploading it as supplementary material for academic publications), make sure not to include cluster-specific paths to files, nor institution-specific profiles. ## Core Nextflow arguments -:::note -These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen). -::: +> [!NOTE] +> These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen). ### `-profile` @@ -198,9 +218,8 @@ Use this parameter to choose a configuration profile. Profiles can give configur Several generic profiles are bundled with the pipeline which instruct the pipeline to use software packaged using different methods (Docker, Singularity, Podman, Shifter, Charliecloud, Apptainer, Conda) - see below. -:::info -We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported. -::: +> [!TIP] +> We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported. The pipeline also dynamically loads configurations from [https://github.com/nf-core/configs](https://github.com/nf-core/configs) when it runs, making multiple config profiles for various institutional clusters available at run time. For more information and to see if your system is available in these configs please see the [nf-core/configs documentation](https://github.com/nf-core/configs#documentation). @@ -224,6 +243,8 @@ If `-profile` is not specified, the pipeline will run locally and expect all sof - A generic configuration profile to be used with [Charliecloud](https://hpc.github.io/charliecloud/) - `apptainer` - A generic configuration profile to be used with [Apptainer](https://apptainer.org/) +- `wave` + - A generic configuration profile to enable [Wave](https://seqera.io/wave/) containers. Use together with one of the above (requires Nextflow ` 24.03.0-edge` or later). - `conda` - A generic configuration profile to be used with [Conda](https://conda.io/docs/). Please only use Conda as a last resort i.e. when it's not possible to run the pipeline with Docker, Singularity, Podman, Shifter, Charliecloud, or Apptainer. @@ -265,14 +286,6 @@ See the main [Nextflow documentation](https://www.nextflow.io/docs/latest/config If you have any questions or issues please send us a message on [Slack](https://nf-co.re/join/slack) on the [`#configs` channel](https://nfcore.slack.com/channels/configs). -## Azure Resource Requests - -To be used with the `azurebatch` profile by specifying the `-profile azurebatch`. -We recommend providing a compute `params.vm_type` of `Standard_D16_v3` VMs by default but these options can be changed if required. - -Note that the choice of VM size depends on your quota and the overall workload during the analysis. -For a thorough list, please refer the [Azure Sizes for virtual machines in Azure](https://docs.microsoft.com/en-us/azure/virtual-machines/sizes). - ## Running in the background Nextflow handles job submissions and supervises the running jobs. The Nextflow process must run until the pipeline is finished. diff --git a/main.nf b/main.nf index cd13268a..60791ecb 100644 --- a/main.nf +++ b/main.nf @@ -9,8 +9,6 @@ ---------------------------------------------------------------------------------------- */ -nextflow.enable.dsl = 2 - /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ IMPORT FUNCTIONS / MODULES / SUBWORKFLOWS / WORKFLOWS @@ -18,6 +16,7 @@ nextflow.enable.dsl = 2 */ include { NFCORE_SMRNASEQ } from './workflows/smrnaseq' +include { PREPARE_GENOME } from './subworkflows/local/prepare_genome' include { PIPELINE_INITIALISATION } from './subworkflows/local/utils_nfcore_smrnaseq_pipeline' include { PIPELINE_COMPLETION } from './subworkflows/local/utils_nfcore_smrnaseq_pipeline' include { getGenomeAttribute } from './subworkflows/local/utils_nfcore_smrnaseq_pipeline' @@ -28,9 +27,17 @@ include { getGenomeAttribute } from './subworkflows/local/utils_nfcore_smrn ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -params.fasta = getGenomeAttribute('fasta') -params.mirtrace_species = getGenomeAttribute('mirtrace_species') -params.bowtie_index = getGenomeAttribute('bowtie') +params.fasta = getGenomeAttribute('fasta') +params.mirtrace_species = getGenomeAttribute('mirtrace_species') +params.bowtie_index = getGenomeAttribute('bowtie') +params.mirna_gtf = getGenomeAttribute('mirna_gtf') //not in igenomes yet +params.rrna = getGenomeAttribute('rrna') //not in igenomes yet +params.trna = getGenomeAttribute('trna') //not in igenomes yet +params.cdna = getGenomeAttribute('cdna') //not in igenomes yet +params.ncrna = getGenomeAttribute('ncrna') //not in igenomes yet +params.pirna = getGenomeAttribute('pirna') //not in igenomes yet +params.other_contamination = getGenomeAttribute('other_contamination') //not in igenomes yet + /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -42,28 +49,61 @@ workflow { main: ch_versions = Channel.empty() + // + // SUBWORKFLOW : Prepare reference genome files + // + PREPARE_GENOME ( + params.fasta, + params.bowtie_index, + params.mirtrace_species, + params.rrna, + params.trna, + params.cdna, + params.ncrna, + params.pirna, + params.other_contamination, + params.fastp_known_mirna_adapters, + params.mirna_gtf + ) + // // SUBWORKFLOW: Run initialisation tasks // PIPELINE_INITIALISATION ( params.version, - params.help, params.validate_params, params.monochrome_logs, args, params.outdir, - params.input + params.input, + params.three_prime_adapter, + params.phred_offset ) // // WORKFLOW: Run main workflow // NFCORE_SMRNASEQ ( - Channel.of(file(params.input, checkIfExists: true)), + PREPARE_GENOME.out.has_fasta, + PREPARE_GENOME.out.has_mirtrace_species, + PREPARE_GENOME.out.mirna_adapters, + PREPARE_GENOME.out.mirtrace_species, + PREPARE_GENOME.out.reference_mature, + PREPARE_GENOME.out.reference_hairpin, + PREPARE_GENOME.out.mirna_gtf, + PREPARE_GENOME.out.fasta, + PREPARE_GENOME.out.bowtie_index, + PREPARE_GENOME.out.rrna, + PREPARE_GENOME.out.trna, + PREPARE_GENOME.out.cdna, + PREPARE_GENOME.out.ncrna, + PREPARE_GENOME.out.pirna, + PREPARE_GENOME.out.other_contamination, + ch_versions, PIPELINE_INITIALISATION.out.samplesheet, - ch_versions + PIPELINE_INITIALISATION.out.three_prime_adapter, + PIPELINE_INITIALISATION.out.phred_offset ) - // // SUBWORKFLOW: Run completion tasks // diff --git a/modules.json b/modules.json index 109997b3..6f19df08 100644 --- a/modules.json +++ b/modules.json @@ -5,69 +5,159 @@ "https://github.com/nf-core/modules.git": { "modules": { "nf-core": { - "cat/cat": { + "bioawk": { "branch": "master", - "git_sha": "9437e6053dccf4aafa022bfd6e7e9de67e625af8", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, + "blat": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, + "bowtie/align": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, + "bowtie/build": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, + "bowtie2/align": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, + "bowtie2/build": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "cat/fastq": { "branch": "master", - "git_sha": "0997b47c93c06b49aa7b3fefda87e728312cf2ca", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, + "csvtk/join": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "fastp": { "branch": "master", - "git_sha": "95cf5fe0194c7bf5cb0e3027a2eb7e7c89385080", - "installed_by": ["fastq_fastqc_umitools_fastp", "modules"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["fastq_fastqc_umitools_fastp"] }, "fastqc": { "branch": "master", - "git_sha": "285a50500f9e02578d90b3ce6382ea3c30216acd", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["fastq_fastqc_umitools_fastp"] }, + "gawk": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, + "mirdeep2/mapper": { + "branch": "master", + "git_sha": "26757a6a54d05c3133c01c564c192ff617c5ea33", + "installed_by": ["fastq_find_mirna_mirdeep2"] + }, + "mirdeep2/mirdeep2": { + "branch": "master", + "git_sha": "757f60e5656283122cd6ec37d4679483bebb7312", + "installed_by": ["fastq_find_mirna_mirdeep2"] + }, + "mirtop/counts": { + "branch": "master", + "git_sha": "196062335bb9ec979075bf2212f64a369b927b0d", + "installed_by": ["bam_stats_mirna_mirtop"] + }, + "mirtop/export": { + "branch": "master", + "git_sha": "196062335bb9ec979075bf2212f64a369b927b0d", + "installed_by": ["bam_stats_mirna_mirtop"] + }, + "mirtop/gff": { + "branch": "master", + "git_sha": "196062335bb9ec979075bf2212f64a369b927b0d", + "installed_by": ["bam_stats_mirna_mirtop"] + }, + "mirtop/stats": { + "branch": "master", + "git_sha": "196062335bb9ec979075bf2212f64a369b927b0d", + "installed_by": ["bam_stats_mirna_mirtop"] + }, + "mirtrace/qc": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, "multiqc": { "branch": "master", - "git_sha": "b7ebe95761cd389603f9cc0e0dc384c0f663815a", + "git_sha": "cf17ca47590cc578dfb47db1c2a44ef86f89976d", "installed_by": ["modules"] }, "samtools/flagstat": { "branch": "master", - "git_sha": "f4596fe0bdc096cf53ec4497e83defdb3a94ff62", - "installed_by": ["bam_stats_samtools", "modules"] + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", + "installed_by": ["bam_stats_samtools"] }, "samtools/idxstats": { "branch": "master", - "git_sha": "f4596fe0bdc096cf53ec4497e83defdb3a94ff62", - "installed_by": ["bam_stats_samtools", "modules"] + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", + "installed_by": ["bam_stats_samtools"] }, "samtools/index": { "branch": "master", - "git_sha": "f4596fe0bdc096cf53ec4497e83defdb3a94ff62", - "installed_by": ["bam_sort_stats_samtools", "modules"] + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", + "installed_by": ["bam_sort_stats_samtools"] }, "samtools/sort": { "branch": "master", - "git_sha": "4352dbdb09ec40db71e9b172b97a01dcf5622c26", - "installed_by": ["bam_sort_stats_samtools", "modules"] + "git_sha": "b7800db9b069ed505db3f9d91b8c72faea9be17b", + "installed_by": ["bam_sort_stats_samtools"] }, "samtools/stats": { "branch": "master", - "git_sha": "f4596fe0bdc096cf53ec4497e83defdb3a94ff62", - "installed_by": ["bam_stats_samtools", "modules"] + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", + "installed_by": ["bam_stats_samtools"] + }, + "seqcluster/collapse": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, + "seqkit/fq2fa": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["fastq_find_mirna_mirdeep2"] + }, + "seqkit/grep": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, + "seqkit/replace": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["fastq_find_mirna_mirdeep2"] }, "umicollapse": { "branch": "master", - "git_sha": "b97197968ac12dde2463fa54541f6350c46f2035", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "umitools/extract": { "branch": "master", - "git_sha": "d2c5e76f291379f3dd403e48e46ed7e6ba5da744", - "installed_by": ["fastq_fastqc_umitools_fastp", "modules"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["fastq_fastqc_umitools_fastp"] }, - "untarfiles": { + "untar": { "branch": "master", - "git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] } } @@ -76,32 +166,42 @@ "nf-core": { "bam_sort_stats_samtools": { "branch": "master", - "git_sha": "4352dbdb09ec40db71e9b172b97a01dcf5622c26", + "git_sha": "763d4b5c05ffda3ac1ac969dc67f7458cfb2eb1d", + "installed_by": ["subworkflows"] + }, + "bam_stats_mirna_mirtop": { + "branch": "master", + "git_sha": "196062335bb9ec979075bf2212f64a369b927b0d", "installed_by": ["subworkflows"] }, "bam_stats_samtools": { "branch": "master", - "git_sha": "f4596fe0bdc096cf53ec4497e83defdb3a94ff62", + "git_sha": "763d4b5c05ffda3ac1ac969dc67f7458cfb2eb1d", "installed_by": ["bam_sort_stats_samtools"] }, "fastq_fastqc_umitools_fastp": { "branch": "master", - "git_sha": "cabcc0dadf8366aa7a9930066a7b3dd90d9825d5", + "git_sha": "46eca555142d6e597729fcb682adcc791796f514", + "installed_by": ["subworkflows"] + }, + "fastq_find_mirna_mirdeep2": { + "branch": "master", + "git_sha": "757f60e5656283122cd6ec37d4679483bebb7312", "installed_by": ["subworkflows"] }, "utils_nextflow_pipeline": { "branch": "master", - "git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa", + "git_sha": "3aa0aec1d52d492fe241919f0c6100ebf0074082", "installed_by": ["subworkflows"] }, "utils_nfcore_pipeline": { "branch": "master", - "git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa", + "git_sha": "1b6b9a3338d011367137808b49b923515080e3ba", "installed_by": ["subworkflows"] }, - "utils_nfvalidation_plugin": { + "utils_nfschema_plugin": { "branch": "master", - "git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa", + "git_sha": "bbd5a41f4535a8defafe6080e00ea74c45f4f96c", "installed_by": ["subworkflows"] } } diff --git a/modules/local/blat_mirna.nf b/modules/local/blat_mirna.nf deleted file mode 100644 index aa0d3d51..00000000 --- a/modules/local/blat_mirna.nf +++ /dev/null @@ -1,62 +0,0 @@ -process BLAT_MIRNA { - tag "$fasta" - label 'process_medium' - - conda 'bioconda::blat=36' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/blat:36--0' : - 'biocontainers/blat:36--0' }" - - input: - val db_type - path mirna - path contaminants - - - output: - path 'filtered.fa' , emit: filtered_set - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - if ( db_type == "cdna" ) - """ - echo $db_type - awk '/^>/ { x=index(\$6, "transcript_biotype:miRNA") } { if(!x) print }' $contaminants > subset.fa - blat -out=blast8 $mirna subset.fa /dev/stdout | awk 'BEGIN{FS="\t"}{if(\$11 < 1e-5)print \$1;}' | uniq > mirnahit.txt - awk 'BEGIN { while((getline<"mirnahit.txt")>0) l[">"\$1]=1 } /^>/ {x = l[\$1]} {if(!x) print }' subset.fa > filtered.fa - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - blat: \$(echo \$(blat) | grep Standalone | awk '{ if (match(\$0,/[0-9]*[0-9]/,m)) print m[0] }') - END_VERSIONS - """ - - else if ( db_type == "ncrna" ) - """ - echo $db_type - awk '/^>/ { x=(index(\$6, "transcript_biotype:rRNA") || index(\$6, "transcript_biotype:miRNA")) } { if(!x) print }' $contaminants > subset.fa - blat -out=blast8 $mirna subset.fa /dev/stdout | awk 'BEGIN{FS="\t"}{if(\$11 < 1e-5)print \$1;}' | uniq > mirnahit.txt - awk 'BEGIN { while((getline<"mirnahit.txt")>0) l[">"\$1]=1 } /^>/ {x = l[\$1]} {if(!x) print }' subset.fa > filtered.fa - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - blat: \$(echo \$(blat) | grep Standalone | awk '{ if (match(\$0,/[0-9]*[0-9]/,m)) print m[0] }') - END_VERSIONS - """ - - else - """ - echo $db_type - blat -out=blast8 $mirna $contaminants /dev/stdout | awk 'BEGIN{FS="\t"}{if(\$11 < 1e-5)print \$1;}' | uniq > mirnahit.txt - awk 'BEGIN { while((getline<"mirnahit.txt")>0) l[">"\$1]=1 } /^>/ {x = l[\$1]} {if(!x) print }' $contaminants > filtered.fa - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - blat: \$(echo \$(blat) | grep Standalone | awk '{ if (match(\$0,/[0-9]*[0-9]/,m)) print m[0] }') - END_VERSIONS - """ - -} diff --git a/modules/local/bowtie_contaminants.nf b/modules/local/bowtie_contaminants.nf deleted file mode 100644 index cf02de31..00000000 --- a/modules/local/bowtie_contaminants.nf +++ /dev/null @@ -1,29 +0,0 @@ -process INDEX_CONTAMINANTS { - label 'process_medium' - - conda 'bowtie2=2.4.5' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bowtie2:2.4.5--py39hd2f7db1_2' : - 'biocontainers/bowtie2:2.4.5--py39hd2f7db1_2'}" - - input: - path fasta - - output: - path 'fasta_bidx*' , emit: index - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - """ - bowtie2-build ${fasta} fasta_bidx --threads ${task.cpus} - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//') - END_VERSIONS - """ - -} diff --git a/modules/local/bowtie_genome.nf b/modules/local/bowtie_genome.nf deleted file mode 100644 index 17ea9253..00000000 --- a/modules/local/bowtie_genome.nf +++ /dev/null @@ -1,36 +0,0 @@ -process INDEX_GENOME { - tag "$fasta" - label 'process_medium' - - conda 'bioconda::bowtie=1.3.1' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bowtie:1.3.1--py310h7b97f60_6' : - 'biocontainers/bowtie:1.3.1--py310h7b97f60_6' }" - - input: - tuple val(meta2), path(fasta) - - output: - path 'genome*ebwt' , emit: index - path 'genome.edited.fa', emit: fasta - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - """ - # Remove any special base characters from reference genome FASTA file - sed '/^[^>]/s/[^ATGCatgc]/N/g' $fasta > genome.edited.fa - sed -i 's/ .*//' genome.edited.fa - - # Build bowtie index - bowtie-build genome.edited.fa genome --threads ${task.cpus} - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - bowtie: \$(echo \$(bowtie --version 2>&1) | sed 's/^.*bowtie-align-s version //; s/ .*\$//') - END_VERSIONS - """ - -} diff --git a/modules/local/bowtie_map_contaminants.nf b/modules/local/bowtie_map_contaminants.nf deleted file mode 100644 index c9863ab3..00000000 --- a/modules/local/bowtie_map_contaminants.nf +++ /dev/null @@ -1,48 +0,0 @@ -process BOWTIE_MAP_CONTAMINANTS { - label 'process_medium' - - conda 'bowtie2=2.4.5' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bowtie2:2.4.5--py39hd2f7db1_2' : - 'biocontainers/bowtie2:2.4.5--py39hd2f7db1_2' }" - - input: - tuple val(meta), path(reads) - path index - val contaminant_type - - output: - tuple val(meta), path("*sam") , emit: bam - tuple val(meta), path('*.filter.unmapped.contaminant.fastq'), emit: unmapped - path "versions.yml" , emit: versions - path "filtered.*.stats" , emit: stats - - when: - task.ext.when == null || task.ext.when - - script: - def args = task.ext.args ?: "" - - """ - INDEX=`find -L ./ -name "*.rev.1.bt2" | sed "s/\\.rev.1.bt2\$//"` - bowtie2 \\ - -x \$INDEX \\ - -U ${reads} \\ - --threads ${task.cpus} \\ - --un ${meta.id}.${contaminant_type}.filter.unmapped.contaminant.fastq \\ - --very-sensitive-local \\ - -k 1 \\ - -S ${meta.id}.filter.contaminant.sam \\ - ${args} \\ - > ${meta.id}.contaminant_bowtie.log 2>&1 - - # extracting number of reads from bowtie logs - awk -v type=${contaminant_type} 'BEGIN{tot=0} {if(NR==4 || NR == 5){tot += \$1}} END {print "\\""type"\\": "tot }' ${meta.id}.contaminant_bowtie.log | tr -d , > filtered.${meta.id}_${contaminant_type}.stats - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//' | tr -d '\0') - END_VERSIONS - """ - -} diff --git a/modules/local/bowtie_map_mirna.nf b/modules/local/bowtie_map_mirna.nf deleted file mode 100644 index d6b0ea8f..00000000 --- a/modules/local/bowtie_map_mirna.nf +++ /dev/null @@ -1,54 +0,0 @@ -process BOWTIE_MAP_SEQ { - tag "$meta.id" - label 'process_medium' - - conda 'bowtie=1.3.0 bioconda::samtools=1.13' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3:40128b496751b037e2bd85f6789e83d4ff8a4837-0' : - 'biocontainers/mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3:40128b496751b037e2bd85f6789e83d4ff8a4837-0' }" - - input: - tuple val(meta), path(reads) - path index - - output: - tuple val(meta), path("*bam") , emit: bam - tuple val(meta), path('unmapped/*fq.gz'), emit: unmapped - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - """ - INDEX=`find -L ./ -name "*.3.ebwt" | sed 's/.3.ebwt//'` - bowtie \\ - -x \$INDEX \\ - -q <(zcat $reads) \\ - -p ${task.cpus} \\ - -t \\ - -k 50 \\ - --best \\ - --strata \\ - -e 99999 \\ - --chunkmbs 2048 \\ - --un ${meta.id}_unmapped.fq -S > ${meta.id}.sam - - samtools view -bS ${meta.id}.sam > ${meta.id}.bam - - if [ ! -f "${meta.id}_unmapped.fq" ] - then - touch ${meta.id}_unmapped.fq - fi - gzip ${meta.id}_unmapped.fq - mkdir unmapped - mv ${meta.id}_unmapped.fq.gz unmapped/. - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - bowtie: \$(echo \$(bowtie --version 2>&1) | sed 's/^.*bowtie-align-s version //; s/ .*\$//') - samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') - END_VERSIONS - """ - -} diff --git a/modules/local/bowtie_mirna.nf b/modules/local/bowtie_mirna.nf deleted file mode 100644 index 733d816e..00000000 --- a/modules/local/bowtie_mirna.nf +++ /dev/null @@ -1,29 +0,0 @@ -process INDEX_MIRNA { - label 'process_medium' - - conda 'bioconda::bowtie=1.3.1' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bowtie:1.3.1--py310h7b97f60_6' : - 'biocontainers/bowtie:1.3.1--py310h7b97f60_6' }" - - input: - tuple val(meta2), path(fasta) - - output: - path 'fasta_bidx*' , emit: index - path "versions.yml", emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - """ - bowtie-build ${fasta} fasta_bidx --threads ${task.cpus} - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - bowtie: \$(echo \$(bowtie --version 2>&1) | sed 's/^.*bowtie-align-s version //; s/ .*\$//') - END_VERSIONS - """ - -} diff --git a/modules/local/datatable_merge.nf b/modules/local/datatable_merge/main.nf similarity index 86% rename from modules/local/datatable_merge.nf rename to modules/local/datatable_merge/main.nf index c71b9c4d..e231a738 100644 --- a/modules/local/datatable_merge.nf +++ b/modules/local/datatable_merge/main.nf @@ -1,13 +1,13 @@ -process TABLE_MERGE { +process DATATABLE_MERGE { label 'process_medium' - conda 'conda-base::r-data.table=1.12.2' + conda 'conda-forge::r-data.table=1.12.2' container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? 'https://depot.galaxyproject.org/singularity/r-data.table:1.12.2' : 'biocontainers/r-data.table:1.12.2' }" input: - path mirtop + tuple val(meta), path(mirtop) output: path "mirna.tsv" , emit: mirna_tsv diff --git a/modules/local/datatable_merge/tests/datatable_merge.nf.test b/modules/local/datatable_merge/tests/datatable_merge.nf.test new file mode 100644 index 00000000..c7485af8 --- /dev/null +++ b/modules/local/datatable_merge/tests/datatable_merge.nf.test @@ -0,0 +1,71 @@ +nextflow_process { + + name "Test Process DATATABLE_MERGE" + script "../main.nf" + process "DATATABLE_MERGE" + tag "modules" + tag "modules_local" + tag "datatable_merge" + + test("Contains hsa-miR-365b-3p, hsa-miR-7-5p, hsa-miR-103a-3p") { + + when { + params { + outdir = "${outputDir}" + } + process { + """ + input[0] = [[],file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/datatable_merge/small_mirtop_dataset.txt", checkIfExists: true)] + """ + } + } + + then { + assert process.success + assert snapshot(process.out).match() + + with(process.out.mirna_tsv) { + with(get(0)) { + assert get(0).endsWith(".tsv") + + // Check for specific miRNAs + def lines = path(get(0)).readLines() + assert lines.any { it.contains("hsa-miR-365b-3p") } + assert lines.any { it.contains("hsa-miR-7-5p") } + assert lines.any { it.contains("hsa-miR-103a-3p") } + } + } + } + } + + test("Does not contain hsa-miR-107, hsa-miR-365a-3p") { + + when { + params { + outdir = "${outputDir}" + } + process { + """ + input[0] = [[],file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/datatable_merge/small_mirtop_dataset.txt", checkIfExists: true)] + """ + } + } + + then { + assert process.success + assert snapshot(process.out).match() + + with(process.out.mirna_tsv) { + with(get(0)) { + assert get(0).endsWith(".tsv") + + // Check for the absence of specific miRNAs + def lines = path(get(0)).readLines() + assert !lines.any { it.contains("hsa-miR-107") } + assert !lines.any { it.contains("hsa-miR-365a-3p") } + } + } + } + } + +} diff --git a/modules/local/datatable_merge/tests/datatable_merge.nf.test.snap b/modules/local/datatable_merge/tests/datatable_merge.nf.test.snap new file mode 100644 index 00000000..7fce7ed9 --- /dev/null +++ b/modules/local/datatable_merge/tests/datatable_merge.nf.test.snap @@ -0,0 +1,48 @@ +{ + "Contains hsa-miR-365b-3p, hsa-miR-7-5p, hsa-miR-103a-3p": { + "content": [ + { + "0": [ + "mirna.tsv:md5,f59a6aeb15588c43c2977950a1b0a080" + ], + "1": [ + "versions.yml:md5,13bf3c8bbf1285dfc0ef547dcbb692b2" + ], + "mirna_tsv": [ + "mirna.tsv:md5,f59a6aeb15588c43c2977950a1b0a080" + ], + "versions": [ + "versions.yml:md5,13bf3c8bbf1285dfc0ef547dcbb692b2" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-30T12:57:47.129770995" + }, + "Does not contain hsa-miR-107, hsa-miR-365a-3p": { + "content": [ + { + "0": [ + "mirna.tsv:md5,f59a6aeb15588c43c2977950a1b0a080" + ], + "1": [ + "versions.yml:md5,13bf3c8bbf1285dfc0ef547dcbb692b2" + ], + "mirna_tsv": [ + "mirna.tsv:md5,f59a6aeb15588c43c2977950a1b0a080" + ], + "versions": [ + "versions.yml:md5,13bf3c8bbf1285dfc0ef547dcbb692b2" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-30T12:57:56.990602055" + } +} \ No newline at end of file diff --git a/modules/local/edger_qc.nf b/modules/local/edger_qc/main.nf similarity index 93% rename from modules/local/edger_qc.nf rename to modules/local/edger_qc/main.nf index 8c311457..2773df80 100644 --- a/modules/local/edger_qc.nf +++ b/modules/local/edger_qc/main.nf @@ -1,7 +1,7 @@ process EDGER_QC { label 'process_medium' - conda 'bioconda::bioconductor-limma=3.58.1 bioconda::bioconductor-edger=4.0.2 conda-forge::r-data.table=1.14.10 conda-forge::r-gplots=3.1.3 conda-forge::r-statmod=1.5.0' + conda 'bioconda::bioconductor-limma=3.58.1 bioconda::bioconductor-edger=4.0.16 conda-forge::r-data.table=1.14.10 conda-forge::r-gplots=3.1.3 conda-forge::r-statmod=1.5.0' container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? 'https://depot.galaxyproject.org/singularity/mulled-v2-419bd7f10b2b902489ac63bbaafc7db76f8e0ae1:f5ff7de321749bc7ae12f7e79a4b581497f4c8ce-0' : 'biocontainers/mulled-v2-419bd7f10b2b902489ac63bbaafc7db76f8e0ae1:f5ff7de321749bc7ae12f7e79a4b581497f4c8ce-0' }" diff --git a/modules/local/edger_qc/tests/edger_qc.nf.test b/modules/local/edger_qc/tests/edger_qc.nf.test new file mode 100644 index 00000000..d33f9c3b --- /dev/null +++ b/modules/local/edger_qc/tests/edger_qc.nf.test @@ -0,0 +1,73 @@ +nextflow_process { + + name "Test Process EDGER_QC" + script "../main.nf" + process "EDGER_QC" + + test("Should not produce MDS plot") { + + when { + params { + outdir = "${outputDir}" + } + process { + """ + input[0] = [file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/edger_qc/Clone1_N1_mature.sorted.idxstats"), + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/edger_qc/Clone1_N1_mature_hairpin.sorted.idxstats") + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + // Snapshot only the stable files (.txt, .csv) and exclude PDFs + process.out.edger_files.get(0).findAll { !it.endsWith('pdf')}, + process.out.versions + ) + .match() } + ) + } + + } + + test("Should produce MDS plot") { + + when { + params { + outdir = "${outputDir}" + } + process { + """ + input[0] = [ + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/edger_qc/Clone1_N1_mature.sorted.idxstats"), + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/edger_qc/Clone1_N1_mature_hairpin.sorted.idxstats"), + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/edger_qc/Clone1_N3_mature.sorted.idxstats"), + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/edger_qc/Clone1_N3_mature_hairpin.sorted.idxstats"), + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/edger_qc/Control_N1_mature.sorted.idxstats"), + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/edger_qc/Control_N1_mature_hairpin.sorted.idxstats"), + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/edger_qc/Control_N2_mature.sorted.idxstats"), + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/edger_qc/Control_N2_mature_hairpin.sorted.idxstats"), + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + // Snapshot only the stable files (.txt, .csv) and exclude PDFs + process.out.edger_files.get(0).findAll { !it.endsWith('pdf')}, + process.out.versions, + // Check MDS plot exists + file(process.out.edger_files[0].find { file(it).name == "hairpin_edgeR_MDS_plot.pdf" }).exists() + ) + .match() } + ) + } + } + +} diff --git a/modules/local/edger_qc/tests/edger_qc.nf.test.snap b/modules/local/edger_qc/tests/edger_qc.nf.test.snap new file mode 100644 index 00000000..4e1b8a65 --- /dev/null +++ b/modules/local/edger_qc/tests/edger_qc.nf.test.snap @@ -0,0 +1,57 @@ +{ + "Should not produce MDS plot": { + "content": [ + [ + "hairpin_counts.csv:md5,9a2c4c71862349eee5071cf08a81df52", + "hairpin_logtpm.csv:md5,590516d1c7447023933f055446d34552", + "hairpin_logtpm.txt:md5,5cbb1258c290d958910db677490596c0", + "hairpin_normalized_CPM.txt:md5,2f6685750d4c0aa1dc8150276f8a5a2d", + "hairpin_unmapped_read_counts.txt:md5,b3ca3b9f01dbdab1bdbd989769121794", + "mature_counts.csv:md5,17b953ef2fb4e58d83acc263f68755fd", + "mature_logtpm.csv:md5,b4654e4ec264243156b1ceab73503017", + "mature_logtpm.txt:md5,9cba6dd8336de7fe79be641285e92a73", + "mature_normalized_CPM.txt:md5,43db2854ec00e6afca25883b64ad67bd", + "mature_unmapped_read_counts.txt:md5,0e129ffe42aa32f96250a5071d3a7649" + ], + [ + "versions.yml:md5,2e5b1dd3ed5befd1d4c9812a3fcb768a" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-30T19:57:28.863043452" + }, + "Should produce MDS plot": { + "content": [ + [ + "hairpin_counts.csv:md5,4b0fa0e52a7b8b40bdc5930378430136", + "hairpin_edgeR_MDS_distance_matrix.txt:md5,f0eb20be2b7bae7775ef65e03139f5a9", + "hairpin_edgeR_MDS_plot_coordinates.txt:md5,2f1f865b11c4ee5253f80ebe9a1914ee", + "hairpin_log2CPM_sample_distances.txt:md5,20592bfa42e23827dfac02eab1e033ff", + "hairpin_logtpm.csv:md5,35a5449d3468995e8010907105922898", + "hairpin_logtpm.txt:md5,1de707003b6ed2c38372670d69eaf5fb", + "hairpin_normalized_CPM.txt:md5,d42e8eb89175107c5dfbfb2c7da98d37", + "hairpin_unmapped_read_counts.txt:md5,c587147fb1a5b6681c17eff2d4859022", + "mature_counts.csv:md5,f961a9d6749dbf0c84dfb8976e0b6516", + "mature_edgeR_MDS_distance_matrix.txt:md5,bfbf327feedbc2e7bbbd57020ae0594c", + "mature_edgeR_MDS_plot_coordinates.txt:md5,b89854153c61a348929ea3901a61bd56", + "mature_log2CPM_sample_distances.txt:md5,b4ed17084de4711e7fd4a12d221d65ec", + "mature_logtpm.csv:md5,850a8ed0e4559d338578f81dc849acf5", + "mature_logtpm.txt:md5,9087155e2f4bc7f85ced8ab8c02c77e6", + "mature_normalized_CPM.txt:md5,3bc348a1248f9597dfc9e8e465c3c8a8", + "mature_unmapped_read_counts.txt:md5,138cf290420edbf9721b9db861204c9c" + ], + [ + "versions.yml:md5,2e5b1dd3ed5befd1d4c9812a3fcb768a" + ], + true + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-30T19:59:15.428541578" + } +} \ No newline at end of file diff --git a/modules/local/filter_stats.nf b/modules/local/filter_stats.nf index 4c46f51d..2c51c35e 100644 --- a/modules/local/filter_stats.nf +++ b/modules/local/filter_stats.nf @@ -1,5 +1,6 @@ process FILTER_STATS { label 'process_medium' + tag "$meta.id" conda 'bowtie2=2.4.5' container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? @@ -7,12 +8,11 @@ process FILTER_STATS { 'biocontainers/bowtie2:2.4.5--py39hd2f7db1_2' }" input: - tuple val(meta), path(reads) - path stats_files + tuple val(meta), path(reads), path (stats_files) output: path "*_mqc.yaml" , emit: stats - tuple val(meta), path('*.filtered.fastq.gz'), emit: reads + tuple val(meta), path('*.filtered.fastq.gz'), emit: reads, optional: true path "versions.yml" , emit: versions when: @@ -20,17 +20,26 @@ process FILTER_STATS { script: """ - readnumber=\$(wc -l ${reads} | awk '{ print \$1/4 }') - cat ./filtered.${meta.id}_*.stats | \\ - tr '\n' ', ' | \\ + + if [[ ${reads} == *.gz ]]; then + readnumber=\$(zcat ${reads} | wc -l | awk '{ print \$1/4 }') + else + readnumber=\$(wc -l ${reads} | awk '{ print \$1/4 }') + fi + + cat ./*${meta.id}*.stats | \\ + tr '\\n' ', ' | \\ awk -v sample=${meta.id} -v readnumber=\$readnumber '{ print "id: \\"my_pca_section\\"\\nsection_name: \\"Contamination Filtering\\"\\ndescription: \\"This plot shows the amount of reads filtered by contaminant type.\\"\\nplot_type: \\"bargraph\\"\\npconfig:\\n id: \\"contamination_filter_plot\\"\\n title: \\"Contamination Plot\\"\\n ylab: \\"Number of reads\\"\\ndata:\\n "sample": {"\$0"\\"remaining reads\\": "readnumber"}" }' > ${meta.id}.contamination_mqc.yaml - gzip -c ${reads} > ${meta.id}.filtered.fastq.gz + + if [[ ${reads} == *.gz ]]; then + cp ${reads} ${meta.id}.filtered.fastq.gz + else + gzip -c ${reads} > ${meta.id}.filtered.fastq.gz + fi cat <<-END_VERSIONS > versions.yml "${task.process}": - cat: \$(cat --version | grep 'cat ' |sed 's/cat (GNU coreutils) //') - gzip: \$(gzip --version | grep "gzip" | sed 's/gzip //') - tr: \$(tr --version | grep 'tr ' |sed 's/tr (GNU coreutils) //') + BusyBox: \$(busybox | sed -n -E 's/.*v([[:digit:].]+)\\s\\(.*/\\1/p') END_VERSIONS """ } diff --git a/modules/local/mirdeep2_mapper.nf b/modules/local/mirdeep2_mapper.nf deleted file mode 100644 index 19a9c5dc..00000000 --- a/modules/local/mirdeep2_mapper.nf +++ /dev/null @@ -1,43 +0,0 @@ -def VERSION = '2.0.1' - -process MIRDEEP2_MAPPER { - label 'process_medium' - tag "$meta.id" - - conda 'bioconda::mirdeep2=2.0.1.3' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mirdeep2:2.0.1.3--hdfd78af_1' : - 'biocontainers/mirdeep2:2.0.1.3--hdfd78af_1' }" - - input: - tuple val(meta), path(reads) - path index - - output: - tuple path('*_collapsed.fa'), path('*reads_vs_refdb.arf'), emit: mirdeep2_inputs - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - def index_base = index.toString().tokenize(' ')[0].tokenize('.')[0] - """ - mapper.pl \\ - $reads \\ - -e \\ - -h \\ - -i \\ - -j \\ - -m \\ - -p $index_base \\ - -s ${meta.id}_collapsed.fa \\ - -t ${meta.id}_reads_vs_refdb.arf \\ - -o 4 - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - mapper: \$(echo "$VERSION") - END_VERSIONS - """ -} diff --git a/modules/local/mirdeep2_prepare.nf b/modules/local/mirdeep2_prepare.nf deleted file mode 100644 index ce66b9f1..00000000 --- a/modules/local/mirdeep2_prepare.nf +++ /dev/null @@ -1,31 +0,0 @@ -process MIRDEEP2_PIGZ { - label 'process_low' - tag "$meta.id" - - // TODO maybe create a mulled container and uncompress within mirdeep2_mapper? - conda 'bioconda::bioconvert=1.1.1' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/bioconvert:1.1.1--pyhdfd78af_0' : - 'biocontainers/bioconvert:1.1.1--pyhdfd78af_0' }" - - input: - tuple val(meta), path(reads) - - output: - tuple val(meta), path("*.{fastq,fq}"), emit: reads - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - """ - pigz -f -d -p $task.cpus $reads - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) - END_VERSIONS - """ - -} diff --git a/modules/local/mirdeep2_run.nf b/modules/local/mirdeep2_run.nf deleted file mode 100644 index ba37a4ac..00000000 --- a/modules/local/mirdeep2_run.nf +++ /dev/null @@ -1,42 +0,0 @@ -def VERSION = '2.0.1' - -process MIRDEEP2_RUN { - label 'process_medium' - errorStrategy 'ignore' - - conda 'bioconda::mirdeep2=2.0.1.3' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mirdeep2:2.0.1.3--hdfd78af_1' : - 'biocontainers/mirdeep2:2.0.1.3--hdfd78af_1' }" - - input: - path(fasta) - tuple path(reads), path(arf) - path(hairpin) - path(mature) - - output: - path 'result*.{bed,csv,html}', emit: result - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - """ - miRDeep2.pl \\ - $reads \\ - $fasta \\ - $arf \\ - $mature \\ - none \\ - $hairpin \\ - -d \\ - -z _${reads.simpleName} - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - mirdeep2: \$(echo "$VERSION") - END_VERSIONS - """ -} diff --git a/modules/local/mirtop_quant.nf b/modules/local/mirtop_quant.nf deleted file mode 100644 index ab38c93d..00000000 --- a/modules/local/mirtop_quant.nf +++ /dev/null @@ -1,42 +0,0 @@ -process MIRTOP_QUANT { - label 'process_medium' - - conda 'mirtop=0.4.25 bioconda::samtools=1.15.1 conda-base::r-base=4.1.1 conda-base::r-data.table=1.14.2' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mulled-v2-0c13ef770dd7cc5c76c2ce23ba6669234cf03385:63be019f50581cc5dfe4fc0f73ae50f2d4d661f7-0' : - 'biocontainers/mulled-v2-0c13ef770dd7cc5c76c2ce23ba6669234cf03385:63be019f50581cc5dfe4fc0f73ae50f2d4d661f7-0' }" - - input: - path ("bams/*") - path hairpin - path gtf - - output: - path "mirtop/mirtop.gff" , emit: mirtop_gff - path "mirtop/mirtop.tsv" , emit: mirtop_table - path "mirtop/mirtop_rawData.tsv", emit: mirtop_rawdata - path "mirtop/stats/*" , emit: logs - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - def filter_species = params.mirgenedb ? params.mirgenedb_species : params.mirtrace_species - """ - #Cleanup the GTF if mirbase html form is broken - GTF="$gtf" - sed 's/>/>/g' \$GTF | sed 's#
#\\n#g' | sed 's#

##g' | sed 's#

##g' | sed -e :a -e '/^\\n*\$/{\$d;N;};/\\n\$/ba' > \${GTF}_html_cleaned.gtf - mirtop gff --hairpin $hairpin --gtf \${GTF}_html_cleaned.gtf -o mirtop --sps $filter_species ./bams/* - mirtop counts --hairpin $hairpin --gtf \${GTF}_html_cleaned.gtf -o mirtop --sps $filter_species --add-extra --gff mirtop/mirtop.gff - mirtop export --format isomir --hairpin $hairpin --gtf \${GTF}_html_cleaned.gtf --sps $filter_species -o mirtop mirtop/mirtop.gff - mirtop stats mirtop/mirtop.gff --out mirtop/stats - mv mirtop/stats/mirtop_stats.log mirtop/stats/full_mirtop_stats.log - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - mirtop: \$(echo \$(mirtop --version 2>&1) | sed 's/^.*mirtop //') - END_VERSIONS - """ - -} diff --git a/modules/local/mirtrace.nf b/modules/local/mirtrace.nf deleted file mode 100644 index 87526016..00000000 --- a/modules/local/mirtrace.nf +++ /dev/null @@ -1,46 +0,0 @@ -process MIRTRACE_RUN { - label 'process_medium' - - conda 'bioconda::mirtrace=1.0.1' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mirtrace:1.0.1--hdfd78af_1' : - 'biocontainers/mirtrace:1.0.1--hdfd78af_1' }" - - input: - tuple val(adapter), val(ids), path(reads) - path(mirtrace_config) - - output: - path "mirtrace/*" , emit: mirtrace - path "versions.yml", emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - // mirtrace protocol defaults to 'params.protocol' if not set - def protocol = params.protocol == 'custom' ? '' : "--protocol $params.protocol" - def java_mem = '' - if(task.memory){ - tmem = task.memory.toBytes() - java_mem = "-Xms${tmem} -Xmx${tmem}" - } - - """ - export mirtracejar=\$(dirname \$(which mirtrace)) - - java $java_mem -jar \$mirtracejar/mirtrace.jar --mirtrace-wrapper-name mirtrace qc \\ - --species $params.mirtrace_species \\ - $protocol \\ - --config $mirtrace_config \\ - --write-fasta \\ - --output-dir mirtrace \\ - --force - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - mirtrace: \$(echo \$(mirtrace -v)) - END_VERSIONS - """ - -} diff --git a/modules/local/parse_fasta_mirna.nf b/modules/local/parse_fasta_mirna.nf index 60665251..b474e1c7 100644 --- a/modules/local/parse_fasta_mirna.nf +++ b/modules/local/parse_fasta_mirna.nf @@ -1,13 +1,14 @@ process PARSE_FASTA_MIRNA { label 'process_medium' - conda 'bioconda::seqkit=2.6.1' + conda 'bioconda::seqkit=2.8.2' container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? 'https://depot.galaxyproject.org/singularity/seqkit:2.6.1--h9ee0642_0' : 'biocontainers/seqkit:2.6.1--h9ee0642_0' }" input: tuple val(meta2), path(fasta) + val filter_species output: tuple val(meta2), path('*_igenome.fa'), emit: parsed_fasta @@ -17,7 +18,6 @@ process PARSE_FASTA_MIRNA { task.ext.when == null || task.ext.when script: - def filter_species = params.mirgenedb ? params.mirgenedb_species : params.mirtrace_species """ # Uncompress FASTA reference files if necessary FASTA="$fasta" diff --git a/modules/local/seqcluster_collapse.nf b/modules/local/seqcluster_collapse.nf deleted file mode 100644 index 4379654c..00000000 --- a/modules/local/seqcluster_collapse.nf +++ /dev/null @@ -1,33 +0,0 @@ -process SEQCLUSTER_SEQUENCES { - label 'process_medium' - tag "$meta.id" - - conda 'bioconda::seqcluster=1.2.9' - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/seqcluster:1.2.9--pyh5e36f6f_0' : - 'biocontainers/seqcluster:1.2.9--pyh5e36f6f_0' }" - - input: - tuple val(meta), path(reads) - - output: - tuple val(meta), path("final/*.fastq.gz"), emit: collapsed - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - """ - seqcluster collapse -f $reads -m 1 --min_size 15 -o collapsed - gzip collapsed/*_trimmed.fastq - mkdir final - mv collapsed/*.fastq.gz final/. - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - seqcluster: \$(echo \$(seqcluster --version 2>&1) | sed 's/^.*seqcluster //') - END_VERSIONS - """ - -} diff --git a/modules/nf-core/bioawk/bioawk.diff b/modules/nf-core/bioawk/bioawk.diff new file mode 100644 index 00000000..bd9ed322 --- /dev/null +++ b/modules/nf-core/bioawk/bioawk.diff @@ -0,0 +1,25 @@ +Changes in module 'nf-core/bioawk' +--- modules/nf-core/bioawk/main.nf ++++ modules/nf-core/bioawk/main.nf +@@ -11,7 +11,7 @@ + tuple val(meta), path(input) + + output: +- tuple val(meta), path("*.gz"), emit: output ++ tuple val(meta), path("*.fasta"), emit: output + path "versions.yml" , emit: versions + + when: +@@ -26,9 +26,7 @@ + bioawk \\ + $args \\ + $input \\ +- > ${prefix} +- +- gzip ${prefix} ++ > ${prefix}.fasta + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + +************************************************************ diff --git a/modules/nf-core/bioawk/environment.yml b/modules/nf-core/bioawk/environment.yml new file mode 100644 index 00000000..527f6cd4 --- /dev/null +++ b/modules/nf-core/bioawk/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bioawk=1.0 diff --git a/modules/nf-core/bioawk/main.nf b/modules/nf-core/bioawk/main.nf new file mode 100644 index 00000000..3ae62108 --- /dev/null +++ b/modules/nf-core/bioawk/main.nf @@ -0,0 +1,36 @@ +process BIOAWK { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bioawk:1.0--h5bf99c6_6': + 'biocontainers/bioawk:1.0--h5bf99c6_6' }" + + input: + tuple val(meta), path(input) + + output: + tuple val(meta), path("*.fasta"), emit: output + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' // args is used for the main arguments of the tool + prefix = task.ext.prefix ?: "${meta.id}" + if ("${input}" == "${prefix}") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!" + def VERSION = '1.0' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. + """ + bioawk \\ + $args \\ + $input \\ + > ${prefix}.fasta + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bioawk: $VERSION + END_VERSIONS + """ +} diff --git a/modules/nf-core/bioawk/meta.yml b/modules/nf-core/bioawk/meta.yml new file mode 100644 index 00000000..c691ac0c --- /dev/null +++ b/modules/nf-core/bioawk/meta.yml @@ -0,0 +1,51 @@ +name: "bioawk" +description: Bioawk is an extension to Brian Kernighan's awk, adding the support of + several common biological data formats. +keywords: + - bioawk + - fastq + - fasta + - sam + - file manipulation + - awk +tools: + - "bioawk": + description: "BWK awk modified for biological data" + homepage: "https://github.com/lh3/bioawk" + documentation: "https://github.com/lh3/bioawk" + tool_dev_url: "https://github.com/lh3/bioawk" + licence: ["Free software license (https://github.com/lh3/bioawk/blob/master/README.awk#L1)"] + identifier: "" +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: Input sequence biological sequence file (optionally gzipped) to + be manipulated via program specified in `$args`. + pattern: "*.{bed,gff,sam,vcf,fastq,fasta,tab,bed.gz,gff.gz,sam.gz,vcf.gz,fastq.gz,fasta.gz,tab.gz}" +output: + - output: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.gz": + type: file + description: | + Manipulated and gzipped version of input sequence file following program specified in `args`. + File name will be what is specified in `$prefix`. Do not include `.gz` suffix in `$prefix`! Output files` will be gzipped for you! + pattern: "*.gz" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@jfy133" +maintainers: + - "@jfy133" diff --git a/modules/nf-core/bioawk/tests/main.nf.test b/modules/nf-core/bioawk/tests/main.nf.test new file mode 100644 index 00000000..270ff1ef --- /dev/null +++ b/modules/nf-core/bioawk/tests/main.nf.test @@ -0,0 +1,35 @@ + +nextflow_process { + + name "Test Process BIOAWK" + script "../main.nf" + process "BIOAWK" + config "./nextflow.config" + + tag "modules" + tag "modules_nfcore" + tag "bioawk" + + test("test-bioawk") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} diff --git a/modules/nf-core/bioawk/tests/main.nf.test.snap b/modules/nf-core/bioawk/tests/main.nf.test.snap new file mode 100644 index 00000000..fa9b5930 --- /dev/null +++ b/modules/nf-core/bioawk/tests/main.nf.test.snap @@ -0,0 +1,37 @@ +{ + "test-bioawk": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "sample_1.fa.gz:md5,b558dd15d8940373a032a827d490e693" + ] + ], + "1": [ + "versions.yml:md5,5fe88e58a71f10551df56518c35ba91a" + ], + "output": [ + [ + { + "id": "test", + "single_end": false + }, + "sample_1.fa.gz:md5,b558dd15d8940373a032a827d490e693" + ] + ], + "versions": [ + "versions.yml:md5,5fe88e58a71f10551df56518c35ba91a" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-28T10:24:46.397249" + } +} \ No newline at end of file diff --git a/modules/nf-core/bioawk/tests/nextflow.config b/modules/nf-core/bioawk/tests/nextflow.config new file mode 100644 index 00000000..5ef017d9 --- /dev/null +++ b/modules/nf-core/bioawk/tests/nextflow.config @@ -0,0 +1,6 @@ +process { + withName: BIOAWK { + ext.args = "-c fastx \'{print \">\" \$name ORS length(\$seq)}\'" + ext.prefix = "sample_1.fa" + } +} diff --git a/modules/nf-core/blat/environment.yml b/modules/nf-core/blat/environment.yml new file mode 100644 index 00000000..2a85c078 --- /dev/null +++ b/modules/nf-core/blat/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::blat=36 diff --git a/modules/nf-core/blat/main.nf b/modules/nf-core/blat/main.nf new file mode 100644 index 00000000..ad7b7207 --- /dev/null +++ b/modules/nf-core/blat/main.nf @@ -0,0 +1,62 @@ +process BLAT { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/blat:36--0': + 'biocontainers/blat:36--0' }" + + input: + tuple val(meta) , path(query) + tuple val(meta2), path(subject) + + output: + tuple val(meta), path("*.psl"), emit: psl + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def unzip = query.toString().endsWith(".gz") + + """ + in=$query + if $unzip + then + gunzip -cdf $query > ${prefix}.fasta + in=${prefix}.fasta + fi + + blat \\ + $args \\ + $subject \\ + \$in \\ + ${prefix}.psl + + if $unzip + then + rm ${prefix}.fasta + fi + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + blat: \$(echo \$(blat 2>&1) | sed 's/^.*BLAT v. //; s/ fast.*\$//') + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + touch ${prefix}.psl + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + blat: \$(echo \$(blat 2>&1) | sed 's/^.*BLAT v. //; s/ fast.*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/blat/meta.yml b/modules/nf-core/blat/meta.yml new file mode 100644 index 00000000..70a92c9b --- /dev/null +++ b/modules/nf-core/blat/meta.yml @@ -0,0 +1,55 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/yaml-schema.json +name: "blat" +description: Queries a sequence subject +keywords: + - blat + - sequence + - search +tools: + - "blat": + description: "BLAT is a bioinformatics software tool which performs rapid mRNA/DNA + and cross-species protein alignments." + homepage: "https://kentinformatics.com/" + documentation: "https://kentinformatics.com/documentation" + doi: "10.1101/gr.229202" + licence: ["Free for academic, nonprofit and personal use"] + identifier: biotools:blat +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - query: + type: file + description: Sequence file + pattern: "*.{fasta,fasta.gz,fa,fa.gz,nib,2bit}" + - - meta2: + type: map + description: | + Groovy Map containing subject information + e.g. `[ id:'test', single_end:false ]` + - subject: + type: file + description: Sequence file + pattern: "*.{fa,nib,2bit}" +output: + - psl: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - "*.psl": + type: file + description: Search results + pattern: "*.{psl}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@d-jch" +maintainers: + - "@d-jch" diff --git a/modules/nf-core/blat/tests/main.nf.test b/modules/nf-core/blat/tests/main.nf.test new file mode 100644 index 00000000..8b07e5cf --- /dev/null +++ b/modules/nf-core/blat/tests/main.nf.test @@ -0,0 +1,75 @@ + +nextflow_process { + + name "Test Process BLAT" + script "../main.nf" + process "BLAT" + config "./nextflow.config" + + tag "modules" + tag "modules_nfcore" + tag "blat" + tag "seqtk/seq" + + setup { + run("SEQTK_SEQ") { + script "../../seqtk/seq/main.nf" + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + + """ + } + } + } + + test("test-blat") { + + when { + process { + """ + input[0] = SEQTK_SEQ.out.fastx + input[1] = [ + [ id:'sarscov2' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test-blat-stub") { + options '-stub' + when { + process { + """ + input[0] = SEQTK_SEQ.out.fastx + input[1] = [ + [ id:'sarscov2' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} diff --git a/modules/nf-core/blat/tests/main.nf.test.snap b/modules/nf-core/blat/tests/main.nf.test.snap new file mode 100644 index 00000000..d46a3320 --- /dev/null +++ b/modules/nf-core/blat/tests/main.nf.test.snap @@ -0,0 +1,72 @@ +{ + "test-blat": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.psl:md5,6e2e5b3be48c84877f3c54b32bb9ec33" + ] + ], + "1": [ + "versions.yml:md5,d9cde833b3f9cf6d359ef0f8a119380a" + ], + "psl": [ + [ + { + "id": "test", + "single_end": false + }, + "test.psl:md5,6e2e5b3be48c84877f3c54b32bb9ec33" + ] + ], + "versions": [ + "versions.yml:md5,d9cde833b3f9cf6d359ef0f8a119380a" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-06T20:38:03.56409" + }, + "test-blat-stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.psl:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,d9cde833b3f9cf6d359ef0f8a119380a" + ], + "psl": [ + [ + { + "id": "test", + "single_end": false + }, + "test.psl:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,d9cde833b3f9cf6d359ef0f8a119380a" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-06T20:38:09.736595" + } +} \ No newline at end of file diff --git a/modules/nf-core/blat/tests/nextflow.config b/modules/nf-core/blat/tests/nextflow.config new file mode 100644 index 00000000..58bc3f25 --- /dev/null +++ b/modules/nf-core/blat/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: SEQTK_SEQ { + ext.args = '-A' + } +} \ No newline at end of file diff --git a/modules/nf-core/bowtie/align/environment.yml b/modules/nf-core/bowtie/align/environment.yml new file mode 100644 index 00000000..4434c7e7 --- /dev/null +++ b/modules/nf-core/bowtie/align/environment.yml @@ -0,0 +1,6 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bowtie=1.3.0 + - bioconda::samtools=1.16.1 diff --git a/modules/nf-core/bowtie/align/main.nf b/modules/nf-core/bowtie/align/main.nf new file mode 100644 index 00000000..5e72b02a --- /dev/null +++ b/modules/nf-core/bowtie/align/main.nf @@ -0,0 +1,77 @@ +process BOWTIE_ALIGN { + tag "$meta.id" + label 'process_high' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3:c84c7c55c45af231883d9ff4fe706ac44c479c36-0' : + 'biocontainers/mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3:c84c7c55c45af231883d9ff4fe706ac44c479c36-0' }" + + input: + tuple val(meta), path(reads) + tuple val(meta2), path(index) + val (save_unaligned) + + output: + tuple val(meta), path('*.bam') , emit: bam + tuple val(meta), path('*.out') , emit: log + tuple val(meta), path('*fastq.gz') , emit: fastq, optional : true + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def unaligned = save_unaligned ? "--un ${prefix}.unmapped.fastq" : '' + def endedness = meta.single_end ? "$reads" : "-1 ${reads[0]} -2 ${reads[1]}" + """ + INDEX=\$(find -L ./ -name "*.3.ebwt" | sed 's/\\.3.ebwt\$//') + bowtie \\ + --threads $task.cpus \\ + --sam \\ + -x \$INDEX \\ + -q \\ + $unaligned \\ + $args \\ + $endedness \\ + 2> >(tee ${prefix}.out >&2) \\ + | samtools view $args2 -@ $task.cpus -bS -o ${prefix}.bam - + + if [ -f ${prefix}.unmapped.fastq ]; then + gzip ${prefix}.unmapped.fastq + fi + if [ -f ${prefix}.unmapped_1.fastq ]; then + gzip ${prefix}.unmapped_1.fastq + gzip ${prefix}.unmapped_2.fastq + fi + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bowtie: \$(echo \$(bowtie --version 2>&1) | sed 's/^.*bowtie-align-s version //; s/ .*\$//') + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + def unaligned = save_unaligned ? + meta.single_end ? "echo '' | gzip > ${prefix}.unmapped.fastq.gz" : + "echo '' | gzip > ${prefix}.unmapped_1.fastq.gz; echo '' | gzip > ${prefix}.unmapped_2.fastq.gz" + : '' + """ + touch ${prefix}.bam + touch ${prefix}.out + $unaligned + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bowtie: \$(echo \$(bowtie --version 2>&1) | sed 's/^.*bowtie-align-s version //; s/ .*\$//') + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS + """ + + +} diff --git a/modules/nf-core/bowtie/align/meta.yml b/modules/nf-core/bowtie/align/meta.yml new file mode 100644 index 00000000..7b346802 --- /dev/null +++ b/modules/nf-core/bowtie/align/meta.yml @@ -0,0 +1,80 @@ +name: bowtie_align +description: Align reads to a reference genome using bowtie +keywords: + - align + - map + - fastq + - fasta + - genome + - reference +tools: + - bowtie: + description: | + bowtie is a software package for mapping DNA sequences against + a large reference genome, such as the human genome. + homepage: http://bowtie-bio.sourceforge.net/index.shtml + documentation: http://bowtie-bio.sourceforge.net/manual.shtml + arxiv: arXiv:1303.3997 + licence: ["Artistic-2.0"] + identifier: biotools:bowtie +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files of size 1 and 2 for single-end and paired-end data, + respectively. + - - meta2: + type: map + description: | + Groovy Map containing genome information + e.g. [ id:'sarscov2' ] + - index: + type: file + description: Bowtie genome index files + pattern: "*.ebwt" + - - save_unaligned: + type: boolean + description: Whether to save fastq files containing the reads which did not + align. +output: + - bam: + - meta: + type: file + description: Output BAM file containing read alignments + pattern: "*.{bam}" + - "*.bam": + type: file + description: Output BAM file containing read alignments + pattern: "*.{bam}" + - log: + - meta: + type: file + description: Log file + pattern: "*.log" + - "*.out": + type: file + description: Log file + pattern: "*.log" + - fastq: + - meta: + type: file + description: Unaligned FastQ files + pattern: "*.fastq.gz" + - "*fastq.gz": + type: file + description: Unaligned FastQ files + pattern: "*.fastq.gz" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@kevinmenden" +maintainers: + - "@kevinmenden" diff --git a/modules/nf-core/bowtie/align/tests/main.nf.test b/modules/nf-core/bowtie/align/tests/main.nf.test new file mode 100644 index 00000000..3403ae22 --- /dev/null +++ b/modules/nf-core/bowtie/align/tests/main.nf.test @@ -0,0 +1,129 @@ +nextflow_process { + + name "Test Process BOWTIE_ALIGN" + script "../main.nf" + process "BOWTIE_ALIGN" + + tag "modules" + tag "modules_nfcore" + tag "bowtie" + tag "bowtie/align" + tag "bowtie/build" + + + setup { + run("BOWTIE_BUILD") { + script "../../../bowtie/build/main.nf" + process { + """ + input[0] = [[ id:'sarscov2' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] + """ + } + } + } + + test("sarscov2 - single_end") { + + when { + process { + """ + input[0] = [ [id:"test", single_end:true], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + input[1] = BOWTIE_BUILD.out.index + input[2] = true + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.versions, + process.out.bam.collect { bam(it[1]).getReadsMD5() }, + process.out.fastq, + process.out.log + ).match() } + ) + } + + } + + test("sarscov2 - single_end - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ [id:"test", single_end:true], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + input[1] = BOWTIE_BUILD.out.index + input[2] = true + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - paired_end") { + + when { + process { + """ + input[0] = [ [id:"test", single_end:false], + [file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true)] + ] + input[1] = BOWTIE_BUILD.out.index + input[2] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.versions, + process.out.bam.collect { bam(it[1]).getReads(2) }, + process.out.log + ).match() } + ) + } + + } + + test("sarscov2 - paired_end - stub") { + + options "-stub" + when { + process { + """ + input[0] = [ [id:"test", single_end:false], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + input[1] = BOWTIE_BUILD.out.index + input[2] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/bowtie/align/tests/main.nf.test.snap b/modules/nf-core/bowtie/align/tests/main.nf.test.snap new file mode 100644 index 00000000..de95bb81 --- /dev/null +++ b/modules/nf-core/bowtie/align/tests/main.nf.test.snap @@ -0,0 +1,192 @@ +{ + "sarscov2 - single_end": { + "content": [ + [ + "versions.yml:md5,96e36b0b99c80da0be8239d03db30ecc" + ], + [ + "7bdcfc6f54ae6e8f4570395cc85db9a3" + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.unmapped.fastq.gz:md5,5729a694abd09657da3b9101861090c4" + ] + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.out:md5,4b9140ceadb8a18ae9330885370f8a0b" + ] + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-26T09:25:24.60746041" + }, + "sarscov2 - single_end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.out:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.unmapped.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "3": [ + "versions.yml:md5,96e36b0b99c80da0be8239d03db30ecc" + ], + "bam": [ + [ + { + "id": "test", + "single_end": true + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastq": [ + [ + { + "id": "test", + "single_end": true + }, + "test.unmapped.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.out:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,96e36b0b99c80da0be8239d03db30ecc" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-25T10:00:28.666281812" + }, + "sarscov2 - paired_end": { + "content": [ + [ + "versions.yml:md5,96e36b0b99c80da0be8239d03db30ecc" + ], + [ + [ + "ATGTGTACATTGGCGACCCTGCTCAATTACCTGCACCACGCACATTGCTAACTAAGGGCACACTAGAACCAGAATATTTCAATTCAGTGTGTAGACTTATGAAAACTATAGGTCCAGACATGTTCCTCGGAACTTGTCGGCGTTGTCCTG", + "ACGCACATTGCTAACTAAGGGCACACTAGAACCAGAATATTTCAATTCAGTGTGTAGACTTATGAAAACTATAGGTCCAGACATGTTCCTCGGAACTTGTCGGCGTTGTCCTGCTGAAATTGTTGACACTGTGAGTGCTTTGGTTTATGA" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.out:md5,5e13272d112cef8faeedcdbd7c602de0" + ] + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-26T11:57:56.604464368" + }, + "sarscov2 - paired_end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.out:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,96e36b0b99c80da0be8239d03db30ecc" + ], + "bam": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastq": [ + + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.out:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,96e36b0b99c80da0be8239d03db30ecc" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-25T10:01:02.043164876" + } +} \ No newline at end of file diff --git a/modules/nf-core/bowtie/align/tests/tags.yml b/modules/nf-core/bowtie/align/tests/tags.yml new file mode 100644 index 00000000..a5753d58 --- /dev/null +++ b/modules/nf-core/bowtie/align/tests/tags.yml @@ -0,0 +1,2 @@ +bowtie/align: + - "modules/nf-core/bowtie/align/**" diff --git a/modules/nf-core/bowtie/build/environment.yml b/modules/nf-core/bowtie/build/environment.yml new file mode 100644 index 00000000..ab5a8422 --- /dev/null +++ b/modules/nf-core/bowtie/build/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bowtie=1.3.0 diff --git a/modules/nf-core/bowtie/build/main.nf b/modules/nf-core/bowtie/build/main.nf new file mode 100644 index 00000000..d5b4c690 --- /dev/null +++ b/modules/nf-core/bowtie/build/main.nf @@ -0,0 +1,50 @@ +process BOWTIE_BUILD { + tag "${meta.id}" + label 'process_high' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/bowtie:1.3.0--py38hed8969a_1' : + 'biocontainers/bowtie:1.3.0--py38hed8969a_1' }" + + input: + tuple val(meta), path(fasta) + + output: + tuple val(meta), path('bowtie') , emit: index + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mkdir -p bowtie + bowtie-build --threads $task.cpus $fasta bowtie/${prefix} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bowtie: \$(echo \$(bowtie --version 2>&1) | sed 's/^.*bowtie-align-s version //; s/ .*\$//') + END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mkdir -p bowtie + touch bowtie/${prefix}.1.ebwt + touch bowtie/${prefix}.2.ebwt + touch bowtie/${prefix}.3.ebwt + touch bowtie/${prefix}.4.ebwt + touch bowtie/${prefix}.rev.1.ebwt + touch bowtie/${prefix}.rev.2.ebwt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bowtie: \$(echo \$(bowtie --version 2>&1) | sed 's/^.*bowtie-align-s version //; s/ .*\$//') + END_VERSIONS + """ + +} diff --git a/modules/nf-core/bowtie/build/meta.yml b/modules/nf-core/bowtie/build/meta.yml new file mode 100644 index 00000000..a878a5b7 --- /dev/null +++ b/modules/nf-core/bowtie/build/meta.yml @@ -0,0 +1,48 @@ +name: bowtie_build +description: Create bowtie index for reference genome +keywords: + - index + - fasta + - genome + - reference +tools: + - bowtie: + description: | + bowtie is a software package for mapping DNA sequences against + a large reference genome, such as the human genome. + homepage: http://bowtie-bio.sourceforge.net/index.shtml + documentation: http://bowtie-bio.sourceforge.net/manual.shtml + arxiv: arXiv:1303.3997 + licence: ["Artistic-2.0"] + identifier: biotools:bowtie +input: + - - meta: + type: map + description: | + Groovy Map containing information about the genome fasta + e.g. [ id:'test' ] + - fasta: + type: file + description: Input genome fasta file +output: + - index: + - meta: + type: map + description: | + Groovy Map containing nformation about the genome fasta + e.g. [ id:'test' ] + - bowtie: + type: file + description: Folder containing bowtie genome index files + pattern: "*.ebwt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@kevinmenden" + - "@drpatelh" +maintainers: + - "@kevinmenden" + - "@drpatelh" diff --git a/modules/nf-core/bowtie/build/tests/main.nf.test b/modules/nf-core/bowtie/build/tests/main.nf.test new file mode 100644 index 00000000..25fb3dad --- /dev/null +++ b/modules/nf-core/bowtie/build/tests/main.nf.test @@ -0,0 +1,57 @@ +nextflow_process { + + name "Test Process BOWTIE_BUILD" + script "../main.nf" + process "BOWTIE_BUILD" + + tag "modules" + tag "modules_nfcore" + tag "bowtie" + tag "bowtie/build" + + test("sarscov2 - fasta") { + + when { + process { + """ + input[0] = [ + [id: 'sarscov2'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - fasta - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [[id: 'sarscov2'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/bowtie/build/tests/main.nf.test.snap b/modules/nf-core/bowtie/build/tests/main.nf.test.snap new file mode 100644 index 00000000..e8061756 --- /dev/null +++ b/modules/nf-core/bowtie/build/tests/main.nf.test.snap @@ -0,0 +1,96 @@ +{ + "sarscov2 - fasta - stub": { + "content": [ + { + "0": [ + [ + { + "id": "sarscov2" + }, + [ + "sarscov2.1.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sarscov2.2.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sarscov2.3.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sarscov2.4.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sarscov2.rev.1.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sarscov2.rev.2.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "1": [ + "versions.yml:md5,afbd066e1dd5ae4a30b21c49149ea09a" + ], + "index": [ + [ + { + "id": "sarscov2" + }, + [ + "sarscov2.1.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sarscov2.2.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sarscov2.3.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sarscov2.4.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sarscov2.rev.1.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sarscov2.rev.2.ebwt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "versions": [ + "versions.yml:md5,afbd066e1dd5ae4a30b21c49149ea09a" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-18T08:38:14.852528155" + }, + "sarscov2 - fasta": { + "content": [ + { + "0": [ + [ + { + "id": "sarscov2" + }, + [ + "sarscov2.1.ebwt:md5,d9b76ecf9fd0413240173273b38d8199", + "sarscov2.2.ebwt:md5,02b44af9f94c62ecd3c583048e25d4cf", + "sarscov2.3.ebwt:md5,4ed93abba181d8dfab2e303e33114777", + "sarscov2.4.ebwt:md5,c25be5f8b0378abf7a58c8a880b87626", + "sarscov2.rev.1.ebwt:md5,b37aaf11853e65a3b13561f27a912b06", + "sarscov2.rev.2.ebwt:md5,9e6b0c4c1ddb99ae71ff8a4fe5ec6459" + ] + ] + ], + "1": [ + "versions.yml:md5,afbd066e1dd5ae4a30b21c49149ea09a" + ], + "index": [ + [ + { + "id": "sarscov2" + }, + [ + "sarscov2.1.ebwt:md5,d9b76ecf9fd0413240173273b38d8199", + "sarscov2.2.ebwt:md5,02b44af9f94c62ecd3c583048e25d4cf", + "sarscov2.3.ebwt:md5,4ed93abba181d8dfab2e303e33114777", + "sarscov2.4.ebwt:md5,c25be5f8b0378abf7a58c8a880b87626", + "sarscov2.rev.1.ebwt:md5,b37aaf11853e65a3b13561f27a912b06", + "sarscov2.rev.2.ebwt:md5,9e6b0c4c1ddb99ae71ff8a4fe5ec6459" + ] + ] + ], + "versions": [ + "versions.yml:md5,afbd066e1dd5ae4a30b21c49149ea09a" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-18T08:37:53.65689025" + } +} \ No newline at end of file diff --git a/modules/nf-core/bowtie/build/tests/tags.yml b/modules/nf-core/bowtie/build/tests/tags.yml new file mode 100644 index 00000000..1ccfa30c --- /dev/null +++ b/modules/nf-core/bowtie/build/tests/tags.yml @@ -0,0 +1,2 @@ +bowtie/build: + - "modules/nf-core/bowtie/build/**" diff --git a/modules/nf-core/bowtie2/align/environment.yml b/modules/nf-core/bowtie2/align/environment.yml new file mode 100644 index 00000000..9090f218 --- /dev/null +++ b/modules/nf-core/bowtie2/align/environment.yml @@ -0,0 +1,7 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::bowtie2=2.5.2 + - bioconda::samtools=1.18 + - conda-forge::pigz=2.6 diff --git a/modules/nf-core/bowtie2/align/main.nf b/modules/nf-core/bowtie2/align/main.nf new file mode 100644 index 00000000..809525ad --- /dev/null +++ b/modules/nf-core/bowtie2/align/main.nf @@ -0,0 +1,117 @@ +process BOWTIE2_ALIGN { + tag "$meta.id" + label 'process_high' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mulled-v2-ac74a7f02cebcfcc07d8e8d1d750af9c83b4d45a:f70b31a2db15c023d641c32f433fb02cd04df5a6-0' : + 'biocontainers/mulled-v2-ac74a7f02cebcfcc07d8e8d1d750af9c83b4d45a:f70b31a2db15c023d641c32f433fb02cd04df5a6-0' }" + + input: + tuple val(meta) , path(reads) + tuple val(meta2), path(index) + tuple val(meta3), path(fasta) + val save_unaligned + val sort_bam + + output: + tuple val(meta), path("*.sam") , emit: sam , optional:true + tuple val(meta), path("*.bam") , emit: bam , optional:true + tuple val(meta), path("*.cram") , emit: cram , optional:true + tuple val(meta), path("*.csi") , emit: csi , optional:true + tuple val(meta), path("*.crai") , emit: crai , optional:true + tuple val(meta), path("*.log") , emit: log + tuple val(meta), path("*fastq.gz") , emit: fastq , optional:true + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: "" + def args2 = task.ext.args2 ?: "" + def prefix = task.ext.prefix ?: "${meta.id}" + + def unaligned = "" + def reads_args = "" + if (meta.single_end) { + unaligned = save_unaligned ? "--un-gz ${prefix}.unmapped.fastq.gz" : "" + reads_args = "-U ${reads}" + } else { + unaligned = save_unaligned ? "--un-conc-gz ${prefix}.unmapped.fastq.gz" : "" + reads_args = "-1 ${reads[0]} -2 ${reads[1]}" + } + + def samtools_command = sort_bam ? 'sort' : 'view' + def extension_pattern = /(--output-fmt|-O)+\s+(\S+)/ + def extension_matcher = (args2 =~ extension_pattern) + def extension = extension_matcher.getCount() > 0 ? extension_matcher[0][2].toLowerCase() : "bam" + def reference = fasta && extension=="cram" ? "--reference ${fasta}" : "" + if (!fasta && extension=="cram") error "Fasta reference is required for CRAM output" + + """ + INDEX=`find -L ./ -name "*.rev.1.bt2" | sed "s/\\.rev.1.bt2\$//"` + [ -z "\$INDEX" ] && INDEX=`find -L ./ -name "*.rev.1.bt2l" | sed "s/\\.rev.1.bt2l\$//"` + [ -z "\$INDEX" ] && echo "Bowtie2 index files not found" 1>&2 && exit 1 + + bowtie2 \\ + -x \$INDEX \\ + $reads_args \\ + --threads $task.cpus \\ + $unaligned \\ + $args \\ + 2> >(tee ${prefix}.bowtie2.log >&2) \\ + | samtools $samtools_command $args2 --threads $task.cpus ${reference} -o ${prefix}.${extension} - + + if [ -f ${prefix}.unmapped.fastq.1.gz ]; then + mv ${prefix}.unmapped.fastq.1.gz ${prefix}.unmapped_1.fastq.gz + fi + + if [ -f ${prefix}.unmapped.fastq.2.gz ]; then + mv ${prefix}.unmapped.fastq.2.gz ${prefix}.unmapped_2.fastq.gz + fi + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//') + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) + END_VERSIONS + """ + + stub: + def args2 = task.ext.args2 ?: "" + def prefix = task.ext.prefix ?: "${meta.id}" + def extension_pattern = /(--output-fmt|-O)+\s+(\S+)/ + def extension = (args2 ==~ extension_pattern) ? (args2 =~ extension_pattern)[0][2].toLowerCase() : "bam" + def create_unmapped = "" + if (meta.single_end) { + create_unmapped = save_unaligned ? "touch ${prefix}.unmapped.fastq.gz" : "" + } else { + create_unmapped = save_unaligned ? "touch ${prefix}.unmapped_1.fastq.gz && touch ${prefix}.unmapped_2.fastq.gz" : "" + } + def reference = fasta && extension=="cram" ? "--reference ${fasta}" : "" + if (!fasta && extension=="cram") error "Fasta reference is required for CRAM output" + + def create_index = "" + if (extension == "cram") { + create_index = "touch ${prefix}.crai" + } else if (extension == "bam") { + create_index = "touch ${prefix}.csi" + } + + """ + touch ${prefix}.${extension} + ${create_index} + touch ${prefix}.bowtie2.log + ${create_unmapped} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//') + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) + END_VERSIONS + """ + +} diff --git a/modules/nf-core/bowtie2/align/meta.yml b/modules/nf-core/bowtie2/align/meta.yml new file mode 100644 index 00000000..f841f781 --- /dev/null +++ b/modules/nf-core/bowtie2/align/meta.yml @@ -0,0 +1,132 @@ +name: bowtie2_align +description: Align reads to a reference genome using bowtie2 +keywords: + - align + - map + - fasta + - fastq + - genome + - reference +tools: + - bowtie2: + description: | + Bowtie 2 is an ultrafast and memory-efficient tool for aligning + sequencing reads to long reference sequences. + homepage: http://bowtie-bio.sourceforge.net/bowtie2/index.shtml + documentation: http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml + doi: 10.1038/nmeth.1923 + licence: ["GPL-3.0-or-later"] + identifier: "" +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files of size 1 and 2 for single-end and paired-end data, + respectively. + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test', single_end:false ] + - index: + type: file + description: Bowtie2 genome index files + pattern: "*.ebwt" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Bowtie2 genome fasta file + pattern: "*.fasta" + - - save_unaligned: + type: boolean + description: | + Save reads that do not map to the reference (true) or discard them (false) + (default: false) + - - sort_bam: + type: boolean + description: use samtools sort (true) or samtools view (false) + pattern: "true or false" +output: + - sam: + - meta: + type: file + description: Output SAM file containing read alignments + pattern: "*.sam" + - "*.sam": + type: file + description: Output SAM file containing read alignments + pattern: "*.sam" + - bam: + - meta: + type: file + description: Output BAM file containing read alignments + pattern: "*.bam" + - "*.bam": + type: file + description: Output BAM file containing read alignments + pattern: "*.bam" + - cram: + - meta: + type: file + description: Output CRAM file containing read alignments + pattern: "*.cram" + - "*.cram": + type: file + description: Output CRAM file containing read alignments + pattern: "*.cram" + - csi: + - meta: + type: file + description: Output SAM/BAM index for large inputs + pattern: "*.csi" + - "*.csi": + type: file + description: Output SAM/BAM index for large inputs + pattern: "*.csi" + - crai: + - meta: + type: file + description: Output CRAM index + pattern: "*.crai" + - "*.crai": + type: file + description: Output CRAM index + pattern: "*.crai" + - log: + - meta: + type: file + description: Aligment log + pattern: "*.log" + - "*.log": + type: file + description: Aligment log + pattern: "*.log" + - fastq: + - meta: + type: file + description: Unaligned FastQ files + pattern: "*.fastq.gz" + - "*fastq.gz": + type: file + description: Unaligned FastQ files + pattern: "*.fastq.gz" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@joseespinosa" + - "@drpatelh" +maintainers: + - "@joseespinosa" + - "@drpatelh" diff --git a/modules/nf-core/bowtie2/align/tests/cram_crai.config b/modules/nf-core/bowtie2/align/tests/cram_crai.config new file mode 100644 index 00000000..03f1d5e5 --- /dev/null +++ b/modules/nf-core/bowtie2/align/tests/cram_crai.config @@ -0,0 +1,5 @@ +process { + withName: BOWTIE2_ALIGN { + ext.args2 = '--output-fmt cram --write-index' + } +} diff --git a/modules/nf-core/bowtie2/align/tests/large_index.config b/modules/nf-core/bowtie2/align/tests/large_index.config new file mode 100644 index 00000000..fdc1c59d --- /dev/null +++ b/modules/nf-core/bowtie2/align/tests/large_index.config @@ -0,0 +1,5 @@ +process { + withName: BOWTIE2_BUILD { + ext.args = '--large-index' + } +} \ No newline at end of file diff --git a/modules/nf-core/bowtie2/align/tests/main.nf.test b/modules/nf-core/bowtie2/align/tests/main.nf.test new file mode 100644 index 00000000..0de5950f --- /dev/null +++ b/modules/nf-core/bowtie2/align/tests/main.nf.test @@ -0,0 +1,623 @@ +nextflow_process { + + name "Test Process BOWTIE2_ALIGN" + script "../main.nf" + process "BOWTIE2_ALIGN" + tag "modules" + tag "modules_nfcore" + tag "bowtie2" + tag "bowtie2/build" + tag "bowtie2/align" + + test("sarscov2 - fastq, index, fasta, false, false - bam") { + + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.bam[0][1]).name, + process.out.log, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - fastq, index, fasta, false, false - sam") { + + config "./sam.config" + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.sam[0][1]).readLines()[0..4], + process.out.log, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - fastq, index, fasta, false, false - sam2") { + + config "./sam2.config" + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.sam[0][1]).readLines()[0..4], + process.out.log, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - fastq, index, fasta, false, true - bam") { + + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.bam[0][1]).name, + process.out.log, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - [fastq1, fastq2], index, fasta, false, false - bam") { + + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.bam[0][1]).name, + process.out.log, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - [fastq1, fastq2], index, fasta, false, true - bam") { + + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.bam[0][1]).name, + process.out.log, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - fastq, large_index, fasta, false, false - bam") { + + config "./large_index.config" + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.bam[0][1]).name, + process.out.log, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - [fastq1, fastq2], large_index, fasta, false, false - bam") { + + config "./large_index.config" + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.bam[0][1]).name, + process.out.log, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - [fastq1, fastq2], index, fasta, true, false - bam") { + + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.bam[0][1]).name, + process.out.log, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - fastq, index, fasta, true, false - bam") { + + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.bam[0][1]).name, + process.out.log, + process.out.fastq, + process.out.versions + ).match() } + + ) + } + + } + + test("sarscov2 - [fastq1, fastq2], index, fasta, true, true - cram") { + + config "./cram_crai.config" + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = true //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.cram[0][1]).name, + file(process.out.crai[0][1]).name + ).match() } + ) + } + + } + + test("sarscov2 - [fastq1, fastq2], index, fasta, false, false - stub") { + + options "-stub" + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.bam[0][1]).name, + file(process.out.csi[0][1]).name, + file(process.out.log[0][1]).name, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - fastq, index, fasta, true, false - stub") { + + options "-stub" + setup { + run("BOWTIE2_BUILD") { + script "../../build/main.nf" + process { + """ + input[0] = [ + [ id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + input[1] = BOWTIE2_BUILD.out.index + input[2] = [[ id:'test'], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = false //save_unaligned + input[4] = false //sort + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.bam[0][1]).name, + file(process.out.csi[0][1]).name, + file(process.out.log[0][1]).name, + process.out.fastq, + process.out.versions + ).match() } + ) + } + + } + +} diff --git a/modules/nf-core/bowtie2/align/tests/main.nf.test.snap b/modules/nf-core/bowtie2/align/tests/main.nf.test.snap new file mode 100644 index 00000000..028e7da6 --- /dev/null +++ b/modules/nf-core/bowtie2/align/tests/main.nf.test.snap @@ -0,0 +1,311 @@ +{ + "sarscov2 - [fastq1, fastq2], large_index, fasta, false, false - bam": { + "content": [ + "test.bam", + [ + [ + { + "id": "test", + "single_end": false + }, + "test.bowtie2.log:md5,bd89ce1b28c93bf822bae391ffcedd19" + ] + ], + [ + + ], + [ + "versions.yml:md5,01d18ab035146ea790e9a0f70adb758f" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-03-18T13:19:25.337323" + }, + "sarscov2 - fastq, index, fasta, false, false - sam2": { + "content": [ + [ + "ERR5069949.2151832\t16\tMT192765.1\t17453\t42\t150M\t*\t0\t0\tACGCACATTGCTAACTAAGGGCACACTAGAACCAGAATATTTCAATTCAGTGTGTAGACTTATGAAAACTATAGGTCCAGACATGTTCCTCGGAACTTGTCGGCGTTGTCCTGCTGAAATTGTTGACACTGTGAGTGCTTTGGTTTATGA\tAAAA versions.yml + "${task.process}": + bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//') + END_VERSIONS + """ + + stub: + """ + mkdir bowtie2 + touch bowtie2/${fasta.baseName}.{1..4}.bt2 + touch bowtie2/${fasta.baseName}.rev.{1,2}.bt2 + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/bowtie2/build/meta.yml b/modules/nf-core/bowtie2/build/meta.yml new file mode 100644 index 00000000..2729a92e --- /dev/null +++ b/modules/nf-core/bowtie2/build/meta.yml @@ -0,0 +1,49 @@ +name: bowtie2_build +description: Builds bowtie index for reference genome +keywords: + - build + - index + - fasta + - genome + - reference +tools: + - bowtie2: + description: | + Bowtie 2 is an ultrafast and memory-efficient tool for aligning + sequencing reads to long reference sequences. + homepage: http://bowtie-bio.sourceforge.net/bowtie2/index.shtml + documentation: http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml + doi: 10.1038/nmeth.1923 + licence: ["GPL-3.0-or-later"] + identifier: "" +input: + - - meta: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Input genome fasta file +output: + - index: + - meta: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test', single_end:false ] + - bowtie2: + type: file + description: Bowtie2 genome index files + pattern: "*.bt2" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@joseespinosa" + - "@drpatelh" +maintainers: + - "@joseespinosa" + - "@drpatelh" diff --git a/modules/nf-core/bowtie2/build/tests/main.nf.test b/modules/nf-core/bowtie2/build/tests/main.nf.test new file mode 100644 index 00000000..16376025 --- /dev/null +++ b/modules/nf-core/bowtie2/build/tests/main.nf.test @@ -0,0 +1,31 @@ +nextflow_process { + + name "Test Process BOWTIE2_BUILD" + script "modules/nf-core/bowtie2/build/main.nf" + process "BOWTIE2_BUILD" + tag "modules" + tag "modules_nfcore" + tag "bowtie2" + tag "bowtie2/build" + + test("Should run without failures") { + + when { + process { + """ + input[0] = [ + [ id:'test' ], + file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) + ] + """ + } + } + + then { + assert process.success + assert snapshot(process.out).match() + } + + } + +} diff --git a/modules/nf-core/bowtie2/build/tests/main.nf.test.snap b/modules/nf-core/bowtie2/build/tests/main.nf.test.snap new file mode 100644 index 00000000..6875e021 --- /dev/null +++ b/modules/nf-core/bowtie2/build/tests/main.nf.test.snap @@ -0,0 +1,45 @@ +{ + "Should run without failures": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + [ + "genome.1.bt2:md5,cbe3d0bbea55bc57c99b4bfa25b5fbdf", + "genome.2.bt2:md5,47b153cd1319abc88dda532462651fcf", + "genome.3.bt2:md5,4ed93abba181d8dfab2e303e33114777", + "genome.4.bt2:md5,c25be5f8b0378abf7a58c8a880b87626", + "genome.rev.1.bt2:md5,52be6950579598a990570fbcf5372184", + "genome.rev.2.bt2:md5,e3b4ef343dea4dd571642010a7d09597" + ] + ] + ], + "1": [ + "versions.yml:md5,1df11e9b82891527271c889c880d3974" + ], + "index": [ + [ + { + "id": "test" + }, + [ + "genome.1.bt2:md5,cbe3d0bbea55bc57c99b4bfa25b5fbdf", + "genome.2.bt2:md5,47b153cd1319abc88dda532462651fcf", + "genome.3.bt2:md5,4ed93abba181d8dfab2e303e33114777", + "genome.4.bt2:md5,c25be5f8b0378abf7a58c8a880b87626", + "genome.rev.1.bt2:md5,52be6950579598a990570fbcf5372184", + "genome.rev.2.bt2:md5,e3b4ef343dea4dd571642010a7d09597" + ] + ] + ], + "versions": [ + "versions.yml:md5,1df11e9b82891527271c889c880d3974" + ] + } + ], + "timestamp": "2023-11-23T11:51:01.107681997" + } +} \ No newline at end of file diff --git a/modules/nf-core/bowtie2/build/tests/tags.yml b/modules/nf-core/bowtie2/build/tests/tags.yml new file mode 100644 index 00000000..81aa61da --- /dev/null +++ b/modules/nf-core/bowtie2/build/tests/tags.yml @@ -0,0 +1,2 @@ +bowtie2/build: + - modules/nf-core/bowtie2/build/** diff --git a/modules/nf-core/cat/cat/environment.yml b/modules/nf-core/cat/cat/environment.yml deleted file mode 100644 index 17a04ef2..00000000 --- a/modules/nf-core/cat/cat/environment.yml +++ /dev/null @@ -1,7 +0,0 @@ -name: cat_cat -channels: - - conda-forge - - bioconda - - defaults -dependencies: - - conda-forge::pigz=2.3.4 diff --git a/modules/nf-core/cat/cat/main.nf b/modules/nf-core/cat/cat/main.nf deleted file mode 100644 index adbdbd7b..00000000 --- a/modules/nf-core/cat/cat/main.nf +++ /dev/null @@ -1,79 +0,0 @@ -process CAT_CAT { - tag "$meta.id" - label 'process_low' - - conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/pigz:2.3.4' : - 'biocontainers/pigz:2.3.4' }" - - input: - tuple val(meta), path(files_in) - - output: - tuple val(meta), path("${prefix}"), emit: file_out - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - def args = task.ext.args ?: '' - def args2 = task.ext.args2 ?: '' - def file_list = files_in.collect { it.toString() } - - // choose appropriate concatenation tool depending on input and output format - - // | input | output | command1 | command2 | - // |-----------|------------|----------|----------| - // | gzipped | gzipped | cat | | - // | ungzipped | ungzipped | cat | | - // | gzipped | ungzipped | zcat | | - // | ungzipped | gzipped | cat | pigz | - - // Use input file ending as default - prefix = task.ext.prefix ?: "${meta.id}${getFileSuffix(file_list[0])}" - out_zip = prefix.endsWith('.gz') - in_zip = file_list[0].endsWith('.gz') - command1 = (in_zip && !out_zip) ? 'zcat' : 'cat' - command2 = (!in_zip && out_zip) ? "| pigz -c -p $task.cpus $args2" : '' - if(file_list.contains(prefix.trim())) { - error "The name of the input file can't be the same as for the output prefix in the " + - "module CAT_CAT (currently `$prefix`). Please choose a different one." - } - """ - $command1 \\ - $args \\ - ${file_list.join(' ')} \\ - $command2 \\ - > ${prefix} - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) - END_VERSIONS - """ - - stub: - def file_list = files_in.collect { it.toString() } - prefix = task.ext.prefix ?: "${meta.id}${file_list[0].substring(file_list[0].lastIndexOf('.'))}" - if(file_list.contains(prefix.trim())) { - error "The name of the input file can't be the same as for the output prefix in the " + - "module CAT_CAT (currently `$prefix`). Please choose a different one." - } - """ - touch $prefix - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' ) - END_VERSIONS - """ -} - -// for .gz files also include the second to last extension if it is present. E.g., .fasta.gz -def getFileSuffix(filename) { - def match = filename =~ /^.*?((\.\w{1,5})?(\.\w{1,5}\.gz$))/ - return match ? match[0][1] : filename.substring(filename.lastIndexOf('.')) -} - diff --git a/modules/nf-core/cat/cat/meta.yml b/modules/nf-core/cat/cat/meta.yml deleted file mode 100644 index 00a8db0b..00000000 --- a/modules/nf-core/cat/cat/meta.yml +++ /dev/null @@ -1,36 +0,0 @@ -name: cat_cat -description: A module for concatenation of gzipped or uncompressed files -keywords: - - concatenate - - gzip - - cat -tools: - - cat: - description: Just concatenation - documentation: https://man7.org/linux/man-pages/man1/cat.1.html - licence: ["GPL-3.0-or-later"] -input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - files_in: - type: file - description: List of compressed / uncompressed files - pattern: "*" -output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - - file_out: - type: file - description: Concatenated file. Will be gzipped if file_out ends with ".gz" - pattern: "${file_out}" -authors: - - "@erikrikarddaniel" - - "@FriederikeHanssen" -maintainers: - - "@erikrikarddaniel" - - "@FriederikeHanssen" diff --git a/modules/nf-core/cat/cat/tests/main.nf.test b/modules/nf-core/cat/cat/tests/main.nf.test deleted file mode 100644 index fcee2d19..00000000 --- a/modules/nf-core/cat/cat/tests/main.nf.test +++ /dev/null @@ -1,178 +0,0 @@ -nextflow_process { - - name "Test Process CAT_CAT" - script "../main.nf" - process "CAT_CAT" - tag "modules" - tag "modules_nfcore" - tag "cat" - tag "cat/cat" - - test("test_cat_name_conflict") { - when { - params { - outdir = "${outputDir}" - } - process { - """ - input[0] = - [ - [ id:'genome', single_end:true ], - [ - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.sizes', checkIfExists: true) - ] - ] - """ - } - } - then { - assertAll( - { assert !process.success }, - { assert process.stdout.toString().contains("The name of the input file can't be the same as for the output prefix") } - ) - } - } - - test("test_cat_unzipped_unzipped") { - when { - params { - outdir = "${outputDir}" - } - process { - """ - input[0] = - [ - [ id:'test', single_end:true ], - [ - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.sizes', checkIfExists: true) - ] - ] - """ - } - } - then { - assertAll( - { assert process.success }, - { assert snapshot(process.out).match() } - ) - } - } - - - test("test_cat_zipped_zipped") { - when { - params { - outdir = "${outputDir}" - } - process { - """ - input[0] = - [ - [ id:'test', single_end:true ], - [ - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.gff3.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/alignment/last/contigs.genome.maf.gz', checkIfExists: true) - ] - ] - """ - } - } - then { - def lines = path(process.out.file_out.get(0).get(1)).linesGzip - assertAll( - { assert process.success }, - { assert snapshot(lines[0..5]).match("test_cat_zipped_zipped_lines") }, - { assert snapshot(lines.size()).match("test_cat_zipped_zipped_size")} - ) - } - } - - test("test_cat_zipped_unzipped") { - config './nextflow_zipped_unzipped.config' - - when { - params { - outdir = "${outputDir}" - } - process { - """ - input[0] = - [ - [ id:'test', single_end:true ], - [ - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.gff3.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/alignment/last/contigs.genome.maf.gz', checkIfExists: true) - ] - ] - """ - } - } - - then { - assertAll( - { assert process.success }, - { assert snapshot(process.out).match() } - ) - } - - } - - test("test_cat_unzipped_zipped") { - config './nextflow_unzipped_zipped.config' - when { - params { - outdir = "${outputDir}" - } - process { - """ - input[0] = - [ - [ id:'test', single_end:true ], - [ - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.sizes', checkIfExists: true) - ] - ] - """ - } - } - then { - def lines = path(process.out.file_out.get(0).get(1)).linesGzip - assertAll( - { assert process.success }, - { assert snapshot(lines[0..5]).match("test_cat_unzipped_zipped_lines") }, - { assert snapshot(lines.size()).match("test_cat_unzipped_zipped_size")} - ) - } - } - - test("test_cat_one_file_unzipped_zipped") { - config './nextflow_unzipped_zipped.config' - when { - params { - outdir = "${outputDir}" - } - process { - """ - input[0] = - [ - [ id:'test', single_end:true ], - [ - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) - ] - ] - """ - } - } - then { - def lines = path(process.out.file_out.get(0).get(1)).linesGzip - assertAll( - { assert process.success }, - { assert snapshot(lines[0..5]).match("test_cat_one_file_unzipped_zipped_lines") }, - { assert snapshot(lines.size()).match("test_cat_one_file_unzipped_zipped_size")} - ) - } - } -} diff --git a/modules/nf-core/cat/cat/tests/main.nf.test.snap b/modules/nf-core/cat/cat/tests/main.nf.test.snap deleted file mode 100644 index 423571ba..00000000 --- a/modules/nf-core/cat/cat/tests/main.nf.test.snap +++ /dev/null @@ -1,121 +0,0 @@ -{ - "test_cat_unzipped_zipped_size": { - "content": [ - 375 - ], - "timestamp": "2023-10-16T14:33:08.049445686" - }, - "test_cat_unzipped_unzipped": { - "content": [ - { - "0": [ - [ - { - "id": "test", - "single_end": true - }, - "test.fasta:md5,f44b33a0e441ad58b2d3700270e2dbe2" - ] - ], - "1": [ - "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" - ], - "file_out": [ - [ - { - "id": "test", - "single_end": true - }, - "test.fasta:md5,f44b33a0e441ad58b2d3700270e2dbe2" - ] - ], - "versions": [ - "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" - ] - } - ], - "timestamp": "2023-10-16T14:32:18.500464399" - }, - "test_cat_zipped_unzipped": { - "content": [ - { - "0": [ - [ - { - "id": "test", - "single_end": true - }, - "cat.txt:md5,c439d3b60e7bc03e8802a451a0d9a5d9" - ] - ], - "1": [ - "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" - ], - "file_out": [ - [ - { - "id": "test", - "single_end": true - }, - "cat.txt:md5,c439d3b60e7bc03e8802a451a0d9a5d9" - ] - ], - "versions": [ - "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" - ] - } - ], - "timestamp": "2023-10-16T14:32:49.642741302" - }, - "test_cat_zipped_zipped_lines": { - "content": [ - [ - "MT192765.1\tGenbank\ttranscript\t259\t29667\t.\t+\t.\tID=unknown_transcript_1;geneID=orf1ab;gene_name=orf1ab", - "MT192765.1\tGenbank\tgene\t259\t21548\t.\t+\t.\tParent=unknown_transcript_1", - "MT192765.1\tGenbank\tCDS\t259\t13461\t.\t+\t0\tParent=unknown_transcript_1;exception=\"ribosomal slippage\";gbkey=CDS;gene=orf1ab;note=\"pp1ab;translated=by -1 ribosomal frameshift\";product=\"orf1ab polyprotein\";protein_id=QIK50426.1", - "MT192765.1\tGenbank\tCDS\t13461\t21548\t.\t+\t0\tParent=unknown_transcript_1;exception=\"ribosomal slippage\";gbkey=CDS;gene=orf1ab;note=\"pp1ab;translated=by -1 ribosomal frameshift\";product=\"orf1ab polyprotein\";protein_id=QIK50426.1", - "MT192765.1\tGenbank\tCDS\t21556\t25377\t.\t+\t0\tParent=unknown_transcript_1;gbkey=CDS;gene=S;note=\"structural protein\";product=\"surface glycoprotein\";protein_id=QIK50427.1", - "MT192765.1\tGenbank\tgene\t21556\t25377\t.\t+\t.\tParent=unknown_transcript_1" - ] - ], - "timestamp": "2023-10-16T14:32:33.629048645" - }, - "test_cat_unzipped_zipped_lines": { - "content": [ - [ - ">MT192765.1 Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/PC00101P/2020, complete genome", - "GTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGT", - "GTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGACACGAG", - "TAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTTTGTCCGG", - "GTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTT", - "ACAGGTTCGCGACGTGCTCGTACGTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAG" - ] - ], - "timestamp": "2023-10-16T14:33:08.038830506" - }, - "test_cat_one_file_unzipped_zipped_lines": { - "content": [ - [ - ">MT192765.1 Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/PC00101P/2020, complete genome", - "GTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGT", - "GTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGACACGAG", - "TAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTTTGTCCGG", - "GTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTT", - "ACAGGTTCGCGACGTGCTCGTACGTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAG" - ] - ], - "timestamp": "2023-10-16T14:33:21.39642399" - }, - "test_cat_zipped_zipped_size": { - "content": [ - 78 - ], - "timestamp": "2023-10-16T14:32:33.641869244" - }, - "test_cat_one_file_unzipped_zipped_size": { - "content": [ - 374 - ], - "timestamp": "2023-10-16T14:33:21.4094373" - } -} \ No newline at end of file diff --git a/modules/nf-core/cat/cat/tests/nextflow_unzipped_zipped.config b/modules/nf-core/cat/cat/tests/nextflow_unzipped_zipped.config deleted file mode 100644 index ec26b0fd..00000000 --- a/modules/nf-core/cat/cat/tests/nextflow_unzipped_zipped.config +++ /dev/null @@ -1,6 +0,0 @@ - -process { - withName: CAT_CAT { - ext.prefix = 'cat.txt.gz' - } -} diff --git a/modules/nf-core/cat/cat/tests/nextflow_zipped_unzipped.config b/modules/nf-core/cat/cat/tests/nextflow_zipped_unzipped.config deleted file mode 100644 index fbc79783..00000000 --- a/modules/nf-core/cat/cat/tests/nextflow_zipped_unzipped.config +++ /dev/null @@ -1,8 +0,0 @@ - -process { - - withName: CAT_CAT { - ext.prefix = 'cat.txt' - } - -} diff --git a/modules/nf-core/cat/cat/tests/tags.yml b/modules/nf-core/cat/cat/tests/tags.yml deleted file mode 100644 index 37b578f5..00000000 --- a/modules/nf-core/cat/cat/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -cat/cat: - - modules/nf-core/cat/cat/** diff --git a/modules/nf-core/cat/fastq/environment.yml b/modules/nf-core/cat/fastq/environment.yml index 8c69b121..c7eb9bd1 100644 --- a/modules/nf-core/cat/fastq/environment.yml +++ b/modules/nf-core/cat/fastq/environment.yml @@ -1,7 +1,5 @@ -name: cat_fastq channels: - conda-forge - bioconda - - defaults dependencies: - conda-forge::coreutils=8.30 diff --git a/modules/nf-core/cat/fastq/main.nf b/modules/nf-core/cat/fastq/main.nf index f132b2ad..b68e5f91 100644 --- a/modules/nf-core/cat/fastq/main.nf +++ b/modules/nf-core/cat/fastq/main.nf @@ -53,9 +53,9 @@ process CAT_FASTQ { def prefix = task.ext.prefix ?: "${meta.id}" def readList = reads instanceof List ? reads.collect{ it.toString() } : [reads.toString()] if (meta.single_end) { - if (readList.size > 1) { + if (readList.size >= 1) { """ - touch ${prefix}.merged.fastq.gz + echo '' | gzip > ${prefix}.merged.fastq.gz cat <<-END_VERSIONS > versions.yml "${task.process}": @@ -64,10 +64,10 @@ process CAT_FASTQ { """ } } else { - if (readList.size > 2) { + if (readList.size >= 2) { """ - touch ${prefix}_1.merged.fastq.gz - touch ${prefix}_2.merged.fastq.gz + echo '' | gzip > ${prefix}_1.merged.fastq.gz + echo '' | gzip > ${prefix}_2.merged.fastq.gz cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/cat/fastq/meta.yml b/modules/nf-core/cat/fastq/meta.yml index db4ac3c7..91ff2fb5 100644 --- a/modules/nf-core/cat/fastq/meta.yml +++ b/modules/nf-core/cat/fastq/meta.yml @@ -10,30 +10,33 @@ tools: The cat utility reads files sequentially, writing them to the standard output. documentation: https://www.gnu.org/software/coreutils/manual/html_node/cat-invocation.html licence: ["GPL-3.0-or-later"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - reads: - type: file - description: | - List of input FastQ files to be concatenated. + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files to be concatenated. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - reads: - type: file - description: Merged fastq file - pattern: "*.{merged.fastq.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.merged.fastq.gz": + type: file + description: Merged fastq file + pattern: "*.{merged.fastq.gz}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/cat/fastq/tests/main.nf.test b/modules/nf-core/cat/fastq/tests/main.nf.test index dab2e14c..f88a78b6 100644 --- a/modules/nf-core/cat/fastq/tests/main.nf.test +++ b/modules/nf-core/cat/fastq/tests/main.nf.test @@ -1,3 +1,5 @@ +// NOTE The version snaps may not be consistant +// https://github.com/nf-core/modules/pull/4087#issuecomment-1767948035 nextflow_process { name "Test Process CAT_FASTQ" @@ -11,9 +13,6 @@ nextflow_process { test("test_cat_fastq_single_end") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -36,9 +35,6 @@ nextflow_process { test("test_cat_fastq_paired_end") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -63,9 +59,6 @@ nextflow_process { test("test_cat_fastq_single_end_same_name") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -88,9 +81,6 @@ nextflow_process { test("test_cat_fastq_paired_end_same_name") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -115,9 +105,129 @@ nextflow_process { test("test_cat_fastq_single_end_single_file") { when { - params { - outdir = "$outputDir" + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true)] + ]) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_cat_fastq_single_end - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true)] + ]) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_cat_fastq_paired_end - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test2_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test2_2.fastq.gz', checkIfExists: true)] + ]) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_cat_fastq_single_end_same_name - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true)] + ]) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_cat_fastq_paired_end_same_name - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true)] + ]) + """ } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_cat_fastq_single_end_single_file - stub") { + + options "-stub" + + when { process { """ input[0] = Channel.of([ diff --git a/modules/nf-core/cat/fastq/tests/main.nf.test.snap b/modules/nf-core/cat/fastq/tests/main.nf.test.snap index 43dfe28f..aec119a9 100644 --- a/modules/nf-core/cat/fastq/tests/main.nf.test.snap +++ b/modules/nf-core/cat/fastq/tests/main.nf.test.snap @@ -28,6 +28,10 @@ ] } ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, "timestamp": "2024-01-17T17:30:39.816981" }, "test_cat_fastq_single_end_same_name": { @@ -59,6 +63,10 @@ ] } ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, "timestamp": "2024-01-17T17:32:35.229332" }, "test_cat_fastq_single_end_single_file": { @@ -90,6 +98,10 @@ ] } ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, "timestamp": "2024-01-17T17:34:00.058829" }, "test_cat_fastq_paired_end_same_name": { @@ -127,8 +139,123 @@ ] } ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, "timestamp": "2024-01-17T17:33:33.031555" }, + "test_cat_fastq_single_end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-05T12:07:28.244999" + }, + "test_cat_fastq_paired_end_same_name - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "versions": [ + "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-05T12:07:57.070911" + }, + "test_cat_fastq_single_end_same_name - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-05T12:07:46.796254" + }, "test_cat_fastq_paired_end": { "content": [ { @@ -164,6 +291,86 @@ ] } ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, "timestamp": "2024-01-17T17:32:02.270935" + }, + "test_cat_fastq_paired_end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "versions": [ + "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-05T12:07:37.807553" + }, + "test_cat_fastq_single_end_single_file - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-05T12:14:51.861264" } } \ No newline at end of file diff --git a/modules/nf-core/csvtk/join/environment.yml b/modules/nf-core/csvtk/join/environment.yml new file mode 100644 index 00000000..ea951bdb --- /dev/null +++ b/modules/nf-core/csvtk/join/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::csvtk=0.30.0 diff --git a/modules/nf-core/csvtk/join/main.nf b/modules/nf-core/csvtk/join/main.nf new file mode 100644 index 00000000..5f3afeea --- /dev/null +++ b/modules/nf-core/csvtk/join/main.nf @@ -0,0 +1,49 @@ +process CSVTK_JOIN { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/csvtk:0.30.0--h9ee0642_0': + 'biocontainers/csvtk:0.30.0--h9ee0642_0' }" + + input: + tuple val(meta), path(csv) + + output: + tuple val(meta), path("${prefix}.${out_extension}"), emit: csv + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" + out_extension = args.contains('--out-delimiter "\t"') || args.contains('-D "\t"') || args.contains("-D \$'\t'") ? "tsv" : "csv" + """ + csvtk \\ + join \\ + $args \\ + --num-cpus $task.cpus \\ + --out-file ${prefix}.${out_extension} \\ + $csv + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + csvtk: \$(echo \$( csvtk version | sed -e "s/csvtk v//g" )) + END_VERSIONS + """ + + stub: + prefix = task.ext.prefix ?: "${meta.id}" + out_extension = args.contains('--out-delimiter "\t"') || args.contains('-D "\t"') || args.contains("-D \$'\t'") ? "tsv" : "csv" + """ + touch ${prefix}.${out_extension} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + csvtk: \$(echo \$( csvtk version | sed -e "s/csvtk v//g" )) + END_VERSIONS + """ +} diff --git a/modules/nf-core/csvtk/join/meta.yml b/modules/nf-core/csvtk/join/meta.yml new file mode 100644 index 00000000..d8671b17 --- /dev/null +++ b/modules/nf-core/csvtk/join/meta.yml @@ -0,0 +1,45 @@ +name: csvtk_join +description: Join two or more CSV (or TSV) tables by selected fields into a single + table +keywords: + - join + - tsv + - csv +tools: + - csvtk: + description: A cross-platform, efficient, practical CSV/TSV toolkit + homepage: http://bioinf.shenwei.me/csvtk + documentation: http://bioinf.shenwei.me/csvtk + tool_dev_url: https://github.com/shenwei356/csvtk + licence: ["MIT"] + identifier: "" +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - csv: + type: file + description: CSV/TSV formatted files + pattern: "*.{csv,tsv}" +output: + - csv: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.${out_extension}: + type: file + description: Joined CSV/TSV file + pattern: "*.{csv,tsv}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "version.yml" +authors: + - "@anoronh4" +maintainers: + - "@anoronh4" diff --git a/modules/nf-core/csvtk/join/tests/main.nf.test b/modules/nf-core/csvtk/join/tests/main.nf.test new file mode 100644 index 00000000..3cf178c4 --- /dev/null +++ b/modules/nf-core/csvtk/join/tests/main.nf.test @@ -0,0 +1,64 @@ +nextflow_process { + + name "Test Process CSVTK_JOIN" + script "../main.nf" + process "CSVTK_JOIN" + + tag "modules" + tag "modules_nfcore" + tag "csvtk" + tag "csvtk/join" + + test("join - csv") { + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + [ + file("https://github.com/nf-core/test-datasets/raw/bacass/bacass_hybrid.csv", checkIfExists: true), + file("https://github.com/nf-core/test-datasets/raw/bacass/bacass_short.csv", checkIfExists: true), + ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("join - csv - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + [ + file("https://github.com/nf-core/test-datasets/raw/bacass/bacass_hybrid.csv", checkIfExists: true), + file("https://github.com/nf-core/test-datasets/raw/bacass/bacass_short.csv", checkIfExists: true), + ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/csvtk/join/tests/main.nf.test.snap b/modules/nf-core/csvtk/join/tests/main.nf.test.snap new file mode 100644 index 00000000..b124788b --- /dev/null +++ b/modules/nf-core/csvtk/join/tests/main.nf.test.snap @@ -0,0 +1,60 @@ +{ + "join - csv": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.csv:md5,d0ad82ca096c7e05eb9f9a04194c9e30" + ] + ], + "1": [ + "versions.yml:md5,e76147e4eca968d23543e7007522f1d3" + ], + "csv": [ + [ + { + "id": "test" + }, + "test.csv:md5,d0ad82ca096c7e05eb9f9a04194c9e30" + ] + ], + "versions": [ + "versions.yml:md5,e76147e4eca968d23543e7007522f1d3" + ] + } + ], + "timestamp": "2024-05-21T15:45:44.045434" + }, + "join - csv - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.csv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,e76147e4eca968d23543e7007522f1d3" + ], + "csv": [ + [ + { + "id": "test" + }, + "test.csv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e76147e4eca968d23543e7007522f1d3" + ] + } + ], + "timestamp": "2024-05-21T15:45:55.59201" + } +} \ No newline at end of file diff --git a/modules/nf-core/csvtk/join/tests/nextflow.config b/modules/nf-core/csvtk/join/tests/nextflow.config new file mode 100644 index 00000000..1b14393a --- /dev/null +++ b/modules/nf-core/csvtk/join/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: CSVTK_JOIN { + ext.args = "--fields 'ID;ID' -p -e -d \"\t\" -D \",\"" + } +} diff --git a/modules/nf-core/csvtk/join/tests/tags.yml b/modules/nf-core/csvtk/join/tests/tags.yml new file mode 100644 index 00000000..6c3a0fa6 --- /dev/null +++ b/modules/nf-core/csvtk/join/tests/tags.yml @@ -0,0 +1,2 @@ +csvtk/join: + - "modules/nf-core/csvtk/join/**" diff --git a/modules/nf-core/fastp/environment.yml b/modules/nf-core/fastp/environment.yml index 70389e66..26d4aca5 100644 --- a/modules/nf-core/fastp/environment.yml +++ b/modules/nf-core/fastp/environment.yml @@ -1,7 +1,5 @@ -name: fastp channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::fastp=0.23.4 diff --git a/modules/nf-core/fastp/main.nf b/modules/nf-core/fastp/main.nf index 4fc19b74..e1b9f565 100644 --- a/modules/nf-core/fastp/main.nf +++ b/modules/nf-core/fastp/main.nf @@ -10,6 +10,7 @@ process FASTP { input: tuple val(meta), path(reads) path adapter_fasta + val discard_trimmed_pass val save_trimmed_fail val save_merged @@ -18,9 +19,9 @@ process FASTP { tuple val(meta), path('*.json') , emit: json tuple val(meta), path('*.html') , emit: html tuple val(meta), path('*.log') , emit: log - path "versions.yml" , emit: versions tuple val(meta), path('*.fail.fastq.gz') , optional:true, emit: reads_fail tuple val(meta), path('*.merged.fastq.gz'), optional:true, emit: reads_merged + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -30,6 +31,8 @@ process FASTP { def prefix = task.ext.prefix ?: "${meta.id}" def adapter_list = adapter_fasta ? "--adapter_fasta ${adapter_fasta}" : "" def fail_fastq = save_trimmed_fail && meta.single_end ? "--failed_out ${prefix}.fail.fastq.gz" : save_trimmed_fail && !meta.single_end ? "--failed_out ${prefix}.paired.fail.fastq.gz --unpaired1 ${prefix}_1.fail.fastq.gz --unpaired2 ${prefix}_2.fail.fastq.gz" : '' + def out_fq1 = discard_trimmed_pass ?: ( meta.single_end ? "--out1 ${prefix}.fastp.fastq.gz" : "--out1 ${prefix}_1.fastp.fastq.gz" ) + def out_fq2 = discard_trimmed_pass ?: "--out2 ${prefix}_2.fastp.fastq.gz" // Added soft-links to original fastqs for consistent naming in MultiQC // Use single ended for interleaved. Add --interleaved_in in config. if ( task.ext.args?.contains('--interleaved_in') ) { @@ -59,7 +62,7 @@ process FASTP { fastp \\ --in1 ${prefix}.fastq.gz \\ - --out1 ${prefix}.fastp.fastq.gz \\ + $out_fq1 \\ --thread $task.cpus \\ --json ${prefix}.fastp.json \\ --html ${prefix}.fastp.html \\ @@ -81,8 +84,8 @@ process FASTP { fastp \\ --in1 ${prefix}_1.fastq.gz \\ --in2 ${prefix}_2.fastq.gz \\ - --out1 ${prefix}_1.fastp.fastq.gz \\ - --out2 ${prefix}_2.fastp.fastq.gz \\ + $out_fq1 \\ + $out_fq2 \\ --json ${prefix}.fastp.json \\ --html ${prefix}.fastp.html \\ $adapter_list \\ @@ -103,14 +106,16 @@ process FASTP { stub: def prefix = task.ext.prefix ?: "${meta.id}" def is_single_output = task.ext.args?.contains('--interleaved_in') || meta.single_end - def touch_reads = is_single_output ? "${prefix}.fastp.fastq.gz" : "${prefix}_1.fastp.fastq.gz ${prefix}_2.fastp.fastq.gz" - def touch_merged = (!is_single_output && save_merged) ? "touch ${prefix}.merged.fastq.gz" : "" + def touch_reads = (discard_trimmed_pass) ? "" : (is_single_output) ? "echo '' | gzip > ${prefix}.fastp.fastq.gz" : "echo '' | gzip > ${prefix}_1.fastp.fastq.gz ; echo '' | gzip > ${prefix}_2.fastp.fastq.gz" + def touch_merged = (!is_single_output && save_merged) ? "echo '' | gzip > ${prefix}.merged.fastq.gz" : "" + def touch_fail_fastq = (!save_trimmed_fail) ? "" : meta.single_end ? "echo '' | gzip > ${prefix}.fail.fastq.gz" : "echo '' | gzip > ${prefix}.paired.fail.fastq.gz ; echo '' | gzip > ${prefix}_1.fail.fastq.gz ; echo '' | gzip > ${prefix}_2.fail.fastq.gz" """ - touch $touch_reads + $touch_reads + $touch_fail_fastq + $touch_merged touch "${prefix}.fastp.json" touch "${prefix}.fastp.html" touch "${prefix}.fastp.log" - $touch_merged cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/fastp/meta.yml b/modules/nf-core/fastp/meta.yml index c22a16ab..159404d0 100644 --- a/modules/nf-core/fastp/meta.yml +++ b/modules/nf-core/fastp/meta.yml @@ -11,62 +11,100 @@ tools: documentation: https://github.com/OpenGene/fastp doi: 10.1093/bioinformatics/bty560 licence: ["MIT"] + identifier: biotools:fastp input: - - meta: - type: map - description: | - Groovy Map containing sample information. Use 'single_end: true' to specify single ended or interleaved FASTQs. Use 'single_end: false' for paired-end reads. - e.g. [ id:'test', single_end:false ] - - reads: - type: file - description: | - List of input FastQ files of size 1 and 2 for single-end and paired-end data, - respectively. If you wish to run interleaved paired-end data, supply as single-end data - but with `--interleaved_in` in your `modules.conf`'s `ext.args` for the module. - - adapter_fasta: - type: file - description: File in FASTA format containing possible adapters to remove. - pattern: "*.{fasta,fna,fas,fa}" - - save_trimmed_fail: - type: boolean - description: Specify true to save files that failed to pass trimming thresholds ending in `*.fail.fastq.gz` - - save_merged: - type: boolean - description: Specify true to save all merged reads to the a file ending in `*.merged.fastq.gz` + - - meta: + type: map + description: | + Groovy Map containing sample information. Use 'single_end: true' to specify single ended or interleaved FASTQs. Use 'single_end: false' for paired-end reads. + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files of size 1 and 2 for single-end and paired-end data, + respectively. If you wish to run interleaved paired-end data, supply as single-end data + but with `--interleaved_in` in your `modules.conf`'s `ext.args` for the module. + - - adapter_fasta: + type: file + description: File in FASTA format containing possible adapters to remove. + pattern: "*.{fasta,fna,fas,fa}" + - - discard_trimmed_pass: + type: boolean + description: Specify true to not write any reads that pass trimming thresholds. + | This can be used to use fastp for the output report only. + - - save_trimmed_fail: + type: boolean + description: Specify true to save files that failed to pass trimming thresholds + ending in `*.fail.fastq.gz` + - - save_merged: + type: boolean + description: Specify true to save all merged reads to a file ending in `*.merged.fastq.gz` output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - reads: - type: file - description: The trimmed/modified/unmerged fastq reads - pattern: "*fastp.fastq.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fastp.fastq.gz": + type: file + description: The trimmed/modified/unmerged fastq reads + pattern: "*fastp.fastq.gz" - json: - type: file - description: Results in JSON format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.json": + type: file + description: Results in JSON format + pattern: "*.json" - html: - type: file - description: Results in HTML format - pattern: "*.html" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.html": + type: file + description: Results in HTML format + pattern: "*.html" - log: - type: file - description: fastq log file - pattern: "*.log" - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.log": + type: file + description: fastq log file + pattern: "*.log" - reads_fail: - type: file - description: Reads the failed the preprocessing - pattern: "*fail.fastq.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fail.fastq.gz": + type: file + description: Reads the failed the preprocessing + pattern: "*fail.fastq.gz" - reads_merged: - type: file - description: Reads that were successfully merged - pattern: "*.{merged.fastq.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.merged.fastq.gz": + type: file + description: Reads that were successfully merged + pattern: "*.{merged.fastq.gz}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@kevinmenden" diff --git a/modules/nf-core/fastp/tests/main.nf.test b/modules/nf-core/fastp/tests/main.nf.test index 6f1f4897..30dbb8aa 100644 --- a/modules/nf-core/fastp/tests/main.nf.test +++ b/modules/nf-core/fastp/tests/main.nf.test @@ -10,221 +10,290 @@ nextflow_process { test("test_fastp_single_end") { when { - params { - outdir = "$outputDir" - } + process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = false - input[0] = Channel.of([ [ id:'test', single_end:true ], [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = false """ } } then { - def html_text = [ "Q20 bases:12.922000 K (92.984097%)", - "single end (151 cycles)" ] - def log_text = [ "Q20 bases: 12922(92.9841%)", - "reads passed filter: 99" ] - def read_lines = ["@ERR5069949.2151832 NS500628:121:HK3MMAFX2:2:21208:10793:15304/1", - "TCATAAACCAAAGCACTCACAGTGTCAACAATTTCAGCAGGACAACGCCGACAAGTTCCGAGGAACATGTCTGGACCTATAGTTTTCATAAGTCTACACACTGAATTGAAATATTCTGGTTCTAGTGTGCCCTTAGTTAGCAATGTGCGT", - "AAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEE - { assert path(process.out.reads.get(0).get(1)).linesGzip.contains(read_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { assert snapshot(process.out.json).match("test_fastp_single_end_json") }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { file(it[1]).getName() } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_single_end-_match") - }, - { assert snapshot(process.out.versions).match("versions_single_end") } + { assert path(process.out.html.get(0).get(1)).getText().contains("single end (151 cycles)") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("reads passed filter: 99") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.versions).match() } ) } } - test("test_fastp_single_end-stub") { - - options '-stub' + test("test_fastp_paired_end") { when { - params { - outdir = "$outputDir" - } + process { """ adapter_fasta = [] + save_trimmed_pass = true save_trimmed_fail = false save_merged = false input[0] = Channel.of([ - [ id:'test', single_end:true ], - [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = false """ } } then { + assertAll( + { assert process.success }, + { assert path(process.out.html.get(0).get(1)).getText().contains("The input has little adapter percentage (~0.000000%), probably it's trimmed before.") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("Q30 bases: 12281(88.3716%)") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.versions).match() } + ) + } + } + test("fastp test_fastp_interleaved") { + + config './nextflow.interleaved.config' + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_interleaved.fastq.gz', checkIfExists: true) ] + ]) + input[1] = [] + input[2] = false + input[3] = false + input[4] = false + """ + } + } + + then { assertAll( { assert process.success }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { file(it[1]).getName() } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_single_end-for_stub_match") - }, - { assert snapshot(process.out.versions).match("versions_single_end_stub") } + { assert path(process.out.html.get(0).get(1)).getText().contains("paired end (151 cycles + 151 cycles)") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("reads passed filter: 162") }, + { assert process.out.reads_fail == [] }, + { assert process.out.reads_merged == [] }, + { assert snapshot( + process.out.reads, + process.out.json, + process.out.versions).match() } ) } } - test("test_fastp_paired_end") { + test("test_fastp_single_end_trim_fail") { when { - params { - outdir = "$outputDir" + + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + ]) + input[1] = [] + input[2] = false + input[3] = true + input[4] = false + """ } + } + + then { + assertAll( + { assert process.success }, + { assert path(process.out.html.get(0).get(1)).getText().contains("single end (151 cycles)") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("reads passed filter: 99") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.versions).match() } + ) + } + } + + test("test_fastp_paired_end_trim_fail") { + + config './nextflow.save_failed.config' + when { process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = false + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true)] + ]) + input[1] = [] + input[2] = false + input[3] = true + input[4] = false + """ + } + } + then { + assertAll( + { assert process.success }, + { assert path(process.out.html.get(0).get(1)).getText().contains("The input has little adapter percentage (~0.000000%), probably it's trimmed before.") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("reads passed filter: 162") }, + { assert snapshot( + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.json, + process.out.versions).match() } + ) + } + } + + test("test_fastp_paired_end_merged") { + + when { + process { + """ input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = true """ } } then { - def html_text = [ "Q20 bases:25.719000 K (93.033098%)", - "The input has little adapter percentage (~0.000000%), probably it's trimmed before."] - def log_text = [ "No adapter detected for read1", - "Q30 bases: 12281(88.3716%)"] - def json_text = ['"passed_filter_reads": 198'] - def read1_lines = ["@ERR5069949.2151832 NS500628:121:HK3MMAFX2:2:21208:10793:15304/1", - "TCATAAACCAAAGCACTCACAGTGTCAACAATTTCAGCAGGACAACGCCGACAAGTTCCGAGGAACATGTCTGGACCTATAGTTTTCATAAGTCTACACACTGAATTGAAATATTCTGGTTCTAGTGTGCCCTTAGTTAGCAATGTGCGT", - "AAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEE - { assert path(process.out.reads.get(0).get(1).get(0)).linesGzip.contains(read1_line) } - } - }, - { read2_lines.each { read2_line -> - { assert path(process.out.reads.get(0).get(1).get(1)).linesGzip.contains(read2_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { json_text.each { json_part -> - { assert path(process.out.json.get(0).get(1)).getText().contains(json_part) } - } - }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { it[1].collect { item -> file(item).getName() } } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_paired_end_match") - }, - { assert snapshot(process.out.versions).match("versions_paired_end") } + { assert path(process.out.html.get(0).get(1)).getText().contains("The input has little adapter percentage (~0.000000%), probably it's trimmed before.") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("total reads: 75") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.versions).match() }, ) } } - test("test_fastp_paired_end-stub") { - - options '-stub' + test("test_fastp_paired_end_merged_adapterlist") { when { - params { - outdir = "$outputDir" + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] + ]) + input[1] = Channel.of([ file(params.modules_testdata_base_path + 'delete_me/fastp/adapters.fasta', checkIfExists: true) ]) + input[2] = false + input[3] = false + input[4] = true + """ } + } + + then { + assertAll( + { assert process.success }, + { assert path(process.out.html.get(0).get(1)).getText().contains("

") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("total bases: 13683") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.versions).match() } + ) + } + } + + test("test_fastp_single_end_qc_only") { + + when { process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = false + input[0] = Channel.of([ + [ id:'test', single_end:true ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + ]) + input[1] = [] + input[2] = true + input[3] = false + input[4] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert path(process.out.html.get(0).get(1)).getText().contains("single end (151 cycles)") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("reads passed filter: 99") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads, + process.out.reads_fail, + process.out.reads_fail, + process.out.reads_merged, + process.out.reads_merged, + process.out.versions).match() } + ) + } + } + test("test_fastp_paired_end_qc_only") { + + when { + process { + """ input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = true + input[3] = false + input[4] = false """ } } @@ -232,114 +301,99 @@ nextflow_process { then { assertAll( { assert process.success }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { it[1].collect { item -> file(item).getName() } } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_paired_end-for_stub_match") - }, - { assert snapshot(process.out.versions).match("versions_paired_end-stub") } + { assert path(process.out.html.get(0).get(1)).getText().contains("The input has little adapter percentage (~0.000000%), probably it's trimmed before.") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("Q30 bases: 12281(88.3716%)") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads, + process.out.reads_fail, + process.out.reads_fail, + process.out.reads_merged, + process.out.reads_merged, + process.out.versions).match() } ) } } - test("fastp test_fastp_interleaved") { + test("test_fastp_single_end - stub") { + + options "-stub" - config './nextflow.interleaved.config' when { - params { - outdir = "$outputDir" + + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + ]) + input[1] = [] + input[2] = false + input[3] = false + input[4] = false + """ } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_fastp_paired_end - stub") { + + options "-stub" + + when { + process { """ adapter_fasta = [] + save_trimmed_pass = true save_trimmed_fail = false save_merged = false input[0] = Channel.of([ - [ id:'test', single_end:true ], // meta map - [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_interleaved.fastq.gz', checkIfExists: true) ] + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = false """ } } then { - def html_text = [ "Q20 bases:25.719000 K (93.033098%)", - "paired end (151 cycles + 151 cycles)"] - def log_text = [ "Q20 bases: 12922(92.9841%)", - "reads passed filter: 162"] - def read_lines = [ "@ERR5069949.2151832 NS500628:121:HK3MMAFX2:2:21208:10793:15304/1", - "TCATAAACCAAAGCACTCACAGTGTCAACAATTTCAGCAGGACAACGCCGACAAGTTCCGAGGAACATGTCTGGACCTATAGTTTTCATAAGTCTACACACTGAATTGAAATATTCTGGTTCTAGTGTGCCCTTAGTTAGCAATGTGCGT", - "AAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEE - { assert path(process.out.reads.get(0).get(1)).linesGzip.contains(read_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { assert snapshot(process.out.json).match("fastp test_fastp_interleaved_json") }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { file(it[1]).getName() } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_interleaved-_match") - }, - { assert snapshot(process.out.versions).match("versions_interleaved") } + { assert snapshot(process.out).match() } ) } } - test("fastp test_fastp_interleaved-stub") { + test("fastp - stub test_fastp_interleaved") { - options '-stub' + options "-stub" config './nextflow.interleaved.config' when { - params { - outdir = "$outputDir" - } process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = false - input[0] = Channel.of([ [ id:'test', single_end:true ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_interleaved.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = false """ } } @@ -347,277 +401,112 @@ nextflow_process { then { assertAll( { assert process.success }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { file(it[1]).getName() } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_interleaved-for_stub_match") - }, - { assert snapshot(process.out.versions).match("versions_interleaved-stub") } + { assert snapshot(process.out).match() } ) } } - test("test_fastp_single_end_trim_fail") { + test("test_fastp_single_end_trim_fail - stub") { + + options "-stub" when { - params { - outdir = "$outputDir" - } + process { """ - adapter_fasta = [] - save_trimmed_fail = true - save_merged = false - input[0] = Channel.of([ [ id:'test', single_end:true ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = true + input[4] = false """ } } then { - def html_text = [ "Q20 bases:12.922000 K (92.984097%)", - "single end (151 cycles)"] - def log_text = [ "Q20 bases: 12922(92.9841%)", - "reads passed filter: 99" ] - def read_lines = [ "@ERR5069949.2151832 NS500628:121:HK3MMAFX2:2:21208:10793:15304/1", - "TCATAAACCAAAGCACTCACAGTGTCAACAATTTCAGCAGGACAACGCCGACAAGTTCCGAGGAACATGTCTGGACCTATAGTTTTCATAAGTCTACACACTGAATTGAAATATTCTGGTTCTAGTGTGCCCTTAGTTAGCAATGTGCGT", - "AAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEE - { assert path(process.out.reads.get(0).get(1)).linesGzip.contains(read_line) } - } - }, - { failed_read_lines.each { failed_read_line -> - { assert path(process.out.reads_fail.get(0).get(1)).linesGzip.contains(failed_read_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { assert snapshot(process.out.json).match("test_fastp_single_end_trim_fail_json") }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { assert snapshot(process.out.versions).match("versions_single_end_trim_fail") } + { assert snapshot(process.out).match() } ) } } - test("test_fastp_paired_end_trim_fail") { + test("test_fastp_paired_end_trim_fail - stub") { + + options "-stub" config './nextflow.save_failed.config' when { - params { - outdir = "$outputDir" - } process { """ - adapter_fasta = [] - save_trimmed_fail = true - save_merged = false - input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true)] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = true + input[4] = false """ } } then { - def html_text = [ "Q20 bases:25.719000 K (93.033098%)", - "The input has little adapter percentage (~0.000000%), probably it's trimmed before."] - def log_text = [ "No adapter detected for read1", - "Q30 bases: 12281(88.3716%)"] - def json_text = ['"passed_filter_reads": 162'] - def read1_lines = ["@ERR5069949.2151832 NS500628:121:HK3MMAFX2:2:21208:10793:15304/1", - "TCATAAACCAAAGCACTCACAGTGTCAACAATTTCAGCAGGACAACGCCGACAAGTTCCGAGGAACATGTCTGGACCTATAGTTTTCATAAGTCTACACACTGAATTGAAATATTCTGGTTCTAGTGTGCCCTTAGTTAGCAATGTGCGT", - "AAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEE - { assert path(process.out.reads.get(0).get(1).get(0)).linesGzip.contains(read1_line) } - } - }, - { read2_lines.each { read2_line -> - { assert path(process.out.reads.get(0).get(1).get(1)).linesGzip.contains(read2_line) } - } - }, - { failed_read2_lines.each { failed_read2_line -> - { assert path(process.out.reads_fail.get(0).get(1).get(2)).linesGzip.contains(failed_read2_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { json_text.each { json_part -> - { assert path(process.out.json.get(0).get(1)).getText().contains(json_part) } - } - }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { assert snapshot(process.out.versions).match("versions_paired_end_trim_fail") } + { assert snapshot(process.out).match() } ) } } - test("test_fastp_paired_end_merged") { + test("test_fastp_paired_end_merged - stub") { + + options "-stub" when { - params { - outdir = "$outputDir" - } process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = true input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = true """ } } then { - def html_text = [ "
"] - def log_text = [ "Merged and filtered:", - "total reads: 75", - "total bases: 13683"] - def json_text = ['"merged_and_filtered": {', '"total_reads": 75', '"total_bases": 13683'] - def read1_lines = [ "@ERR5069949.1066259 NS500628:121:HK3MMAFX2:1:11312:18369:8333/1", - "CCTTATGACAGCAAGAACTGTGTATGATGATGGTGCTAGGAGAGTGTGGACACTTATGAATGTCTTGACACTCGTTTATAAAGTTTATTATGGTAATGCTTTAGATCAAGCCATTTCCATGTGGGCTCTTATAATCTCTGTTACTTC", - "AAAAAEAEEAEEEEEEEEEEEEEEEEAEEEEAEEEEEEEEAEEEEEEEEEEEEEEEEE/EAEEEEEE/6EEEEEEEEEEAEEAEEE/EE/AEEAEEEEEAEEEA/EEAAEAE - { assert path(process.out.reads.get(0).get(1).get(0)).linesGzip.contains(read1_line) } - } - }, - { read2_lines.each { read2_line -> - { assert path(process.out.reads.get(0).get(1).get(1)).linesGzip.contains(read2_line) } - } - }, - { read_merged_lines.each { read_merged_line -> - { assert path(process.out.reads_merged.get(0).get(1)).linesGzip.contains(read_merged_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { json_text.each { json_part -> - { assert path(process.out.json.get(0).get(1)).getText().contains(json_part) } - } - }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { it[1].collect { item -> file(item).getName() } } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_paired_end_merged_match") - }, - { assert snapshot(process.out.versions).match("versions_paired_end_merged") } + { assert snapshot(process.out).match() } ) } } - test("test_fastp_paired_end_merged-stub") { + test("test_fastp_paired_end_merged_adapterlist - stub") { - options '-stub' + options "-stub" when { - params { - outdir = "$outputDir" - } process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = true - input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = Channel.of([ file(params.modules_testdata_base_path + 'delete_me/fastp/adapters.fasta', checkIfExists: true) ]) + input[2] = false + input[3] = false + input[4] = true """ } } @@ -625,101 +514,63 @@ nextflow_process { then { assertAll( { assert process.success }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { it[1].collect { item -> file(item).getName() } } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_paired_end_merged-for_stub_match") - }, - { assert snapshot(process.out.versions).match("versions_paired_end_merged_stub") } + { assert snapshot(process.out).match() } ) } } - test("test_fastp_paired_end_merged_adapterlist") { + test("test_fastp_single_end_qc_only - stub") { + + options "-stub" when { - params { - outdir = "$outputDir" - } process { """ - adapter_fasta = Channel.of([ file(params.modules_testdata_base_path + 'delete_me/fastp/adapters.fasta', checkIfExists: true) ]) - save_trimmed_fail = false - save_merged = true + input[0] = Channel.of([ + [ id:'test', single_end:true ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + ]) + input[1] = [] + input[2] = true + input[3] = false + input[4] = false + """ + } + } + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_fastp_paired_end_qc_only - stub") { + + options "-stub" + + when { + process { + """ input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = true + input[3] = false + input[4] = false """ } } then { - def html_text = [ "
"] - def log_text = [ "Merged and filtered:", - "total reads: 75", - "total bases: 13683"] - def json_text = ['"merged_and_filtered": {', '"total_reads": 75', '"total_bases": 13683',"--adapter_fasta"] - def read1_lines = ["@ERR5069949.1066259 NS500628:121:HK3MMAFX2:1:11312:18369:8333/1", - "CCTTATGACAGCAAGAACTGTGTATGATGATGGTGCTAGGAGAGTGTGGACACTTATGAATGTCTTGACACTCGTTTATAAAGTTTATTATGGTAATGCTTTAGATCAAGCCATTTCCATGTGGGCTCTTATAATCTCTGTTACTTC", - "AAAAAEAEEAEEEEEEEEEEEEEEEEAEEEEAEEEEEEEEAEEEEEEEEEEEEEEEEE/EAEEEEEE/6EEEEEEEEEEAEEAEEE/EE/AEEAEEEEEAEEEA/EEAAEAE - { assert path(process.out.reads.get(0).get(1).get(0)).linesGzip.contains(read1_line) } - } - }, - { read2_lines.each { read2_line -> - { assert path(process.out.reads.get(0).get(1).get(1)).linesGzip.contains(read2_line) } - } - }, - { read_merged_lines.each { read_merged_line -> - { assert path(process.out.reads_merged.get(0).get(1)).linesGzip.contains(read_merged_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { json_text.each { json_part -> - { assert path(process.out.json.get(0).get(1)).getText().contains(json_part) } - } - }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { assert snapshot(process.out.versions).match("versions_paired_end_merged_adapterlist") } + { assert snapshot(process.out).match() } ) } } -} +} \ No newline at end of file diff --git a/modules/nf-core/fastp/tests/main.nf.test.snap b/modules/nf-core/fastp/tests/main.nf.test.snap index 3e876288..54be7e45 100644 --- a/modules/nf-core/fastp/tests/main.nf.test.snap +++ b/modules/nf-core/fastp/tests/main.nf.test.snap @@ -1,55 +1,178 @@ { - "fastp test_fastp_interleaved_json": { + "test_fastp_single_end_qc_only - stub": { "content": [ - [ - [ - { - "id": "test", - "single_end": true - }, - "test.fastp.json:md5,b24e0624df5cc0b11cd5ba21b726fb22" + { + "0": [ + + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + + ], + "reads_fail": [ + + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] - ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-03-18T16:19:15.063001" + "timestamp": "2024-07-05T14:31:10.841098" }, - "test_fastp_paired_end_merged-for_stub_match": { + "test_fastp_paired_end": { "content": [ [ [ - "test_1.fastp.fastq.gz", - "test_2.fastp.fastq.gz" - ], - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "test.merged.fastq.gz", - "{id=test, single_end=false}" + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,1e0f8e27e71728e2b63fc64086be95cd" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7", + "test_2.fastp.fastq.gz:md5,25cbdca08e2083dbd4f0502de6b62f39" + ] + ] + ], + [ + + ], + [ + + ], + [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-01-17T18:10:13.467574" + "timestamp": "2024-07-05T13:43:28.665779" }, - "versions_interleaved": { + "test_fastp_paired_end_merged_adapterlist": { "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,5914ca3f21ce162123a824e33e8564f6" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,54b726a55e992a869fd3fa778afe1672", + "test_2.fastp.fastq.gz:md5,29d3b33b869f7b63417b8ff07bb128ba" + ] + ] + ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,c873bb1ab3fa859dcc47306465e749d5" + ] + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:56:24.615634793" + "timestamp": "2024-07-05T13:44:18.210375" }, - "test_fastp_single_end_json": { + "test_fastp_single_end_qc_only": { "content": [ [ [ @@ -57,274 +180,1152 @@ "id": "test", "single_end": true }, - "test.fastp.json:md5,c852d7a6dba5819e4ac8d9673bedcacc" + "test.fastp.json:md5,5cc5f01e449309e0e689ed6f51a2294a" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-03-18T16:18:43.526412" - }, - "versions_paired_end": { - "content": [ + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:55:42.333545689" + "timestamp": "2024-07-05T13:44:27.380974" }, - "test_fastp_paired_end_match": { + "test_fastp_paired_end_trim_fail": { "content": [ [ [ - "test_1.fastp.fastq.gz", - "test_2.fastp.fastq.gz" - ], - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=false}" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-01T12:03:06.431833729" - }, - "test_fastp_interleaved-_match": { - "content": [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,6ff32a64c5188b9a9192be1398c262c7", + "test_2.fastp.fastq.gz:md5,db0cb7c9977e94ac2b4b446ebd017a8a" + ] + ] + ], [ - "test.fastp.fastq.gz", - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=true}" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-03-18T16:19:15.111894" - }, - "test_fastp_paired_end_merged_match": { - "content": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test.paired.fail.fastq.gz:md5,409b687c734cedd7a1fec14d316e1366", + "test_1.fail.fastq.gz:md5,4f273cf3159c13f79e8ffae12f5661f6", + "test_2.fail.fastq.gz:md5,f97b9edefb5649aab661fbc9e71fc995" + ] + ] + ], + [ + + ], [ [ - "test_1.fastp.fastq.gz", - "test_2.fastp.fastq.gz" - ], - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "test.merged.fastq.gz", - "{id=test, single_end=false}" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-01T12:08:44.496251446" - }, - "versions_single_end_stub": { - "content": [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,4c3268ddb50ea5b33125984776aa3519" + ] + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:55:27.354051299" + "timestamp": "2024-07-05T13:43:58.749589" }, - "versions_interleaved-stub": { + "fastp - stub test_fastp_interleaved": { "content": [ - [ - "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "reads_fail": [ + + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:56:46.535528418" + "timestamp": "2024-07-05T13:50:00.270029" }, - "versions_single_end_trim_fail": { + "test_fastp_single_end - stub": { "content": [ - [ - "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "reads_fail": [ + + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:59:03.724591407" + "timestamp": "2024-07-05T13:49:42.502789" }, - "test_fastp_paired_end-for_stub_match": { + "test_fastp_paired_end_merged_adapterlist - stub": { "content": [ - [ - [ - "test_1.fastp.fastq.gz", - "test_2.fastp.fastq.gz" + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] ], - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=false}" - ] + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "reads_fail": [ + + ], + "reads_merged": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-01-17T18:07:15.398827" + "timestamp": "2024-07-05T13:54:53.458252" }, - "versions_paired_end-stub": { + "test_fastp_paired_end_merged - stub": { "content": [ - [ - "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "reads_fail": [ + + ], + "reads_merged": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:56:06.50017282" + "timestamp": "2024-07-05T13:50:27.689379" }, - "versions_single_end": { + "test_fastp_paired_end_merged": { "content": [ [ - "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-01T11:55:07.67921647" - }, - "versions_paired_end_merged_stub": { - "content": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,b712fd68ed0322f4bec49ff2a5237fcc" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,54b726a55e992a869fd3fa778afe1672", + "test_2.fastp.fastq.gz:md5,29d3b33b869f7b63417b8ff07bb128ba" + ] + ] + ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,c873bb1ab3fa859dcc47306465e749d5" + ] + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:59:47.350653154" + "timestamp": "2024-07-05T13:44:08.68476" }, - "test_fastp_interleaved-for_stub_match": { + "test_fastp_paired_end - stub": { "content": [ - [ - "test.fastp.fastq.gz", - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=true}" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "reads_fail": [ + + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-01-17T18:08:06.127974" + "timestamp": "2024-07-05T13:49:51.679221" }, - "versions_paired_end_trim_fail": { + "test_fastp_single_end": { "content": [ + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,c852d7a6dba5819e4ac8d9673bedcacc" + ] + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7" + ] + ], + [ + + ], + [ + + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:59:18.140484878" + "timestamp": "2024-07-05T13:43:18.834322" }, - "test_fastp_single_end-for_stub_match": { + "test_fastp_single_end_trim_fail - stub": { "content": [ - [ - "test.fastp.fastq.gz", - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=true}" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "reads_fail": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-01-17T18:06:00.244202" + "timestamp": "2024-07-05T14:05:36.898142" }, - "test_fastp_single_end-_match": { + "test_fastp_paired_end_trim_fail - stub": { "content": [ - [ - "test.fastp.fastq.gz", - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=true}" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test.paired.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_1.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "reads_fail": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test.paired.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_1.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-03-18T16:18:43.580336" + "timestamp": "2024-07-05T14:05:49.212847" }, - "versions_paired_end_merged_adapterlist": { + "fastp test_fastp_interleaved": { "content": [ + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,217d62dc13a23e92513a1bd8e1bcea39" + ] + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,b24e0624df5cc0b11cd5ba21b726fb22" + ] + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T12:05:37.845370554" + "timestamp": "2024-07-05T13:43:38.910832" }, - "versions_paired_end_merged": { + "test_fastp_single_end_trim_fail": { "content": [ + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,9a7ee180f000e8d00c7fb67f06293eb5" + ] + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7" + ] + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fail.fastq.gz:md5,3e4aaadb66a5b8fc9b881bf39c227abd" + ] + ], + [ + + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:59:32.860543858" + "timestamp": "2024-07-05T13:43:48.22378" }, - "test_fastp_single_end_trim_fail_json": { + "test_fastp_paired_end_qc_only": { "content": [ [ [ { "id": "test", - "single_end": true + "single_end": false }, - "test.fastp.json:md5,9a7ee180f000e8d00c7fb67f06293eb5" + "test.fastp.json:md5,623064a45912dac6f2b64e3f2e9901df" ] + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-05T13:44:36.334938" + }, + "test_fastp_paired_end_qc_only - stub": { + "content": [ + { + "0": [ + + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + + ], + "reads_fail": [ + + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" }, - "timestamp": "2024-01-17T18:08:41.942317" + "timestamp": "2024-07-05T14:31:27.096468" } } \ No newline at end of file diff --git a/modules/nf-core/fastqc/environment.yml b/modules/nf-core/fastqc/environment.yml index 1787b38a..691d4c76 100644 --- a/modules/nf-core/fastqc/environment.yml +++ b/modules/nf-core/fastqc/environment.yml @@ -1,7 +1,5 @@ -name: fastqc channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::fastqc=0.12.1 diff --git a/modules/nf-core/fastqc/main.nf b/modules/nf-core/fastqc/main.nf index d79f1c86..d8989f48 100644 --- a/modules/nf-core/fastqc/main.nf +++ b/modules/nf-core/fastqc/main.nf @@ -26,7 +26,10 @@ process FASTQC { def rename_to = old_new_pairs*.join(' ').join(' ') def renamed_files = old_new_pairs.collect{ old_name, new_name -> new_name }.join(' ') - def memory_in_mb = MemoryUnit.of("${task.memory}").toUnit('MB') + // The total amount of allocated RAM by FastQC is equal to the number of threads defined (--threads) time the amount of RAM defined (--memory) + // https://github.com/s-andrews/FastQC/blob/1faeea0412093224d7f6a07f777fad60a5650795/fastqc#L211-L222 + // Dividing the task.memory by task.cpu allows to stick to requested amount of RAM in the label + def memory_in_mb = MemoryUnit.of("${task.memory}").toUnit('MB') / task.cpus // FastQC memory value allowed range (100 - 10000) def fastqc_memory = memory_in_mb > 10000 ? 10000 : (memory_in_mb < 100 ? 100 : memory_in_mb) diff --git a/modules/nf-core/fastqc/meta.yml b/modules/nf-core/fastqc/meta.yml index ee5507e0..4827da7a 100644 --- a/modules/nf-core/fastqc/meta.yml +++ b/modules/nf-core/fastqc/meta.yml @@ -16,35 +16,44 @@ tools: homepage: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ documentation: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/ licence: ["GPL-2.0-only"] + identifier: biotools:fastqc input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - reads: - type: file - description: | - List of input FastQ files of size 1 and 2 for single-end and paired-end data, - respectively. + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files of size 1 and 2 for single-end and paired-end data, + respectively. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - html: - type: file - description: FastQC report - pattern: "*_{fastqc.html}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.html": + type: file + description: FastQC report + pattern: "*_{fastqc.html}" - zip: - type: file - description: FastQC report archive - pattern: "*_{fastqc.zip}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.zip": + type: file + description: FastQC report archive + pattern: "*_{fastqc.zip}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@grst" diff --git a/modules/nf-core/fastqc/tests/main.nf.test b/modules/nf-core/fastqc/tests/main.nf.test index 70edae4d..e9d79a07 100644 --- a/modules/nf-core/fastqc/tests/main.nf.test +++ b/modules/nf-core/fastqc/tests/main.nf.test @@ -23,17 +23,14 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - // NOTE The report contains the date inside it, which means that the md5sum is stable per day, but not longer than that. So you can't md5sum it. - // looks like this:
Mon 2 Oct 2023
test.gz
- // https://github.com/nf-core/modules/pull/3903#issuecomment-1743620039 - - { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, - { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, - { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_single") } + { assert process.success }, + // NOTE The report contains the date inside it, which means that the md5sum is stable per day, but not longer than that. So you can't md5sum it. + // looks like this:
Mon 2 Oct 2023
test.gz
+ // https://github.com/nf-core/modules/pull/3903#issuecomment-1743620039 + { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, + { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, + { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -54,16 +51,14 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - { assert process.out.html[0][1][0] ==~ ".*/test_1_fastqc.html" }, - { assert process.out.html[0][1][1] ==~ ".*/test_2_fastqc.html" }, - { assert process.out.zip[0][1][0] ==~ ".*/test_1_fastqc.zip" }, - { assert process.out.zip[0][1][1] ==~ ".*/test_2_fastqc.zip" }, - { assert path(process.out.html[0][1][0]).text.contains("File typeConventional base calls") }, - { assert path(process.out.html[0][1][1]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_paired") } + { assert process.success }, + { assert process.out.html[0][1][0] ==~ ".*/test_1_fastqc.html" }, + { assert process.out.html[0][1][1] ==~ ".*/test_2_fastqc.html" }, + { assert process.out.zip[0][1][0] ==~ ".*/test_1_fastqc.zip" }, + { assert process.out.zip[0][1][1] ==~ ".*/test_2_fastqc.zip" }, + { assert path(process.out.html[0][1][0]).text.contains("File typeConventional base calls") }, + { assert path(process.out.html[0][1][1]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -83,13 +78,11 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, - { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, - { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_interleaved") } + { assert process.success }, + { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, + { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, + { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -109,13 +102,11 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, - { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, - { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_bam") } + { assert process.success }, + { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, + { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, + { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -138,22 +129,20 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - { assert process.out.html[0][1][0] ==~ ".*/test_1_fastqc.html" }, - { assert process.out.html[0][1][1] ==~ ".*/test_2_fastqc.html" }, - { assert process.out.html[0][1][2] ==~ ".*/test_3_fastqc.html" }, - { assert process.out.html[0][1][3] ==~ ".*/test_4_fastqc.html" }, - { assert process.out.zip[0][1][0] ==~ ".*/test_1_fastqc.zip" }, - { assert process.out.zip[0][1][1] ==~ ".*/test_2_fastqc.zip" }, - { assert process.out.zip[0][1][2] ==~ ".*/test_3_fastqc.zip" }, - { assert process.out.zip[0][1][3] ==~ ".*/test_4_fastqc.zip" }, - { assert path(process.out.html[0][1][0]).text.contains("File typeConventional base calls") }, - { assert path(process.out.html[0][1][1]).text.contains("File typeConventional base calls") }, - { assert path(process.out.html[0][1][2]).text.contains("File typeConventional base calls") }, - { assert path(process.out.html[0][1][3]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_multiple") } + { assert process.success }, + { assert process.out.html[0][1][0] ==~ ".*/test_1_fastqc.html" }, + { assert process.out.html[0][1][1] ==~ ".*/test_2_fastqc.html" }, + { assert process.out.html[0][1][2] ==~ ".*/test_3_fastqc.html" }, + { assert process.out.html[0][1][3] ==~ ".*/test_4_fastqc.html" }, + { assert process.out.zip[0][1][0] ==~ ".*/test_1_fastqc.zip" }, + { assert process.out.zip[0][1][1] ==~ ".*/test_2_fastqc.zip" }, + { assert process.out.zip[0][1][2] ==~ ".*/test_3_fastqc.zip" }, + { assert process.out.zip[0][1][3] ==~ ".*/test_4_fastqc.zip" }, + { assert path(process.out.html[0][1][0]).text.contains("File typeConventional base calls") }, + { assert path(process.out.html[0][1][1]).text.contains("File typeConventional base calls") }, + { assert path(process.out.html[0][1][2]).text.contains("File typeConventional base calls") }, + { assert path(process.out.html[0][1][3]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -173,21 +162,18 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - { assert process.out.html[0][1] ==~ ".*/mysample_fastqc.html" }, - { assert process.out.zip[0][1] ==~ ".*/mysample_fastqc.zip" }, - { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_custom_prefix") } + { assert process.success }, + { assert process.out.html[0][1] ==~ ".*/mysample_fastqc.html" }, + { assert process.out.zip[0][1] ==~ ".*/mysample_fastqc.zip" }, + { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } test("sarscov2 single-end [fastq] - stub") { - options "-stub" - + options "-stub" when { process { """ @@ -201,12 +187,123 @@ nextflow_process { then { assertAll ( - { assert process.success }, - { assert snapshot(process.out.html.collect { file(it[1]).getName() } + - process.out.zip.collect { file(it[1]).getName() } + - process.out.versions ).match("fastqc_stub") } + { assert process.success }, + { assert snapshot(process.out).match() } ) } } + test("sarscov2 paired-end [fastq] - stub") { + + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [id: 'test', single_end: false], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("sarscov2 interleaved [fastq] - stub") { + + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [id: 'test', single_end: false], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_interleaved.fastq.gz', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("sarscov2 paired-end [bam] - stub") { + + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [id: 'test', single_end: false], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("sarscov2 multiple [fastq] - stub") { + + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [id: 'test', single_end: false], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test2_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test2_2.fastq.gz', checkIfExists: true) ] + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("sarscov2 custom_prefix - stub") { + + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [ id:'mysample', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } } diff --git a/modules/nf-core/fastqc/tests/main.nf.test.snap b/modules/nf-core/fastqc/tests/main.nf.test.snap index 86f7c311..d5db3092 100644 --- a/modules/nf-core/fastqc/tests/main.nf.test.snap +++ b/modules/nf-core/fastqc/tests/main.nf.test.snap @@ -1,88 +1,392 @@ { - "fastqc_versions_interleaved": { + "sarscov2 custom_prefix": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:40:07.293713" + "timestamp": "2024-07-22T11:02:16.374038" }, - "fastqc_stub": { + "sarscov2 single-end [fastq] - stub": { "content": [ - [ - "test.html", - "test.zip", - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "test", + "single_end": true + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:02:24.993809" + }, + "sarscov2 custom_prefix - stub": { + "content": [ + { + "0": [ + [ + { + "id": "mysample", + "single_end": true + }, + "mysample.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "mysample", + "single_end": true + }, + "mysample.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "mysample", + "single_end": true + }, + "mysample.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "mysample", + "single_end": true + }, + "mysample.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:31:01.425198" + "timestamp": "2024-07-22T11:03:10.93942" }, - "fastqc_versions_multiple": { + "sarscov2 interleaved [fastq]": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:40:55.797907" + "timestamp": "2024-07-22T11:01:42.355718" }, - "fastqc_versions_bam": { + "sarscov2 paired-end [bam]": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:40:26.795862" + "timestamp": "2024-07-22T11:01:53.276274" }, - "fastqc_versions_single": { + "sarscov2 multiple [fastq]": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:39:27.043675" + "timestamp": "2024-07-22T11:02:05.527626" }, - "fastqc_versions_paired": { + "sarscov2 paired-end [fastq]": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:01:31.188871" + }, + "sarscov2 paired-end [fastq] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:02:34.273566" + }, + "sarscov2 multiple [fastq] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:39:47.584191" + "timestamp": "2024-07-22T11:03:02.304411" }, - "fastqc_versions_custom_prefix": { + "sarscov2 single-end [fastq]": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:01:19.095607" + }, + "sarscov2 interleaved [fastq] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:02:44.640184" + }, + "sarscov2 paired-end [bam] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:41:14.576531" + "timestamp": "2024-07-22T11:02:53.550742" } } \ No newline at end of file diff --git a/modules/nf-core/gawk/environment.yml b/modules/nf-core/gawk/environment.yml new file mode 100644 index 00000000..315f6dc6 --- /dev/null +++ b/modules/nf-core/gawk/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - conda-forge::gawk=5.3.0 diff --git a/modules/nf-core/gawk/main.nf b/modules/nf-core/gawk/main.nf new file mode 100644 index 00000000..ca468929 --- /dev/null +++ b/modules/nf-core/gawk/main.nf @@ -0,0 +1,55 @@ +process GAWK { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/gawk:5.3.0' : + 'biocontainers/gawk:5.3.0' }" + + input: + tuple val(meta), path(input) + path(program_file) + + output: + tuple val(meta), path("${prefix}.${suffix}"), emit: output + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' // args is used for the main arguments of the tool + def args2 = task.ext.args2 ?: '' // args2 is used to specify a program when no program file has been given + prefix = task.ext.prefix ?: "${meta.id}" + suffix = task.ext.suffix ?: "${input.getExtension()}" + + program = program_file ? "-f ${program_file}" : "${args2}" + + """ + awk \\ + ${args} \\ + ${program} \\ + ${input} \\ + > ${prefix}.${suffix} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + gawk: \$(awk -Wversion | sed '1!d; s/.*Awk //; s/,.*//') + END_VERSIONS + """ + + stub: + prefix = task.ext.prefix ?: "${meta.id}" + suffix = task.ext.suffix ?: "${input.getExtension()}" + def create_cmd = suffix.endsWith("gz") ? "echo '' | gzip >" : "touch" + + """ + ${create_cmd} ${prefix}.${suffix} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + gawk: \$(awk -Wversion | sed '1!d; s/.*Awk //; s/,.*//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/gawk/meta.yml b/modules/nf-core/gawk/meta.yml new file mode 100644 index 00000000..05170082 --- /dev/null +++ b/modules/nf-core/gawk/meta.yml @@ -0,0 +1,56 @@ +name: "gawk" +description: | + If you are like many computer users, you would frequently like to make changes in various text files + wherever certain patterns appear, or extract data from parts of certain lines while discarding the rest. + The job is easy with awk, especially the GNU implementation gawk. +keywords: + - gawk + - awk + - txt + - text + - file parsing +tools: + - "gawk": + description: "GNU awk" + homepage: "https://www.gnu.org/software/gawk/" + documentation: "https://www.gnu.org/software/gawk/manual/" + tool_dev_url: "https://www.gnu.org/prep/ftp.html" + licence: ["GPL v3"] + identifier: "" +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: The input file - Specify the logic that needs to be executed on + this file on the `ext.args2` or in the program file + pattern: "*" + - - program_file: + type: file + description: Optional file containing logic for awk to execute. If you don't + wish to use a file, you can use `ext.args2` to specify the logic. + pattern: "*" +output: + - output: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.${suffix}: + type: file + description: The output file - specify the name of this file using `ext.prefix` + and the extension using `ext.suffix` + pattern: "*" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@nvnieuwk" +maintainers: + - "@nvnieuwk" diff --git a/modules/nf-core/gawk/tests/main.nf.test b/modules/nf-core/gawk/tests/main.nf.test new file mode 100644 index 00000000..fce82ca9 --- /dev/null +++ b/modules/nf-core/gawk/tests/main.nf.test @@ -0,0 +1,56 @@ +nextflow_process { + + name "Test Process GAWK" + script "../main.nf" + process "GAWK" + + tag "modules" + tag "modules_nfcore" + tag "gawk" + + test("convert fasta to bed") { + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) + ] + input[1] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("convert fasta to bed with program file") { + config "./nextflow_with_program_file.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) + ] + input[1] = Channel.of('BEGIN {FS="\t"}; {print \$1 FS "0" FS \$2}').collectFile(name:"program.txt") + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } +} \ No newline at end of file diff --git a/modules/nf-core/gawk/tests/main.nf.test.snap b/modules/nf-core/gawk/tests/main.nf.test.snap new file mode 100644 index 00000000..4f3a759c --- /dev/null +++ b/modules/nf-core/gawk/tests/main.nf.test.snap @@ -0,0 +1,68 @@ +{ + "convert fasta to bed with program file": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.bed:md5,87a15eb9c2ff20ccd5cd8735a28708f7" + ] + ], + "1": [ + "versions.yml:md5,842acc9870dc8ac280954047cb2aa23a" + ], + "output": [ + [ + { + "id": "test" + }, + "test.bed:md5,87a15eb9c2ff20ccd5cd8735a28708f7" + ] + ], + "versions": [ + "versions.yml:md5,842acc9870dc8ac280954047cb2aa23a" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.03.0" + }, + "timestamp": "2024-05-17T15:20:02.495430346" + }, + "convert fasta to bed": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.bed:md5,87a15eb9c2ff20ccd5cd8735a28708f7" + ] + ], + "1": [ + "versions.yml:md5,842acc9870dc8ac280954047cb2aa23a" + ], + "output": [ + [ + { + "id": "test" + }, + "test.bed:md5,87a15eb9c2ff20ccd5cd8735a28708f7" + ] + ], + "versions": [ + "versions.yml:md5,842acc9870dc8ac280954047cb2aa23a" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.03.0" + }, + "timestamp": "2024-05-17T15:19:53.291809648" + } +} \ No newline at end of file diff --git a/modules/nf-core/gawk/tests/nextflow.config b/modules/nf-core/gawk/tests/nextflow.config new file mode 100644 index 00000000..6e5d43a3 --- /dev/null +++ b/modules/nf-core/gawk/tests/nextflow.config @@ -0,0 +1,6 @@ +process { + withName: GAWK { + ext.suffix = "bed" + ext.args2 = '\'BEGIN {FS="\t"}; {print \$1 FS "0" FS \$2}\'' + } +} diff --git a/modules/nf-core/gawk/tests/nextflow_with_program_file.config b/modules/nf-core/gawk/tests/nextflow_with_program_file.config new file mode 100644 index 00000000..693ad419 --- /dev/null +++ b/modules/nf-core/gawk/tests/nextflow_with_program_file.config @@ -0,0 +1,5 @@ +process { + withName: GAWK { + ext.suffix = "bed" + } +} diff --git a/modules/nf-core/gawk/tests/tags.yml b/modules/nf-core/gawk/tests/tags.yml new file mode 100644 index 00000000..72e4531d --- /dev/null +++ b/modules/nf-core/gawk/tests/tags.yml @@ -0,0 +1,2 @@ +gawk: + - "modules/nf-core/gawk/**" diff --git a/modules/nf-core/mirdeep2/mapper/environment.yml b/modules/nf-core/mirdeep2/mapper/environment.yml new file mode 100644 index 00000000..fafc6663 --- /dev/null +++ b/modules/nf-core/mirdeep2/mapper/environment.yml @@ -0,0 +1,7 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + - "bioconda::mirdeep2=2.0.1.2" diff --git a/modules/nf-core/mirdeep2/mapper/main.nf b/modules/nf-core/mirdeep2/mapper/main.nf new file mode 100644 index 00000000..d52820a3 --- /dev/null +++ b/modules/nf-core/mirdeep2/mapper/main.nf @@ -0,0 +1,53 @@ +process MIRDEEP2_MAPPER { + tag "$meta.id" + label 'process_medium' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mirdeep2:2.0.1.2--0': + 'biocontainers/mirdeep2:2.0.1.2--0' }" + + input: + tuple val(meta), path(reads) + tuple val(meta2), path(index, stageAs: '*') + + output: + tuple val(meta), path('*.fa'), path('*.arf'), emit: outputs + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def VERSION = '2.0.1' + + """ + mapper.pl \\ + ${reads} \\ + $args \\ + -p ${index}/${meta2.id} \\ + -s ${prefix}_collapsed.fa \\ + -t ${prefix}_reads_collapsed_vs_${meta2.id}_genome.arf + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirdeep2: \$(echo "$VERSION") + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def VERSION = '2.0.1' + """ + touch ${prefix}.fa + touch ${prefix}reads_vs_refdb.arf + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirdeep2: \$(echo "$VERSION") + END_VERSIONS + """ +} diff --git a/modules/nf-core/mirdeep2/mapper/meta.yml b/modules/nf-core/mirdeep2/mapper/meta.yml new file mode 100644 index 00000000..a482c480 --- /dev/null +++ b/modules/nf-core/mirdeep2/mapper/meta.yml @@ -0,0 +1,59 @@ +name: "mirdeep2_mapper" +description: | + miRDeep2 Mapper is a tool that prepares deep sequencing reads for downstream miRNA detection by collapsing reads, mapping them to a genome, and outputting the required files for miRNA discovery. +keywords: + - mirdeep2 + - mapper + - RNA sequencing +tools: + - "mirdeep2": + description: | + miRDeep2 Mapper (`mapper.pl`) is part of the miRDeep2 suite. It collapses identical reads, maps them to a reference genome, and outputs both collapsed FASTA and ARF files for downstream miRNA detection and analysis. + homepage: "https://www.mdc-berlin.de/content/mirdeep2-documentation" + documentation: "https://www.mdc-berlin.de/content/mirdeep2-documentation" + tool_dev_url: "https://github.com/rajewsky-lab/mirdeep2" + doi: "10.1093/nar/gkn491" + licence: ["GPL V3"] + identifier: biotools:mirdeep2 + +input: + - - meta: + type: map + description: Groovy Map containing sample information, e.g. `[ id:'sample1', + single_end:false ]` + - reads: + type: file + description: File containing the raw sequencing reads that need to be collapsed + and mapped to a reference genome. + pattern: "*.fa" + - - meta2: + type: map + description: Groovy Map containing information about the genome index. + - index: + type: file + description: Path to the genome index file used for mapping the reads to the + genome. + pattern: "*" +output: + - outputs: + - meta: + type: map + description: Groovy Map containing sample information, e.g. `[ id:'sample1', single_end:false ]` + - "*.fa": + type: file + description: Collapsed reads in FASTA format. + pattern: "*.fa" + - "*.arf": + type: file + description: Alignment Read Format (ARF) file containing the mapping of reads + to the genome. + pattern: "*.arf" + - versions: + - versions.yml: + type: file + description: File containing software versions for tracking. + pattern: "versions.yml" +authors: + - "@atrigila" +maintainers: + - "@atrigila" diff --git a/modules/nf-core/mirdeep2/mapper/tests/main.nf.test b/modules/nf-core/mirdeep2/mapper/tests/main.nf.test new file mode 100644 index 00000000..62e3e615 --- /dev/null +++ b/modules/nf-core/mirdeep2/mapper/tests/main.nf.test @@ -0,0 +1,141 @@ + +nextflow_process { + + name "Test Process MIRDEEP2_MAPPER" + script "../main.nf" + process "MIRDEEP2_MAPPER" + + tag "modules" + tag "modules_nfcore" + tag "mirdeep2" + tag "bowtie/build" + tag "mirdeep2/mapper" + tag "seqkit/fq2fa" + tag "seqkit/replace" + + + setup { + run("BOWTIE_BUILD") { + script "../../../bowtie/build/main.nf" + process { + """ + input[0] = [ + [ id:'genome_cel_cluster' ], // meta map + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/cel_cluster.fa', checkIfExists: true) + ] + """ + } + } + + run("SEQKIT_FQ2FA") { + script "../../../seqkit/fq2fa/main.nf" + process { + """ + input[0] = [ + [ id:'small_Clone1_N1' ], // meta map + file('https://github.com/nf-core/test-datasets/raw/smrnaseq/testdata/trimmed/small_Clone1_N1.fastp.fastq.gz', checkIfExists: true) + ] + """ + } + } + + run("SEQKIT_REPLACE") { + script "../../../seqkit/replace/main.nf" + config "./nextflow.config" + process { + """ + input[0] = SEQKIT_FQ2FA.out.fasta + """ + } + } + + } + + test("mirdeep2 - mapper - fasta celegans") { + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test_reads', single_end:false ], // meta map + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/reads.fa', checkIfExists: true) + ] + input[1] = BOWTIE_BUILD.out.index + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.versions).match() }, + + // md5sum not stable - IDs change while sequences are the same + + // Assert TCACCGGGGGTACATCAGCTAA occurs once + { assert file(process.out.outputs[0][1]).readLines().findAll { it.contains("TCACCGGGGGTACATCAGCTAA") }.size() == 1 }, + + // Assert seq_347479_x287 occurs once + { assert file(process.out.outputs[0][1]).readLines().findAll { it.contains("seq_347479_x287") }.size() == 1 }, + + // Assert that specific content occurs 4 times + { assert file(process.out.outputs[0][2]).readLines().findAll { it.contains("21\t1\t21\ttcaccgggtgtaaatcagctt\tchrII:11534525-11540624\t21\t3535\t3555\ttcaccgggtgtaaatcagctt\t+\t0\tmmmmmmmmmmmmmmmmmmmmm") }.size() == 4 } + ) + } + + } + + test("mirdeep2 - mapper - fasta smrnaseq") { + config "./nextflow.config" + + when { + process { + """ + input[0] = SEQKIT_REPLACE.out.fastx + input[1] = BOWTIE_BUILD.out.index + """ + } + } + + then { + assertAll( + { assert process.success }, + + // Assert reads occurs once + { assert file(process.out.outputs[0][1]).readLines().findAll { it.contains("TACCTGAGGTAGCAGGTTGTATAGTTGGGG") }.size() == 1 }, + + // Assert ID occurs once + { assert file(process.out.outputs[0][1]).readLines().findAll { it.contains("seq_996152_x1") }.size() == 1 } + + ) + } + + } + + test("mirdeep2 - fasta - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test_reads', single_end:false ], // meta map + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/reads.fa', checkIfExists: true) + ] + input[1] = BOWTIE_BUILD.out.index + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/mirdeep2/mapper/tests/main.nf.test.snap b/modules/nf-core/mirdeep2/mapper/tests/main.nf.test.snap new file mode 100644 index 00000000..4c3697d9 --- /dev/null +++ b/modules/nf-core/mirdeep2/mapper/tests/main.nf.test.snap @@ -0,0 +1,51 @@ +{ + "mirdeep2 - fasta - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test_reads", + "single_end": false + }, + "test_reads.fa:md5,d41d8cd98f00b204e9800998ecf8427e", + "test_readsreads_vs_refdb.arf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,33c794292d6772d67fa8001439394614" + ], + "outputs": [ + [ + { + "id": "test_reads", + "single_end": false + }, + "test_reads.fa:md5,d41d8cd98f00b204e9800998ecf8427e", + "test_readsreads_vs_refdb.arf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,33c794292d6772d67fa8001439394614" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-20T20:58:19.544297445" + }, + "mirdeep2 - mapper - fasta celegans": { + "content": [ + [ + "versions.yml:md5,33c794292d6772d67fa8001439394614" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-17T17:41:05.101661825" + } +} \ No newline at end of file diff --git a/modules/nf-core/mirdeep2/mapper/tests/nextflow.config b/modules/nf-core/mirdeep2/mapper/tests/nextflow.config new file mode 100644 index 00000000..ec097561 --- /dev/null +++ b/modules/nf-core/mirdeep2/mapper/tests/nextflow.config @@ -0,0 +1,11 @@ +process { + withName: 'MIRDEEP2_MAPPER' { + ext.args = "-c -j -k TCGTATGCCGTCTTCTGCTTGT -l 18 -m -v" + } + + withName: 'SEQKIT_REPLACE' { + ext.args = "-p '\s.+'" + ext.suffix = "fasta" + } + +} diff --git a/modules/nf-core/mirdeep2/mirdeep2/environment.yml b/modules/nf-core/mirdeep2/mirdeep2/environment.yml new file mode 100644 index 00000000..fafc6663 --- /dev/null +++ b/modules/nf-core/mirdeep2/mirdeep2/environment.yml @@ -0,0 +1,7 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + - "bioconda::mirdeep2=2.0.1.2" diff --git a/modules/nf-core/mirdeep2/mirdeep2/main.nf b/modules/nf-core/mirdeep2/mirdeep2/main.nf new file mode 100644 index 00000000..66c85968 --- /dev/null +++ b/modules/nf-core/mirdeep2/mirdeep2/main.nf @@ -0,0 +1,64 @@ +process MIRDEEP2_MIRDEEP2 { + tag "$meta.id" + label 'process_medium' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mirdeep2:2.0.1.2--0': + 'biocontainers/mirdeep2:2.0.1.2--0' }" + + input: + tuple val(meta), path(processed_reads), path(genome_mappings) + tuple val(meta2), path(fasta) + tuple val(meta3), path(mature), path(hairpin), path(mature_other_species) + + output: + tuple val(meta), path("result*.{bed,csv,html}") , emit: outputs + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def VERSION = '2.0.1' + def mature_species = mature ? "${mature}" : "none" + def mature_other = mature_other_species ? "${mature_other_species}": "none" + def precursors = hairpin ? "${hairpin}" : "none" + + """ + miRDeep2.pl \\ + $processed_reads \\ + $fasta \\ + $genome_mappings \\ + $mature_species \\ + $mature_other \\ + $precursors \\ + $args + + mv result_*.bed result_${prefix}.bed + mv result_*.csv result_${prefix}.csv + mv result_*.html result_${prefix}.html + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirdeep2: \$(echo "$VERSION") + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def VERSION = '2.0.1' + """ + touch result_${prefix}.html + touch result_${prefix}.bed + touch result_${prefix}.csv + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirdeep2: \$(echo "$VERSION") + END_VERSIONS + """ +} diff --git a/modules/nf-core/mirdeep2/mirdeep2/meta.yml b/modules/nf-core/mirdeep2/mirdeep2/meta.yml new file mode 100644 index 00000000..adf14101 --- /dev/null +++ b/modules/nf-core/mirdeep2/mirdeep2/meta.yml @@ -0,0 +1,76 @@ +name: "mirdeep2_mirdeep2" +description: | + miRDeep2 is a tool for identifying known and novel miRNAs in deep sequencing data by analyzing sequenced RNAs. It integrates the mapping of sequencing reads to the genome and predicts miRNA precursors and mature miRNAs. +keywords: + - mirdeep2 + - miRNA + - RNA sequencing +tools: + - "mirdeep2": + description: | + miRDeep2 is a tool that discovers microRNA genes by analyzing sequenced RNAs. + It includes three main scripts: `miRDeep2.pl`, `mapper.pl`, and `quantifier.pl` for comprehensive miRNA detection and quantification. + homepage: "https://www.mdc-berlin.de/content/mirdeep2-documentation" + documentation: "https://www.mdc-berlin.de/content/mirdeep2-documentation" + tool_dev_url: "https://github.com/rajewsky-lab/mirdeep2" + doi: "10.1093/nar/gkn491" + licence: ["GPL V3"] + identifier: biotools:mirdeep2 + +input: + - - meta: + type: map + description: Groovy Map containing sample information, e.g. `[ id:'sample1', + single_end:false ]` + - processed_reads: + type: file + description: FASTA file containing the processed sequencing reads. + pattern: "*.fa" + - genome_mappings: + type: file + description: ARF format file with mapped reads to the genome. + pattern: "*.arf" + - - meta2: + type: map + description: Groovy Map for genome FASTA file metadata, e.g. `[ id:'genome']` + - fasta: + type: file + description: FASTA file of the corresponding genome. + pattern: "*.fa" + - - meta3: + type: map + description: Groovy Map for miRNA metadata, e.g. `[ id:'mirbase', single_end:false + ]` + - mature: + type: file + description: FASTA file containing known mature miRNAs of the species being + analyzed. + pattern: "*.fa" + - hairpin: + type: file + description: FASTA file containing hairpin sequences (miRNA precursors). + pattern: "*.fa" + - mature_other_species: + type: file + description: FASTA file containing known mature miRNAs of other species. + pattern: "*.fa" +output: + - outputs: + - meta: + type: map + description: Groovy Map containing sample information e.g. `[ id:'sample1', + single_end:false ]` + - result*.{bed,csv,html}: + type: file + description: Output files, including BED, CSV, and HTML results files with an + overview of detected miRNAs. + pattern: "result*.{bed,csv,html}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@atrigila" +maintainers: + - "@atrigila" diff --git a/modules/nf-core/mirdeep2/mirdeep2/tests/main.nf.test b/modules/nf-core/mirdeep2/mirdeep2/tests/main.nf.test new file mode 100644 index 00000000..b7b73ec1 --- /dev/null +++ b/modules/nf-core/mirdeep2/mirdeep2/tests/main.nf.test @@ -0,0 +1,111 @@ +nextflow_process { + + name "Test Process MIRDEEP2_MIRDEEP2" + script "../main.nf" + process "MIRDEEP2_MIRDEEP2" + + tag "modules" + tag "modules_nfcore" + tag "mirdeep2" + tag "mirdeep2/mirdeep2" + tag "bowtie/build" + tag "mirdeep2/mapper" + + + setup { + run("BOWTIE_BUILD") { + script "../../../bowtie/build/main.nf" + process { + """ + input[0] = [ + [ id:'genome_cel_cluster' ], // meta map + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/cel_cluster.fa', checkIfExists: true) + ] + """ + } + } + + run("MIRDEEP2_MAPPER") { + script "../../../mirdeep2/mapper/main.nf" + config "./nextflow.config" + + process { + """ + input[0] = [ + [ id:'test_reads', single_end:false ], // meta map + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/reads.fa', checkIfExists: true) + ] + input[1] = BOWTIE_BUILD.out.index + """ + } + } + + } + + test("mirdeep2 - mirdeep2 - fa") { + + when { + process { + """ + input[0] = MIRDEEP2_MAPPER.out.outputs + input[1] = [ + [ id:'genome_cel_cluster' ], // meta map + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/cel_cluster.fa', checkIfExists: true) + ] + input[2] = [ + [ id:'hairpin_mature'], // meta map + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/mature_ref_this_species.fa', checkIfExists: true), + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/precursors_ref_this_species.fa', checkIfExists: true), + [] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.versions, + path(process.out.outputs.get(0).get(1)[2]).readLines().last().contains(''), + process.out.outputs.get(0).get(1)[0], + path(process.out.outputs.get(0).get(1)[1]).readLines().first().contains('miRDeep2 score') + ).match() }, + // Assert .html + { assert path(process.out.outputs.get(0).get(1)[2]).readLines().last().contains('') } + ) + } + + } + + test("mirdeep - mirdeep2 - stub") { + + options "-stub" + + when { + process { + """ + input[0] = MIRDEEP2_MAPPER.out.outputs + input[1] = [ + [ id:'genome_cel_cluster' ], // meta map + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/cel_cluster.fa', checkIfExists: true) + ] + input[2] = [ + [ id:'hairpin_mature'], // meta map + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/mature_ref_this_species.fa', checkIfExists: true), + file('https://github.com/rajewsky-lab/mirdeep2/raw/master/tutorial_dir/mature_ref_other_species.fa', checkIfExists: true), + [] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/mirdeep2/mirdeep2/tests/main.nf.test.snap b/modules/nf-core/mirdeep2/mirdeep2/tests/main.nf.test.snap new file mode 100644 index 00000000..f8ffcf01 --- /dev/null +++ b/modules/nf-core/mirdeep2/mirdeep2/tests/main.nf.test.snap @@ -0,0 +1,60 @@ +{ + "mirdeep - mirdeep2 - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test_reads", + "single_end": false + }, + [ + "result_test_reads.bed:md5,d41d8cd98f00b204e9800998ecf8427e", + "result_test_reads.csv:md5,d41d8cd98f00b204e9800998ecf8427e", + "result_test_reads.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "1": [ + "versions.yml:md5,8984ad2f1e8bdd148da051e2e6b569bf" + ], + "outputs": [ + [ + { + "id": "test_reads", + "single_end": false + }, + [ + "result_test_reads.bed:md5,d41d8cd98f00b204e9800998ecf8427e", + "result_test_reads.csv:md5,d41d8cd98f00b204e9800998ecf8427e", + "result_test_reads.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "versions": [ + "versions.yml:md5,8984ad2f1e8bdd148da051e2e6b569bf" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-20T21:04:53.304188615" + }, + "mirdeep2 - mirdeep2 - fa": { + "content": [ + [ + "versions.yml:md5,8984ad2f1e8bdd148da051e2e6b569bf" + ], + true, + "result_test_reads.bed:md5,ba5ef5782e40d7219ca064dd68865d74", + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-23T15:08:50.660562955" + } +} \ No newline at end of file diff --git a/modules/nf-core/mirdeep2/mirdeep2/tests/nextflow.config b/modules/nf-core/mirdeep2/mirdeep2/tests/nextflow.config new file mode 100644 index 00000000..6a33ae05 --- /dev/null +++ b/modules/nf-core/mirdeep2/mirdeep2/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: 'MIRDEEP2_MAPPER' { + ext.args = "-c -j -k TCGTATGCCGTCTTCTGCTTGT -l 18 -m -v" + } +} diff --git a/modules/nf-core/mirtop/counts/environment.yml b/modules/nf-core/mirtop/counts/environment.yml new file mode 100644 index 00000000..1f5deb37 --- /dev/null +++ b/modules/nf-core/mirtop/counts/environment.yml @@ -0,0 +1,13 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + - "bioconda::mirtop=0.4.28" + - "bioconda::samtools=1.21" + - "conda-forge::python=3.11" + - "conda-forge::biopython=1.83" + - "bioconda::pysam=0.22.1" + - "bioconda::pybedtools=0.10.0" + - "conda-forge::pandas=2.2.2" diff --git a/modules/nf-core/mirtop/counts/main.nf b/modules/nf-core/mirtop/counts/main.nf new file mode 100644 index 00000000..a4ca1889 --- /dev/null +++ b/modules/nf-core/mirtop/counts/main.nf @@ -0,0 +1,51 @@ +process MIRTOP_COUNTS { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "community.wave.seqera.io/library/mirtop_pybedtools_pysam_samtools_pruned:60b8208f3dbb2910" + + input: + tuple val(meta), path(mirtop_gff) + tuple val(meta2), path(hairpin) + tuple val(meta3), path(gtf), val(species) + + output: + tuple val(meta), path("counts/*.tsv"), emit: tsv + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mirtop \\ + counts \\ + $args \\ + --hairpin $hairpin \\ + --gtf $gtf \\ + --sps $species \\ + --gff $mirtop_gff \\ + -o counts + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirtop: \$(echo \$(mirtop --version 2>&1) | sed 's/^.*mirtop //') + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mkdir counts + touch counts/mirtop.tsv + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirtop: \$(echo \$(mirtop --version 2>&1) | sed 's/^.*mirtop //') + END_VERSIONS + """ +} diff --git a/modules/nf-core/mirtop/counts/meta.yml b/modules/nf-core/mirtop/counts/meta.yml new file mode 100644 index 00000000..679df7b0 --- /dev/null +++ b/modules/nf-core/mirtop/counts/meta.yml @@ -0,0 +1,68 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json +name: "mirtop_counts" +description: mirtop counts generates a file with the minimal information about each + sequence and the count data in columns for each samples. +keywords: + - mirna + - isomir + - gff +tools: + - "mirtop": + description: "Small RNA-seq annotation" + homepage: "https://github.com/miRTop/mirtop" + documentation: "https://mirtop.readthedocs.io/en/latest/" + tool_dev_url: "https://github.com/miRTop/mirtop" + licence: ["MIT License"] + identifier: biotools:miRTop + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - mirtop_gff: + type: file + description: GFF file + pattern: "*.{gff}" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - hairpin: + type: file + description: Hairpin file + pattern: "*.{fa,fasta}" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - gtf: + type: file + description: GTF file + pattern: "*.{gtf}" + - species: + type: string + description: Species name of the GTF file +output: + - tsv: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - counts/*.tsv: + type: file + description: TSV file + pattern: "*.{tsv}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@atrigila" +maintainers: + - "@atrigila" diff --git a/modules/nf-core/mirtop/counts/tests/main.nf.test b/modules/nf-core/mirtop/counts/tests/main.nf.test new file mode 100644 index 00000000..5048c166 --- /dev/null +++ b/modules/nf-core/mirtop/counts/tests/main.nf.test @@ -0,0 +1,92 @@ +nextflow_process { + + name "Test Process MIRTOP_COUNTS" + script "../main.nf" + config "./nextflow.config" + process "MIRTOP_COUNTS" + + tag "modules" + tag "modules_nfcore" + tag "mirtop" + tag "mirtop/gff" + tag "mirtop/counts" + + setup { + run("MIRTOP_GFF") { + script "../../gff/main.nf" + process { + """ + input[0] = [ + [ id:'sample_sim_isomir_bam'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/sim_isomir_sort.bam", checkIfExists: true), + ] + input[1] = [ + [ id:'hairpin'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hairpin_mirtop.fa", checkIfExists: true), + ] + input[2] = [ + [ id:'hsa' ], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hsa.gff3", checkIfExists: true), + "hsa"] + """ + } + } + } + + test("isomir - bam") { + + when { + process { + """ + input[0] = MIRTOP_GFF.out.gff + input[1] = [ + [ id:'hairpin'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hairpin_mirtop.fa", checkIfExists: true), + ] + input[2] = [ + [ id:'hsa' ], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hsa.gff3", checkIfExists: true), + "hsa"] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("isomir - bam - stub") { + + options "-stub" + + when { + process { + """ + input[0] = MIRTOP_GFF.out.gff + input[1] = [ + [ id:'hairpin'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hairpin_mirtop.fa", checkIfExists: true), + ] + input[2] = [ + [ id:'hsa' ], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hsa.gff3", checkIfExists: true), + "hsa"] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/mirtop/counts/tests/main.nf.test.snap b/modules/nf-core/mirtop/counts/tests/main.nf.test.snap new file mode 100644 index 00000000..104acf13 --- /dev/null +++ b/modules/nf-core/mirtop/counts/tests/main.nf.test.snap @@ -0,0 +1,68 @@ +{ + "isomir - bam - stub": { + "content": [ + { + "0": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "mirtop.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,a9b4901761f70f5a2c9aa3718dd361b8" + ], + "tsv": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "mirtop.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,a9b4901761f70f5a2c9aa3718dd361b8" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-18T15:05:22.556134542" + }, + "isomir - bam": { + "content": [ + { + "0": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam_mirtop.tsv:md5,43f8a525104c2d9b5a8937564c3a14f6" + ] + ], + "1": [ + "versions.yml:md5,a9b4901761f70f5a2c9aa3718dd361b8" + ], + "tsv": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam_mirtop.tsv:md5,43f8a525104c2d9b5a8937564c3a14f6" + ] + ], + "versions": [ + "versions.yml:md5,a9b4901761f70f5a2c9aa3718dd361b8" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-18T16:05:58.332523272" + } +} \ No newline at end of file diff --git a/modules/nf-core/mirtop/counts/tests/nextflow.config b/modules/nf-core/mirtop/counts/tests/nextflow.config new file mode 100644 index 00000000..83d77e20 --- /dev/null +++ b/modules/nf-core/mirtop/counts/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: 'MIRTOP_COUNTS' { + ext.args = '--add-extra' + } +} diff --git a/modules/nf-core/mirtop/export/environment.yml b/modules/nf-core/mirtop/export/environment.yml new file mode 100644 index 00000000..17572707 --- /dev/null +++ b/modules/nf-core/mirtop/export/environment.yml @@ -0,0 +1,11 @@ +channels: + - conda-forge + - bioconda +dependencies: + - "bioconda::mirtop=0.4.28" + - "bioconda::samtools=1.21" + - "conda-forge::python=3.11" + - "conda-forge::biopython=1.83" + - "bioconda::pysam=0.22.1" + - "bioconda::pybedtools=0.10.0" + - "conda-forge::pandas=2.2.2" diff --git a/modules/nf-core/mirtop/export/main.nf b/modules/nf-core/mirtop/export/main.nf new file mode 100644 index 00000000..b99333de --- /dev/null +++ b/modules/nf-core/mirtop/export/main.nf @@ -0,0 +1,55 @@ +process MIRTOP_EXPORT { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "community.wave.seqera.io/library/mirtop_pybedtools_pysam_samtools_pruned:60b8208f3dbb2910" + + input: + tuple val(meta), path(mirtop_gff) + tuple val(meta2), path(hairpin) + tuple val(meta3), path(gtf), val(species) + + output: + tuple val(meta), path("export/*_rawData.tsv") , emit: tsv, optional: true + tuple val(meta), path("export/*.fasta") , emit: fasta, optional: true + tuple val(meta), path("export/*.vcf*") , emit: vcf , optional: true + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '--format isomir' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mirtop \\ + export \\ + $args \\ + --hairpin $hairpin\\ + --gtf $gtf \\ + --sps $species \\ + -o export \\ + $mirtop_gff + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirtop: \$(echo \$(mirtop --version 2>&1) | sed 's/^.*mirtop //') + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mkdir export + touch export/${prefix}.fasta + touch export/${prefix}.vcf + touch export/${prefix}.tsv + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirtop: \$(echo \$(mirtop --version 2>&1) | sed 's/^.*mirtop //') + END_VERSIONS + """ +} diff --git a/modules/nf-core/mirtop/export/meta.yml b/modules/nf-core/mirtop/export/meta.yml new file mode 100644 index 00000000..25f66c3b --- /dev/null +++ b/modules/nf-core/mirtop/export/meta.yml @@ -0,0 +1,88 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json +name: "mirtop_export" +description: mirtop export generates files such as fasta, vcf or compatible with isomiRs + bioconductor package +keywords: + - mirna + - isomir + - gff +tools: + - "mirtop": + description: "Small RNA-seq annotation" + homepage: "https://github.com/miRTop/mirtop" + documentation: "https://mirtop.readthedocs.io/en/latest/" + tool_dev_url: "https://github.com/miRTop/mirtop" + licence: ["MIT License"] + identifier: biotools:miRTop + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - mirtop_gff: + type: file + description: GFF file + pattern: "*.{gff}" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - hairpin: + type: file + description: Hairpin file + pattern: "*.{fa,fasta}" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - gtf: + type: file + description: GTF file + pattern: "*.{gtf}" + - species: + type: string + description: Species name of the GTF file +output: + - tsv: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - export/*_rawData.tsv: + type: file + description: TSV file + pattern: "*.{tsv}" + - fasta: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - export/*.fasta: + type: file + description: FASTA file + pattern: "*.{fasta,fa}" + - vcf: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - export/*.vcf*: + type: file + description: VCF file + pattern: "*.{vcf,vcf.gz}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@atrigila" +maintainers: + - "@atrigila" diff --git a/modules/nf-core/mirtop/export/tests/main.nf.test b/modules/nf-core/mirtop/export/tests/main.nf.test new file mode 100644 index 00000000..234caf57 --- /dev/null +++ b/modules/nf-core/mirtop/export/tests/main.nf.test @@ -0,0 +1,92 @@ +nextflow_process { + + name "Test Process MIRTOP_EXPORT" + script "../main.nf" + process "MIRTOP_EXPORT" + + tag "modules" + tag "modules_nfcore" + tag "mirtop" + tag "mirtop/gff" + tag "mirtop/export" + + setup { + run("MIRTOP_GFF") { + script "../../gff/main.nf" + process { + """ + input[0] = [ + [ id:'sample_sim_isomir_bam'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/sim_isomir_sort.bam", checkIfExists: true), + ] + input[1] = [ + [ id:'hairpin'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hairpin_mirtop.fa", checkIfExists: true), + ] + input[2] = [ + [ id:'hsa' ], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hsa.gff3", checkIfExists: true), + "hsa"] + """ + } + } + } + + test("isomir - bam") { + + when { + process { + """ + input[0] = MIRTOP_GFF.out.gff + input[1] = [ + [ id:'hairpin'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hairpin_mirtop.fa", checkIfExists: true), + ] + input[2] = [ + [ id:'hsa' ], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hsa.gff3", checkIfExists: true), + "hsa"] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("isomir - bam - stub") { + + options "-stub" + + when { + process { + """ + input[0] = MIRTOP_GFF.out.gff + input[1] = [ + [ id:'hairpin'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hairpin_mirtop.fa", checkIfExists: true), + ] + input[2] = [ + [ id:'hsa' ], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hsa.gff3", checkIfExists: true), + "hsa"] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + +} diff --git a/modules/nf-core/mirtop/export/tests/main.nf.test.snap b/modules/nf-core/mirtop/export/tests/main.nf.test.snap new file mode 100644 index 00000000..ea272199 --- /dev/null +++ b/modules/nf-core/mirtop/export/tests/main.nf.test.snap @@ -0,0 +1,102 @@ +{ + "isomir - bam - stub": { + "content": [ + { + "0": [ + + ], + "1": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam.fasta:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam.vcf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,f107a819b949304d703be77d60d97ae9" + ], + "fasta": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam.fasta:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tsv": [ + + ], + "vcf": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam.vcf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,f107a819b949304d703be77d60d97ae9" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-18T14:54:17.653801164" + }, + "isomir - bam": { + "content": [ + { + "0": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam_mirtop_rawData.tsv:md5,efbcbe67716a4a56f89e538af2251dcc" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,f107a819b949304d703be77d60d97ae9" + ], + "fasta": [ + + ], + "tsv": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam_mirtop_rawData.tsv:md5,efbcbe67716a4a56f89e538af2251dcc" + ] + ], + "vcf": [ + + ], + "versions": [ + "versions.yml:md5,f107a819b949304d703be77d60d97ae9" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-18T16:06:37.626754369" + } +} \ No newline at end of file diff --git a/modules/nf-core/mirtop/gff/environment.yml b/modules/nf-core/mirtop/gff/environment.yml new file mode 100644 index 00000000..1f5deb37 --- /dev/null +++ b/modules/nf-core/mirtop/gff/environment.yml @@ -0,0 +1,13 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + - "bioconda::mirtop=0.4.28" + - "bioconda::samtools=1.21" + - "conda-forge::python=3.11" + - "conda-forge::biopython=1.83" + - "bioconda::pysam=0.22.1" + - "bioconda::pybedtools=0.10.0" + - "conda-forge::pandas=2.2.2" diff --git a/modules/nf-core/mirtop/gff/main.nf b/modules/nf-core/mirtop/gff/main.nf new file mode 100644 index 00000000..aeb26bcc --- /dev/null +++ b/modules/nf-core/mirtop/gff/main.nf @@ -0,0 +1,53 @@ +process MIRTOP_GFF { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "community.wave.seqera.io/library/mirtop_pybedtools_pysam_samtools_pruned:60b8208f3dbb2910" + + input: + tuple val(meta), path(bam, arity:'1..*') + tuple val(meta2), path(hairpin) + tuple val(meta3), path(gtf), val(species) + + output: + tuple val(meta), path("mirtop/*mirtop.gff") , emit: gff + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mirtop \\ + gff \\ + $args \\ + --sps $species \\ + --hairpin $hairpin \\ + --gtf $gtf \\ + -o mirtop \\ + $bam + + mv mirtop/mirtop.gff mirtop/${prefix}_mirtop.gff + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirtop: \$(echo \$(mirtop --version 2>&1) | sed 's/^.*mirtop //') + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mkdir mirtop + touch mirtop/mirtop.gff + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirtop: \$(echo \$(mirtop --version 2>&1) | sed 's/^.*mirtop //') + END_VERSIONS + """ +} diff --git a/modules/nf-core/mirtop/gff/meta.yml b/modules/nf-core/mirtop/gff/meta.yml new file mode 100644 index 00000000..8e23f054 --- /dev/null +++ b/modules/nf-core/mirtop/gff/meta.yml @@ -0,0 +1,67 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json +name: "mirtop_gff" +description: mirtop gff generates the GFF3 adapter format to capture miRNA variations +keywords: + - mirna + - isomir + - gff +tools: + - "mirtop": + description: "Small RNA-seq annotation" + homepage: "https://github.com/miRTop/mirtop" + documentation: "https://mirtop.readthedocs.io/en/latest/" + tool_dev_url: "https://github.com/miRTop/mirtop" + licence: ["MIT License"] + identifier: biotools:miRTop + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - bam: + type: file + description: Sorted BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - hairpin: + type: file + description: Hairpin file + pattern: "*.{fa,fasta}" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - gtf: + type: file + description: GTF file + pattern: "*.{gtf}" + - species: + type: string + description: Species name of the GTF file +output: + - gff: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - mirtop/*mirtop.gff: + type: file + description: GFF file + pattern: "*.{gff}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@atrigila" +maintainers: + - "@atrigila" diff --git a/modules/nf-core/mirtop/gff/tests/main.nf.test b/modules/nf-core/mirtop/gff/tests/main.nf.test new file mode 100644 index 00000000..85977b43 --- /dev/null +++ b/modules/nf-core/mirtop/gff/tests/main.nf.test @@ -0,0 +1,74 @@ +nextflow_process { + + name "Test Process MIRTOP_GFF" + script "../main.nf" + process "MIRTOP_GFF" + + tag "modules" + tag "modules_nfcore" + tag "mirtop" + tag "mirtop/gff" + + test("isomir - bam") { + + when { + process { + """ + input[0] = [ + [ id:'sample_sim_isomir_bam'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/sim_isomir_sort.bam", checkIfExists: true), + ] + input[1] = [ + [ id:'hairpin'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hairpin_mirtop.fa", checkIfExists: true), + ] + input[2] = [ + [ id:'hsa' ], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hsa.gff3", checkIfExists: true), + "hsa"] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("isomir - bam - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'sample_sim_isomir_bam'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/sim_isomir_sort.bam", checkIfExists: true), + ] + input[1] = [ + [ id:'hairpin'], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hairpin_mirtop.fa", checkIfExists: true), + ] + input[2] = [ + [ id:'hsa' ], // meta map + file("https://github.com/nf-core/test-datasets/raw/modules/data/delete_me/mirtop/hsa.gff3", checkIfExists: true), + "hsa"] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/mirtop/gff/tests/main.nf.test.snap b/modules/nf-core/mirtop/gff/tests/main.nf.test.snap new file mode 100644 index 00000000..0dddae2d --- /dev/null +++ b/modules/nf-core/mirtop/gff/tests/main.nf.test.snap @@ -0,0 +1,68 @@ +{ + "isomir - bam - stub": { + "content": [ + { + "0": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "mirtop.gff:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,4413a2adbdafd7b63e8db3c18bd73314" + ], + "gff": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "mirtop.gff:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,4413a2adbdafd7b63e8db3c18bd73314" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-18T14:31:04.976117723" + }, + "isomir - bam": { + "content": [ + { + "0": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam_mirtop.gff:md5,da04f476c3fb8670e861fd8dd83418f9" + ] + ], + "1": [ + "versions.yml:md5,4413a2adbdafd7b63e8db3c18bd73314" + ], + "gff": [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam_mirtop.gff:md5,da04f476c3fb8670e861fd8dd83418f9" + ] + ], + "versions": [ + "versions.yml:md5,4413a2adbdafd7b63e8db3c18bd73314" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-18T16:07:04.203181613" + } +} \ No newline at end of file diff --git a/modules/nf-core/mirtop/stats/environment.yml b/modules/nf-core/mirtop/stats/environment.yml new file mode 100644 index 00000000..1f5deb37 --- /dev/null +++ b/modules/nf-core/mirtop/stats/environment.yml @@ -0,0 +1,13 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + - "bioconda::mirtop=0.4.28" + - "bioconda::samtools=1.21" + - "conda-forge::python=3.11" + - "conda-forge::biopython=1.83" + - "bioconda::pysam=0.22.1" + - "bioconda::pybedtools=0.10.0" + - "conda-forge::pandas=2.2.2" diff --git a/modules/nf-core/mirtop/stats/main.nf b/modules/nf-core/mirtop/stats/main.nf new file mode 100644 index 00000000..4742448b --- /dev/null +++ b/modules/nf-core/mirtop/stats/main.nf @@ -0,0 +1,49 @@ + +process MIRTOP_STATS { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "community.wave.seqera.io/library/mirtop_pybedtools_pysam_samtools_pruned:60b8208f3dbb2910" + + input: + tuple val(meta), path(mirtop_gff) + + output: + tuple val(meta), path("stats/*.txt") , emit: txt + tuple val(meta), path("stats/*_stats.log") , emit: log + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mirtop \\ + stats \\ + $args \\ + --out stats \\ + $mirtop_gff + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirtop: \$(echo \$(mirtop --version 2>&1) | sed 's/^.*mirtop //') + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + mkdir stats + touch stats/${prefix}.txt + touch stats/${prefix}_stats.log + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirtop: \$(echo \$(mirtop --version 2>&1) | sed 's/^.*mirtop //') + END_VERSIONS + """ +} diff --git a/modules/nf-core/mirtop/stats/meta.yml b/modules/nf-core/mirtop/stats/meta.yml new file mode 100644 index 00000000..ebce7d99 --- /dev/null +++ b/modules/nf-core/mirtop/stats/meta.yml @@ -0,0 +1,57 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json +name: "mirtop_stats" +description: mirtop gff gets the number of isomiRs and miRNAs annotated in the GFF + file by isomiR category. +keywords: + - mirna + - isomir + - gff +tools: + - "mirtop": + description: "Small RNA-seq annotation" + homepage: "https://github.com/miRTop/mirtop" + documentation: "https://mirtop.readthedocs.io/en/latest/" + tool_dev_url: "https://github.com/miRTop/mirtop" + licence: ["MIT License"] + identifier: biotools:miRTop + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - mirtop_gff: + type: file + description: Mirtop GFF file obtained with mirtop_gff + pattern: "*.{gff}" +output: + - txt: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - stats/*.txt: + type: file + description: TXT file with stats + pattern: "*.{txt}" + - log: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - stats/*_stats.log: + type: file + description: log file with stats + pattern: "*.{log}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@atrigila" +maintainers: + - "@atrigila" diff --git a/modules/nf-core/mirtop/stats/tests/main.nf.test b/modules/nf-core/mirtop/stats/tests/main.nf.test new file mode 100644 index 00000000..e9d16cdc --- /dev/null +++ b/modules/nf-core/mirtop/stats/tests/main.nf.test @@ -0,0 +1,58 @@ + +nextflow_process { + + name "Test Process MIRTOP_STATS" + script "../main.nf" + process "MIRTOP_STATS" + + tag "modules" + tag "modules_nfcore" + tag "mirtop" + tag "mirtop/stats" + + test("isomir - bam") { + + when { + process { + """ + input[0] = [ + [ id:'mirtop_gff_sample1'], // meta map + file("https://github.com/miRTop/mirtop/raw/master/data/examples/gff/correct_file.gff", checkIfExists: true), + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("isomir - bam - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'mirtop_gff_sample1'], // meta map + file("https://github.com/miRTop/mirtop/raw/master/data/examples/gff/correct_file.gff", checkIfExists: true), + ] """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/mirtop/stats/tests/main.nf.test.snap b/modules/nf-core/mirtop/stats/tests/main.nf.test.snap new file mode 100644 index 00000000..2904830d --- /dev/null +++ b/modules/nf-core/mirtop/stats/tests/main.nf.test.snap @@ -0,0 +1,100 @@ +{ + "isomir - bam - stub": { + "content": [ + { + "0": [ + [ + { + "id": "mirtop_gff_sample1" + }, + "mirtop_gff_sample1.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "mirtop_gff_sample1" + }, + "mirtop_gff_sample1_stats.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e894f8aff6fda2b94339ecd04fb25aed" + ], + "log": [ + [ + { + "id": "mirtop_gff_sample1" + }, + "mirtop_gff_sample1_stats.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "txt": [ + [ + { + "id": "mirtop_gff_sample1" + }, + "mirtop_gff_sample1.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e894f8aff6fda2b94339ecd04fb25aed" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-18T15:02:00.325615537" + }, + "isomir - bam": { + "content": [ + { + "0": [ + [ + { + "id": "mirtop_gff_sample1" + }, + "mirtop_stats.txt:md5,006f8767fe5afd6f66c83a28a1caba63" + ] + ], + "1": [ + [ + { + "id": "mirtop_gff_sample1" + }, + "mirtop_stats.log:md5,8fa28ad20bb1b1a91245f2c1e6613f85" + ] + ], + "2": [ + "versions.yml:md5,e894f8aff6fda2b94339ecd04fb25aed" + ], + "log": [ + [ + { + "id": "mirtop_gff_sample1" + }, + "mirtop_stats.log:md5,8fa28ad20bb1b1a91245f2c1e6613f85" + ] + ], + "txt": [ + [ + { + "id": "mirtop_gff_sample1" + }, + "mirtop_stats.txt:md5,006f8767fe5afd6f66c83a28a1caba63" + ] + ], + "versions": [ + "versions.yml:md5,e894f8aff6fda2b94339ecd04fb25aed" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-18T15:01:49.621322368" + } +} \ No newline at end of file diff --git a/modules/nf-core/mirtrace/qc/environment.yml b/modules/nf-core/mirtrace/qc/environment.yml new file mode 100644 index 00000000..c83822c4 --- /dev/null +++ b/modules/nf-core/mirtrace/qc/environment.yml @@ -0,0 +1,7 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + - "bioconda::mirtrace=1.0.1" diff --git a/modules/nf-core/mirtrace/qc/main.nf b/modules/nf-core/mirtrace/qc/main.nf new file mode 100644 index 00000000..5893c0d7 --- /dev/null +++ b/modules/nf-core/mirtrace/qc/main.nf @@ -0,0 +1,64 @@ +process MIRTRACE_QC { + tag "$meta.id" + label 'process_medium' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/mirtrace:1.0.1--0': + 'biocontainers/mirtrace:1.0.1--0' }" + + input: + tuple val(meta), path(reads), path(mirtrace_config) + val(mirtrace_species) + + output: + tuple val(meta), path ("*.html") , emit: html + tuple val(meta), path ("*.json") , emit: json + tuple val(meta), path ("*.tsv") , emit: tsv + tuple val(meta), path ("qc_passed_reads.all.collapsed/*.{fa,fasta}") , emit: all_fa + tuple val(meta), path ("qc_passed_reads.rnatype_unknown.collapsed/*.{fa,fasta}") , emit: rnatype_unknown_fa + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def mirtrace_mode = mirtrace_config ? "--config ${mirtrace_config}": "${reads}" + + """ + mirtrace qc \\ + --species ${mirtrace_species} \\ + --write-fasta \\ + --output-dir . \\ + --force \\ + ${mirtrace_mode} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirtrace: \$(echo \$(mirtrace -v)) + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + touch ${prefix}.fa + touch ${prefix}.html + touch ${prefix}.json + touch ${prefix}.tsv + + mkdir -p qc_passed_reads.all.collapsed + mkdir -p qc_passed_reads.rnatype_unknown.collapsed + + touch qc_passed_reads.all.collapsed/${prefix}.fa + touch qc_passed_reads.rnatype_unknown.collapsed/${prefix}.fa + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + mirtrace: \$(echo \$(mirtrace -v)) + END_VERSIONS + """ +} diff --git a/modules/nf-core/mirtrace/qc/meta.yml b/modules/nf-core/mirtrace/qc/meta.yml new file mode 100644 index 00000000..e83ab389 --- /dev/null +++ b/modules/nf-core/mirtrace/qc/meta.yml @@ -0,0 +1,109 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json +name: "mirtrace_qc" +description: "A tool for quality control and tracing taxonomic origins of microRNA + sequencing data" +keywords: + - microRNA + - smrnaseq + - QC +tools: + - "mirtrace": + description: "miRTrace is a new quality control and taxonomic tracing tool developed + specifically for small RNA sequencing data (sRNA-Seq). Each sample is characterized + by profiling sequencing quality, read length, sequencing depth and miRNA complexity + and also the amounts of miRNAs versus undesirable sequences (derived from tRNAs, + rRNAs and sequencing artifacts). In addition to these routine quality control + (QC) analyses, miRTrace can accurately and sensitively resolve taxonomic origins + of small RNA-Seq data based on the composition of clade-specific miRNAs. This + feature can be used to detect cross-clade contaminations in typical lab settings. + It can also be applied for more specific applications in forensics, food quality + control and clinical diagnosis, for instance tracing the origins of meat products + or detecting parasitic microRNAs in host serum." + homepage: "https://github.com/friedlanderlab/mirtrace/tree/master" + documentation: "https://github.com/friedlanderlab/mirtrace/blob/master/release-bundle-includes/doc/manual/mirtrace_manual.pdf" + tool_dev_url: "https://github.com/friedlanderlab/mirtrace/tree/master" + doi: "10.1186/s13059-018-1588-9" + licence: ["GPL v2"] + identifier: biotools:miRTrace + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - reads: + type: file + description: microRNA sequencing data + pattern: "*.{fastq,fastq.gz}" + - mirtrace_config: + type: file + description: (Optional) CSV with list of FASTQ files to process with one entry + per row. No headers. Each row consists of the following columns "FASTQ file + path, id, adapter, PHRED-ASCII-offset". + - - mirtrace_species: + type: string + description: Target species in microRNA sequencing data (miRbase encoding, e.g. + “hsa” for Homo sapiens) +output: + - html: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - "*.html": + type: file + description: HTML file + pattern: "*.{html}" + - json: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - "*.json": + type: file + description: JSON file + pattern: "*.{json}" + - tsv: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - "*.tsv": + type: file + description: TSV file + pattern: "*.{tsv}" + - all_fa: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - qc_passed_reads.all.collapsed/*.{fa,fasta}: + type: file + description: QC-passed reads in FASTA file. Identical reads are collapsed. Entries + are sorted by abundance. + pattern: "*.{fa,fasta}" + - rnatype_unknown_fa: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - qc_passed_reads.rnatype_unknown.collapsed/*.{fa,fasta}: + type: file + description: Unknown RNA type QC-passed reads in FASTA file. Identical reads + are collapsed. Entries are sorted by abundance. + pattern: "*.{fa,fasta}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@atrigila" +maintainers: + - "@atrigila" diff --git a/modules/nf-core/mirtrace/qc/tests/main.nf.test b/modules/nf-core/mirtrace/qc/tests/main.nf.test new file mode 100644 index 00000000..d0ef1c7b --- /dev/null +++ b/modules/nf-core/mirtrace/qc/tests/main.nf.test @@ -0,0 +1,129 @@ +nextflow_process { + + name "Test Process MIRTRACE_QC" + script "../main.nf" + process "MIRTRACE_QC" + + tag "modules" + tag "modules_nfcore" + tag "mirtrace" + tag "mirtrace/qc" + + test("human - fastq") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test_rnaseq_1.fastq.gz', checkIfExists: true) + ], + [] + ] + input[1] = "hsa" + """ + } + } + + then { + assertAll( + { assert process.success }, + + // Check HTML + { assert path(process.out.html.get(0).get(1)).text.contains("This file is part of miRTrace.")} , + + // Check JSON + { assert path(process.out.json.get(0).get(1)).json.results[0].stats.uniqueQCPassedSeqsCount == 912 }, + + // Check TSV + { assert snapshot(process.out.tsv).match("tsv") }, + + // Check FASTA files + { assert snapshot(process.out.rnatype_unknown_fa).match("rnatype_unknown_fa") }, + { assert snapshot(process.out.all_fa).match("all_fa") }, + + // Check versions + { assert snapshot(process.out.versions).match("versions") } + + ) + } + + } + + test("human - fastq - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test_rnaseq_1.fastq.gz', checkIfExists: true), + ], + [] + ] + input[1] = "hsa" + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("human - fastq - optional config") { + + when { + process { + """ + ch_reads = Channel.of([ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test_rnaseq_1.fastq.gz', checkIfExists: true) + ] + ]) + + ch_mirtrace_config = Channel.of( + "./test_rnaseq_1.fastq.gz,test_rnaseq_1,TGGAATTCTCGGGTGCCAAGG,33") + .collectFile(name: "mirtrace_config.txt", newLine: true) + + input[0] = ch_reads + .combine(ch_mirtrace_config) + + input[1] = "hsa" + + """ + } + } + + then { + assertAll( + { assert process.success }, + + // Check HTML + { assert path(process.out.html.get(0).get(1)).text.contains("This file is part of miRTrace.")} , + + // Check JSON + { assert path(process.out.json.get(0).get(1)).json.results[0].stats.uniqueQCPassedSeqsCount == 912 }, + + // Check TSV + { assert snapshot(process.out.tsv).match("tsv_config") }, + + // Check FASTA files + { assert snapshot(process.out.rnatype_unknown_fa).match("rnatype_unknown_fa_config") }, + { assert snapshot(process.out.all_fa).match("all_fa_config") }, + + ) + } + + } + +} diff --git a/modules/nf-core/mirtrace/qc/tests/main.nf.test.snap b/modules/nf-core/mirtrace/qc/tests/main.nf.test.snap new file mode 100644 index 00000000..3f5593b7 --- /dev/null +++ b/modules/nf-core/mirtrace/qc/tests/main.nf.test.snap @@ -0,0 +1,242 @@ +{ + "all_fa_config": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_rnaseq_1.fasta:md5,125963c9ee39a49b3a680903e28e7c9d" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-10T18:35:10.022991825" + }, + "all_fa": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test_rnaseq_1.fasta:md5,0181296141177654088bbc2a96b29560" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-10T18:33:02.971744888" + }, + "versions": { + "content": [ + [ + "versions.yml:md5,b50529beb497fc9882232140b636d9ce" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-02T14:00:50.712738593" + }, + "tsv_config": { + "content": [ + [ + [ + { + "id": "test" + }, + [ + "mirtrace-stats-contamination_basic.tsv:md5,ac69ca6d2a709854f1048b635d06e927", + "mirtrace-stats-contamination_detailed.tsv:md5,ef80997ac12662c64cbcf5fe9851e786", + "mirtrace-stats-length.tsv:md5,630b3a845953321ae4cf2fc4a4943ab5", + "mirtrace-stats-mirna-complexity.tsv:md5,ab2a7600a2daa5c1797eea13d0abc2f0", + "mirtrace-stats-phred.tsv:md5,44eaeae26ec629e71fb31e56bfb5a548", + "mirtrace-stats-qcstatus.tsv:md5,623cf3a0c5e363488966844feb0dd978", + "mirtrace-stats-rnatype.tsv:md5,ed8d3a76247a1432a365def87c3f4c67" + ] + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-10T18:35:09.99291948" + }, + "tsv": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "mirtrace-stats-contamination_basic.tsv:md5,ac69ca6d2a709854f1048b635d06e927", + "mirtrace-stats-contamination_detailed.tsv:md5,ef80997ac12662c64cbcf5fe9851e786", + "mirtrace-stats-length.tsv:md5,54ecc0698e8e83d5b1c979b2ee3b1512", + "mirtrace-stats-mirna-complexity.tsv:md5,ab2a7600a2daa5c1797eea13d0abc2f0", + "mirtrace-stats-phred.tsv:md5,44eaeae26ec629e71fb31e56bfb5a548", + "mirtrace-stats-qcstatus.tsv:md5,623cf3a0c5e363488966844feb0dd978", + "mirtrace-stats-rnatype.tsv:md5,ed8d3a76247a1432a365def87c3f4c67" + ] + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-10T18:33:02.856967195" + }, + "rnatype_unknown_fa_config": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_rnaseq_1.fasta:md5,125963c9ee39a49b3a680903e28e7c9d" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-10T18:35:10.010723637" + }, + "human - fastq - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "5": [ + "versions.yml:md5,b50529beb497fc9882232140b636d9ce" + ], + "all_fa": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "rnatype_unknown_fa": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tsv": [ + [ + { + "id": "test", + "single_end": false + }, + "test.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,b50529beb497fc9882232140b636d9ce" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-02T13:43:42.209949119" + }, + "rnatype_unknown_fa": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test_rnaseq_1.fasta:md5,0181296141177654088bbc2a96b29560" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-10T18:33:02.917439516" + } +} \ No newline at end of file diff --git a/modules/nf-core/multiqc/environment.yml b/modules/nf-core/multiqc/environment.yml index ca39fb67..6f5b867b 100644 --- a/modules/nf-core/multiqc/environment.yml +++ b/modules/nf-core/multiqc/environment.yml @@ -1,7 +1,5 @@ -name: multiqc channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::multiqc=1.21 + - bioconda::multiqc=1.25.1 diff --git a/modules/nf-core/multiqc/main.nf b/modules/nf-core/multiqc/main.nf index 47ac352f..cc0643e1 100644 --- a/modules/nf-core/multiqc/main.nf +++ b/modules/nf-core/multiqc/main.nf @@ -3,14 +3,16 @@ process MULTIQC { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/multiqc:1.21--pyhdfd78af_0' : - 'biocontainers/multiqc:1.21--pyhdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/multiqc:1.25.1--pyhdfd78af_0' : + 'biocontainers/multiqc:1.25.1--pyhdfd78af_0' }" input: path multiqc_files, stageAs: "?/*" path(multiqc_config) path(extra_multiqc_config) path(multiqc_logo) + path(replace_names) + path(sample_names) output: path "*multiqc_report.html", emit: report @@ -23,16 +25,22 @@ process MULTIQC { script: def args = task.ext.args ?: '' + def prefix = task.ext.prefix ? "--filename ${task.ext.prefix}.html" : '' def config = multiqc_config ? "--config $multiqc_config" : '' def extra_config = extra_multiqc_config ? "--config $extra_multiqc_config" : '' - def logo = multiqc_logo ? /--cl-config 'custom_logo: "${multiqc_logo}"'/ : '' + def logo = multiqc_logo ? "--cl-config 'custom_logo: \"${multiqc_logo}\"'" : '' + def replace = replace_names ? "--replace-names ${replace_names}" : '' + def samples = sample_names ? "--sample-names ${sample_names}" : '' """ multiqc \\ --force \\ $args \\ $config \\ + $prefix \\ $extra_config \\ $logo \\ + $replace \\ + $samples \\ . cat <<-END_VERSIONS > versions.yml @@ -44,7 +52,7 @@ process MULTIQC { stub: """ mkdir multiqc_data - touch multiqc_plots + mkdir multiqc_plots touch multiqc_report.html cat <<-END_VERSIONS > versions.yml diff --git a/modules/nf-core/multiqc/meta.yml b/modules/nf-core/multiqc/meta.yml index 45a9bc35..b16c1879 100644 --- a/modules/nf-core/multiqc/meta.yml +++ b/modules/nf-core/multiqc/meta.yml @@ -1,5 +1,6 @@ name: multiqc -description: Aggregate results from bioinformatics analyses across many samples into a single report +description: Aggregate results from bioinformatics analyses across many samples into + a single report keywords: - QC - bioinformatics tools @@ -12,40 +13,59 @@ tools: homepage: https://multiqc.info/ documentation: https://multiqc.info/docs/ licence: ["GPL-3.0-or-later"] + identifier: biotools:multiqc input: - - multiqc_files: - type: file - description: | - List of reports / files recognised by MultiQC, for example the html and zip output of FastQC - - multiqc_config: - type: file - description: Optional config yml for MultiQC - pattern: "*.{yml,yaml}" - - extra_multiqc_config: - type: file - description: Second optional config yml for MultiQC. Will override common sections in multiqc_config. - pattern: "*.{yml,yaml}" - - multiqc_logo: - type: file - description: Optional logo file for MultiQC - pattern: "*.{png}" + - - multiqc_files: + type: file + description: | + List of reports / files recognised by MultiQC, for example the html and zip output of FastQC + - - multiqc_config: + type: file + description: Optional config yml for MultiQC + pattern: "*.{yml,yaml}" + - - extra_multiqc_config: + type: file + description: Second optional config yml for MultiQC. Will override common sections + in multiqc_config. + pattern: "*.{yml,yaml}" + - - multiqc_logo: + type: file + description: Optional logo file for MultiQC + pattern: "*.{png}" + - - replace_names: + type: file + description: | + Optional two-column sample renaming file. First column a set of + patterns, second column a set of corresponding replacements. Passed via + MultiQC's `--replace-names` option. + pattern: "*.{tsv}" + - - sample_names: + type: file + description: | + Optional TSV file with headers, passed to the MultiQC --sample_names + argument. + pattern: "*.{tsv}" output: - report: - type: file - description: MultiQC report file - pattern: "multiqc_report.html" + - "*multiqc_report.html": + type: file + description: MultiQC report file + pattern: "multiqc_report.html" - data: - type: directory - description: MultiQC data dir - pattern: "multiqc_data" + - "*_data": + type: directory + description: MultiQC data dir + pattern: "multiqc_data" - plots: - type: file - description: Plots created by MultiQC - pattern: "*_data" + - "*_plots": + type: file + description: Plots created by MultiQC + pattern: "*_data" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@abhi18av" - "@bunop" diff --git a/modules/nf-core/multiqc/tests/main.nf.test b/modules/nf-core/multiqc/tests/main.nf.test index f1c4242e..33316a7d 100644 --- a/modules/nf-core/multiqc/tests/main.nf.test +++ b/modules/nf-core/multiqc/tests/main.nf.test @@ -8,6 +8,8 @@ nextflow_process { tag "modules_nfcore" tag "multiqc" + config "./nextflow.config" + test("sarscov2 single-end [fastqc]") { when { @@ -17,6 +19,8 @@ nextflow_process { input[1] = [] input[2] = [] input[3] = [] + input[4] = [] + input[5] = [] """ } } @@ -41,6 +45,8 @@ nextflow_process { input[1] = Channel.of(file("https://github.com/nf-core/tools/raw/dev/nf_core/pipeline-template/assets/multiqc_config.yml", checkIfExists: true)) input[2] = [] input[3] = [] + input[4] = [] + input[5] = [] """ } } @@ -66,6 +72,8 @@ nextflow_process { input[1] = [] input[2] = [] input[3] = [] + input[4] = [] + input[5] = [] """ } } diff --git a/modules/nf-core/multiqc/tests/main.nf.test.snap b/modules/nf-core/multiqc/tests/main.nf.test.snap index bfebd802..2fcbb5ff 100644 --- a/modules/nf-core/multiqc/tests/main.nf.test.snap +++ b/modules/nf-core/multiqc/tests/main.nf.test.snap @@ -2,14 +2,14 @@ "multiqc_versions_single": { "content": [ [ - "versions.yml:md5,21f35ee29416b9b3073c28733efe4b7d" + "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-29T08:48:55.657331" + "timestamp": "2024-10-02T17:51:46.317523" }, "multiqc_stub": { "content": [ @@ -17,25 +17,25 @@ "multiqc_report.html", "multiqc_data", "multiqc_plots", - "versions.yml:md5,21f35ee29416b9b3073c28733efe4b7d" + "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-29T08:49:49.071937" + "timestamp": "2024-10-02T17:52:20.680978" }, "multiqc_versions_config": { "content": [ [ - "versions.yml:md5,21f35ee29416b9b3073c28733efe4b7d" + "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-29T08:49:25.457567" + "timestamp": "2024-10-02T17:52:09.185842" } } \ No newline at end of file diff --git a/modules/nf-core/multiqc/tests/nextflow.config b/modules/nf-core/multiqc/tests/nextflow.config new file mode 100644 index 00000000..c537a6a3 --- /dev/null +++ b/modules/nf-core/multiqc/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: 'MULTIQC' { + ext.prefix = null + } +} diff --git a/modules/nf-core/samtools/flagstat/environment.yml b/modules/nf-core/samtools/flagstat/environment.yml index bd57cb54..62054fc9 100644 --- a/modules/nf-core/samtools/flagstat/environment.yml +++ b/modules/nf-core/samtools/flagstat/environment.yml @@ -1,8 +1,8 @@ -name: samtools_flagstat +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.19.2 - - bioconda::htslib=1.19.1 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/flagstat/main.nf b/modules/nf-core/samtools/flagstat/main.nf index eb5f5252..4a499727 100644 --- a/modules/nf-core/samtools/flagstat/main.nf +++ b/modules/nf-core/samtools/flagstat/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_FLAGSTAT { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' : - 'biocontainers/samtools:1.19.2--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(bam), path(bai) diff --git a/modules/nf-core/samtools/flagstat/meta.yml b/modules/nf-core/samtools/flagstat/meta.yml index 97991358..cdc4c254 100644 --- a/modules/nf-core/samtools/flagstat/meta.yml +++ b/modules/nf-core/samtools/flagstat/meta.yml @@ -1,5 +1,6 @@ name: samtools_flagstat -description: Counts the number of alignments in a BAM/CRAM/SAM file for each FLAG type +description: Counts the number of alignments in a BAM/CRAM/SAM file for each FLAG + type keywords: - stats - mapping @@ -17,34 +18,37 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - bai: - type: file - description: Index for BAM/CRAM/SAM file - pattern: "*.{bai,crai,sai}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - bai: + type: file + description: Index for BAM/CRAM/SAM file + pattern: "*.{bai,crai,sai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - flagstat: - type: file - description: File containing samtools flagstat output - pattern: "*.{flagstat}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.flagstat": + type: file + description: File containing samtools flagstat output + pattern: "*.{flagstat}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" maintainers: diff --git a/modules/nf-core/samtools/flagstat/tests/main.nf.test b/modules/nf-core/samtools/flagstat/tests/main.nf.test index 24c3c04b..3b648a37 100644 --- a/modules/nf-core/samtools/flagstat/tests/main.nf.test +++ b/modules/nf-core/samtools/flagstat/tests/main.nf.test @@ -11,9 +11,30 @@ nextflow_process { test("BAM") { when { - params { - outdir = "$outputDir" + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true) + ]) + """ } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("BAM - stub") { + + options "-stub" + + when { process { """ input[0] = Channel.of([ @@ -28,8 +49,7 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out.flagstat).match("flagstat") }, - { assert snapshot(process.out.versions).match("versions") } + { assert snapshot(process.out).match() } ) } } diff --git a/modules/nf-core/samtools/flagstat/tests/main.nf.test.snap b/modules/nf-core/samtools/flagstat/tests/main.nf.test.snap index a76fc27e..04c3852b 100644 --- a/modules/nf-core/samtools/flagstat/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/flagstat/tests/main.nf.test.snap @@ -1,32 +1,72 @@ { - "flagstat": { + "BAM - stub": { "content": [ - [ - [ - { - "id": "test", - "single_end": false - }, - "test.flagstat:md5,4f7ffd1e6a5e85524d443209ac97d783" + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,108a155f2d4a99f50bf3176904208d27" + ], + "flagstat": [ + [ + { + "id": "test", + "single_end": false + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,108a155f2d4a99f50bf3176904208d27" ] - ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.04.3" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-12T18:31:37.783927" + "timestamp": "2024-09-16T08:02:58.866491759" }, - "versions": { + "BAM": { "content": [ - [ - "versions.yml:md5,fd0030ce49ab3a92091ad80260226452" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.flagstat:md5,4f7ffd1e6a5e85524d443209ac97d783" + ] + ], + "1": [ + "versions.yml:md5,108a155f2d4a99f50bf3176904208d27" + ], + "flagstat": [ + [ + { + "id": "test", + "single_end": false + }, + "test.flagstat:md5,4f7ffd1e6a5e85524d443209ac97d783" + ] + ], + "versions": [ + "versions.yml:md5,108a155f2d4a99f50bf3176904208d27" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:11:44.299617452" + "timestamp": "2024-09-16T08:02:47.383332837" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/idxstats/environment.yml b/modules/nf-core/samtools/idxstats/environment.yml index 174973b8..62054fc9 100644 --- a/modules/nf-core/samtools/idxstats/environment.yml +++ b/modules/nf-core/samtools/idxstats/environment.yml @@ -1,8 +1,8 @@ -name: samtools_idxstats +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.19.2 - - bioconda::htslib=1.19.1 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/idxstats/main.nf b/modules/nf-core/samtools/idxstats/main.nf index a544026f..c4b5a0a3 100644 --- a/modules/nf-core/samtools/idxstats/main.nf +++ b/modules/nf-core/samtools/idxstats/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_IDXSTATS { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' : - 'biocontainers/samtools:1.19.2--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(bam), path(bai) diff --git a/modules/nf-core/samtools/idxstats/meta.yml b/modules/nf-core/samtools/idxstats/meta.yml index 344e92a3..f0a6bcb2 100644 --- a/modules/nf-core/samtools/idxstats/meta.yml +++ b/modules/nf-core/samtools/idxstats/meta.yml @@ -18,34 +18,37 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - bai: - type: file - description: Index for BAM/CRAM/SAM file - pattern: "*.{bai,crai,sai}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - bai: + type: file + description: Index for BAM/CRAM/SAM file + pattern: "*.{bai,crai,sai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - idxstats: - type: file - description: File containing samtools idxstats output - pattern: "*.{idxstats}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.idxstats": + type: file + description: File containing samtools idxstats output + pattern: "*.{idxstats}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" maintainers: diff --git a/modules/nf-core/samtools/idxstats/tests/main.nf.test b/modules/nf-core/samtools/idxstats/tests/main.nf.test index a2dcb27c..5fd1fc78 100644 --- a/modules/nf-core/samtools/idxstats/tests/main.nf.test +++ b/modules/nf-core/samtools/idxstats/tests/main.nf.test @@ -11,9 +11,6 @@ nextflow_process { test("bam") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -28,9 +25,29 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out.idxstats).match("idxstats") }, - { assert snapshot(process.out.versions).match("versions") } + { assert snapshot(process.out).match() } ) } } -} + + test("bam - stub") { + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + }} diff --git a/modules/nf-core/samtools/idxstats/tests/main.nf.test.snap b/modules/nf-core/samtools/idxstats/tests/main.nf.test.snap index a7050bdc..2cc89a3b 100644 --- a/modules/nf-core/samtools/idxstats/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/idxstats/tests/main.nf.test.snap @@ -1,32 +1,72 @@ { - "versions": { + "bam - stub": { "content": [ - [ - "versions.yml:md5,613dde56f108418039ffcdeeddba397a" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,c8d7394830c3c1e5be150589571534fb" + ], + "idxstats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,c8d7394830c3c1e5be150589571534fb" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:16:50.147462763" + "timestamp": "2024-09-16T08:11:56.466856235" }, - "idxstats": { + "bam": { "content": [ - [ - [ - { - "id": "test", - "single_end": false - }, - "test.idxstats:md5,df60a8c8d6621100d05178c93fb053a2" + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,df60a8c8d6621100d05178c93fb053a2" + ] + ], + "1": [ + "versions.yml:md5,c8d7394830c3c1e5be150589571534fb" + ], + "idxstats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,df60a8c8d6621100d05178c93fb053a2" + ] + ], + "versions": [ + "versions.yml:md5,c8d7394830c3c1e5be150589571534fb" ] - ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.04.3" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-12T18:36:41.561026" + "timestamp": "2024-09-16T08:11:46.311550359" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/index/environment.yml b/modules/nf-core/samtools/index/environment.yml index a5e50649..62054fc9 100644 --- a/modules/nf-core/samtools/index/environment.yml +++ b/modules/nf-core/samtools/index/environment.yml @@ -1,8 +1,8 @@ -name: samtools_index +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.19.2 - - bioconda::htslib=1.19.1 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/index/main.nf b/modules/nf-core/samtools/index/main.nf index dc14f98d..31175610 100644 --- a/modules/nf-core/samtools/index/main.nf +++ b/modules/nf-core/samtools/index/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_INDEX { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' : - 'biocontainers/samtools:1.19.2--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input) @@ -35,10 +35,11 @@ process SAMTOOLS_INDEX { """ stub: + def args = task.ext.args ?: '' + def extension = file(input).getExtension() == 'cram' ? + "crai" : args.contains("-c") ? "csi" : "bai" """ - touch ${input}.bai - touch ${input}.crai - touch ${input}.csi + touch ${input}.${extension} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/samtools/index/meta.yml b/modules/nf-core/samtools/index/meta.yml index 01a4ee03..db8df0d5 100644 --- a/modules/nf-core/samtools/index/meta.yml +++ b/modules/nf-core/samtools/index/meta.yml @@ -15,38 +15,52 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: input file output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bai: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" - - crai: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bai": + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" - csi: - type: file - description: CSI index file - pattern: "*.{csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: CSI index file + pattern: "*.{csi}" + - crai: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@ewels" diff --git a/modules/nf-core/samtools/index/tests/main.nf.test b/modules/nf-core/samtools/index/tests/main.nf.test index bb7756d1..ca34fb5c 100644 --- a/modules/nf-core/samtools/index/tests/main.nf.test +++ b/modules/nf-core/samtools/index/tests/main.nf.test @@ -9,11 +9,7 @@ nextflow_process { tag "samtools/index" test("bai") { - when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -27,18 +23,13 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out.bai).match("bai") }, - { assert snapshot(process.out.versions).match("bai_versions") } + { assert snapshot(process.out).match() } ) } } test("crai") { - when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -52,20 +43,83 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out.crai).match("crai") }, - { assert snapshot(process.out.versions).match("crai_versions") } + { assert snapshot(process.out).match() } ) } } test("csi") { - config "./csi.nextflow.config" when { - params { - outdir = "$outputDir" + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.csi[0][1]).name, + process.out.versions + ).match() } + ) + } + } + + test("bai - stub") { + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("crai - stub") { + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram', checkIfExists: true) + ]) + """ } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("csi - stub") { + options "-stub" + config "./csi.nextflow.config" + + when { process { """ input[0] = Channel.of([ @@ -79,8 +133,7 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert path(process.out.csi.get(0).get(1)).exists() }, - { assert snapshot(process.out.versions).match("csi_versions") } + { assert snapshot(process.out).match() } ) } } diff --git a/modules/nf-core/samtools/index/tests/main.nf.test.snap b/modules/nf-core/samtools/index/tests/main.nf.test.snap index 3dc8e7de..72d65e81 100644 --- a/modules/nf-core/samtools/index/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/index/tests/main.nf.test.snap @@ -1,74 +1,250 @@ { - "crai_versions": { + "csi - stub": { "content": [ - [ - "versions.yml:md5,cc4370091670b64bba7c7206403ffb3e" - ] + { + "0": [ + + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ], + "bai": [ + + ], + "crai": [ + + ], + "csi": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:12:00.324667957" + "timestamp": "2024-09-16T08:21:25.261127166" }, - "csi_versions": { + "crai - stub": { "content": [ - [ - "versions.yml:md5,cc4370091670b64bba7c7206403ffb3e" - ] + { + "0": [ + + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.recalibrated.sorted.cram.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ], + "bai": [ + + ], + "crai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.recalibrated.sorted.cram.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "csi": [ + + ], + "versions": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:12:07.885103162" + "timestamp": "2024-09-16T08:21:12.653194876" }, - "crai": { + "bai - stub": { "content": [ - [ - [ - { - "id": "test", - "single_end": false - }, - "test.paired_end.recalibrated.sorted.cram.crai:md5,14bc3bd5c89cacc8f4541f9062429029" + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ], + "bai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "crai": [ + + ], + "csi": [ + + ], + "versions": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ] - ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.04.3" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-12T18:41:38.446424" + "timestamp": "2024-09-16T08:21:01.854932651" }, - "bai": { + "csi": { "content": [ + "test.paired_end.sorted.bam.csi", [ - [ - { - "id": "test", - "single_end": false - }, - "test.paired_end.sorted.bam.bai:md5,704c10dd1326482448ca3073fdebc2f4" - ] + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.04.3" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-12T18:40:46.579747" + "timestamp": "2024-09-16T08:20:51.485364222" }, - "bai_versions": { + "crai": { "content": [ - [ - "versions.yml:md5,cc4370091670b64bba7c7206403ffb3e" - ] + { + "0": [ + + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.recalibrated.sorted.cram.crai:md5,14bc3bd5c89cacc8f4541f9062429029" + ] + ], + "3": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ], + "bai": [ + + ], + "crai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.recalibrated.sorted.cram.crai:md5,14bc3bd5c89cacc8f4541f9062429029" + ] + ], + "csi": [ + + ], + "versions": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T08:20:40.518873972" + }, + "bai": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.bai:md5,704c10dd1326482448ca3073fdebc2f4" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ], + "bai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.bai:md5,704c10dd1326482448ca3073fdebc2f4" + ] + ], + "crai": [ + + ], + "csi": [ + + ], + "versions": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:11:51.641425452" + "timestamp": "2024-09-16T08:20:21.184050361" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/sort/environment.yml b/modules/nf-core/samtools/sort/environment.yml index 4d898e48..62054fc9 100644 --- a/modules/nf-core/samtools/sort/environment.yml +++ b/modules/nf-core/samtools/sort/environment.yml @@ -1,8 +1,8 @@ -name: samtools_sort +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.19.2 - - bioconda::htslib=1.19.1 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/sort/main.nf b/modules/nf-core/samtools/sort/main.nf index fc374f98..caf3c61a 100644 --- a/modules/nf-core/samtools/sort/main.nf +++ b/modules/nf-core/samtools/sort/main.nf @@ -4,19 +4,19 @@ process SAMTOOLS_SORT { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' : - 'biocontainers/samtools:1.19.2--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta) , path(bam) tuple val(meta2), path(fasta) output: - tuple val(meta), path("*.bam"), emit: bam, optional: true - tuple val(meta), path("*.cram"), emit: cram, optional: true - tuple val(meta), path("*.crai"), emit: crai, optional: true - tuple val(meta), path("*.csi"), emit: csi, optional: true - path "versions.yml" , emit: versions + tuple val(meta), path("*.bam"), emit: bam, optional: true + tuple val(meta), path("*.cram"), emit: cram, optional: true + tuple val(meta), path("*.crai"), emit: crai, optional: true + tuple val(meta), path("*.csi"), emit: csi, optional: true + path "versions.yml", emit: versions when: task.ext.when == null || task.ext.when @@ -32,7 +32,6 @@ process SAMTOOLS_SORT { """ samtools cat \\ - --threads $task.cpus \\ ${bam} \\ | \\ samtools sort \\ @@ -50,10 +49,20 @@ process SAMTOOLS_SORT { """ stub: + def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" + def extension = args.contains("--output-fmt sam") ? "sam" : + args.contains("--output-fmt cram") ? "cram" : + "bam" """ - touch ${prefix}.bam - touch ${prefix}.bam.csi + touch ${prefix}.${extension} + if [ "${extension}" == "bam" ]; + then + touch ${prefix}.${extension}.csi + elif [ "${extension}" == "cram" ]; + then + touch ${prefix}.${extension}.crai + fi cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/samtools/sort/meta.yml b/modules/nf-core/samtools/sort/meta.yml index 341a7d0e..a9dbec5a 100644 --- a/modules/nf-core/samtools/sort/meta.yml +++ b/modules/nf-core/samtools/sort/meta.yml @@ -15,52 +15,73 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM/CRAM/SAM file(s) - pattern: "*.{bam,cram,sam}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Reference genome FASTA file - pattern: "*.{fa,fasta,fna}" - optional: true + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: BAM/CRAM/SAM file(s) + pattern: "*.{bam,cram,sam}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Reference genome FASTA file + pattern: "*.{fa,fasta,fna}" + optional: true output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bam: - type: file - description: Sorted BAM file - pattern: "*.{bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: Sorted BAM file + pattern: "*.{bam}" - cram: - type: file - description: Sorted CRAM file - pattern: "*.{cram}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cram": + type: file + description: Sorted CRAM file + pattern: "*.{cram}" - crai: - type: file - description: CRAM index file (optional) - pattern: "*.crai" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: CRAM index file (optional) + pattern: "*.crai" - csi: - type: file - description: BAM index file (optional) - pattern: "*.csi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: BAM index file (optional) + pattern: "*.csi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@ewels" diff --git a/modules/nf-core/samtools/sort/tests/main.nf.test b/modules/nf-core/samtools/sort/tests/main.nf.test index 8360e2b1..b05e6691 100644 --- a/modules/nf-core/samtools/sort/tests/main.nf.test +++ b/modules/nf-core/samtools/sort/tests/main.nf.test @@ -30,13 +30,83 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out).match() } + { assert snapshot( + process.out.bam, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match()} + ) + } + } + + test("multiple bam") { + + config "./nextflow.config" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam', checkIfExists: true) + ] + ]) + input[1] = Channel.of([ + [ id:'fasta' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + process.out.bam, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match()} ) } } test("cram") { + config "./nextflow_cram.config" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true) + ]) + input[1] = Channel.of([ + [ id:'fasta' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + process.out.cram.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.crai.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match()} + ) + } + } + + test("bam - stub") { + + options "-stub" config "./nextflow.config" when { @@ -62,24 +132,51 @@ nextflow_process { } } - test("bam_stub") { + test("multiple bam - stub") { config "./nextflow.config" - options "-stub" when { - params { - outdir = "$outputDir" + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam', checkIfExists: true) + ] + ]) + input[1] = Channel.of([ + [ id:'fasta' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ]) + """ } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("cram - stub") { + + options "-stub" + config "./nextflow_cram.config" + + when { process { """ input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.bam', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true) ]) input[1] = Channel.of([ [ id:'fasta' ], // meta map - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ]) """ } @@ -88,8 +185,7 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(file(process.out.bam[0][1]).name).match("bam_stub_bam") }, - { assert snapshot(process.out.versions).match("bam_stub_versions") } + { assert snapshot(process.out).match() } ) } } diff --git a/modules/nf-core/samtools/sort/tests/main.nf.test.snap b/modules/nf-core/samtools/sort/tests/main.nf.test.snap index 38477656..469891fe 100644 --- a/modules/nf-core/samtools/sort/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/sort/tests/main.nf.test.snap @@ -1,5 +1,35 @@ { "cram": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test.sorted.cram" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.sorted.cram.crai" + ] + ], + [ + "versions.yml:md5,2659b187d681241451539d4c53500b9f" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T08:49:58.207549273" + }, + "bam - stub": { "content": [ { "0": [ @@ -8,7 +38,7 @@ "id": "test", "single_end": false }, - "test.sorted.bam:md5,bc0b7c25da26384a006ed84cc9e4da23" + "test.sorted.bam:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], "1": [ @@ -23,11 +53,11 @@ "id": "test", "single_end": false }, - "test.sorted.bam.csi:md5,8d4e836c2fed6c0bf874d5e8cdba5831" + "test.sorted.bam.csi:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], "4": [ - "versions.yml:md5,e6d43fefc9a8bff91c2ce6e3a1716eca" + "versions.yml:md5,2659b187d681241451539d4c53500b9f" ], "bam": [ [ @@ -35,7 +65,7 @@ "id": "test", "single_end": false }, - "test.sorted.bam:md5,bc0b7c25da26384a006ed84cc9e4da23" + "test.sorted.bam:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], "crai": [ @@ -50,43 +80,116 @@ "id": "test", "single_end": false }, - "test.sorted.bam.csi:md5,8d4e836c2fed6c0bf874d5e8cdba5831" + "test.sorted.bam.csi:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], "versions": [ - "versions.yml:md5,e6d43fefc9a8bff91c2ce6e3a1716eca" + "versions.yml:md5,2659b187d681241451539d4c53500b9f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-03-04T15:08:00.830294" + "timestamp": "2024-09-16T08:50:08.630951018" }, - "bam_stub_bam": { + "cram - stub": { "content": [ - "test.sorted.bam" + { + "0": [ + + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.sorted.cram:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.sorted.cram.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + + ], + "4": [ + "versions.yml:md5,2659b187d681241451539d4c53500b9f" + ], + "bam": [ + + ], + "crai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.sorted.cram.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "cram": [ + [ + { + "id": "test", + "single_end": false + }, + "test.sorted.cram:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "csi": [ + + ], + "versions": [ + "versions.yml:md5,2659b187d681241451539d4c53500b9f" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.04.3" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-12T19:21:04.364044" + "timestamp": "2024-09-16T08:50:19.061912443" }, - "bam_stub_versions": { + "multiple bam": { "content": [ [ - "versions.yml:md5,e6d43fefc9a8bff91c2ce6e3a1716eca" + [ + { + "id": "test", + "single_end": false + }, + "test.sorted.bam:md5,8a16ba90c7d294cbb4c33ac0f7127a12" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.sorted.bam.csi" + ] + ], + [ + "versions.yml:md5,2659b187d681241451539d4c53500b9f" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.09.0" }, - "timestamp": "2024-02-13T16:15:00.20800281" + "timestamp": "2024-10-08T11:59:55.479443" }, - "bam": { + "multiple bam - stub": { "content": [ { "0": [ @@ -95,7 +198,7 @@ "id": "test", "single_end": false }, - "test.sorted.bam:md5,bc0b7c25da26384a006ed84cc9e4da23" + "test.sorted.bam:md5,8a16ba90c7d294cbb4c33ac0f7127a12" ] ], "1": [ @@ -110,11 +213,11 @@ "id": "test", "single_end": false }, - "test.sorted.bam.csi:md5,8d4e836c2fed6c0bf874d5e8cdba5831" + "test.sorted.bam.csi:md5,d185916eaff9afeb4d0aeab3310371f9" ] ], "4": [ - "versions.yml:md5,e6d43fefc9a8bff91c2ce6e3a1716eca" + "versions.yml:md5,2659b187d681241451539d4c53500b9f" ], "bam": [ [ @@ -122,7 +225,7 @@ "id": "test", "single_end": false }, - "test.sorted.bam:md5,bc0b7c25da26384a006ed84cc9e4da23" + "test.sorted.bam:md5,8a16ba90c7d294cbb4c33ac0f7127a12" ] ], "crai": [ @@ -137,18 +240,48 @@ "id": "test", "single_end": false }, - "test.sorted.bam.csi:md5,8d4e836c2fed6c0bf874d5e8cdba5831" + "test.sorted.bam.csi:md5,d185916eaff9afeb4d0aeab3310371f9" ] ], "versions": [ - "versions.yml:md5,e6d43fefc9a8bff91c2ce6e3a1716eca" + "versions.yml:md5,2659b187d681241451539d4c53500b9f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.09.0" + }, + "timestamp": "2024-10-08T11:36:13.781404" + }, + "bam": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test.sorted.bam:md5,34aa85e86abefe637f7a4a9887f016fc" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.sorted.bam.csi" + ] + ], + [ + "versions.yml:md5,2659b187d681241451539d4c53500b9f" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.09.0" }, - "timestamp": "2024-03-04T15:07:48.773803" + "timestamp": "2024-10-08T11:59:46.372244" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/sort/tests/nextflow_cram.config b/modules/nf-core/samtools/sort/tests/nextflow_cram.config new file mode 100644 index 00000000..3a8c0188 --- /dev/null +++ b/modules/nf-core/samtools/sort/tests/nextflow_cram.config @@ -0,0 +1,8 @@ +process { + + withName: SAMTOOLS_SORT { + ext.prefix = { "${meta.id}.sorted" } + ext.args = "--write-index --output-fmt cram" + } + +} diff --git a/modules/nf-core/samtools/stats/environment.yml b/modules/nf-core/samtools/stats/environment.yml index 67bb0ca4..62054fc9 100644 --- a/modules/nf-core/samtools/stats/environment.yml +++ b/modules/nf-core/samtools/stats/environment.yml @@ -1,8 +1,8 @@ -name: samtools_stats +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.19.2 - - bioconda::htslib=1.19.1 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/stats/main.nf b/modules/nf-core/samtools/stats/main.nf index 52b00f4b..493525a9 100644 --- a/modules/nf-core/samtools/stats/main.nf +++ b/modules/nf-core/samtools/stats/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_STATS { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.19.2--h50ea8bc_0' : - 'biocontainers/samtools:1.19.2--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input), path(input_index) diff --git a/modules/nf-core/samtools/stats/meta.yml b/modules/nf-core/samtools/stats/meta.yml index 735ff812..77b020f7 100644 --- a/modules/nf-core/samtools/stats/meta.yml +++ b/modules/nf-core/samtools/stats/meta.yml @@ -16,43 +16,46 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file from alignment - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAI/CRAI file from alignment - pattern: "*.{bai,crai}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Reference file the CRAM was created with (optional) - pattern: "*.{fasta,fa}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file from alignment + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAI/CRAI file from alignment + pattern: "*.{bai,crai}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Reference file the CRAM was created with (optional) + pattern: "*.{fasta,fa}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - stats: - type: file - description: File containing samtools stats output - pattern: "*.{stats}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.stats": + type: file + description: File containing samtools stats output + pattern: "*.{stats}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@FriederikeHanssen" diff --git a/modules/nf-core/samtools/stats/tests/main.nf.test b/modules/nf-core/samtools/stats/tests/main.nf.test index e3d5cb14..5bc89309 100644 --- a/modules/nf-core/samtools/stats/tests/main.nf.test +++ b/modules/nf-core/samtools/stats/tests/main.nf.test @@ -3,6 +3,7 @@ nextflow_process { name "Test Process SAMTOOLS_STATS" script "../main.nf" process "SAMTOOLS_STATS" + tag "modules" tag "modules_nfcore" tag "samtools" @@ -11,9 +12,6 @@ nextflow_process { test("bam") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -37,9 +35,59 @@ nextflow_process { test("cram") { when { - params { - outdir = "$outputDir" + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram.crai', checkIfExists: true) + ]) + input[1] = Channel.of([ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) + ]) + """ } + } + + then { + assertAll( + {assert process.success}, + {assert snapshot(process.out).match()} + ) + } + } + + test("bam - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true) + ]) + input[1] = [[],[]] + """ + } + } + + then { + assertAll( + {assert process.success}, + {assert snapshot(process.out).match()} + ) + } + } + + test("cram - stub") { + + options "-stub" + + when { process { """ input[0] = Channel.of([ @@ -49,7 +97,7 @@ nextflow_process { ]) input[1] = Channel.of([ [ id:'genome' ], // meta map - file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) ]) """ } diff --git a/modules/nf-core/samtools/stats/tests/main.nf.test.snap b/modules/nf-core/samtools/stats/tests/main.nf.test.snap index 1b7c9ba4..df507be7 100644 --- a/modules/nf-core/samtools/stats/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/stats/tests/main.nf.test.snap @@ -8,11 +8,11 @@ "id": "test", "single_end": false }, - "test.stats:md5,01812900aa4027532906c5d431114233" + "test.stats:md5,a27fe55e49a341f92379bb20a65c6a06" ] ], "1": [ - "versions.yml:md5,0514ceb1769b2a88843e08c1f82624a9" + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" ], "stats": [ [ @@ -20,19 +20,89 @@ "id": "test", "single_end": false }, - "test.stats:md5,01812900aa4027532906c5d431114233" + "test.stats:md5,a27fe55e49a341f92379bb20a65c6a06" ] ], "versions": [ - "versions.yml:md5,0514ceb1769b2a88843e08c1f82624a9" + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:15:25.562429714" + "timestamp": "2024-09-16T09:29:16.767396182" + }, + "bam - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" + ], + "stats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T09:29:29.721580274" + }, + "cram - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" + ], + "stats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T09:29:53.567964304" }, "bam": { "content": [ @@ -43,11 +113,11 @@ "id": "test", "single_end": false }, - "test.stats:md5,5d8681bf541199898c042bf400391d59" + "test.stats:md5,d53a2584376d78942839e9933a34d11b" ] ], "1": [ - "versions.yml:md5,0514ceb1769b2a88843e08c1f82624a9" + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" ], "stats": [ [ @@ -55,18 +125,18 @@ "id": "test", "single_end": false }, - "test.stats:md5,5d8681bf541199898c042bf400391d59" + "test.stats:md5,d53a2584376d78942839e9933a34d11b" ] ], "versions": [ - "versions.yml:md5,0514ceb1769b2a88843e08c1f82624a9" + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:15:07.857611509" + "timestamp": "2024-09-16T09:28:50.73610604" } } \ No newline at end of file diff --git a/modules/nf-core/seqcluster/collapse/environment.yml b/modules/nf-core/seqcluster/collapse/environment.yml new file mode 100644 index 00000000..1263856a --- /dev/null +++ b/modules/nf-core/seqcluster/collapse/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - "bioconda::seqcluster=1.2.9" diff --git a/modules/nf-core/seqcluster/collapse/main.nf b/modules/nf-core/seqcluster/collapse/main.nf new file mode 100644 index 00000000..8c3f2256 --- /dev/null +++ b/modules/nf-core/seqcluster/collapse/main.nf @@ -0,0 +1,51 @@ +process SEQCLUSTER_COLLAPSE { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/seqcluster:1.2.9--pyh5e36f6f_0': + 'biocontainers/seqcluster:1.2.9--pyh5e36f6f_0' }" + + input: + tuple val(meta), path(fastq) + + output: + tuple val(meta), path("*.fastq.gz") , emit: fastq + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + if ("$fastq" == "${prefix}.fastq.gz") error "Input and output names are the same, set prefix in module configuration to disambiguate!" + """ + seqcluster \\ + collapse \\ + $args \\ + -f $fastq \\ + -o collapsed + + gzip collapsed/*_trimmed.fastq + mv collapsed/*_trimmed.fastq.gz ${prefix}.fastq.gz + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + seqcluster: \$(echo \$(seqcluster --version 2>&1) | sed 's/^.*seqcluster //') + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + """ + echo "" | gzip > ${prefix}.fastq.gz + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + seqcluster: \$(echo \$(seqcluster --version 2>&1) | sed 's/^.*seqcluster //') + END_VERSIONS + """ +} diff --git a/modules/nf-core/seqcluster/collapse/meta.yml b/modules/nf-core/seqcluster/collapse/meta.yml new file mode 100644 index 00000000..e3a6f7e3 --- /dev/null +++ b/modules/nf-core/seqcluster/collapse/meta.yml @@ -0,0 +1,49 @@ +name: "seqcluster_collapse" +description: Seqcluster collapse reduces computational complexity by collapsing identical + sequences in a FASTQ file. +keywords: + - smrnaseq + - cluster + - mirna +tools: + - "seqcluster": + description: "Small RNA analysis from NGS data. Seqcluster generates a list of + clusters of small RNA sequences, their genome location, their annotation and + the abundance in all the sample of the project." + homepage: "https://github.com/lpantano/seqcluster" + documentation: "https://github.com/lpantano/seqcluster" + tool_dev_url: "https://github.com/lpantano/seqcluster" + doi: "10.1093/bioinformatics/btr527" + licence: ["MIT"] + identifier: biotools:seqcluster + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - fastq: + type: file + description: FASTQ file + pattern: "*.{fastq.gz}" +output: + - fastq: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - "*.fastq.gz": + type: file + description: FASTQ file + pattern: "*.{fastq.gz}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@atrigila" +maintainers: + - "@atrigila" diff --git a/modules/nf-core/seqcluster/collapse/tests/main.nf.test b/modules/nf-core/seqcluster/collapse/tests/main.nf.test new file mode 100644 index 00000000..ffca4e92 --- /dev/null +++ b/modules/nf-core/seqcluster/collapse/tests/main.nf.test @@ -0,0 +1,58 @@ +nextflow_process { + + name "Test Process SEQCLUSTER_COLLAPSE" + script "../main.nf" + process "SEQCLUSTER_COLLAPSE" + config "./nextflow.config" + + tag "modules" + tag "modules_nfcore" + tag "seqcluster" + tag "seqcluster/collapse" + + test("human - fastq") { + when { + process { + """ + input[0] = [ + [ id:'test_1', single_end:false ], // meta map + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/testdata/trimmed/small_Clone1_N1.fastp.fastq.gz", checkIfExists: true), + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("human - fastq - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test_1', single_end:false ], // meta map + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/testdata/trimmed/small_Clone1_N1.fastp.fastq.gz", checkIfExists: true), + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/seqcluster/collapse/tests/main.nf.test.snap b/modules/nf-core/seqcluster/collapse/tests/main.nf.test.snap new file mode 100644 index 00000000..6f29d70e --- /dev/null +++ b/modules/nf-core/seqcluster/collapse/tests/main.nf.test.snap @@ -0,0 +1,72 @@ +{ + "human - fastq": { + "content": [ + { + "0": [ + [ + { + "id": "test_1", + "single_end": false + }, + "test_1_seqcluster.fastq.gz:md5,21c736de10e306f14ec296eaeb38ef45" + ] + ], + "1": [ + "versions.yml:md5,3eac6df59cc79fecd4d39b164e25a61b" + ], + "fastq": [ + [ + { + "id": "test_1", + "single_end": false + }, + "test_1_seqcluster.fastq.gz:md5,21c736de10e306f14ec296eaeb38ef45" + ] + ], + "versions": [ + "versions.yml:md5,3eac6df59cc79fecd4d39b164e25a61b" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-09T19:48:39.444011681" + }, + "human - fastq - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test_1", + "single_end": false + }, + "test_1_seqcluster.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,3eac6df59cc79fecd4d39b164e25a61b" + ], + "fastq": [ + [ + { + "id": "test_1", + "single_end": false + }, + "test_1_seqcluster.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,3eac6df59cc79fecd4d39b164e25a61b" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-09T19:41:17.429861126" + } +} \ No newline at end of file diff --git a/modules/nf-core/seqcluster/collapse/tests/nextflow.config b/modules/nf-core/seqcluster/collapse/tests/nextflow.config new file mode 100644 index 00000000..eb86c1b2 --- /dev/null +++ b/modules/nf-core/seqcluster/collapse/tests/nextflow.config @@ -0,0 +1,6 @@ +process { + withName: SEQCLUSTER_COLLAPSE { + ext.args = "-m 1 --min_size 15" + ext.prefix = {"${meta.id}_seqcluster"} + } +} diff --git a/modules/nf-core/seqkit/fq2fa/environment.yml b/modules/nf-core/seqkit/fq2fa/environment.yml new file mode 100644 index 00000000..41f3e7de --- /dev/null +++ b/modules/nf-core/seqkit/fq2fa/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::seqkit=2.8.1 diff --git a/modules/nf-core/seqkit/fq2fa/main.nf b/modules/nf-core/seqkit/fq2fa/main.nf new file mode 100644 index 00000000..77462ad0 --- /dev/null +++ b/modules/nf-core/seqkit/fq2fa/main.nf @@ -0,0 +1,48 @@ +process SEQKIT_FQ2FA { + tag "$meta.id" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/seqkit:2.8.1--h9ee0642_0' : + 'biocontainers/seqkit:2.8.1--h9ee0642_0' }" + + input: + tuple val(meta), path(fastq) + + output: + tuple val(meta), path("*.fa.gz"), emit: fasta + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + + """ + seqkit \\ + fq2fa \\ + $args \\ + -j $task.cpus \\ + -o ${prefix}.fa.gz \\ + $fastq + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + seqkit: \$( seqkit | sed '3!d; s/Version: //' ) + END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + echo "" | gzip > ${prefix}.fa.gz + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + seqkit: \$( seqkit | sed '3!d; s/Version: //' ) + END_VERSIONS + """ +} diff --git a/modules/nf-core/seqkit/fq2fa/meta.yml b/modules/nf-core/seqkit/fq2fa/meta.yml new file mode 100644 index 00000000..2241fda9 --- /dev/null +++ b/modules/nf-core/seqkit/fq2fa/meta.yml @@ -0,0 +1,44 @@ +name: "seqkit_fq2fa" +description: Convert FASTQ to FASTA format +keywords: + - fastq + - fasta + - convert +tools: + - "seqkit": + description: "Cross-platform and ultrafast toolkit for FASTA/Q file manipulation, + written by Wei Shen." + homepage: "https://github.com/shenwei356/seqkit" + documentation: "https://bioinf.shenwei.me/seqkit/" + doi: "10.1371/journal.pone.0163962" + licence: ["MIT"] + identifier: biotools:seqkit + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - fastq: + type: file + description: Sequence file in fastq format + pattern: "*.{fastq,fq}.gz" +output: + - fasta: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - "*.fa.gz": + type: file + description: Sequence file in fasta format + pattern: "*.{fasta,fa}.gz" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@d-jch" diff --git a/modules/nf-core/seqkit/fq2fa/tests/main.nf.test b/modules/nf-core/seqkit/fq2fa/tests/main.nf.test new file mode 100644 index 00000000..08f399e7 --- /dev/null +++ b/modules/nf-core/seqkit/fq2fa/tests/main.nf.test @@ -0,0 +1,56 @@ +nextflow_process { + + name "Test Process SEQKIT_FQ2FA" + script "../main.nf" + process "SEQKIT_FQ2FA" + + tag "modules" + tag "modules_nfcore" + tag "seqkit" + tag "seqkit/fq2fa" + + test("sarscov2 - bam") { + + when { + process { + """ + input[0] = [[ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - bam - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [[ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/seqkit/fq2fa/tests/main.nf.test.snap b/modules/nf-core/seqkit/fq2fa/tests/main.nf.test.snap new file mode 100644 index 00000000..b10ff751 --- /dev/null +++ b/modules/nf-core/seqkit/fq2fa/tests/main.nf.test.snap @@ -0,0 +1,72 @@ +{ + "sarscov2 - bam - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,70efc6839fd6443ee9116c082a730f72" + ], + "fasta": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,70efc6839fd6443ee9116c082a730f72" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-13T08:56:21.234724552" + }, + "sarscov2 - bam": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa.gz:md5,f0c5c9110ce19e9ebbc9a6b6baf9e105" + ] + ], + "1": [ + "versions.yml:md5,70efc6839fd6443ee9116c082a730f72" + ], + "fasta": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa.gz:md5,f0c5c9110ce19e9ebbc9a6b6baf9e105" + ] + ], + "versions": [ + "versions.yml:md5,70efc6839fd6443ee9116c082a730f72" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-13T08:55:54.648865102" + } +} \ No newline at end of file diff --git a/modules/nf-core/seqkit/fq2fa/tests/tags.yml b/modules/nf-core/seqkit/fq2fa/tests/tags.yml new file mode 100644 index 00000000..004f102d --- /dev/null +++ b/modules/nf-core/seqkit/fq2fa/tests/tags.yml @@ -0,0 +1,2 @@ +seqkit/fq2fa: + - "modules/nf-core/seqkit/fq2fa/**" diff --git a/modules/nf-core/seqkit/grep/environment.yml b/modules/nf-core/seqkit/grep/environment.yml new file mode 100644 index 00000000..41f3e7de --- /dev/null +++ b/modules/nf-core/seqkit/grep/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::seqkit=2.8.1 diff --git a/modules/nf-core/seqkit/grep/main.nf b/modules/nf-core/seqkit/grep/main.nf new file mode 100644 index 00000000..361e1620 --- /dev/null +++ b/modules/nf-core/seqkit/grep/main.nf @@ -0,0 +1,58 @@ +process SEQKIT_GREP { + tag "$meta.id" + label 'process_low' + + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/seqkit:2.8.1--h9ee0642_0': + 'biocontainers/seqkit:2.8.1--h9ee0642_0' }" + + input: + tuple val(meta), path(sequence) + path pattern + + output: + tuple val(meta), path("*.{fa,fq}") , emit: filter + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + // fasta or fastq. Exact pattern match .fasta or .fa suffix with optional .gz (gzip) suffix + def suffix = task.ext.suffix ?: "${sequence}" ==~ /(.*f[astn]*a(.gz)?$)/ ? "fa" : "fq" + def pattern_file = pattern ? "-f ${pattern}" : "" + + """ + seqkit \\ + grep \\ + $args \\ + --threads $task.cpus \\ + ${pattern_file} \\ + ${sequence} \\ + -o ${prefix}.${suffix} \\ + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + seqkit: \$( seqkit version | sed 's/seqkit v//' ) + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + // fasta or fastq. Exact pattern match .fasta or .fa suffix with optional .gz (gzip) suffix + def suffix = task.ext.suffix ?: "${sequence}" ==~ /(.*f[astn]*a(.gz)?$)/ ? "fa" : "fq" + + """ + echo "" | gzip > ${prefix}.${suffix}.gz + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + seqkit: \$( seqkit version | sed 's/seqkit v//' ) + END_VERSIONS + """ +} diff --git a/modules/nf-core/seqkit/grep/meta.yml b/modules/nf-core/seqkit/grep/meta.yml new file mode 100644 index 00000000..309f8197 --- /dev/null +++ b/modules/nf-core/seqkit/grep/meta.yml @@ -0,0 +1,56 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/yaml-schema.json +name: "seqkit_grep" +description: Select sequences from a large file based on name/ID +keywords: + - filter + - seqkit + - subseq + - grep +tools: + - "seqkit": + description: Cross-platform and ultrafast toolkit for FASTA/Q file manipulation, + written by Wei Shen. + homepage: https://bioinf.shenwei.me/seqkit/usage/ + documentation: https://bioinf.shenwei.me/seqkit/usage/ + tool_dev_url: https://github.com/shenwei356/seqkit/ + doi: "10.1371/journal.pone.0163962" + licence: ["MIT"] + identifier: biotools:seqkit +input: + - - meta: + type: map + description: > + Groovy Map containing sample information e.g. [ id:'test', single_end:false + ] + - sequence: + type: file + description: > + Fasta or fastq file containing sequences to be filtered + pattern: "*.{fa,fna,faa,fasta,fq,fastq}[.gz]" + - - pattern: + type: file + description: > + pattern file (one record per line). If no pattern is given, a string can be + specificied within the args using '-p pattern_string' + pattern: "*.{txt,tsv}" +output: + - filter: + - meta: + type: map + description: > + Groovy Map containing sample information e.g. [ id:'test', single_end:false + ] + - "*.{fa,fq}.gz": + type: file + description: > + Fasta or fastq file containing the filtered sequences + pattern: "*.{fa,fq}[.gz]" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@Joon-Klaps" +maintainers: + - "@Joon-Klaps" diff --git a/modules/nf-core/seqkit/grep/seqkit-grep.diff b/modules/nf-core/seqkit/grep/seqkit-grep.diff new file mode 100644 index 00000000..16db0fbc --- /dev/null +++ b/modules/nf-core/seqkit/grep/seqkit-grep.diff @@ -0,0 +1,23 @@ +Changes in module 'nf-core/seqkit/grep' +--- modules/nf-core/seqkit/grep/main.nf ++++ modules/nf-core/seqkit/grep/main.nf +@@ -13,7 +13,7 @@ + path pattern + + output: +- tuple val(meta), path("*.{fa,fq}.gz") , emit: filter ++ tuple val(meta), path("*.{fa,fq}") , emit: filter + path "versions.yml" , emit: versions + + when: +@@ -33,7 +33,7 @@ + --threads $task.cpus \\ + ${pattern_file} \\ + ${sequence} \\ +- -o ${prefix}.${suffix}.gz \\ ++ -o ${prefix}.${suffix} \\ + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + +************************************************************ diff --git a/modules/nf-core/seqkit/grep/tests/main.nf.test b/modules/nf-core/seqkit/grep/tests/main.nf.test new file mode 100644 index 00000000..93e4c07f --- /dev/null +++ b/modules/nf-core/seqkit/grep/tests/main.nf.test @@ -0,0 +1,80 @@ +nextflow_process { + + name "Test Process SEQKIT_GREP" + script "../main.nf" + process "SEQKIT_GREP" + + tag "modules" + tag "modules_nfcore" + tag "seqkit" + tag "seqkit/grep" + + test("with_file") { + + when { + process { + """ + input[0] = [[ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) + ] + input[1] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.header', checkIfExists: true) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("with_pattern") { + config "./nextflow.config" + when { + process { + """ + input[0] = [[ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) + ] + input[1] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("with_file - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [[ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) + ] + input[1] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.header', checkIfExists: true) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/seqkit/grep/tests/main.nf.test.snap b/modules/nf-core/seqkit/grep/tests/main.nf.test.snap new file mode 100644 index 00000000..2ab6f4b8 --- /dev/null +++ b/modules/nf-core/seqkit/grep/tests/main.nf.test.snap @@ -0,0 +1,107 @@ +{ + "with_file": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa.gz:md5,69bd44ef67566a76d6cbb8aa4a25ae35" + ] + ], + "1": [ + "versions.yml:md5,2b6d2bf7e727f835a915128c179cebfa" + ], + "filter": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa.gz:md5,69bd44ef67566a76d6cbb8aa4a25ae35" + ] + ], + "versions": [ + "versions.yml:md5,2b6d2bf7e727f835a915128c179cebfa" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-08T10:33:47.542049206" + }, + "with_pattern": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa.gz:md5,69bd44ef67566a76d6cbb8aa4a25ae35" + ] + ], + "1": [ + "versions.yml:md5,2b6d2bf7e727f835a915128c179cebfa" + ], + "filter": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa.gz:md5,69bd44ef67566a76d6cbb8aa4a25ae35" + ] + ], + "versions": [ + "versions.yml:md5,2b6d2bf7e727f835a915128c179cebfa" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-08T10:38:28.77443193" + }, + "with_file - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,2b6d2bf7e727f835a915128c179cebfa" + ], + "filter": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fa.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,2b6d2bf7e727f835a915128c179cebfa" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-08T10:34:12.85678016" + } +} \ No newline at end of file diff --git a/modules/nf-core/seqkit/grep/tests/nextflow.config b/modules/nf-core/seqkit/grep/tests/nextflow.config new file mode 100644 index 00000000..cd3aa8b4 --- /dev/null +++ b/modules/nf-core/seqkit/grep/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: SEQKIT_GREP { + ext.args = "-p chr21" + } +} diff --git a/modules/nf-core/seqkit/grep/tests/tags.yml b/modules/nf-core/seqkit/grep/tests/tags.yml new file mode 100644 index 00000000..ada66a88 --- /dev/null +++ b/modules/nf-core/seqkit/grep/tests/tags.yml @@ -0,0 +1,2 @@ +seqkit/grep: + - "modules/nf-core/seqkit/grep/**" diff --git a/modules/nf-core/seqkit/replace/environment.yml b/modules/nf-core/seqkit/replace/environment.yml new file mode 100644 index 00000000..41f3e7de --- /dev/null +++ b/modules/nf-core/seqkit/replace/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::seqkit=2.8.1 diff --git a/modules/nf-core/seqkit/replace/main.nf b/modules/nf-core/seqkit/replace/main.nf new file mode 100644 index 00000000..70811c8b --- /dev/null +++ b/modules/nf-core/seqkit/replace/main.nf @@ -0,0 +1,59 @@ +process SEQKIT_REPLACE { + tag "$meta.id" + label 'process_low' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/seqkit:2.8.1--h9ee0642_0': + 'biocontainers/seqkit:2.8.1--h9ee0642_0' }" + + input: + tuple val(meta), path(fastx) + + output: + tuple val(meta), path("*.fast*"), emit: fastx + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def extension = "fastq" + if ("$fastx" ==~ /.+\.fasta|.+\.fasta.gz|.+\.fa|.+\.fa.gz|.+\.fas|.+\.fas.gz|.+\.fna|.+\.fna.gz/) { + extension = "fasta" + } + def endswith = task.ext.suffix ?: "${extension}.gz" + """ + seqkit \\ + replace \\ + ${args} \\ + --threads ${task.cpus} \\ + -i ${fastx} \\ + -o ${prefix}.${endswith} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + seqkit: \$( seqkit version | sed 's/seqkit v//' ) + END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + def extension = "fastq" + if ("$fastx" ==~ /.+\.fasta|.+\.fasta.gz|.+\.fa|.+\.fa.gz|.+\.fas|.+\.fas.gz|.+\.fna|.+\.fna.gz/) { + extension = "fasta" + } + def endswith = task.ext.suffix ?: "${extension}.gz" + + """ + echo "" | gzip > ${prefix}.${endswith} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + seqkit: \$( seqkit version | sed 's/seqkit v//' ) + END_VERSIONS + """ + +} diff --git a/modules/nf-core/seqkit/replace/meta.yml b/modules/nf-core/seqkit/replace/meta.yml new file mode 100644 index 00000000..1be01079 --- /dev/null +++ b/modules/nf-core/seqkit/replace/meta.yml @@ -0,0 +1,47 @@ +name: seqkit_replace +description: Use seqkit to find/replace strings within sequences and sequence headers +keywords: + - seqkit + - replace + - sequence + - sequence headers + - fasta +tools: + - seqkit: + description: Cross-platform and ultrafast toolkit for FASTA/Q file manipulation, + written by Wei Shen. + homepage: https://bioinf.shenwei.me/seqkit/usage/ + documentation: https://bioinf.shenwei.me/seqkit/usage/ + tool_dev_url: https://github.com/shenwei356/seqkit/ + doi: "10.1371/journal.pone.016396" + identifier: biotools:seqkit +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fastx: + type: file + description: fasta/q file + pattern: "*.{fasta,fastq,fa,fq,fas,fna,faa}*" +output: + - fastx: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fast*": + type: file + description: fasta/q file with replaced values + pattern: "*.{fasta,fastq,fa,fq,fas,fna,faa}*" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@mjcipriano" +maintainers: + - "@mjcipriano" diff --git a/modules/nf-core/seqkit/replace/tests/main.nf.test b/modules/nf-core/seqkit/replace/tests/main.nf.test new file mode 100644 index 00000000..759974c1 --- /dev/null +++ b/modules/nf-core/seqkit/replace/tests/main.nf.test @@ -0,0 +1,81 @@ +nextflow_process { + + name "Test Process SEQKIT_REPLACE" + script "../main.nf" + process "SEQKIT_REPLACE" + + tag "modules" + tag "modules_nfcore" + tag "seqkit" + tag "seqkit/replace" + + test("sarscov2 - fasta - replace") { + + config "./replace.config" + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - fasta - uncomp") { + + config "./uncomp.config" + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - fasta - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/seqkit/replace/tests/main.nf.test.snap b/modules/nf-core/seqkit/replace/tests/main.nf.test.snap new file mode 100644 index 00000000..24e1887f --- /dev/null +++ b/modules/nf-core/seqkit/replace/tests/main.nf.test.snap @@ -0,0 +1,101 @@ +{ + "sarscov2 - fasta - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.fasta.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,d0b955de076997af3989d2ce5b5417b6" + ], + "fastx": [ + [ + { + "id": "test" + }, + "test.fasta.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,d0b955de076997af3989d2ce5b5417b6" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-08T11:10:12.100214525" + }, + "sarscov2 - fasta - replace": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.fasta.gz:md5,b1518908253a4997fcad98270751112e" + ] + ], + "1": [ + "versions.yml:md5,d0b955de076997af3989d2ce5b5417b6" + ], + "fastx": [ + [ + { + "id": "test" + }, + "test.fasta.gz:md5,b1518908253a4997fcad98270751112e" + ] + ], + "versions": [ + "versions.yml:md5,d0b955de076997af3989d2ce5b5417b6" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-07T16:23:57.895160549" + }, + "sarscov2 - fasta - uncomp": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test..fasta:md5,05d3294a62c72f5489f067c1da3c2f6c" + ] + ], + "1": [ + "versions.yml:md5,d0b955de076997af3989d2ce5b5417b6" + ], + "fastx": [ + [ + { + "id": "test" + }, + "test..fasta:md5,05d3294a62c72f5489f067c1da3c2f6c" + ] + ], + "versions": [ + "versions.yml:md5,d0b955de076997af3989d2ce5b5417b6" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-07T16:24:09.142463316" + } +} \ No newline at end of file diff --git a/modules/nf-core/seqkit/replace/tests/replace.config b/modules/nf-core/seqkit/replace/tests/replace.config new file mode 100644 index 00000000..8766447c --- /dev/null +++ b/modules/nf-core/seqkit/replace/tests/replace.config @@ -0,0 +1,5 @@ + process { + withName: 'SEQKIT_REPLACE' { + ext.args = "-s -p 'A' -r 'N'" + } + } diff --git a/modules/nf-core/seqkit/replace/tests/tags.yml b/modules/nf-core/seqkit/replace/tests/tags.yml new file mode 100644 index 00000000..b42ee48d --- /dev/null +++ b/modules/nf-core/seqkit/replace/tests/tags.yml @@ -0,0 +1,2 @@ +seqkit/replace: + - "modules/nf-core/seqkit/replace/**" diff --git a/modules/nf-core/seqkit/replace/tests/uncomp.config b/modules/nf-core/seqkit/replace/tests/uncomp.config new file mode 100644 index 00000000..dbd892b5 --- /dev/null +++ b/modules/nf-core/seqkit/replace/tests/uncomp.config @@ -0,0 +1,6 @@ + process { + withName: 'SEQKIT_REPLACE' { + ext.args = "-s -p 'T' -r 'N'" + ext.suffix = ".fasta" + } + } diff --git a/modules/nf-core/umicollapse/environment.yml b/modules/nf-core/umicollapse/environment.yml index 8dbc65dc..3847980d 100644 --- a/modules/nf-core/umicollapse/environment.yml +++ b/modules/nf-core/umicollapse/environment.yml @@ -1,7 +1,5 @@ -name: umicollapse channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::umicollapse=1.0.0 diff --git a/modules/nf-core/umicollapse/meta.yml b/modules/nf-core/umicollapse/meta.yml index c1361f9a..8b366c24 100644 --- a/modules/nf-core/umicollapse/meta.yml +++ b/modules/nf-core/umicollapse/meta.yml @@ -1,58 +1,76 @@ ---- name: "umicollapse" -description: Deduplicate reads based on the mapping co-ordinate and the UMI attached to the read. +description: Deduplicate reads based on the mapping co-ordinate and the UMI attached + to the read. keywords: - umicollapse - deduplication - genomics tools: - "umicollapse": - description: "UMICollapse contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs)." + description: "UMICollapse contains tools for dealing with Unique Molecular Identifiers + (UMIs)/Random Molecular Tags (RMTs)." homepage: "https://github.com/Daniel-Liu-c0deb0t/UMICollapse" documentation: "https://github.com/Daniel-Liu-c0deb0t/UMICollapse" tool_dev_url: "https://github.com/Daniel-Liu-c0deb0t/UMICollapse" doi: "10.7717/peerj.8275" licence: ["MIT"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: | - BAM file containing reads to be deduplicated via UMIs. - pattern: "*.{bam}" - - bai: - type: file - description: | - BAM index files corresponding to the input BAM file. Optionally can be skipped using [] when using FastQ input. - pattern: "*.{bai}" - - mode: - type: string - description: | - Selects the mode of Umicollapse - either fastq or bam need to be provided. - pattern: "{fastq,bam}" - + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: Input bam file + pattern: "*.bam" + - bai: + type: file + description: | + BAM index files corresponding to the input BAM file. Optionally can be skipped using [] when using FastQ input. + pattern: "*.{bai}" + - - mode: + type: string + description: | + Selects the mode of Umicollapse - either fastq or bam need to be provided. + pattern: "{fastq,bam}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bam: - type: file - description: BAM file with deduplicated UMIs. - pattern: "*.{bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: BAM file with deduplicated UMIs. + pattern: "*.{bam}" + - fastq: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*dedup*fastq.gz": + type: file + description: FASTQ file with deduplicated UMIs. + pattern: "*dedup*fastq.gz" - log: - type: file - description: A log file with the deduplication statistics. - pattern: "*_{UMICollapse.log}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_UMICollapse.log": + type: file + description: A log file with the deduplication statistics. + pattern: "*_{UMICollapse.log}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@CharlotteAnne" - "@chris-cheshire" diff --git a/modules/nf-core/umicollapse/tests/main.nf.test b/modules/nf-core/umicollapse/tests/main.nf.test index 2dec45b2..cc28359a 100644 --- a/modules/nf-core/umicollapse/tests/main.nf.test +++ b/modules/nf-core/umicollapse/tests/main.nf.test @@ -22,8 +22,8 @@ nextflow_process { input[0] = [ [ id:'test', single_end:true ], // meta map [ - file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_2_fastq_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ] """ @@ -34,7 +34,7 @@ nextflow_process { script "../../bwa/index/main.nf" process{ """ - input[0] = [[ id:'sarscov2'],file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)] + input[0] = [[ id:'sarscov2'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] """ } } @@ -44,7 +44,7 @@ nextflow_process { """ input[0] = UMITOOLS_EXTRACT.out.reads input[1] = BWA_INDEX.out.index - input[2] = [[ id:'sarscov2'],file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)] + input[2] = [[ id:'sarscov2'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] input[3] = true """ } @@ -90,8 +90,8 @@ nextflow_process { input[0] = [ [ id:'test', single_end:false ], // meta map [ - file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_2_fastq_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ] """ @@ -104,7 +104,7 @@ nextflow_process { """ input[0] = [ [ id:'sarscov2'], - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] """ } @@ -115,7 +115,7 @@ nextflow_process { """ input[0] = UMITOOLS_EXTRACT.out.reads input[1] = BWA_INDEX.out.index - input[2] = [[ id:'sarscov2'],file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)] + input[2] = [[ id:'sarscov2'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] input[3] = true """ } @@ -159,7 +159,7 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:true ], // meta map - file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), [] ] input[1] = 'fastq' @@ -188,8 +188,8 @@ nextflow_process { input[0] = [ [ id:'test', single_end:false ], // meta map [ - file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_2_fastq_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ] """ @@ -202,7 +202,7 @@ nextflow_process { """ input[0] = [ [ id:'sarscov2'], - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] """ } @@ -213,7 +213,7 @@ nextflow_process { """ input[0] = UMITOOLS_EXTRACT.out.reads input[1] = BWA_INDEX.out.index - input[2] = [[ id:'sarscov2'],file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)] + input[2] = [[ id:'sarscov2'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] input[3] = true """ } diff --git a/modules/nf-core/umicollapse/tests/main.nf.test.snap b/modules/nf-core/umicollapse/tests/main.nf.test.snap index 861e9ca6..3f393eac 100644 --- a/modules/nf-core/umicollapse/tests/main.nf.test.snap +++ b/modules/nf-core/umicollapse/tests/main.nf.test.snap @@ -7,7 +7,7 @@ "id": "test", "single_end": true }, - "test.dedup.bam:md5,05c5331185263cbee6f508c0669be864" + "test.dedup.bam:md5,89e844724f73fae9e7100506d0be5775" ] ], [ @@ -18,7 +18,7 @@ "nf-test": "0.8.4", "nextflow": "23.10.1" }, - "timestamp": "2024-03-14T13:41:23.869211282" + "timestamp": "2024-05-20T08:47:11.402203361" }, "umicollapse fastq tests": { "content": [ @@ -108,7 +108,7 @@ "id": "test", "single_end": false }, - "test.dedup.bam:md5,f4f05467cb456309fe22851d8b4d4387" + "test.dedup.bam:md5,3e2ae4701e3d2ca074ea878a314a3e4f" ] ], [ @@ -119,6 +119,6 @@ "nf-test": "0.8.4", "nextflow": "23.10.1" }, - "timestamp": "2024-03-14T13:41:54.486079388" + "timestamp": "2024-05-20T08:47:30.028323337" } } \ No newline at end of file diff --git a/modules/nf-core/umitools/extract/environment.yml b/modules/nf-core/umitools/extract/environment.yml index aab452d1..9f9e03c4 100644 --- a/modules/nf-core/umitools/extract/environment.yml +++ b/modules/nf-core/umitools/extract/environment.yml @@ -1,7 +1,5 @@ -name: umitools_extract channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::umi_tools=1.1.5 diff --git a/modules/nf-core/umitools/extract/main.nf b/modules/nf-core/umitools/extract/main.nf index 8719e5f6..b97900e0 100644 --- a/modules/nf-core/umitools/extract/main.nf +++ b/modules/nf-core/umitools/extract/main.nf @@ -53,4 +53,22 @@ process UMITOOLS_EXTRACT { END_VERSIONS """ } + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + if (meta.single_end) { + output_command = "echo '' | gzip > ${prefix}.umi_extract.fastq.gz" + } else { + output_command = "echo '' | gzip > ${prefix}.umi_extract_1.fastq.gz ;" + output_command += "echo '' | gzip > ${prefix}.umi_extract_2.fastq.gz" + } + """ + touch ${prefix}.umi_extract.log + ${output_command} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + umitools: \$( umi_tools --version | sed '/version:/!d; s/.*: //' ) + END_VERSIONS + """ } diff --git a/modules/nf-core/umitools/extract/meta.yml b/modules/nf-core/umitools/extract/meta.yml index 7695b271..648ffbd2 100644 --- a/modules/nf-core/umitools/extract/meta.yml +++ b/modules/nf-core/umitools/extract/meta.yml @@ -1,5 +1,6 @@ name: umitools_extract -description: Extracts UMI barcode from a read and add it to the read name, leaving any sample barcode in place +description: Extracts UMI barcode from a read and add it to the read name, leaving + any sample barcode in place keywords: - UMI - barcode @@ -8,38 +9,49 @@ keywords: tools: - umi_tools: description: > - UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes + UMI-tools contains tools for dealing with Unique Molecular Identifiers (UMIs)/Random + Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes documentation: https://umi-tools.readthedocs.io/en/latest/ license: "MIT" + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - reads: - type: list - description: | - List of input FASTQ files whose UMIs will be extracted. + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: list + description: | + List of input FASTQ files whose UMIs will be extracted. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - reads: - type: file - description: > - Extracted FASTQ files. | For single-end reads, pattern is \${prefix}.umi_extract.fastq.gz. | For paired-end reads, pattern is \${prefix}.umi_extract_{1,2}.fastq.gz. - pattern: "*.{fastq.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fastq.gz": + type: file + description: > + Extracted FASTQ files. | For single-end reads, pattern is \${prefix}.umi_extract.fastq.gz. + | For paired-end reads, pattern is \${prefix}.umi_extract_{1,2}.fastq.gz. + pattern: "*.{fastq.gz}" - log: - type: file - description: Logfile for umi_tools - pattern: "*.{log}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.log": + type: file + description: Logfile for umi_tools + pattern: "*.{log}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@grst" diff --git a/modules/nf-core/umitools/extract/tests/main.nf.test b/modules/nf-core/umitools/extract/tests/main.nf.test index 2a8eba15..bb8a0658 100644 --- a/modules/nf-core/umitools/extract/tests/main.nf.test +++ b/modules/nf-core/umitools/extract/tests/main.nf.test @@ -9,7 +9,7 @@ nextflow_process { tag "umitools" tag "umitools/extract" - test("Should run without failures") { + test("single end") { when { process { @@ -24,7 +24,82 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out.versions).match("versions") } + { assert snapshot( + process.out.reads.collect { it.collect { it instanceof Map ? it : file(it).name }}, + process.out.log.collect { it.collect { it instanceof Map ? it : file(it).name }}, + process.out.versions + ).match() } + ) + } + } + + test("single end - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + "genomics/sarscov2/illumina/fastq/test_1.fastq.gz", checkIfExists: true) ] + ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("pair end") { + + when { + process { + """ + input[0] = [ [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + "genomics/sarscov2/illumina/fastq/test_1.fastq.gz", checkIfExists: true), + file(params.modules_testdata_base_path + "genomics/sarscov2/illumina/fastq/test_2.fastq.gz", checkIfExists: true) ] + ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.reads[0][1][0]).name, + file(process.out.reads[0][1][1]).name, + process.out.log.collect { it.collect { it instanceof Map ? it : file(it).name }}, + process.out.versions + ).match() } + ) + } + } + + test("pair end - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + "genomics/sarscov2/illumina/fastq/test_1.fastq.gz", checkIfExists: true), + file(params.modules_testdata_base_path + "genomics/sarscov2/illumina/fastq/test_2.fastq.gz", checkIfExists: true) ] + ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } ) } } diff --git a/modules/nf-core/umitools/extract/tests/main.nf.test.snap b/modules/nf-core/umitools/extract/tests/main.nf.test.snap index bf82701d..b1159054 100644 --- a/modules/nf-core/umitools/extract/tests/main.nf.test.snap +++ b/modules/nf-core/umitools/extract/tests/main.nf.test.snap @@ -1,14 +1,167 @@ { - "versions": { + "pair end - stub": { "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test.umi_extract_1.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.umi_extract_2.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.umi_extract.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,568d243174c081a0301e74ed42e59b48" + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.umi_extract.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test.umi_extract_1.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.umi_extract_2.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "versions": [ + "versions.yml:md5,568d243174c081a0301e74ed42e59b48" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-02T15:05:20.008312" + }, + "single end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.umi_extract.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.umi_extract.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,568d243174c081a0301e74ed42e59b48" + ], + "log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.umi_extract.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.umi_extract.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,568d243174c081a0301e74ed42e59b48" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-02T15:04:12.145999" + }, + "pair end": { + "content": [ + "test.umi_extract_1.fastq.gz", + "test.umi_extract_2.fastq.gz", + [ + [ + { + "id": "test", + "single_end": false + }, + "test.umi_extract.log" + ] + ], + [ + "versions.yml:md5,568d243174c081a0301e74ed42e59b48" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-02T15:21:09.578031" + }, + "single end": { + "content": [ + [ + [ + { + "id": "test", + "single_end": true + }, + "test.umi_extract.fastq.gz" + ] + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.umi_extract.log" + ] + ], [ "versions.yml:md5,568d243174c081a0301e74ed42e59b48" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-16T10:01:33.326046137" + "timestamp": "2024-07-02T15:03:52.464606" } } \ No newline at end of file diff --git a/modules/nf-core/untarfiles/environment.yml b/modules/nf-core/untar/environment.yml similarity index 50% rename from modules/nf-core/untarfiles/environment.yml rename to modules/nf-core/untar/environment.yml index e479f80d..c7794856 100644 --- a/modules/nf-core/untarfiles/environment.yml +++ b/modules/nf-core/untar/environment.yml @@ -1,9 +1,7 @@ -name: untarfiles channels: - conda-forge - bioconda - - defaults dependencies: - - conda-forge::sed=4.7 - - bioconda::grep=3.4 + - conda-forge::grep=3.11 + - conda-forge::sed=4.8 - conda-forge::tar=1.34 diff --git a/modules/nf-core/untar/main.nf b/modules/nf-core/untar/main.nf new file mode 100644 index 00000000..9bd8f554 --- /dev/null +++ b/modules/nf-core/untar/main.nf @@ -0,0 +1,84 @@ +process UNTAR { + tag "$archive" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/ubuntu:22.04' : + 'nf-core/ubuntu:22.04' }" + + input: + tuple val(meta), path(archive) + + output: + tuple val(meta), path("$prefix"), emit: untar + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' + prefix = task.ext.prefix ?: ( meta.id ? "${meta.id}" : archive.baseName.toString().replaceFirst(/\.tar$/, "")) + + """ + mkdir $prefix + + ## Ensures --strip-components only applied when top level of tar contents is a directory + ## If just files or multiple directories, place all in prefix + if [[ \$(tar -taf ${archive} | grep -o -P "^.*?\\/" | uniq | wc -l) -eq 1 ]]; then + tar \\ + -C $prefix --strip-components 1 \\ + -xavf \\ + $args \\ + $archive \\ + $args2 + else + tar \\ + -C $prefix \\ + -xavf \\ + $args \\ + $archive \\ + $args2 + fi + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + untar: \$(echo \$(tar --version 2>&1) | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//') + END_VERSIONS + """ + + stub: + prefix = task.ext.prefix ?: ( meta.id ? "${meta.id}" : archive.toString().replaceFirst(/\.[^\.]+(.gz)?$/, "")) + """ + mkdir ${prefix} + ## Dry-run untaring the archive to get the files and place all in prefix + if [[ \$(tar -taf ${archive} | grep -o -P "^.*?\\/" | uniq | wc -l) -eq 1 ]]; then + for i in `tar -tf ${archive}`; + do + if [[ \$(echo "\${i}" | grep -E "/\$") == "" ]]; + then + touch \${i} + else + mkdir -p \${i} + fi + done + else + for i in `tar -tf ${archive}`; + do + if [[ \$(echo "\${i}" | grep -E "/\$") == "" ]]; + then + touch ${prefix}/\${i} + else + mkdir -p ${prefix}/\${i} + fi + done + fi + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + untar: \$(echo \$(tar --version 2>&1) | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/untar/meta.yml b/modules/nf-core/untar/meta.yml new file mode 100644 index 00000000..290346b3 --- /dev/null +++ b/modules/nf-core/untar/meta.yml @@ -0,0 +1,49 @@ +name: untar +description: Extract files. +keywords: + - untar + - uncompress + - extract +tools: + - untar: + description: | + Extract tar.gz files. + documentation: https://www.gnu.org/software/tar/manual/ + licence: ["GPL-3.0-or-later"] + identifier: "" +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - archive: + type: file + description: File to be untar + pattern: "*.{tar}.{gz}" +output: + - untar: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - $prefix: + type: directory + description: Directory containing contents of archive + pattern: "*/" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@joseespinosa" + - "@drpatelh" + - "@matthdsm" + - "@jfy133" +maintainers: + - "@joseespinosa" + - "@drpatelh" + - "@matthdsm" + - "@jfy133" diff --git a/modules/nf-core/untar/tests/main.nf.test b/modules/nf-core/untar/tests/main.nf.test new file mode 100644 index 00000000..c957517a --- /dev/null +++ b/modules/nf-core/untar/tests/main.nf.test @@ -0,0 +1,85 @@ +nextflow_process { + + name "Test Process UNTAR" + script "../main.nf" + process "UNTAR" + tag "modules" + tag "modules_nfcore" + tag "untar" + + test("test_untar") { + + when { + process { + """ + input[0] = [ [], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/db/kraken2.tar.gz', checkIfExists: true) ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() }, + ) + } + } + + test("test_untar_onlyfiles") { + + when { + process { + """ + input[0] = [ [], file(params.modules_testdata_base_path + 'generic/tar/hello.tar.gz', checkIfExists: true) ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() }, + ) + } + } + + test("test_untar - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ [], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/db/kraken2.tar.gz', checkIfExists: true) ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() }, + ) + } + } + + test("test_untar_onlyfiles - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ [], file(params.modules_testdata_base_path + 'generic/tar/hello.tar.gz', checkIfExists: true) ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() }, + ) + } + } +} diff --git a/modules/nf-core/untar/tests/main.nf.test.snap b/modules/nf-core/untar/tests/main.nf.test.snap new file mode 100644 index 00000000..ceb91b79 --- /dev/null +++ b/modules/nf-core/untar/tests/main.nf.test.snap @@ -0,0 +1,158 @@ +{ + "test_untar_onlyfiles": { + "content": [ + { + "0": [ + [ + [ + + ], + [ + "hello.txt:md5,e59ff97941044f85df5297e1c302d260" + ] + ] + ], + "1": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ], + "untar": [ + [ + [ + + ], + [ + "hello.txt:md5,e59ff97941044f85df5297e1c302d260" + ] + ] + ], + "versions": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-10T12:04:28.231047" + }, + "test_untar_onlyfiles - stub": { + "content": [ + { + "0": [ + [ + [ + + ], + [ + "hello.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "1": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ], + "untar": [ + [ + [ + + ], + [ + "hello.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "versions": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-10T12:04:45.773103" + }, + "test_untar - stub": { + "content": [ + { + "0": [ + [ + [ + + ], + [ + "hash.k2d:md5,d41d8cd98f00b204e9800998ecf8427e", + "opts.k2d:md5,d41d8cd98f00b204e9800998ecf8427e", + "taxo.k2d:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "1": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ], + "untar": [ + [ + [ + + ], + [ + "hash.k2d:md5,d41d8cd98f00b204e9800998ecf8427e", + "opts.k2d:md5,d41d8cd98f00b204e9800998ecf8427e", + "taxo.k2d:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "versions": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-10T12:04:36.777441" + }, + "test_untar": { + "content": [ + { + "0": [ + [ + [ + + ], + [ + "hash.k2d:md5,8b8598468f54a7087c203ad0190555d9", + "opts.k2d:md5,a033d00cf6759407010b21700938f543", + "taxo.k2d:md5,094d5891cdccf2f1468088855c214b2c" + ] + ] + ], + "1": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ], + "untar": [ + [ + [ + + ], + [ + "hash.k2d:md5,8b8598468f54a7087c203ad0190555d9", + "opts.k2d:md5,a033d00cf6759407010b21700938f543", + "taxo.k2d:md5,094d5891cdccf2f1468088855c214b2c" + ] + ] + ], + "versions": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-10T12:04:19.377674" + } +} \ No newline at end of file diff --git a/modules/nf-core/untar/tests/tags.yml b/modules/nf-core/untar/tests/tags.yml new file mode 100644 index 00000000..feb6f15c --- /dev/null +++ b/modules/nf-core/untar/tests/tags.yml @@ -0,0 +1,2 @@ +untar: + - modules/nf-core/untar/** diff --git a/modules/nf-core/untarfiles/main.nf b/modules/nf-core/untarfiles/main.nf deleted file mode 100644 index de27e67c..00000000 --- a/modules/nf-core/untarfiles/main.nf +++ /dev/null @@ -1,52 +0,0 @@ -process UNTARFILES { - tag "$archive" - label 'process_single' - - conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ubuntu:20.04' : - 'nf-core/ubuntu:20.04' }" - - input: - tuple val(meta), path(archive) - - output: - tuple val(meta), path("${prefix}/**") , emit: files - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - def args = task.ext.args ?: '' - def args2 = task.ext.args2 ?: '' - prefix = task.ext.prefix ?: ( meta.id ? "${meta.id}" : archive.baseName.toString().replaceFirst(/\.tar$/, "")) - - """ - mkdir $prefix - - tar \\ - -C $prefix \\ - -xavf \\ - $args \\ - $archive \\ - $args2 - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - untar: \$(echo \$(tar --version 2>&1) | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//') - END_VERSIONS - """ - - stub: - prefix = task.ext.prefix ?: "${meta.id}" - """ - mkdir $prefix - touch ${prefix}/file.txt - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - untar: \$(echo \$(tar --version 2>&1) | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//') - END_VERSIONS - """ -} diff --git a/modules/nf-core/untarfiles/meta.yml b/modules/nf-core/untarfiles/meta.yml deleted file mode 100644 index 38108826..00000000 --- a/modules/nf-core/untarfiles/meta.yml +++ /dev/null @@ -1,48 +0,0 @@ -name: untarfiles -description: Extract files. -keywords: - - untar - - uncompress - - files -tools: - - untar: - description: | - Extract tar.gz files. - documentation: https://www.gnu.org/software/tar/manual/ - licence: ["GPL-3.0-or-later"] -input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - archive: - type: file - description: File to be untar - pattern: "*.{tar}.{gz}" -output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - files: - type: string - description: A list containing references to individual archive files - pattern: "*/**" - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" -authors: - - "@joseespinosa" - - "@drpatelh" - - "@matthdsm" - - "@jfy133" - - "@pinin4fjords" -maintainers: - - "@joseespinosa" - - "@drpatelh" - - "@matthdsm" - - "@jfy133" - - "@pinin4fjords" diff --git a/nextflow.config b/nextflow.config index 147461f1..6c7472db 100644 --- a/nextflow.config +++ b/nextflow.config @@ -10,28 +10,25 @@ params { // Input options - input = null - // Workflow flags - protocol = 'illumina' - // References genome = null igenomes_base = 's3://ngi-igenomes/igenomes' igenomes_ignore = false mirna_gtf = null - mature = "https://mirbase.org/download/mature.fa" - hairpin = "https://mirbase.org/download/hairpin.fa" + mature = "https://github.com/nf-core/test-datasets/raw/smrnaseq/miRBase/mature.fa" + hairpin = "https://github.com/nf-core/test-datasets/raw/smrnaseq/miRBase/hairpin.fa" mirgenedb = false mirgenedb_mature = null mirgenedb_hairpin = null mirgenedb_gff = null mirgenedb_species = null - save_aligned = false - save_aligned_mirna_quant = true bowtie_index = null + // General pipeline configuration + save_intermediates = false + // UMI handling with_umi = false // skips umitools extract in FASTQ_FASTQC_UMITOOLS_FASTP subworkflow. Needs to be true for fastq mode in collapsing reads @@ -45,12 +42,12 @@ params { // Trimming options clip_r1 = null three_prime_clip_r1 = null - three_prime_adapter = 'AGATCGGAAGAGCACACGTCTGAACTCCAGTCA' + three_prime_adapter = 'AGATCGGAAGAGCACACGTCTGAACTCCAGTCA' //is set to the Illumina TruSeq single index adapter sequence to ensure that the auto-detect functionality of FASTP is disabled. trim_fastq = true fastp_min_length = 17 fastp_known_mirna_adapters = "$projectDir/assets/known_adapters.fa" save_trimmed_fail = false - save_merged = true + save_merged = false skip_fastqc = false skip_multiqc = false skip_mirdeep = false @@ -75,8 +72,6 @@ params { genome = null igenomes_base = 's3://ngi-igenomes/igenomes/' igenomes_ignore = false - - // MultiQC options multiqc_config = null multiqc_title = null @@ -85,16 +80,18 @@ params { multiqc_methods_description = null // Boilerplate options - outdir = null - publish_dir_mode = 'copy' - email = null - email_on_fail = null - plaintext_email = false - monochrome_logs = false - hook_url = null - help = false - version = false - + outdir = null + publish_dir_mode = 'copy' + email = null + email_on_fail = null + plaintext_email = false + monochrome_logs = false + hook_url = null + help = false + help_full = false + show_hidden = false + version = false + pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/' // Config options config_profile_name = null config_profile_description = null @@ -103,126 +100,105 @@ params { config_profile_contact = null config_profile_url = null - - // Max resource options - // Defaults only, expecting to be overwritten - max_memory = '128.GB' - max_cpus = 16 - max_time = '240.h' - // Schema validation default options - validationFailUnrecognisedParams = false - validationLenientMode = false - validationSchemaIgnoreParams = 'genomes,igenomes_base' - validationShowHiddenParams = false - validate_params = true - + validate_params = true } // Load base.config by default for all pipelines includeConfig 'conf/base.config' -// Load nf-core custom profiles from different Institutions -try { - includeConfig "${params.custom_config_base}/nfcore_custom.config" -} catch (Exception e) { - System.err.println("WARNING: Could not load nf-core/config profiles: ${params.custom_config_base}/nfcore_custom.config") -} - -// Load nf-core/smrnaseq custom profiles from different institutions. -// Warning: Uncomment only if a pipeline-specific institutional config already exists on nf-core/configs! -// try { -// includeConfig "${params.custom_config_base}/pipeline/smrnaseq.config" -// } catch (Exception e) { -// System.err.println("WARNING: Could not load nf-core/config/smrnaseq profiles: ${params.custom_config_base}/pipeline/smrnaseq.config") -// } - profiles { debug { - dumpHashes = true - process.beforeScript = 'echo $HOSTNAME' - cleanup = false + dumpHashes = true + process.beforeScript = 'echo $HOSTNAME' nextflow.enable.configProcessNamesValidation = true } conda { - conda.enabled = true - docker.enabled = false - singularity.enabled = false - podman.enabled = false - shifter.enabled = false - charliecloud.enabled = false - channels = ['conda-forge', 'bioconda', 'defaults'] - apptainer.enabled = false + conda.enabled = true + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + conda.channels = ['conda-forge', 'bioconda'] + apptainer.enabled = false } mamba { - conda.enabled = true - conda.useMamba = true - docker.enabled = false - singularity.enabled = false - podman.enabled = false - shifter.enabled = false - charliecloud.enabled = false - apptainer.enabled = false + conda.enabled = true + conda.useMamba = true + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false } docker { - docker.enabled = true - conda.enabled = false - singularity.enabled = false - podman.enabled = false - shifter.enabled = false - charliecloud.enabled = false - apptainer.enabled = false - docker.runOptions = '-u $(id -u):$(id -g)' + docker.enabled = true + conda.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false + docker.runOptions = '-u $(id -u):$(id -g)' } arm { - docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64' + docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64' } singularity { - singularity.enabled = true - singularity.autoMounts = true - conda.enabled = false - docker.enabled = false - podman.enabled = false - shifter.enabled = false - charliecloud.enabled = false - apptainer.enabled = false + singularity.enabled = true + singularity.autoMounts = true + conda.enabled = false + docker.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false } podman { - podman.enabled = true - conda.enabled = false - docker.enabled = false - singularity.enabled = false - shifter.enabled = false - charliecloud.enabled = false - apptainer.enabled = false + podman.enabled = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + shifter.enabled = false + charliecloud.enabled = false + apptainer.enabled = false } shifter { - shifter.enabled = true - conda.enabled = false - docker.enabled = false - singularity.enabled = false - podman.enabled = false - charliecloud.enabled = false - apptainer.enabled = false + shifter.enabled = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + podman.enabled = false + charliecloud.enabled = false + apptainer.enabled = false } charliecloud { - charliecloud.enabled = true - conda.enabled = false - docker.enabled = false - singularity.enabled = false - podman.enabled = false - shifter.enabled = false - apptainer.enabled = false + charliecloud.enabled = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + apptainer.enabled = false } apptainer { - apptainer.enabled = true - apptainer.autoMounts = true - conda.enabled = false - docker.enabled = false - singularity.enabled = false - podman.enabled = false - shifter.enabled = false - charliecloud.enabled = false + apptainer.enabled = true + apptainer.autoMounts = true + conda.enabled = false + docker.enabled = false + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + } + wave { + apptainer.ociAutoPull = true + singularity.ociAutoPull = true + wave.enabled = true + wave.freeze = true + wave.strategy = 'conda,container' } gitpod { @@ -231,33 +207,44 @@ profiles { executor.memory = 60.GB } - test { includeConfig 'conf/test.config' } - test_umi { includeConfig 'conf/test_umi.config' } - test_no_genome { includeConfig 'conf/test_no_genome.config' } - test_full { includeConfig 'conf/test_full.config' } - test_index { includeConfig 'conf/test_index.config' } -} -// Set default registry for Apptainer, Docker, Podman and Singularity independent of -profile -// Will not be used unless Apptainer / Docker / Podman / Singularity are enabled -// Set to your registry if you have a mirror of containers -apptainer.registry = 'quay.io' -docker.registry = 'quay.io' -podman.registry = 'quay.io' -singularity.registry = 'quay.io' -// Nextflow plugins -plugins { - id 'nf-validation@1.1.3' // Validation of pipeline parameters and creation of an input channel from a sample sheet + test { includeConfig 'conf/test.config' } + test_umi { includeConfig 'conf/test_umi.config' } + test_full { includeConfig 'conf/test_full.config' } + test_full_filter_contamination { includeConfig 'conf/test_full_filter_contamination.config' } + test_technical_repeats { includeConfig 'conf/test_technical_repeats.config' } + test_mirgenedb { includeConfig 'conf/test_mirgenedb.config' } + test_contamination { includeConfig 'conf/test_contamination.config' } + test_contamination_tech_reps { includeConfig 'conf/test_contamination_tech_reps.config' } + test_skipfastp { includeConfig 'conf/test_skipfastp.config' } + test_nextflex { includeConfig 'conf/test_nextflex.config' } + ci { includeConfig 'conf/ci.config' } + + + //Protocol specific profiles + cats { includeConfig 'conf/protocol_cats.config' } + illumina { includeConfig 'conf/protocol_illumina.config' } + qiaseq { includeConfig 'conf/protocol_qiaseq.config' } + nextflex { includeConfig 'conf/protocol_nextflex.config' } } -// Load igenomes.config if required -if (!params.igenomes_ignore) { - includeConfig 'conf/igenomes.config' -} else { - params.genomes = [:] -} +// Load nf-core custom profiles from different Institutions +includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" + +// Load nf-core/smrnaseq custom profiles from different institutions. +includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/pipeline/smrnaseq.config" : "/dev/null" +// Set default registry for Apptainer, Docker, Podman, Charliecloud and Singularity independent of -profile +// Will not be used unless Apptainer / Docker / Podman / Charliecloud / Singularity are enabled +// Set to your registry if you have a mirror of containers +apptainer.registry = 'quay.io' +docker.registry = 'quay.io' +podman.registry = 'quay.io' +singularity.registry = 'quay.io' +charliecloud.registry = 'quay.io' +// Load igenomes.config if required +includeConfig !params.igenomes_ignore ? 'conf/igenomes.config' : 'conf/igenomes_ignored.config' // Export these variables to prevent local Python/R libraries from conflicting with those in the container // The JULIA depot path has been adjusted to a fixed path `/usr/local/share/julia` that needs to be used for packages in the container. // See https://apeltzer.github.io/post/03-julia-lang-nextflow/ for details on that. Once we have a common agreement on where to keep Julia packages, this is adjustable. @@ -269,16 +256,24 @@ env { JULIA_DEPOT_PATH = "/usr/local/share/julia" } -// Capture exit codes from upstream processes when piping -process.shell = ['/bin/bash', '-euo', 'pipefail'] - -// Set default registry for Docker and Podman independent of -profile -// Will not be used unless Docker / Podman are enabled -// Set to your registry if you have a mirror of containers -apptainer.registry = 'quay.io' -docker.registry = 'quay.io' -podman.registry = 'quay.io' -singularity.registry = 'quay.io' +// Set bash options +process.shell = """\ +bash + +set -e # Exit if a tool returns a non-zero status/exit code +set -u # Treat unset variables and parameters as an error +set -o pipefail # Returns the status of the last command to exit with a non-zero status or zero if all successfully execute +set -C # No clobber - prevent output redirection from overwriting files. +""" +// Set bash options +process.shell = """\ +bash + +set -e # Exit if a tool returns a non-zero status/exit code +set -u # Treat unset variables and parameters as an error +set -o pipefail # Returns the status of the last command to exit with a non-zero status or zero if all successfully execute +set -C # No clobber - prevent output redirection from overwriting files. +""" // Disable process selector warnings by default. Use debug profile to enable warnings. nextflow.enable.configProcessNamesValidation = false @@ -307,44 +302,46 @@ manifest { homePage = 'https://github.com/nf-core/smrnaseq' description = """Small RNA-Seq Best Practice Analysis Pipeline.""" mainScript = 'main.nf' - nextflowVersion = '!>=23.04.0' - version = '2.3.1' + nextflowVersion = '!>=24.04.2' + version = '2.4.0' doi = '10.5281/zenodo.3456879' } -// Load modules.config for DSL2 module specific options -includeConfig 'conf/modules.config' +// Nextflow plugins +plugins { + id 'nf-schema@2.1.1' // Validation of pipeline parameters and creation of an input channel from a sample sheet +} -// Function to ensure that resource requirements don't go beyond -// a maximum limit -def check_max(obj, type) { - if (type == 'memory') { - try { - if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1) - return params.max_memory as nextflow.util.MemoryUnit - else - return obj - } catch (all) { - println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj" - return obj - } - } else if (type == 'time') { - try { - if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1) - return params.max_time as nextflow.util.Duration - else - return obj - } catch (all) { - println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj" - return obj - } - } else if (type == 'cpus') { - try { - return Math.min( obj, params.max_cpus as int ) - } catch (all) { - println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj" - return obj - } +validation { + defaultIgnoreParams = ["genomes","igenomes_base"] + help { + enabled = true + command = "nextflow run $manifest.name -profile --input samplesheet.csv --outdir " + fullParameter = "help_full" + showHiddenParameter = "show_hidden" + beforeText = """ +-\033[2m----------------------------------------------------\033[0m- + \033[0;32m,--.\033[0;30m/\033[0;32m,-.\033[0m +\033[0;34m ___ __ __ __ ___ \033[0;32m/,-._.--~\'\033[0m +\033[0;34m |\\ | |__ __ / ` / \\ |__) |__ \033[0;33m} {\033[0m +\033[0;34m | \\| | \\__, \\__/ | \\ |___ \033[0;32m\\`-._,-`-,\033[0m + \033[0;32m`._,._,\'\033[0m +\033[0;35m ${manifest.name} ${manifest.version}\033[0m +-\033[2m----------------------------------------------------\033[0m- +""" + afterText = """${manifest.doi ? "* The pipeline\n" : ""}${manifest.doi.tokenize(",").collect { " https://doi.org/${it.trim().replace('https://doi.org/','')}"}.join("\n")}${manifest.doi ? "\n" : ""} +* The nf-core framework + https://doi.org/10.1038/s41587-020-0439-x + +* Software dependencies + https://github.com/${manifest.name}/blob/master/CITATIONS.md +""" + } + summary { + beforeText = validation.help.beforeText + afterText = validation.help.afterText } - } + +// Load modules.config for DSL2 module specific options +includeConfig 'conf/modules.config' diff --git a/nextflow_schema.json b/nextflow_schema.json index 42680715..dc5a3c3f 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -1,10 +1,10 @@ { - "$schema": "http://json-schema.org/draft-07/schema", + "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/nf-core/smrnaseq/master/nextflow_schema.json", "title": "nf-core/smrnaseq pipeline parameters", "description": "Small RNA-Seq Best Practice Analysis Pipeline.", "type": "object", - "definitions": { + "$defs": { "input_output_options": { "title": "Input/output options", "type": "object", @@ -23,14 +23,6 @@ "help_text": "You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/smrnaseq/usage#samplesheet-input).", "fa_icon": "fas fa-file-csv" }, - "protocol": { - "type": "string", - "default": "illumina", - "fa_icon": "fas fa-vial", - "description": "Protocol for constructing smRNA-seq libraries.", - "help_text": "Presets for trimming parameters and 3' adapter sequence with a specified protocol.\n\n| Protocol | Library Prep Kit | Trimming Parameter | 3' Adapter Sequence |\n| :------------ | :-------------------------------------- | :-------------------------------------- | :--------------------- |\n| illumina | Illumina TruSeq Small RNA | `clip_r1 = 0` `three_prime_clip_r1 = 0` | `TGGAATTCTCGGGTGCCAAGG` |\n| nextflex | BIOO SCIENTIFIC NEXTFLEX Small RNA-Seq | `clip_r1 = 4` `three_prime_clip_r1 = 4` | `TGGAATTCTCGGGTGCCAAGG` |\n| qiaseq | QIAGEN QIAseq miRNA | `clip_r1 = 0` `three_prime_clip_r1 = 0` | `AACTGTAGGCACCATCAAT` |\n| cats | Diagenode CATS Small RNA-seq | `clip_r1 = 3` `three_prime_clip_r1 = 0` | `AAAAAAAAAAA` + `GATCGGAAGAGCACACGTCTG` (only polyA is used for trimming) |\n| custom | user defined | user defined | user defined |\n\n> NB: When running `--protocol custom` the user ***must define the 3' Adapter Sequence***.\n> If trimming parameters aren't provided the pipeline will deafult to `clip_R1 = 0` and `three_prime_clip_R1 = 0` (i.e. no extra clipping).", - "enum": ["illumina", "nextflex", "qiaseq", "cats", "custom"] - }, "outdir": { "type": "string", "format": "directory-path", @@ -48,6 +40,11 @@ "type": "string", "description": "MultiQC report title. Printed as page header, used for filename if not otherwise specified.", "fa_icon": "fas fa-file-signature" + }, + "save_intermediates": { + "type": "boolean", + "description": "Save all intermediate files (e.g. fastq, bams) of all steps of the pipeline to output directory", + "fa_icon": "fas fa-save" } } }, @@ -155,7 +152,7 @@ "description": "Path to FASTA file with mature miRNAs.", "fa_icon": "fas fa-wheelchair", "help_text": "Typically this will be the `mature.fa` file from miRBase. Can be given either as a plain text `.fa` file or a compressed `.gz` file.\n\nDefaults to the current miRBase release URL, from which the file will be downloaded.", - "default": "https://mirbase.org/download/mature.fa" + "default": "https://github.com/nf-core/test-datasets/raw/smrnaseq/miRBase/mature.fa" }, "mirgenedb_mature": { "type": "string", @@ -167,7 +164,7 @@ "description": "Path to FASTA file with miRNAs precursors.", "fa_icon": "fab fa-cuttlefish", "help_text": "Typically this will be the `mature.fa` file from miRBase. Can be given either as a plain text `.fa` file or a compressed `.gz` file.\n\nDefaults to the current miRBase release URL, from which the file will be downloaded.", - "default": "https://mirbase.org/download/hairpin.fa" + "default": "https://github.com/nf-core/test-datasets/raw/smrnaseq/miRBase/hairpin.fa" }, "mirgenedb_hairpin": { "type": "string", @@ -186,19 +183,6 @@ "help_text": "Saving generated references means that you can use them for future pipeline runs, reducing processing times.", "fa_icon": "fas fa-save" }, - "save_aligned": { - "type": "boolean", - "fa_icon": "fas fa-save", - "help_text": "Save aligned reads of initial bowtie mapping.", - "description": "Save aligned reads of initial mapping in bam format." - }, - "save_aligned_mirna_quant": { - "type": "boolean", - "fa_icon": "fas fa-save", - "default": true, - "help_text": "Save aligned reads of the bowtie runs in BOWTIE_MAP_MATURE, BOWTIE_MAP_HAIRPIN, and BOWTIE_MAP_SEQCLUSTER.", - "description": "Save aligned reads of miRNA quant subworkflow in bam format." - }, "igenomes_ignore": { "type": "boolean", "description": "Do not load the iGenomes reference config.", @@ -259,7 +243,7 @@ "exists": true, "mimetype": "text/plain", "default": "${projectDir}/assets/known_adapters.fa", - "description": "FastA with known miRNA adapter sequences for adapter trimming", + "description": "Fasta with known miRNA adapter sequences for adapter trimming", "fa_icon": "far fa-question-circle" }, "min_trimmed_reads": { @@ -271,7 +255,7 @@ "save_merged": { "type": "boolean", "description": "Save merged reads.", - "default": true + "default": false }, "phred_offset": { "type": "integer", @@ -398,41 +382,6 @@ } } }, - "max_job_request_options": { - "title": "Max job request options", - "type": "object", - "fa_icon": "fab fa-acquisitions-incorporated", - "description": "Set the top limit for requested resources for any single job.", - "help_text": "If you are running on a smaller system, a pipeline step requesting more resources than are available may cause the Nextflow to stop the run with an error. These options allow you to cap the maximum resources requested by any single job so that the pipeline will run on your system.\n\nNote that you can not _increase_ the resources requested by any job using these options. For that you will need your own configuration file. See [the nf-core website](https://nf-co.re/usage/configuration) for details.", - "properties": { - "max_cpus": { - "type": "integer", - "description": "Maximum number of CPUs that can be requested for any single job.", - "default": 16, - "fa_icon": "fas fa-microchip", - "hidden": true, - "help_text": "Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. `--max_cpus 1`" - }, - "max_memory": { - "type": "string", - "description": "Maximum amount of memory that can be requested for any single job.", - "default": "128.GB", - "fa_icon": "fas fa-memory", - "pattern": "^\\d+(\\.\\d+)?\\.?\\s*(K|M|G|T)?B$", - "hidden": true, - "help_text": "Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. `--max_memory '8.GB'`" - }, - "max_time": { - "type": "string", - "description": "Maximum amount of time that can be requested for any single job.", - "default": "240.h", - "fa_icon": "far fa-clock", - "pattern": "^(\\d+\\.?\\s*(s|m|h|d|day)\\s*)+$", - "hidden": true, - "help_text": "Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. `--max_time '2.h'`" - } - } - }, "generic_options": { "title": "Generic options", "type": "object", @@ -440,12 +389,6 @@ "description": "Less common options for the pipeline, typically set in a config file.", "help_text": "These options are common to all nf-core pipelines and allow you to customise some of the core preferences for how the pipeline runs.\n\nTypically these options would be set in a Nextflow config file loaded for all pipeline runs, such as `~/.nextflow/config`.", "properties": { - "help": { - "type": "boolean", - "description": "Display help text.", - "fa_icon": "fas fa-question-circle", - "hidden": true - }, "version": { "type": "boolean", "description": "Display version and exit.", @@ -521,57 +464,46 @@ "fa_icon": "fas fa-check-square", "hidden": true }, - "validationShowHiddenParams": { - "type": "boolean", - "fa_icon": "far fa-eye-slash", - "description": "Show all params when using `--help`", - "hidden": true, - "help_text": "By default, parameters set as _hidden_ in the schema are not shown on the command line when a user runs with `--help`. Specifying this option will tell the pipeline to show all parameters." - }, - "validationFailUnrecognisedParams": { - "type": "boolean", - "fa_icon": "far fa-check-circle", - "description": "Validation of parameters fails when an unrecognised parameter is found.", - "hidden": true, - "help_text": "By default, when an unrecognised parameter is found, it returns a warinig." - }, - "validationLenientMode": { - "type": "boolean", + "pipelines_testdata_base_path": { + "type": "string", "fa_icon": "far fa-check-circle", - "description": "Validation of parameters in lenient more.", - "hidden": true, - "help_text": "Allows string values that are parseable as numbers or booleans. For further information see [JSONSchema docs](https://github.com/everit-org/json-schema#lenient-mode)." + "description": "Base URL or local path to location of pipeline test dataset files", + "default": "https://raw.githubusercontent.com/nf-core/test-datasets/", + "hidden": true } } } }, "allOf": [ { - "$ref": "#/definitions/input_output_options" + "$ref": "#/$defs/input_output_options" }, { - "$ref": "#/definitions/umi_options" + "$ref": "#/$defs/umi_options" }, { - "$ref": "#/definitions/reference_genome_options" + "$ref": "#/$defs/reference_genome_options" }, { - "$ref": "#/definitions/trimming_options" + "$ref": "#/$defs/trimming_options" }, { - "$ref": "#/definitions/contamination_filtering" + "$ref": "#/$defs/contamination_filtering" }, { - "$ref": "#/definitions/skipping_pipeline_steps" + "$ref": "#/$defs/skipping_pipeline_steps" }, { - "$ref": "#/definitions/institutional_config_options" + "$ref": "#/$defs/institutional_config_options" }, { - "$ref": "#/definitions/max_job_request_options" - }, - { - "$ref": "#/definitions/generic_options" + "$ref": "#/$defs/generic_options" + } + ], + "properties": { + "igenomes_base": { + "type": "string", + "default": "s3://ngi-igenomes/igenomes/" } - ] + } } diff --git a/nf-test.config b/nf-test.config new file mode 100644 index 00000000..dc97fc42 --- /dev/null +++ b/nf-test.config @@ -0,0 +1,15 @@ +config { + // location for all nf-tests + testsDir "tests" + + // nf-test directory including temporary files for each test + workDir ".nf-test" + + // location of library folder that is added automatically to the classpath + libDir "tests/lib/" + + // location of an optional nextflow.config file specific for executing tests + configFile "nextflow.config" + + options "-dump-channels" +} diff --git a/pyproject.toml b/pyproject.toml deleted file mode 100644 index 56110621..00000000 --- a/pyproject.toml +++ /dev/null @@ -1,15 +0,0 @@ -# Config file for Python. Mostly used to configure linting of bin/*.py with Ruff. -# Should be kept the same as nf-core/tools to avoid fighting with template synchronisation. -[tool.ruff] -line-length = 120 -target-version = "py38" -cache-dir = "~/.cache/ruff" - -[tool.ruff.lint] -select = ["I", "E1", "E4", "E7", "E9", "F", "UP", "N"] - -[tool.ruff.lint.isort] -known-first-party = ["nf_core"] - -[tool.ruff.lint.per-file-ignores] -"__init__.py" = ["E402", "F401"] diff --git a/subworkflows/local/contaminant_filter.nf b/subworkflows/local/contaminant_filter.nf deleted file mode 100644 index 02d89df7..00000000 --- a/subworkflows/local/contaminant_filter.nf +++ /dev/null @@ -1,128 +0,0 @@ -// -// Filter contamination by rrna, trna, cdna, ncma, pirna -// - -include { BLAT_MIRNA as BLAT_CDNA - BLAT_MIRNA as BLAT_NCRNA - BLAT_MIRNA as BLAT_PIRNA - BLAT_MIRNA as BLAT_OTHER } from '../../modules/local/blat_mirna' - -include { INDEX_CONTAMINANTS as INDEX_RRNA - INDEX_CONTAMINANTS as INDEX_TRNA - INDEX_CONTAMINANTS as INDEX_CDNA - INDEX_CONTAMINANTS as INDEX_NCRNA - INDEX_CONTAMINANTS as INDEX_PIRNA - INDEX_CONTAMINANTS as INDEX_OTHER } from '../../modules/local/bowtie_contaminants' - -include { BOWTIE_MAP_CONTAMINANTS as MAP_RRNA - BOWTIE_MAP_CONTAMINANTS as MAP_TRNA - BOWTIE_MAP_CONTAMINANTS as MAP_CDNA - BOWTIE_MAP_CONTAMINANTS as MAP_NCRNA - BOWTIE_MAP_CONTAMINANTS as MAP_PIRNA - BOWTIE_MAP_CONTAMINANTS as MAP_OTHER } from '../../modules/local/bowtie_map_contaminants' - -include { FILTER_STATS } from '../../modules/local/filter_stats' - -workflow CONTAMINANT_FILTER { - take: - mirna - rrna - trna - cdna - ncrna - pirna - other - reads // channel: [ val(meta), [ reads ] ] - - main: - - ch_versions = Channel.empty() - ch_filter_stats = Channel.empty() - ch_mqc_results = Channel.empty() - - rrna_reads = reads - - reads.set { rrna_reads } - - if (params.rrna) { - // Index DB and filter $reads emit: $rrna_reads - INDEX_RRNA ( rrna ) - ch_versions = ch_versions.mix(INDEX_RRNA.out.versions) - MAP_RRNA ( reads, INDEX_RRNA.out.index, 'rRNA' ) - ch_versions = ch_versions.mix(MAP_RRNA.out.versions) - ch_filter_stats = ch_filter_stats.mix(MAP_RRNA.out.stats.ifEmpty(null)) - MAP_RRNA.out.unmapped.set { rrna_reads } - } - - rrna_reads.set { trna_reads } - - if (params.trna) { - // Index DB and filter $rrna_reads emit: $trna_reads - INDEX_TRNA ( trna ) - ch_versions = ch_versions.mix(INDEX_TRNA.out.versions) - MAP_TRNA ( rrna_reads, INDEX_TRNA.out.index, 'tRNA') - ch_versions = ch_versions.mix(MAP_TRNA.out.versions) - ch_filter_stats = ch_filter_stats.mix(MAP_TRNA.out.stats.ifEmpty(null)) - MAP_TRNA.out.unmapped.set { trna_reads } - } - - trna_reads.set { cdna_reads } - - - if (params.cdna) { - BLAT_CDNA ( 'cdna', mirna, cdna ) - ch_versions = ch_versions.mix(BLAT_CDNA.out.versions) - INDEX_CDNA ( BLAT_CDNA.out.filtered_set ) - ch_versions = ch_versions.mix(INDEX_CDNA.out.versions) - MAP_CDNA ( trna_reads, INDEX_CDNA.out.index, 'cDNA' ) - ch_versions = ch_versions.mix(MAP_CDNA.out.versions) - ch_filter_stats = ch_filter_stats.mix(MAP_CDNA.out.stats.ifEmpty(null)) - MAP_CDNA.out.unmapped.set { cdna_reads } - } - - cdna_reads.set { ncrna_reads } - - if (params.ncrna) { - BLAT_NCRNA ( 'ncrna', mirna, ncrna ) - ch_versions = ch_versions.mix(BLAT_NCRNA.out.versions) - INDEX_NCRNA ( BLAT_NCRNA.out.filtered_set ) - ch_versions = ch_versions.mix(INDEX_NCRNA.out.versions) - MAP_NCRNA ( cdna_reads, INDEX_NCRNA.out.index, 'ncRNA' ) - ch_versions = ch_versions.mix(MAP_NCRNA.out.versions) - ch_filter_stats = ch_filter_stats.mix(MAP_NCRNA.out.stats.ifEmpty(null)) - MAP_NCRNA.out.unmapped.set { ncrna_reads } - } - - ncrna_reads.set { pirna_reads } - - if (params.pirna) { - BLAT_PIRNA ( 'other', mirna, pirna ) - ch_versions = ch_versions.mix(BLAT_PIRNA.out.versions) - INDEX_PIRNA ( BLAT_PIRNA.out.filtered_set ) - ch_versions = ch_versions.mix(INDEX_PIRNA.out.versions) - MAP_PIRNA ( ncrna_reads, INDEX_PIRNA.out.index, 'piRNA' ) - ch_versions = ch_versions.mix(MAP_PIRNA.out.versions) - ch_filter_stats = ch_filter_stats.mix(MAP_PIRNA.out.stats.ifEmpty(null)) - MAP_PIRNA.out.unmapped.set { pirna_reads } - } - - pirna_reads.set { other_cont_reads } - - if (other) { - BLAT_OTHER ( 'other', mirna, other) - ch_versions = ch_versions.mix(BLAT_OTHER.out.versions) - INDEX_OTHER ( BLAT_OTHER.out.filtered_set ) - ch_versions = ch_versions.mix(INDEX_OTHER.out.versions) - MAP_OTHER ( ncrna_reads, INDEX_OTHER.out.index, 'other' ) - ch_versions = ch_versions.mix(MAP_OTHER.out.versions) - ch_filter_stats = ch_filter_stats.mix(MAP_OTHER.out.stats.ifEmpty(null)) - MAP_OTHER.out.unmapped.set { other_cont_reads } - } - - FILTER_STATS ( other_cont_reads, ch_filter_stats.collect() ) - - emit: - filtered_reads = FILTER_STATS.out.reads - versions = ch_versions.mix(FILTER_STATS.out.versions) - filter_stats = FILTER_STATS.out.stats -} diff --git a/subworkflows/local/contaminant_filter/main.nf b/subworkflows/local/contaminant_filter/main.nf new file mode 100644 index 00000000..585acd3e --- /dev/null +++ b/subworkflows/local/contaminant_filter/main.nf @@ -0,0 +1,324 @@ +// +// Filter contamination by rrna, trna, cdna, ncma, pirna +// + +include { BLAT as BLAT_CDNA } from '../../../modules/nf-core/blat/main' +include { BLAT as BLAT_NCRNA } from '../../../modules/nf-core/blat/main' +include { BLAT as BLAT_PIRNA } from '../../../modules/nf-core/blat/main' +include { BLAT as BLAT_OTHER } from '../../../modules/nf-core/blat/main' + +include { GAWK as GAWK_CDNA } from '../../../modules/nf-core/gawk/main' +include { GAWK as GAWK_NCRNA } from '../../../modules/nf-core/gawk/main' +include { GAWK as GAWK_PIRNA } from '../../../modules/nf-core/gawk/main' +include { GAWK as GAWK_OTHER } from '../../../modules/nf-core/gawk/main' + +include { SEQKIT_GREP as SEQKIT_GREP_CDNA } from '../../../modules/nf-core/seqkit/grep/main' +include { SEQKIT_GREP as SEQKIT_GREP_NCRNA } from '../../../modules/nf-core/seqkit/grep/main' +include { SEQKIT_GREP as SEQKIT_GREP_PIRNA } from '../../../modules/nf-core/seqkit/grep/main' +include { SEQKIT_GREP as SEQKIT_GREP_OTHER } from '../../../modules/nf-core/seqkit/grep/main' + +include { BOWTIE2_BUILD as INDEX_TRNA } from '../../../modules/nf-core/bowtie2/build/main' +include { BOWTIE2_BUILD as INDEX_CDNA } from '../../../modules/nf-core/bowtie2/build/main' +include { BOWTIE2_BUILD as INDEX_NCRNA } from '../../../modules/nf-core/bowtie2/build/main' +include { BOWTIE2_BUILD as INDEX_PIRNA } from '../../../modules/nf-core/bowtie2/build/main' +include { BOWTIE2_BUILD as INDEX_OTHER } from '../../../modules/nf-core/bowtie2/build/main' +include { BOWTIE2_BUILD as INDEX_RRNA } from '../../../modules/nf-core/bowtie2/build/main' + +include { BOWTIE2_ALIGN as BOWTIE2_ALIGN_RRNA } from '../../../modules/nf-core/bowtie2/align/main' +include { BOWTIE2_ALIGN as BOWTIE2_ALIGN_TRNA } from '../../../modules/nf-core/bowtie2/align/main' +include { BOWTIE2_ALIGN as BOWTIE2_ALIGN_CDNA } from '../../../modules/nf-core/bowtie2/align/main' +include { BOWTIE2_ALIGN as BOWTIE2_ALIGN_NCRNA } from '../../../modules/nf-core/bowtie2/align/main' +include { BOWTIE2_ALIGN as BOWTIE2_ALIGN_PIRNA } from '../../../modules/nf-core/bowtie2/align/main' +include { BOWTIE2_ALIGN as BOWTIE2_ALIGN_OTHER } from '../../../modules/nf-core/bowtie2/align/main' + +include { GAWK as STATS_GAWK_RRNA } from '../../../modules/nf-core/gawk/main' +include { GAWK as STATS_GAWK_TRNA } from '../../../modules/nf-core/gawk/main' +include { GAWK as STATS_GAWK_CDNA } from '../../../modules/nf-core/gawk/main' +include { GAWK as STATS_GAWK_NCRNA } from '../../../modules/nf-core/gawk/main' +include { GAWK as STATS_GAWK_PIRNA } from '../../../modules/nf-core/gawk/main' +include { GAWK as STATS_GAWK_OTHER } from '../../../modules/nf-core/gawk/main' + + +include { FILTER_STATS } from '../../../modules/local/filter_stats' + +workflow CONTAMINANT_FILTER { + take: + ch_reference_hairpin // channel: [ val(meta), path(fasta) ] + ch_rrna // channel: [ path(fasta) ] + ch_trna // channel: [ path(fasta) ] + ch_cdna // channel: [ val(meta), path(fasta) ] + ch_ncrna // channel: [ val(meta), path(fasta) ] + ch_pirna // channel: [ val(meta), path(fasta) ] + ch_other_contamination // channel: [ val(meta), path(fasta) ] + ch_reads_for_mirna // channel: [ val(meta), [ reads ] ] + + main: + + ch_versions = Channel.empty() + ch_filter_stats = Channel.empty() + ch_mqc_results = Channel.empty() + + ch_reads_for_mirna.set { rrna_reads } + + if (params.rrna) { + // Index DB and filter $reads emit: $rrna_reads + INDEX_RRNA ( ch_rrna ) + ch_versions = ch_versions.mix(INDEX_RRNA.out.versions) + + // Add meta.contaminant to input reads channel + ch_reads_for_mirna = ch_reads_for_mirna.map{meta, fastq -> return [[id: meta.id, contaminant: "rRNA", single_end: meta.single_end], fastq]} + + // Map which reads are rRNAs + BOWTIE2_ALIGN_RRNA(ch_reads_for_mirna, INDEX_RRNA.out.index.first(), [[],[]], true, false) + ch_versions = ch_versions.mix(BOWTIE2_ALIGN_RRNA.out.versions) + + // Obtain how many hits were contaminants + ch_bowtie = BOWTIE2_ALIGN_RRNA.out.log + + STATS_GAWK_RRNA(ch_bowtie, []) + ch_versions = ch_versions.mix(STATS_GAWK_RRNA.out.versions) + + // Remove meta.contaminant and collect all contaminant stats in a single channel + ch_filter_stats = ch_filter_stats + .mix(STATS_GAWK_RRNA.out.output + .map{meta, stats -> return [[id:meta.id, single_end:meta.single_end], stats]} + .ifEmpty(null)) + + // Assign clean reads to new channel + rrna_reads = BOWTIE2_ALIGN_RRNA.out.fastq + } + + trna_reads = rrna_reads + + if (params.trna) { + // Index DB and filter $rrna_reads emit: $trna_reads + INDEX_TRNA ( ch_trna ) + ch_versions = ch_versions.mix(INDEX_TRNA.out.versions) + + // Add meta.contaminant to input reads channel + rrna_reads = rrna_reads.map{meta, fastq -> return [[id:meta.id, contaminant: "tRNA", single_end:meta.single_end], fastq]} + + // Map which reads are tRNAs + BOWTIE2_ALIGN_TRNA(rrna_reads, INDEX_TRNA.out.index.first(), [[],[]], true, false) + ch_versions = ch_versions.mix(BOWTIE2_ALIGN_TRNA.out.versions) + + // Obtain how many hits were contaminants + ch_bowtie = BOWTIE2_ALIGN_TRNA.out.log + + STATS_GAWK_TRNA(ch_bowtie, []) + ch_versions = ch_versions.mix(STATS_GAWK_TRNA.out.versions) + + // Remove meta.contaminant and collect all contaminant stats in a single channel + ch_filter_stats = ch_filter_stats + .mix(STATS_GAWK_TRNA.out.output + .map{meta, stats -> return [[id:meta.id, single_end:meta.single_end], stats]} + .ifEmpty(null)) + + // Assign clean reads to new channel + trna_reads = BOWTIE2_ALIGN_TRNA.out.fastq + } + + cdna_reads = trna_reads + + // Define how to filter significant BLAT hits + ch_program = Channel.value('BEGIN{FS="\t"}{if(\$11 < 1e-5) print \$2;}').collectFile(name:"program.txt") + + if (params.cdna) { + // Search which hairpin miRNAs are present in the cDNA data + BLAT_CDNA(ch_reference_hairpin, ch_cdna) + ch_versions = ch_versions.mix(BLAT_CDNA.out.versions) + + // Extract the significant hits + GAWK_CDNA(BLAT_CDNA.out.psl, ch_program) + ch_versions = ch_versions.mix(GAWK_CDNA.out.versions) + + // Get only unique elements of the list + ch_pattern = GAWK_CDNA.out.output + .map { meta, file -> file.text.readLines() } + .flatten() + .unique() + .collectFile(name: 'ch_hairpin_cDNA_unique.txt', newLine: true) + + // Remove the hairpin miRNAs from the cDNA data + SEQKIT_GREP_CDNA(ch_cdna, ch_pattern) + ch_versions = ch_versions.mix(SEQKIT_GREP_CDNA.out.versions) + + // Previous original code: + INDEX_CDNA ( SEQKIT_GREP_CDNA.out.filter ) + ch_versions = ch_versions.mix(INDEX_CDNA.out.versions) + + // Add meta.contaminant to input reads channel + trna_reads = trna_reads.map{meta, fastq -> return [[id:meta.id, contaminant: "cDNA", single_end:meta.single_end], fastq]} + + // Map which reads are cDNA + BOWTIE2_ALIGN_CDNA(trna_reads, INDEX_CDNA.out.index.first(), [[],[]], true, false) + ch_versions = ch_versions.mix(BOWTIE2_ALIGN_CDNA.out.versions) + + // Obtain how many hits were contaminants + STATS_GAWK_CDNA(BOWTIE2_ALIGN_CDNA.out.log, []) + ch_versions = ch_versions.mix(STATS_GAWK_CDNA.out.versions) + + // Remove meta.contaminant and collect all contaminant stats in a single channel + ch_filter_stats = ch_filter_stats + .mix(STATS_GAWK_CDNA.out.output + .map{meta, stats -> return [[id:meta.id, single_end:meta.single_end], stats]} + .ifEmpty(null)) + + // Assign clean reads to new channel + cdna_reads = BOWTIE2_ALIGN_CDNA.out.fastq + } + + ncrna_reads = cdna_reads + + if (params.ncrna) { + // Search which hairpin miRNAs are present in the ncRNA data + BLAT_NCRNA(ch_reference_hairpin, ch_ncrna) + ch_versions = ch_versions.mix(BLAT_NCRNA.out.versions) + + // Extract the significant hits + GAWK_NCRNA(BLAT_NCRNA.out.psl, ch_program) + ch_versions = ch_versions.mix(GAWK_NCRNA.out.versions) + + // Get only unique elements of the list + ch_pattern = GAWK_NCRNA.out.output + .map { meta, file -> file.text.readLines() } + .flatten() + .unique() + .collectFile(name: 'ch_hairpin_ncRNA_unique.txt', newLine: true) + + // Remove the hairpin miRNAs from the ncRNA data + SEQKIT_GREP_NCRNA(ch_ncrna, ch_pattern) + ch_versions = ch_versions.mix(SEQKIT_GREP_NCRNA.out.versions) + + // Previous original code: + INDEX_NCRNA ( SEQKIT_GREP_NCRNA.out.filter ) + ch_versions = ch_versions.mix(INDEX_NCRNA.out.versions) + + // Add meta.contaminant to input reads channel + cdna_reads = cdna_reads.map{meta, fastq -> return [[id:meta.id, contaminant: "ncRNA", single_end:meta.single_end], fastq]} + + // Map which reads are ncRNA + BOWTIE2_ALIGN_NCRNA(cdna_reads, INDEX_NCRNA.out.index.first(), [[],[]], true, false) + ch_versions = ch_versions.mix(BOWTIE2_ALIGN_NCRNA.out.versions) + + // Obtain how many hits were contaminants + STATS_GAWK_NCRNA(BOWTIE2_ALIGN_NCRNA.out.log, []) + ch_versions = ch_versions.mix(STATS_GAWK_NCRNA.out.versions) + + // Remove meta.contaminant and collect all contaminant stats in a single channel + ch_filter_stats = ch_filter_stats + .mix(STATS_GAWK_NCRNA.out.output + .map{meta, stats -> return [[id:meta.id, single_end:meta.single_end], stats]} + .ifEmpty(null)) + + // Assign clean reads to new channel + ncrna_reads = BOWTIE2_ALIGN_NCRNA.out.fastq + } + + pirna_reads = ncrna_reads + + if (params.pirna) { + // Search which hairpin miRNAs are present in the piRNA data + BLAT_PIRNA(ch_reference_hairpin, ch_pirna) + ch_versions = ch_versions.mix(BLAT_PIRNA.out.versions) + + // Extract the significant hits + GAWK_PIRNA(BLAT_PIRNA.out.psl, ch_program) + ch_versions = ch_versions.mix(GAWK_PIRNA.out.versions) + + // Get only unique elements of the list + ch_pattern = GAWK_PIRNA.out.output + .map { meta, file -> file.text.readLines() } + .flatten() + .unique() + .collectFile(name: 'ch_hairpin_piRNA_unique.txt', newLine: true) + + // Remove the hairpin miRNAs from the piRNA data + SEQKIT_GREP_PIRNA(ch_pirna, ch_pattern) + ch_versions = ch_versions.mix(SEQKIT_GREP_PIRNA.out.versions) + + // Previous original code: + INDEX_PIRNA ( SEQKIT_GREP_PIRNA.out.filter ) + ch_versions = ch_versions.mix(INDEX_PIRNA.out.versions) + + // Add meta.contaminant to input reads channel + ncrna_reads = ncrna_reads.map{meta, fastq -> return [[id:meta.id, contaminant: "piRNA", single_end:meta.single_end], fastq]} + + // Map which reads are piRNA + BOWTIE2_ALIGN_PIRNA(ncrna_reads, INDEX_PIRNA.out.index.first(), [[],[]], true, false) + ch_versions = ch_versions.mix(BOWTIE2_ALIGN_PIRNA.out.versions) + + // Obtain how many hits were contaminants + STATS_GAWK_PIRNA(BOWTIE2_ALIGN_PIRNA.out.log, []) + ch_versions = ch_versions.mix(STATS_GAWK_PIRNA.out.versions) + + // Remove meta.contaminant and collect all contaminant stats in a single channel + ch_filter_stats = ch_filter_stats + .mix(STATS_GAWK_PIRNA.out.output + .map{meta, stats -> return [[id:meta.id, single_end:meta.single_end], stats]} + .ifEmpty(null)) + + // Assign clean reads to new channel + pirna_reads = BOWTIE2_ALIGN_PIRNA.out.fastq + } + + other_cont_reads = pirna_reads + + if (params.other_contamination) { + // Search which hairpin miRNAs are present in the other data + BLAT_OTHER(ch_reference_hairpin, ch_other_contamination) + ch_versions = ch_versions.mix(BLAT_OTHER.out.versions) + + // Extract the significant hits + GAWK_OTHER(BLAT_OTHER.out.psl, ch_program) + ch_versions = ch_versions.mix(GAWK_OTHER.out.versions) + + // Get only unique elements of the list + ch_pattern = GAWK_OTHER.out.output + .map { meta, file -> file.text.readLines() } + .flatten() + .unique() + .collectFile(name: 'ch_hairpin_other_unique.txt', newLine: true) + + // Remove the hairpin miRNAs from the other data + SEQKIT_GREP_OTHER(ch_other_contamination, ch_pattern) + ch_versions = ch_versions.mix(SEQKIT_GREP_OTHER.out.versions) + + // Previous original code: + INDEX_OTHER ( SEQKIT_GREP_OTHER.out.filter ) + ch_versions = ch_versions.mix(INDEX_OTHER.out.versions) + + // Map which reads are other + BOWTIE2_ALIGN_OTHER(pirna_reads, INDEX_OTHER.out.index.first(), [[],[]], true, false) + ch_versions = ch_versions.mix(BOWTIE2_ALIGN_OTHER.out.versions) + + // Obtain how many hits were contaminants + STATS_GAWK_OTHER(BOWTIE2_ALIGN_OTHER.out.log, []) + ch_versions = ch_versions.mix(STATS_GAWK_OTHER.out.versions) + + // Remove meta.contaminant and collect all contaminant stats in a single channel + ch_filter_stats = ch_filter_stats + .mix(STATS_GAWK_OTHER.out.output + .map{meta, stats -> return [[id:meta.id, single_end:meta.single_end], stats]} + .ifEmpty(null)) + + // Assign clean reads to new channel + other_cont_reads = BOWTIE2_ALIGN_OTHER.out.fastq + } + + // Remove meta.contaminant from final set of reads + other_cont_reads = other_cont_reads + .map{meta, reads -> return [[id:meta.id, single_end:meta.single_end], reads]} + + // Create channel with reads and contaminants + ch_reads_contaminants = other_cont_reads.join(ch_filter_stats.groupTuple()) + + // Filter all contaminant stats and create MultiQC file + FILTER_STATS ( ch_reads_contaminants ) + FILTER_STATS.out.stats.dump(tag:"FILTER_STATS.out.stats") + + emit: + filtered_reads = other_cont_reads // channel: [ val(meta), path(fastq) ] + filter_stats = FILTER_STATS.out.stats // channel: [ path(stats) ] + versions = ch_versions.mix(FILTER_STATS.out.versions) // channel: [ versions.yml ] +} diff --git a/subworkflows/local/contaminant_filter/tests/contaminant_filter.nf.test b/subworkflows/local/contaminant_filter/tests/contaminant_filter.nf.test new file mode 100644 index 00000000..081d7079 --- /dev/null +++ b/subworkflows/local/contaminant_filter/tests/contaminant_filter.nf.test @@ -0,0 +1,96 @@ +nextflow_workflow { + + name "Test Workflow CONTAMINANT_FILTER" + script "../contaminant_filter.nf" + config "./nextflow.config" + workflow "CONTAMINANT_FILTER" + tag "subworkflows" + tag "subworkflows_local" + tag "subworkflows/contaminant_filter" + tag "contaminant_filter" + + test("Should remove other contaminants") { + + when { + params { + outdir = "${outputDir}" + } + workflow { + """ + input[0] = [file("https://github.com/nf-core/test-datasets/raw/smrnaseq/miRBase/hairpin.fa", checkIfExists: true)] + input[1] = [] + input[2] = [] + input[3] = [] + input[4] = [] + input[5] = [] + input[6] = [file("https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/Homo_sapiens.GRCh37.ncrna.fa", checkIfExists: true)] + input[7] = Channel.of([['id':'Clone1_N1', 'single_end':true], file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/contaminant_filter/small_Clone1_N1.tRNA.filter.unmapped.contaminant.fastq", checkIfExists: true)]) + """ + } + } + + then { + assert workflow.success + assert path(workflow.out.filtered_reads.get(0).get(1)).linesGzip.contains("@M07660:69:000000000-KDJ4R:1:1102:18200:10888 1:N:0:ACAGTG") + assert workflow.out.filtered_reads + } + + } + + test("Should remove tRNA contaminants") { + + when { + params { + outdir = "${outputDir}" + } + workflow { + """ + input[0] = [file("https://github.com/nf-core/test-datasets/raw/smrnaseq/miRBase/hairpin.fa", checkIfExists: true)] + input[1] = [] + input[2] = [file("https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/hg19-tRNAs.fa")] + input[3] = [] + input[4] = [] + input[5] = [] + input[6] = [] + input[7] = Channel.of([['id':'Clone1_N1', 'single_end':true], file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/contaminant_filter/small_Clone1_N1.tRNA.filter.unmapped.contaminant.fastq", checkIfExists: true)]) + """ + } + } + + then { + assert workflow.success + assert path(workflow.out.filtered_reads.get(0).get(1)).linesGzip.contains("@M07660:69:000000000-KDJ4R:1:1102:18200:10888 1:N:0:ACAGTG") + assert workflow.out.filtered_reads + } + + } + + test("Should remove ncRNA contaminants") { + + when { + params { + outdir = "${outputDir}" + } + workflow { + """ + input[0] = [file("https://github.com/nf-core/test-datasets/raw/smrnaseq/miRBase/hairpin.fa", checkIfExists: true)] + input[1] = [] + input[2] = [] + input[3] = [] + input[4] = [file("https://huggingface.co/datasets/nf-core/smrnaseq/resolve/main/GRCh37/Homo_sapiens.GRCh37.ncrna.fa")] + input[5] = [] + input[6] = [] + input[7] = Channel.of([['id':'Clone1_N1', 'single_end':true], file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/contaminant_filter/small_Clone1_N1.tRNA.filter.unmapped.contaminant.fastq", checkIfExists: true)]) + """ + } + } + + then { + assert workflow.success + assert path(workflow.out.filtered_reads.get(0).get(1)).linesGzip.contains("@M07660:69:000000000-KDJ4R:1:1102:18200:10888 1:N:0:ACAGTG") + assert workflow.out.filtered_reads + } + + } + +} diff --git a/subworkflows/local/contaminant_filter/tests/nextflow.config b/subworkflows/local/contaminant_filter/tests/nextflow.config new file mode 100644 index 00000000..72863b87 --- /dev/null +++ b/subworkflows/local/contaminant_filter/tests/nextflow.config @@ -0,0 +1,16 @@ +process { + withName: 'CONTAMINANT_FILTER:BLAT.*' { + ext.args = '-out=blast8' + ext.prefix = {"${meta.id}_${meta2.id}"} + tag = {"${meta.id} ${meta2.id}"} + } + + withName: 'CONTAMINANT_FILTER:GAWK.*' { + ext.prefix = {"significant_hits_${meta.id}"} + } + + withName: 'CONTAMINANT_FILTER:SEQKIT_GREP.*' { + ext.prefix = {"filtered_${meta.id}"} + ext.args = '-v' + } +} diff --git a/subworkflows/local/genome_quant.nf b/subworkflows/local/genome_quant.nf index c56f8e5f..4e5181a4 100644 --- a/subworkflows/local/genome_quant.nf +++ b/subworkflows/local/genome_quant.nf @@ -2,29 +2,25 @@ // Quantify mirna with bowtie and mirtop // -include { BAM_SORT_STATS_SAMTOOLS } from '../nf-core/bam_sort_stats_samtools' -include { BOWTIE_MAP_SEQ as BOWTIE_MAP_GENOME } from '../../modules/local/bowtie_map_mirna' +include { BAM_SORT_STATS_SAMTOOLS } from '../nf-core/bam_sort_stats_samtools' +include { BOWTIE_ALIGN as BOWTIE_MAP_GENOME } from '../../modules/nf-core/bowtie/align/main' workflow GENOME_QUANT { take: - bowtie_index - fasta_formatted // fasta as generated by bowtie index step - reads // channel: [ val(meta), [ reads ] ] + ch_bowtie_index // channel: [ val(meta), path(directory_index) ] + ch_fasta // channel: [ val(meta), path(fasta) ] + ch_reads // channel: [ val(meta), [ reads ] ] main: ch_versions = Channel.empty() - BOWTIE_MAP_GENOME ( reads, bowtie_index.collect() ) + BOWTIE_MAP_GENOME ( ch_reads, ch_bowtie_index, true ) ch_versions = ch_versions.mix(BOWTIE_MAP_GENOME.out.versions) - ch_fasta_formatted_for_sort = fasta_formatted .map { file -> tuple(file.baseName, file) } - BAM_SORT_STATS_SAMTOOLS ( BOWTIE_MAP_GENOME.out.bam, ch_fasta_formatted_for_sort ) + BAM_SORT_STATS_SAMTOOLS ( BOWTIE_MAP_GENOME.out.bam, ch_fasta ) ch_versions = ch_versions.mix(BAM_SORT_STATS_SAMTOOLS.out.versions) emit: - fasta = fasta_formatted - index = bowtie_index - stats = BAM_SORT_STATS_SAMTOOLS.out.stats - - versions = ch_versions + stats = BAM_SORT_STATS_SAMTOOLS.out.stats // channel: [ val(meta), [ stats ] ] + versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/mirdeep2.nf b/subworkflows/local/mirdeep2.nf deleted file mode 100644 index f8098ba5..00000000 --- a/subworkflows/local/mirdeep2.nf +++ /dev/null @@ -1,31 +0,0 @@ -// -// Quantify mirna with bowtie and mirtop -// - -include { MIRDEEP2_PIGZ } from '../../modules/local/mirdeep2_prepare' -include { MIRDEEP2_MAPPER } from '../../modules/local/mirdeep2_mapper' -include { MIRDEEP2_RUN } from '../../modules/local/mirdeep2_run' - -workflow MIRDEEP2 { - take: - reads // channel: [ val(meta), [ reads ] ] - fasta - index - hairpin - mature - - main: - ch_versions = Channel.empty() - - MIRDEEP2_PIGZ ( reads ) - ch_versions = ch_versions.mix(MIRDEEP2_PIGZ.out.versions.first()) - - MIRDEEP2_MAPPER ( MIRDEEP2_PIGZ.out.reads, index ) - ch_versions = ch_versions.mix(MIRDEEP2_MAPPER.out.versions.first()) - - MIRDEEP2_RUN ( fasta, MIRDEEP2_MAPPER.out.mirdeep2_inputs, hairpin, mature ) - ch_versions = ch_versions.mix(MIRDEEP2_RUN.out.versions.first()) - - emit: - versions = ch_versions // channel: [ versions.yml ] -} diff --git a/subworkflows/local/mirna_quant.nf b/subworkflows/local/mirna_quant.nf index aea1c9a7..51a9b9a3 100644 --- a/subworkflows/local/mirna_quant.nf +++ b/subworkflows/local/mirna_quant.nf @@ -8,121 +8,129 @@ include { PARSE_FASTA_MIRNA as PARSE_MATURE include { FORMAT_FASTA_MIRNA as FORMAT_MATURE FORMAT_FASTA_MIRNA as FORMAT_HAIRPIN } from '../../modules/local/format_fasta_mirna' -include { INDEX_MIRNA as INDEX_MATURE - INDEX_MIRNA as INDEX_HAIRPIN } from '../../modules/local/bowtie_mirna' +include { BOWTIE_BUILD as INDEX_MATURE } from '../../modules/nf-core/bowtie/build/main' +include { BOWTIE_BUILD as INDEX_HAIRPIN } from '../../modules/nf-core/bowtie/build/main' -include { BOWTIE_MAP_SEQ as BOWTIE_MAP_MATURE - BOWTIE_MAP_SEQ as BOWTIE_MAP_HAIRPIN - BOWTIE_MAP_SEQ as BOWTIE_MAP_SEQCLUSTER } from '../../modules/local/bowtie_map_mirna' +include { BOWTIE_ALIGN as BOWTIE_MAP_MATURE + BOWTIE_ALIGN as BOWTIE_MAP_HAIRPIN + BOWTIE_ALIGN as BOWTIE_MAP_SEQCLUSTER } from '../../modules/nf-core/bowtie/align/main' include { BAM_SORT_STATS_SAMTOOLS as BAM_STATS_MATURE BAM_SORT_STATS_SAMTOOLS as BAM_STATS_HAIRPIN } from '../nf-core/bam_sort_stats_samtools' -include { SEQCLUSTER_SEQUENCES } from '../../modules/local/seqcluster_collapse.nf' -include { MIRTOP_QUANT } from '../../modules/local/mirtop_quant.nf' -include { TABLE_MERGE } from '../../modules/local/datatable_merge.nf' -include { EDGER_QC } from '../../modules/local/edger_qc.nf' +include { SEQCLUSTER_COLLAPSE } from '../../modules/nf-core/seqcluster/collapse/main' +include { DATATABLE_MERGE } from '../../modules/local/datatable_merge/main' +include { EDGER_QC } from '../../modules/local/edger_qc/main' +include { BAM_STATS_MIRNA_MIRTOP } from '../../subworkflows/nf-core/bam_stats_mirna_mirtop/main' +include { CSVTK_JOIN } from '../../modules/nf-core/csvtk/join/main' workflow MIRNA_QUANT { take: - mature // channel: [ val(meta), fasta file] - hairpin // channel: [ val(meta), fasta file] - gtf // channle: GTF file - reads // channel: [ val(meta), [ reads ] ] + ch_reference_mature // channel: [ val(meta), fasta file] + ch_reference_hairpin // channel: [ val(meta), fasta file] + ch_mirna_gtf // channel: [ val(meta), path(gtf) ] + ch_reads_for_mirna // channel: [ val(meta), [ reads ] ] + ch_mirtrace_species // channel: [ val(string) ] main: ch_versions = Channel.empty() + ch_parse_species_input = params.mirgenedb ? Channel.value(params.mirgenedb_species) : ch_mirtrace_species - PARSE_MATURE ( mature ).parsed_fasta.set { mirna_parsed } + PARSE_MATURE ( ch_reference_mature, ch_parse_species_input ) + ch_mirna_parsed = PARSE_MATURE.out.parsed_fasta ch_versions = ch_versions.mix(PARSE_MATURE.out.versions) - FORMAT_MATURE ( mirna_parsed ) + FORMAT_MATURE ( ch_mirna_parsed ) ch_versions = ch_versions.mix(FORMAT_MATURE.out.versions) - INDEX_MATURE ( FORMAT_MATURE.out.formatted_fasta ).index.set { mature_bowtie } + INDEX_MATURE ( FORMAT_MATURE.out.formatted_fasta ) + ch_mature_bowtie = INDEX_MATURE.out.index.first() ch_versions = ch_versions.mix(INDEX_MATURE.out.versions) - reads + ch_reads_mirna = ch_reads_for_mirna .map { add_suffix(it, "mature") } - .dump (tag:'msux') - .set { reads_mirna } - BOWTIE_MAP_MATURE ( reads_mirna, mature_bowtie.collect() ) + BOWTIE_MAP_MATURE ( ch_reads_mirna, ch_mature_bowtie, true ) ch_versions = ch_versions.mix(BOWTIE_MAP_MATURE.out.versions) - BOWTIE_MAP_MATURE.out.unmapped + ch_reads_hairpin = BOWTIE_MAP_MATURE.out.fastq .map { add_suffix(it, "hairpin") } - .dump (tag:'hsux') - .set { reads_hairpin } BAM_STATS_MATURE ( BOWTIE_MAP_MATURE.out.bam, FORMAT_MATURE.out.formatted_fasta ) ch_versions = ch_versions.mix(BAM_STATS_MATURE.out.versions) - PARSE_HAIRPIN ( hairpin ).parsed_fasta.set { hairpin_parsed } + PARSE_HAIRPIN ( ch_reference_hairpin, ch_parse_species_input ) + ch_hairpin_parsed = PARSE_HAIRPIN.out.parsed_fasta ch_versions = ch_versions.mix(PARSE_HAIRPIN.out.versions) - FORMAT_HAIRPIN ( hairpin_parsed ) + FORMAT_HAIRPIN ( ch_hairpin_parsed ) ch_versions = ch_versions.mix(FORMAT_HAIRPIN.out.versions) - INDEX_HAIRPIN ( FORMAT_HAIRPIN.out.formatted_fasta ).index.set { hairpin_bowtie } + INDEX_HAIRPIN ( FORMAT_HAIRPIN.out.formatted_fasta ) + hairpin_bowtie = INDEX_HAIRPIN.out.index.first() ch_versions = ch_versions.mix(INDEX_HAIRPIN.out.versions) - BOWTIE_MAP_HAIRPIN ( reads_hairpin, hairpin_bowtie.collect() ) + BOWTIE_MAP_HAIRPIN ( ch_reads_hairpin, hairpin_bowtie, true ) ch_versions = ch_versions.mix(BOWTIE_MAP_HAIRPIN.out.versions) BAM_STATS_HAIRPIN ( BOWTIE_MAP_HAIRPIN.out.bam, FORMAT_HAIRPIN.out.formatted_fasta ) ch_versions = ch_versions.mix(BAM_STATS_HAIRPIN.out.versions) - BAM_STATS_MATURE.out.idxstats.collect{it[1]} + ch_edger_input = BAM_STATS_MATURE.out.idxstats.collect{it[1]} .mix(BAM_STATS_HAIRPIN.out.idxstats.collect{it[1]}) - .dump(tag:'edger') .flatten() .collect() - .set { edger_input } - EDGER_QC ( edger_input ) + EDGER_QC ( ch_edger_input ) ch_versions.mix(EDGER_QC.out.versions) - reads - .map { add_suffix(it, "seqcluster") } - .dump (tag:'ssux') - .set { reads_seqcluster } + SEQCLUSTER_COLLAPSE ( ch_reads_for_mirna ) + ch_versions = ch_versions.mix(SEQCLUSTER_COLLAPSE.out.versions) - SEQCLUSTER_SEQUENCES ( reads_seqcluster ).collapsed.set { reads_collapsed } - ch_versions = ch_versions.mix(SEQCLUSTER_SEQUENCES.out.versions) + ch_reads_collapsed = SEQCLUSTER_COLLAPSE.out.fastq - BOWTIE_MAP_SEQCLUSTER ( reads_collapsed, hairpin_bowtie.collect() ) + BOWTIE_MAP_SEQCLUSTER ( ch_reads_collapsed, hairpin_bowtie, true ) ch_versions = ch_versions.mix(BOWTIE_MAP_SEQCLUSTER.out.versions) ch_mirtop_logs = Channel.empty() - if (params.mirtrace_species){ - MIRTOP_QUANT ( BOWTIE_MAP_SEQCLUSTER.out.bam.collect{it[1]}, FORMAT_HAIRPIN.out.formatted_fasta.collect{it[1]}, gtf ) - ch_mirtop_logs = MIRTOP_QUANT.out.logs - ch_versions = ch_versions.mix(MIRTOP_QUANT.out.versions) - - TABLE_MERGE ( MIRTOP_QUANT.out.mirtop_table ) - ch_versions = ch_versions.mix(TABLE_MERGE.out.versions) - } - BOWTIE_MAP_HAIRPIN.out.unmapped + + // nf-core/mirtop + + ch_mirna_gtf_species = ch_mirna_gtf.map{ meta,gtf -> gtf } + .combine(ch_mirtrace_species) + .map{ gtf, species -> [ [id:species.toString()], gtf, species ] } + .collect() + + BAM_STATS_MIRNA_MIRTOP(BOWTIE_MAP_SEQCLUSTER.out.bam, FORMAT_HAIRPIN.out.formatted_fasta, ch_mirna_gtf_species ) + + ch_mirtop_logs = BAM_STATS_MIRNA_MIRTOP.out.stats_log + ch_versions = ch_versions.mix(BAM_STATS_MIRNA_MIRTOP.out.versions) + + ch_tsvs = BAM_STATS_MIRNA_MIRTOP.out.counts + .collect{it[1]} + .map{it -> return [[id:"TSVs"], it]} + + CSVTK_JOIN ( ch_tsvs ) + ch_versions = ch_versions.mix(CSVTK_JOIN.out.versions) + + DATATABLE_MERGE ( CSVTK_JOIN.out.csv ) + ch_versions = ch_versions.mix(DATATABLE_MERGE.out.versions) + + ch_reads_genome = BOWTIE_MAP_HAIRPIN.out.fastq .map { add_suffix(it, "genome") } - .dump (tag:'gsux') - .set { reads_genome } emit: - fasta_mature = FORMAT_MATURE.out.formatted_fasta - fasta_hairpin = FORMAT_HAIRPIN.out.formatted_fasta - unmapped = reads_genome - mature_stats = BAM_STATS_MATURE.out.stats - hairpin_stats = BAM_STATS_HAIRPIN.out.stats - mirtop_logs = ch_mirtop_logs - - versions = ch_versions + fasta_mature = FORMAT_MATURE.out.formatted_fasta // channel: [ val(meta), path(fasta) ] + fasta_hairpin = FORMAT_HAIRPIN.out.formatted_fasta // channel: [ val(meta), path(fasta) ] + unmapped = ch_reads_genome // channel: [ val(meta), path(bam) ] + mature_stats = BAM_STATS_MATURE.out.stats // channel: [ val(meta), [ stats ] ] + hairpin_stats = BAM_STATS_HAIRPIN.out.stats // channel: [ val(meta), [ stats ] ] + mirtop_logs = ch_mirtop_logs // channel: [ val(meta), path(log) ] + versions = ch_versions // channel: [ versions.yml ] } def add_suffix(row, suffix) { - def meta = [:] - meta.id = "${row[0].id}_${suffix}" - def array = [] - array = [ meta, row[1] ] - return array + def meta_clone = row[0].clone() + meta_clone.id = "${row[0].id}_${suffix}" + return [ meta_clone, row[1] ] } diff --git a/subworkflows/local/mirtrace.nf b/subworkflows/local/mirtrace.nf deleted file mode 100644 index 528e4233..00000000 --- a/subworkflows/local/mirtrace.nf +++ /dev/null @@ -1,29 +0,0 @@ -// -// Quantify mirna with bowtie and mirtop -// - -include { MIRTRACE_RUN } from '../../modules/local/mirtrace' - -workflow MIRTRACE { - take: - reads // channel: [ val(adapterseq), [ val(ids) ], [ path(reads) ] ] - - main: - - //Staging the files as path() but then getting the filenames for the config file that mirtrace needs - //Directly using val(reads) as in previous versions is not reliable as staging between work directories is not 100% reliable if not explicitly defined via nextflow itself - //mirtrace is a bit peculiar in parsing these config files, so looked it up in the source how its done. this way should work - ch_mirtrace_config = - reads.map { adapter, ids, reads -> [adapter, ids,reads]} - .transpose() - .collectFile { adapter, id, path -> "./${path.getFileName().toString()},${id},${adapter},${params.phred_offset}\n" } // operations need a channel, so, should be outside the module - - MIRTRACE_RUN ( - reads, - ch_mirtrace_config - ) - - emit: - results = MIRTRACE_RUN.out.mirtrace - versions = MIRTRACE_RUN.out.versions -} diff --git a/subworkflows/local/prepare_genome/main.nf b/subworkflows/local/prepare_genome/main.nf new file mode 100644 index 00000000..2f3f34b7 --- /dev/null +++ b/subworkflows/local/prepare_genome/main.nf @@ -0,0 +1,149 @@ +// +// Uncompress and prepare reference genome files +// + +// nf-core modules +include { UNTAR as UNTAR_BOWTIE_INDEX } from '../../../modules/nf-core/untar' +include { BOWTIE_BUILD as INDEX_GENOME } from '../../../modules/nf-core/bowtie/build/main' +include { BIOAWK as CLEAN_FASTA } from '../../../modules/nf-core/bioawk/main' + +/* +======================================================================================== + FUNCTIONS +======================================================================================== +*/ +// +// Extract prefix from bowtie index files +// +def extractFirstIndexPrefix(files_path) { + def files = files_path.listFiles() + if (files == null || files.length == 0) { + throw new Exception("The provided bowtie_index path doesn't contain any files.") + } + def index_prefix = '' + for (file_path in files) { + def file_name = file_path.getName() + if (file_name.endsWith(".1.ebwt") && !file_name.endsWith(".rev.1.ebwt")) { + index_prefix = file_name.substring(0, file_name.lastIndexOf(".1.ebwt")) + break + } + } + if (index_prefix == '') { + throw new Exception("Unable to extract the prefix from the Bowtie index files. No file with the '.1.ebwt' extension was found. Please ensure that the correct files are in the specified path.") + } + return index_prefix +} + + +workflow PREPARE_GENOME { + take: + val_fasta // file: /path/to/genome.fasta + val_bowtie_index // file or directory: /path/to/bowtie/ or /path/to/bowtie.tar.gz + val_mirtrace_species // string: Species for miRTrace + val_rrna // string: Path to the rRNA fasta file to be used as contamination database. + val_trna // string: Path to the tRNA fasta file to be used as contamination database. + val_cdna // string: Path to the cDNA fasta file to be used as contamination database. + val_ncrna // string: Path to the ncRNA fasta file to be used as contamination database. + val_pirna // string: Path to the piRNA fasta file to be used as contamination database. + val_other_contamination // string: Path to the additional fasta file to be used as contamination database. + val_fastp_known_mirna_adapters // string: Path to Fasta with known miRNA adapter sequences for adapter trimming + val_mirna_gtf // string: Path to GFF/GTF file with coordinates positions of precursor and miRNAs + + main: + ch_versions = Channel.empty() + + // Parameter channel handling + ch_fasta = val_fasta ? Channel.fromPath(val_fasta, checkIfExists: true).map{ it -> [ [id:it.baseName], it ] }.collect() : Channel.empty() + ch_bowtie_index = val_bowtie_index ? Channel.fromPath(val_bowtie_index, checkIfExists: true).map{ it -> [ [id:""], it ] }.collect() : Channel.empty() + + bool_mirtrace_species = val_mirtrace_species ? true : false + bool_has_fasta = val_fasta ? true : false + + ch_mirtrace_species = val_mirtrace_species ? Channel.value(val_mirtrace_species) : Channel.empty() + mirna_gtf_from_species = val_mirtrace_species ? (val_mirtrace_species == 'hsa' ? "https://raw.githubusercontent.com/nf-core/test-datasets/smrnaseq/reference/hsa.gff3" : "https://mirbase.org/download/CURRENT/genomes/${val_mirtrace_species}.gff3") : false + ch_mirna_gtf = val_mirna_gtf ? Channel.fromPath(val_mirna_gtf, checkIfExists: true).map{ it -> [ [id:it.baseName], it ] }.collect() : ( mirna_gtf_from_species ? Channel.fromPath(mirna_gtf_from_species, checkIfExists: true).map{ it -> [ [id:it.baseName], it ] }.collect() : Channel.empty() ) + ch_mirna_adapters = params.with_umi ? [] : Channel.fromPath(val_fastp_known_mirna_adapters, checkIfExists: true).collect() + + ch_rrna = val_rrna ? Channel.fromPath(val_rrna, checkIfExists: true).map{ it -> [ [id:'rRNA'], it ] }.collect() : Channel.empty() + ch_trna = val_trna ? Channel.fromPath(val_trna, checkIfExists: true).map{ it -> [ [id:'tRNA'], it ] }.collect() : Channel.empty() + ch_cdna = val_cdna ? Channel.fromPath(val_cdna, checkIfExists: true).map{ it -> [ [id:'cDNA'], it ] }.collect() : Channel.empty() + ch_ncrna = val_ncrna ? Channel.fromPath(val_ncrna, checkIfExists: true).map{ it -> [ [id:'ncRNA'], it ] }.collect() : Channel.empty() + ch_pirna = val_pirna ? Channel.fromPath(val_pirna, checkIfExists: true).map{ it -> [ [id:'piRNA'], it ] }.collect() : Channel.empty() + ch_other_contamination = val_other_contamination ? Channel.fromPath(val_other_contamination, checkIfExists: true).map{ it -> [ [id:'other'], it ] }.collect() : Channel.empty() + + // even if bowtie index is specified, there still needs to be a fasta. + // without fasta, no genome analysis. + if(val_fasta) { + // Clean fasta (replace non-ATCGs with Ns, remove whitespaces from headers) + // Note: CLEAN_FASTA runs even when a bowtie_index is provided, as cleaning doesn't affect it, making regeneration unnecessary. + CLEAN_FASTA ( ch_fasta ) + ch_versions = ch_versions.mix(CLEAN_FASTA.out.versions) + ch_fasta = CLEAN_FASTA.out.output + + //Prepare bowtie index, unless specified + //This needs to be done here as the index is used by GENOME_QUANT + if(val_bowtie_index) { + if (val_bowtie_index.endsWith(".tar.gz")) { + UNTAR_BOWTIE_INDEX ( ch_bowtie_index ) + ch_bowtie_index = UNTAR_BOWTIE_INDEX.out.untar + .map{ meta, index_dir -> + def index_prefix = extractFirstIndexPrefix(index_dir) + [[id:index_prefix], index_dir] + } + ch_versions = ch_versions.mix(UNTAR_BOWTIE_INDEX.out.versions) + } else { + ch_bowtie_index = Channel.fromPath(val_bowtie_index, checkIfExists: true) + .map{it -> + def index_prefix = extractFirstIndexPrefix(it) + [[id:index_prefix], it] + } + } + + } else { + + // Index FASTA with nf-core Bowtie1 + INDEX_GENOME ( CLEAN_FASTA.out.output ) + ch_versions = ch_versions.mix(INDEX_GENOME.out.versions) + + // Set channels: clean fasta and its index + ch_bowtie_index = INDEX_GENOME.out.index.collect() + } + } + + //Config checks + // Check optional parameters + if (!params.mirgenedb && !val_mirtrace_species) { + exit 1, "Reference species for miRTrace is not defined via the --mirtrace_species parameter." + } + + // Genome options + if (!params.mirgenedb) { + ch_reference_mature = params.mature ? Channel.fromPath(params.mature, checkIfExists: true).map{ it -> [ [id:it.baseName], it ] }.collect() : { exit 1, "Mature miRNA fasta file not found: ${params.mature}" } + ch_reference_hairpin = params.hairpin ? Channel.fromPath(params.hairpin, checkIfExists: true).map{ it -> [ [id:it.baseName], it ] }.collect() : { exit 1, "Hairpin miRNA fasta file not found: ${params.hairpin}" } + } else { + if (!params.mirgenedb_species) { + exit 1, "MirGeneDB species not set, please specify via the --mirgenedb_species parameter" + } + ch_reference_mature = params.mirgenedb_mature ? Channel.fromPath(params.mirgenedb_mature, checkIfExists: true).map{ it -> [ [id:it.baseName], it ] }.collect() : { exit 1, "Mature miRNA fasta file not found via --mirgenedb_mature: ${params.mirgenedb_mature}" } + ch_reference_hairpin = params.mirgenedb_hairpin ? Channel.fromPath(params.mirgenedb_hairpin, checkIfExists: true).map{ it -> [ [id:it.baseName], it ] }.collect() : { exit 1, "Hairpin miRNA fasta file not found via --mirgenedb_hairpin: ${params.mirgenedb_hairpin}" } + ch_mirna_gtf = params.mirgenedb_gff ? Channel.fromPath(params.mirgenedb_gff, checkIfExists: true).map{ it -> [ [id:it.baseName], it ] }.collect() : { exit 1, "MirGeneDB gff file not found via --mirgenedb_gff: ${params.mirgenedb_gff}"} + } + + emit: + fasta = ch_fasta // channel: [ val(meta), path(fasta) ] + has_fasta = bool_has_fasta // boolean + bowtie_index = ch_bowtie_index // channel: [ val(meta), path(directory_index) ] + versions = ch_versions // channel: [ versions.yml ] + mirtrace_species = ch_mirtrace_species // channel: [ val(string) ] + has_mirtrace_species = bool_mirtrace_species // boolean + reference_mature = ch_reference_mature // channel: [ val(meta), path(fasta) ] + reference_hairpin = ch_reference_hairpin // channel: [ val(meta), path(fasta) ] + mirna_gtf = ch_mirna_gtf // channel: [ val(meta), path(gtf) ] + rrna = ch_rrna // channel: [ val(meta), path(fasta) ] + trna = ch_trna // channel: [ val(meta), path(fasta) ] + cdna = ch_cdna // channel: [ val(meta), path(fasta) ] + ncrna = ch_ncrna // channel: [ val(meta), path(fasta) ] + pirna = ch_pirna // channel: [ val(meta), path(fasta) ] + other_contamination = ch_other_contamination // channel: [ val(meta), path(fasta) ] + mirna_adapters = ch_mirna_adapters // channel: [ val(meta), path(fasta) ] +} diff --git a/subworkflows/local/utils_nfcore_smrnaseq_pipeline/main.nf b/subworkflows/local/utils_nfcore_smrnaseq_pipeline/main.nf index 9d728769..c1494948 100644 --- a/subworkflows/local/utils_nfcore_smrnaseq_pipeline/main.nf +++ b/subworkflows/local/utils_nfcore_smrnaseq_pipeline/main.nf @@ -8,38 +8,41 @@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -include { UTILS_NFVALIDATION_PLUGIN } from '../../nf-core/utils_nfvalidation_plugin' -include { paramsSummaryMap } from 'plugin/nf-validation' -include { fromSamplesheet } from 'plugin/nf-validation' -include { UTILS_NEXTFLOW_PIPELINE } from '../../nf-core/utils_nextflow_pipeline' +include { UTILS_NFSCHEMA_PLUGIN } from '../../nf-core/utils_nfschema_plugin' +include { paramsSummaryMap } from 'plugin/nf-schema' +include { samplesheetToList } from 'plugin/nf-schema' include { completionEmail } from '../../nf-core/utils_nfcore_pipeline' include { completionSummary } from '../../nf-core/utils_nfcore_pipeline' -include { dashedLine } from '../../nf-core/utils_nfcore_pipeline' -include { nfCoreLogo } from '../../nf-core/utils_nfcore_pipeline' include { imNotification } from '../../nf-core/utils_nfcore_pipeline' include { UTILS_NFCORE_PIPELINE } from '../../nf-core/utils_nfcore_pipeline' -include { workflowCitation } from '../../nf-core/utils_nfcore_pipeline' +include { UTILS_NEXTFLOW_PIPELINE } from '../../nf-core/utils_nextflow_pipeline' /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW TO INITIALISE PIPELINE -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow PIPELINE_INITIALISATION { take: - version // boolean: Display version and exit - help // boolean: Display help text - validate_params // boolean: Boolean whether to validate parameters against the schema at runtime - monochrome_logs // boolean: Do not use coloured log outputs - nextflow_cli_args // array: List of positional nextflow CLI args - outdir // string: The output directory where the results will be saved - input // string: Path to input samplesheet + version // boolean: Display version and exit + validate_params // boolean: Boolean whether to validate parameters against the schema at runtime + monochrome_logs // boolean: Do not use coloured log outputs + nextflow_cli_args // array: List of positional nextflow CLI args + outdir // string: The output directory where the results will be saved + input // string: Path to input samplesheet + val_three_prime_adapter // string: Sequencing adapter sequence to use for trimming + val_phred_offset // string: The PHRED quality offset to be used for any input fastq files main: - ch_versions = Channel.empty() + //Channel definitions + ch_versions = Channel.empty() + ch_three_prime_adapter = Channel.value(val_three_prime_adapter) + ch_phred_offset = Channel.value(val_phred_offset) // // Print version and exit if required and dump pipeline parameters to JSON file @@ -50,29 +53,25 @@ workflow PIPELINE_INITIALISATION { outdir, workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1 ) - //Detect Protocol setting, set this early before help so help shows proper adapters etc pp - formatProtocol(params,log) + // // Validate parameters and generate parameter summary to stdout // - pre_help_text = nfCoreLogo(monochrome_logs) - post_help_text = '\n' + workflowCitation() + '\n' + dashedLine(monochrome_logs) - def String workflow_command = "nextflow run ${workflow.manifest.name} -profile --input samplesheet.csv --outdir " - UTILS_NFVALIDATION_PLUGIN ( - help, - workflow_command, - pre_help_text, - post_help_text, + UTILS_NFSCHEMA_PLUGIN ( + workflow, validate_params, - "nextflow_schema.json" + null ) + + // // Check config provided to the pipeline // UTILS_NFCORE_PIPELINE ( nextflow_cli_args ) + // // Custom validation for pipeline parameters // @@ -81,8 +80,9 @@ workflow PIPELINE_INITIALISATION { // // Create channel from input file provided through params.input // - Channel - .fromSamplesheet("input") + + ch_samplesheet = Channel + .fromList(samplesheetToList(params.input, "${projectDir}/assets/schema_input.json")) .map { meta, fastq_1, fastq_2 -> if (!fastq_2) { @@ -92,24 +92,27 @@ workflow PIPELINE_INITIALISATION { } } .groupTuple() - .map { - validateInputSamplesheet(it) + .map { samplesheet -> + validateInputSamplesheet(samplesheet) } .map { meta, fastqs -> return [ meta, fastqs.flatten() ] } - .set { ch_samplesheet } emit: - samplesheet = ch_samplesheet - versions = ch_versions + samplesheet = ch_samplesheet // channel: sample fastqs parsed from --input + versions = ch_versions // channel: [ versions.yml ] + three_prime_adapter = ch_three_prime_adapter // channel: [ val(string) ] + phred_offset = ch_phred_offset // channel: [ val(string) ] } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW FOR PIPELINE COMPLETION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow PIPELINE_COMPLETION { @@ -124,7 +127,6 @@ workflow PIPELINE_COMPLETION { multiqc_report // string: Path to MultiQC report main: - summary_params = paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") // @@ -132,27 +134,72 @@ workflow PIPELINE_COMPLETION { // workflow.onComplete { if (email || email_on_fail) { - completionEmail(summary_params, email, email_on_fail, plaintext_email, outdir, monochrome_logs, multiqc_report.toList()) + completionEmail( + summary_params, + email, + email_on_fail, + plaintext_email, + outdir, + monochrome_logs, + multiqc_report.toList() + ) } completionSummary(monochrome_logs) - if (hook_url) { imNotification(summary_params, hook_url) } } + + workflow.onError { + log.error "Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting" + } } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // // Check and validate pipeline parameters // def validateInputParameters() { genomeExistsError() + + if (!params.mirgenedb && !params.mirtrace_species) { + error("Reference species for miRTrace is not defined via the `--mirtrace_species` parameter.") + } + + if (!params.mirgenedb) { + // Validate mature miRNA fasta file + if (!params.mature) { + error("Mature miRNA fasta file not found. Please specify using the `--mature` parameter.") + } + // Validate hairpin miRNA fasta file + if (!params.hairpin) { + error("Hairpin miRNA fasta file not found. Please specify using the `--hairpin` parameter.") + } + } else { + // Validate MirGeneDB species + if (!params.mirgenedb_species) { + error("You specified to be using MirGeneDB, but the MirGeneDB species is not set. Please specify using the `--mirgenedb_species` parameter.") + } + // Validate MirGeneDB mature miRNA fasta file + if (!params.mirgenedb_mature) { + error("You specified to be using MirGeneDB, but the mature miRNA fasta file is not found. Please provide the file using the `--mirgenedb_mature` parameter.") + } + // Validate MirGeneDB hairpin miRNA fasta file + if (!params.mirgenedb_hairpin) { + error("You specified to be using MirGeneDB, but the hairpin miRNA fasta file is not found. Please provide the file using the `--mirgenedb_hairpin` parameter.") + } + // Validate MirGeneDB GFF file + if (!params.mirgenedb_gff) { + error("You specified to be using MirGeneDB, but the GFF file is not found. Please provide the file using the `--mirgenedb_gff` parameter.") + } + } } // Validate channels from input samplesheet @@ -161,11 +208,19 @@ def validateInputSamplesheet(input) { def (metas, fastqs) = input[1..2] // Check that multiple runs of the same sample are of the same datatype i.e. single-end / paired-end - def endedness_ok = metas.collect{ it.single_end }.unique().size == 1 + def endedness_ok = metas.collect{ meta -> meta.single_end }.unique().size == 1 if (!endedness_ok) { error("Please check input samplesheet -> Multiple runs of a sample must be of the same datatype i.e. single-end or paired-end: ${metas[0].id}") } + // Emit a warning if `single_end` is false + if (metas[0].single_end == false) { + log.warn "Sample ${metas[0].id} is detected as paired-end reads (fastq_1 and fastq_2). The pipeline only handles SE data. Samplesheets with fastq_1 and fastq_2 are supported but fastq_2 is removed." + // Remove fastq_2 from the list and keep only fastq_1 + fastqs = fastqs.collect { it.take(1) } + metas[0].single_end = true + } + return [ metas[0], fastqs ] } // @@ -193,31 +248,55 @@ def genomeExistsError() { error(error_string) } } - // // Generate methods description for MultiQC // def toolCitationText() { - // TODO nf-core: Optionally add in-text citation tools to this list. // Can use ternary operators to dynamically construct based conditions, e.g. params["run_xyz"] ? "Tool (Foo et al. 2023)" : "", // Uncomment function in methodsDescriptionText to render in MultiQC report def citation_text = [ "Tools used in the workflow included:", - "FastQC (Andrews 2010),", - "MultiQC (Ewels et al. 2016)", + !params.skip_fastqc ? "FastQC (Andrews 2010)," : "", + !params.skip_multiqc ? "MultiQC (Ewels et al. 2016)," : "", + !params.skip_fastp ? "fastp (Chen et al. 2018)," : "", + !params.skip_mirdeep ? "MiRDeep2 (Friedländer et al. 2012)," : "", + params.filter_contamination ? "Contamination filtering tools: BLAT (Kent 2002), Bowtie2 (Langmead and Salzberg 2012)," : "", + params.mirtrace_species ? "miRTrace (Kang et al. 2018)," : "", + "UMI-tools (Smith et al. 2017),", + "Bowtie (Langmead et al. 2009),", + "SAMtools (Li et al. 2009),", + "EdgeR (Robinson et al. 2010),", + "Mirtop (Desvignes et al. 2019),", + "SeqKit (Shen et al. 2016),", + "UMICollapse (Liu 2020),", + "Seqcluster (Pantano et al. 2011)", "." - ].join(' ').trim() + ].join(' ').trim() return citation_text } def toolBibliographyText() { - // TODO nf-core: Optionally add bibliographic entries to this list. // Can use ternary operators to dynamically construct based conditions, e.g. params["run_xyz"] ? "
  • Author (2023) Pub name, Journal, DOI
  • " : "", // Uncomment function in methodsDescriptionText to render in MultiQC report def reference_text = [ "
  • Andrews S, (2010) FastQC, URL: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/).
  • ", - "
  • Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. doi: /10.1093/bioinformatics/btw354
  • " + "
  • Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. doi: /10.1093/bioinformatics/btw354
  • ", + "
  • Smith, T., Heger, A., & Sudbery, I. (2017). UMI-tools: Modelling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. PeerJ, 5, e8275. doi: 10.7717/peerj.8275
  • ", + "
  • Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics, 34(17), i884–i890. doi: 10.1093/bioinformatics/bty560
  • ", + "
  • Kang, W., Eldfjell, Y., Fromm, B., et al. (2018). miRTrace reveals the organismal origins of microRNA sequencing data. Genome Biology, 19(1), 213. doi: 10.1186/s13059-018-1588-9
  • ", + "
  • Langmead, B., Trapnell, C., Pop, M., & Salzberg, S. L. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology, 10(3), R25. doi: 10.1186/gb-2009-10-3-r25
  • ", + "
  • Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods, 9(4), 357–359. doi: 10.1038/nmeth.1923
  • ", + "
  • Li, H., Handsaker, B., Wysoker, A., et al. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078–2079. doi: 10.1093/bioinformatics/btp352
  • ", + "
  • Robinson, M. D., McCarthy, D. J., & Smyth, G. K. (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26(1), 139–140. doi: 10.1093/bioinformatics/btp616
  • ", + "
  • Desvignes, T., Loher, P., Eilbeck, K., et al. (2019). Unification of miRNA and isomiR research: the mirGFF3 format and the mirtop API. Bioinformatics, 36(3), 698–703. doi: 10.1093/bioinformatics/btz675
  • ", + "
  • Friedländer, M. R., Mackowiak, S. D., Li, N., Chen, W., & Rajewsky, N. (2012). miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades. Nucleic Acids Research, 40(1), 37–52. doi: 10.1093/nar/gkr688
  • ", + "
  • Shen, W., Le, S., Li, Y., & Hu, F. (2016). SeqKit: A cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE, 11(10), e0163962. doi: 10.1371/journal.pone.0163962
  • ", + "
  • Liu, D. (2020). Algorithms for efficiently collapsing reads with Unique Molecular Identifiers. PeerJ, 8, e9583. doi: 10.7717/peerj.9583
  • ", + "
  • Kent, W. J. (2002). BLAT—the BLAST-like alignment tool. Genome Research, 12(4), 656–664. doi: 10.1101/gr.229202
  • ", + "
  • Pantano, L., Estivill, X., & Martí, E. (2011). A non-biased framework for the annotation and classification of the non-miRNA small RNA transcriptome. Bioinformatics, 27(22), 3202–3203. doi: 10.1093/bioinformatics/btr527
  • ", + "
  • Bioawk, URL: https://github.com/lh3/bioawk
  • ", + "
  • csvtk, URL: https://github.com/shenwei356/csvtk
  • " ].join(' ').trim() return reference_text @@ -230,16 +309,25 @@ def methodsDescriptionText(mqc_methods_yaml) { meta["manifest_map"] = workflow.manifest.toMap() // Pipeline DOI - meta["doi_text"] = meta.manifest_map.doi ? "(doi: ${meta.manifest_map.doi})" : "" - meta["nodoi_text"] = meta.manifest_map.doi ? "": "
  • If available, make sure to update the text to include the Zenodo DOI of version of the pipeline used.
  • " + if (meta.manifest_map.doi) { + // Using a loop to handle multiple DOIs + // Removing `https://doi.org/` to handle pipelines using DOIs vs DOI resolvers + // Removing ` ` since the manifest.doi is a string and not a proper list + def temp_doi_ref = "" + def manifest_doi = meta.manifest_map.doi.tokenize(",") + manifest_doi.each { doi_ref -> + temp_doi_ref += "(doi: ${doi_ref.replace("https://doi.org/", "").replace(" ", "")}), " + } + meta["doi_text"] = temp_doi_ref.substring(0, temp_doi_ref.length() - 2) + } else meta["doi_text"] = "" + meta["nodoi_text"] = meta.manifest_map.doi ? "" : "
  • If available, make sure to update the text to include the Zenodo DOI of version of the pipeline used.
  • " // Tool references meta["tool_citations"] = "" meta["tool_bibliography"] = "" - // TODO nf-core: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled! - // meta["tool_citations"] = toolCitationText().replaceAll(", \\.", ".").replaceAll("\\. \\.", ".").replaceAll(", \\.", ".") - // meta["tool_bibliography"] = toolBibliographyText() + meta["tool_citations"] = toolCitationText().replaceAll(", \\.", ".").replaceAll("\\. \\.", ".").replaceAll(", \\.", ".") + meta["tool_bibliography"] = toolBibliographyText() def methods_text = mqc_methods_yaml.text @@ -250,44 +338,3 @@ def methodsDescriptionText(mqc_methods_yaml) { return description_html.toString() } -/* -* Format the protocol -* Given the protocol parameter (params.protocol), -* this function formats the protocol such that it is fit for the respective -* subworkflow -*/ -def formatProtocol(params,log) { - - switch(params.protocol){ - case 'illumina': - params.putIfAbsent("clip_r1", 0); - params.putIfAbsent("three_prime_clip_r1",0); - params.putIfAbsent("three_prime_adapter", "TGGAATTCTCGGGTGCCAAGG"); - break - case 'nextflex': - params.putIfAbsent("clip_r1", 4); - params.putIfAbsent("three_prime_clip_r1", 4); - params.putIfAbsent("three_prime_adapter", "TGGAATTCTCGGGTGCCAAGG"); - break - case 'qiaseq': - params.putIfAbsent("clip_r1",0); - params.putIfAbsent("three_prime_clip_r1",0); - params.putIfAbsent("three_prime_adapter","AACTGTAGGCACCATCAAT"); - break - case 'cats': - params.putIfAbsent("clip_r1",3); - params.putIfAbsent("three_prime_clip_r1", 0); - params.putIfAbsent("three_prime_adapter", "AAAAAAAA"); - break - case 'custom': - params.putIfAbsent("clip_r1", params.clip_r1) - params.putIfAbsent("three_prime_clip_r1", params.three_prime_clip_r1) - default: - log.warn "Please make sure to specify all required clipping and trimming parameters, otherwise only adapter detection will be performed." - } - - log.warn "Running with Protocol ${params.protocol}" - log.warn "Therefore using Adapter: ${params.three_prime_adapter}" - log.warn "Clipping ${params.clip_r1} bases from R1" - log.warn "And clipping ${params.three_prime_clip_r1} bases from 3' end" - } diff --git a/subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.test b/subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.test index 75b5b934..821a3cf5 100644 --- a/subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.test +++ b/subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.test @@ -19,9 +19,6 @@ nextflow_workflow { test("test_bam_sort_stats_samtools_single_end") { when { - params { - outdir = "$outputDir" - } workflow { """ input[0] = Channel.of([ @@ -41,9 +38,11 @@ nextflow_workflow { { assert workflow.success}, { assert workflow.out.bam.get(0).get(1) ==~ ".*.bam"}, { assert workflow.out.bai.get(0).get(1) ==~ ".*.bai"}, - { assert snapshot(workflow.out.stats).match("test_bam_sort_stats_samtools_single_end_stats") }, - { assert snapshot(workflow.out.flagstat).match("test_bam_sort_stats_samtools_single_end_flagstats") }, - { assert snapshot(workflow.out.idxstats).match("test_bam_sort_stats_samtools_single_end_idxstats") } + { assert snapshot( + workflow.out.flagstat, + workflow.out.idxstats, + workflow.out.stats, + workflow.out.versions).match() } ) } } @@ -51,9 +50,6 @@ nextflow_workflow { test("test_bam_sort_stats_samtools_paired_end") { when { - params { - outdir = "$outputDir" - } workflow { """ input[0] = Channel.of([ @@ -73,9 +69,65 @@ nextflow_workflow { { assert workflow.success}, { assert workflow.out.bam.get(0).get(1) ==~ ".*.bam"}, { assert workflow.out.bai.get(0).get(1) ==~ ".*.bai"}, - { assert snapshot(workflow.out.stats).match("test_bam_sort_stats_samtools_paired_end_stats") }, - { assert snapshot(workflow.out.flagstat).match("test_bam_sort_stats_samtools_paired_end_flagstats") }, - { assert snapshot(workflow.out.idxstats).match("test_bam_sort_stats_samtools_paired_end_idxstats") } + { assert snapshot( + workflow.out.flagstat, + workflow.out.idxstats, + workflow.out.stats, + workflow.out.versions).match() } + ) + } + } + + test("test_bam_sort_stats_samtools_single_end - stub") { + + options "-stub" + + when { + workflow { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.bam', checkIfExists: true) + ]) + input[1] = Channel.of([ + [ id:'genome' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll( + { assert workflow.success}, + { assert snapshot(workflow.out).match() } + ) + } + } + + test("test_bam_sort_stats_samtools_paired_end - stub") { + + options "-stub" + + when { + workflow { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.bam', checkIfExists: true) + ]) + input[1] = Channel.of([ + [ id:'genome' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll( + { assert workflow.success}, + { assert snapshot(workflow.out).match() } ) } } diff --git a/subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.test.snap b/subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.test.snap index 6645a092..c3c9a049 100644 --- a/subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.test.snap +++ b/subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.test.snap @@ -1,5 +1,5 @@ { - "test_bam_sort_stats_samtools_paired_end_flagstats": { + "test_bam_sort_stats_samtools_single_end": { "content": [ [ [ @@ -7,53 +7,42 @@ "id": "test", "single_end": false }, - "test.flagstat:md5,4f7ffd1e6a5e85524d443209ac97d783" + "test.flagstat:md5,2191911d72575a2358b08b1df64ccb53" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" - }, - "timestamp": "2023-10-22T20:25:03.687121177" - }, - "test_bam_sort_stats_samtools_paired_end_idxstats": { - "content": [ + ], [ [ { "id": "test", "single_end": false }, - "test.idxstats:md5,df60a8c8d6621100d05178c93fb053a2" + "test.idxstats:md5,613e048487662c694aa4a2f73ca96a20" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" - }, - "timestamp": "2023-10-22T20:25:03.709648916" - }, - "test_bam_sort_stats_samtools_single_end_stats": { - "content": [ + ], [ [ { "id": "test", "single_end": false }, - "test.stats:md5,cb0bf2b79de52fdf0c61e80efcdb0bb4" + "test.stats:md5,2fe0f3a7a1f07906061c1dadb62e0d05" ] + ], + [ + "versions.yml:md5,032c89015461d597fcc5a5331b619d0a", + "versions.yml:md5,416c5e4a374c61167db999b0e400e3cf", + "versions.yml:md5,721391fd94c417808516480c9451c6fd", + "versions.yml:md5,9e12386b91a2977d23292754e3bcb522", + "versions.yml:md5,c294c162aeb09862cc5e55b602647452" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:44:38.553256801" + "timestamp": "2024-09-16T08:26:24.36986488" }, - "test_bam_sort_stats_samtools_paired_end_stats": { + "test_bam_sort_stats_samtools_paired_end": { "content": [ [ [ @@ -61,50 +50,281 @@ "id": "test", "single_end": false }, - "test.stats:md5,d7796222a087b9bb97f631f1c21b9c95" + "test.flagstat:md5,4f7ffd1e6a5e85524d443209ac97d783" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" - }, - "timestamp": "2024-02-13T16:44:48.355870518" - }, - "test_bam_sort_stats_samtools_single_end_idxstats": { - "content": [ + ], [ [ { "id": "test", "single_end": false }, - "test.idxstats:md5,613e048487662c694aa4a2f73ca96a20" + "test.idxstats:md5,df60a8c8d6621100d05178c93fb053a2" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" - }, - "timestamp": "2024-01-18T17:10:02.84631" - }, - "test_bam_sort_stats_samtools_single_end_flagstats": { - "content": [ + ], [ [ { "id": "test", "single_end": false }, - "test.flagstat:md5,2191911d72575a2358b08b1df64ccb53" + "test.stats:md5,ba007b13981dad548358c7c957d41e12" ] + ], + [ + "versions.yml:md5,032c89015461d597fcc5a5331b619d0a", + "versions.yml:md5,416c5e4a374c61167db999b0e400e3cf", + "versions.yml:md5,721391fd94c417808516480c9451c6fd", + "versions.yml:md5,9e12386b91a2977d23292754e3bcb522", + "versions.yml:md5,c294c162aeb09862cc5e55b602647452" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T08:26:38.683996037" + }, + "test_bam_sort_stats_samtools_single_end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + "versions.yml:md5,032c89015461d597fcc5a5331b619d0a", + "versions.yml:md5,416c5e4a374c61167db999b0e400e3cf", + "versions.yml:md5,721391fd94c417808516480c9451c6fd", + "versions.yml:md5,9e12386b91a2977d23292754e3bcb522", + "versions.yml:md5,c294c162aeb09862cc5e55b602647452" + ], + "bai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "bam": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "csi": [ + + ], + "flagstat": [ + [ + { + "id": "test", + "single_end": false + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "idxstats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "stats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,032c89015461d597fcc5a5331b619d0a", + "versions.yml:md5,416c5e4a374c61167db999b0e400e3cf", + "versions.yml:md5,721391fd94c417808516480c9451c6fd", + "versions.yml:md5,9e12386b91a2977d23292754e3bcb522", + "versions.yml:md5,c294c162aeb09862cc5e55b602647452" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T08:07:18.896460047" + }, + "test_bam_sort_stats_samtools_paired_end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + "versions.yml:md5,032c89015461d597fcc5a5331b619d0a", + "versions.yml:md5,416c5e4a374c61167db999b0e400e3cf", + "versions.yml:md5,721391fd94c417808516480c9451c6fd", + "versions.yml:md5,9e12386b91a2977d23292754e3bcb522", + "versions.yml:md5,c294c162aeb09862cc5e55b602647452" + ], + "bai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "bam": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "csi": [ + + ], + "flagstat": [ + [ + { + "id": "test", + "single_end": false + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "idxstats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "stats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,032c89015461d597fcc5a5331b619d0a", + "versions.yml:md5,416c5e4a374c61167db999b0e400e3cf", + "versions.yml:md5,721391fd94c417808516480c9451c6fd", + "versions.yml:md5,9e12386b91a2977d23292754e3bcb522", + "versions.yml:md5,c294c162aeb09862cc5e55b602647452" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-01-18T17:10:02.829756" + "timestamp": "2024-09-16T08:07:39.028688324" } } \ No newline at end of file diff --git a/subworkflows/nf-core/bam_stats_mirna_mirtop/main.nf b/subworkflows/nf-core/bam_stats_mirna_mirtop/main.nf new file mode 100644 index 00000000..ac9393f4 --- /dev/null +++ b/subworkflows/nf-core/bam_stats_mirna_mirtop/main.nf @@ -0,0 +1,37 @@ +include { MIRTOP_GFF } from '../../../modules/nf-core/mirtop/gff' +include { MIRTOP_COUNTS } from '../../../modules/nf-core/mirtop/counts' +include { MIRTOP_EXPORT } from '../../../modules/nf-core/mirtop/export' +include { MIRTOP_STATS } from '../../../modules/nf-core/mirtop/stats' + + +workflow BAM_STATS_MIRNA_MIRTOP { + + take: + ch_bam // channel: [ val(meta), [ bam ] ] + ch_hairpin // channel: [ val(meta), [ hairpin ] ] + ch_gtf_species // channel: [ val(meta), [ gtf ], val(species) ] + + main: + + ch_versions = Channel.empty() + + MIRTOP_GFF ( ch_bam, ch_hairpin, ch_gtf_species ) + ch_versions = ch_versions.mix(MIRTOP_GFF.out.versions) + + MIRTOP_COUNTS ( MIRTOP_GFF.out.gff, ch_hairpin, ch_gtf_species ) + ch_versions = ch_versions.mix(MIRTOP_COUNTS.out.versions) + + MIRTOP_EXPORT ( MIRTOP_GFF.out.gff, ch_hairpin, ch_gtf_species ) + ch_versions = ch_versions.mix(MIRTOP_EXPORT.out.versions) + + MIRTOP_STATS ( MIRTOP_GFF.out.gff ) + ch_versions = ch_versions.mix(MIRTOP_STATS.out.versions) + + emit: + isomirs = MIRTOP_EXPORT.out.tsv // channel: [ val(meta), [ tsv ] ] + counts = MIRTOP_COUNTS.out.tsv // channel: [ val(meta), [ tsv ] ] + stats_txt = MIRTOP_STATS.out.txt // channel: [ val(meta), [ txt ] ] + stats_log = MIRTOP_STATS.out.log // channel: [ val(meta), [ log ] ] + versions = ch_versions // channel: [ versions.yml ] +} + diff --git a/subworkflows/nf-core/bam_stats_mirna_mirtop/meta.yml b/subworkflows/nf-core/bam_stats_mirna_mirtop/meta.yml new file mode 100644 index 00000000..e7cf4e73 --- /dev/null +++ b/subworkflows/nf-core/bam_stats_mirna_mirtop/meta.yml @@ -0,0 +1,61 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json +name: "bam_stats_mirna_mirtop" +description: mirtop is a command line tool to annotate miRNAs and isomiRs and compute general statistics using the mirGFF3 format. +keywords: + - miRNA + - isomirs + - bam + - stats +components: + - mirtop/gff + - mirtop/counts + - mirtop/export + - mirtop/stats +input: + - ch_bam: + type: file + description: | + The input channel containing the BAM/CRAM/SAM files + Structure: [ val(meta), path(bam) ] + pattern: "*.{bam/cram/sam}" + - ch_hairpin: + type: file + description: | + Input channel containing the hairpin fasta file + Structure: [ val(meta), path(fasta) ] + pattern: "*.{fasta,fa}" + - ch_gtf_species: + type: file + description: | + Input channel containing the species gtf and the name of species in miRbase format + Structure: [ val(meta), path(gtf), val(species)] + pattern: "*.{gtf}" +output: + - rawdata_tsv: + type: file + description: | + Channel containing isomiRs compatible files + Structure: [ val(meta), path(tsv) ] + pattern: "*.tsv" + - stats_txt: + type: file + description: | + Channel containing TXT file with a table with different statistics for each type of isomiRs: total counts, average counts, total sequences. + Structure: [ val(meta), path(txt) ] + pattern: "*.txt" + - stats_log: + type: file + description: | + Channel containing log files in JSON format with the same information as stats_txt + Structure: [ val(meta), path(log) ] + pattern: "*.log" + - versions: + type: file + description: | + File containing software versions + Structure: [ path(versions.yml) ] + pattern: "versions.yml" +authors: + - "@atrigila" +maintainers: + - "@atrigila" diff --git a/subworkflows/nf-core/bam_stats_mirna_mirtop/tests/main.nf.test b/subworkflows/nf-core/bam_stats_mirna_mirtop/tests/main.nf.test new file mode 100644 index 00000000..604c747b --- /dev/null +++ b/subworkflows/nf-core/bam_stats_mirna_mirtop/tests/main.nf.test @@ -0,0 +1,50 @@ + +nextflow_workflow { + + name "Test Subworkflow BAM_STATS_MIRNA_MIRTOP" + script "../main.nf" + workflow "BAM_STATS_MIRNA_MIRTOP" + + tag "subworkflows" + tag "subworkflows_nfcore" + tag "subworkflows/bam_stats_mirna_mirtop" + tag "mirtop" + tag "mirtop/gff" + tag "mirtop/export" + tag "mirtop/stats" + tag "mirtop/counts" + + test("isomir - bam") { + + when { + workflow { + """ + input[0] = [ + [ id:'sample_sim_isomir_bam'], // meta map + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/nf-test_data/mirtop/SRX8195117_SRR11631013_seqcluster.bam", checkIfExists: true), + ] + input[1] = [ + [ id:'hairpin'], // meta map + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/hairpin.fa", checkIfExists: true), + ] + input[2] = [ + [ id:'hsa' ], // meta map + file("https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/hsa.gff3", checkIfExists: true), + "hsa"] + """ + } + } + + then { + assertAll( + { assert workflow.success}, + { assert snapshot( + workflow.out.isomirs, + workflow.out.stats_txt, + workflow.out.stats_log, + workflow.out.versions).match() }, + { assert path("${workflow.out.counts[0][1]}").exists() } + ) + } + } +} diff --git a/subworkflows/nf-core/bam_stats_mirna_mirtop/tests/main.nf.test.snap b/subworkflows/nf-core/bam_stats_mirna_mirtop/tests/main.nf.test.snap new file mode 100644 index 00000000..513f11be --- /dev/null +++ b/subworkflows/nf-core/bam_stats_mirna_mirtop/tests/main.nf.test.snap @@ -0,0 +1,41 @@ +{ + "isomir - bam": { + "content": [ + [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "sample_sim_isomir_bam_mirtop_rawData.tsv:md5,4a065e444c54b0e816352bf1640594dd" + ] + ], + [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "mirtop_stats.txt:md5,3db542a532cf3f3c8b4efda134bd6202" + ] + ], + [ + [ + { + "id": "sample_sim_isomir_bam" + }, + "mirtop_stats.log:md5,676d0cedfb4770f7862bc0601c2a7f15" + ] + ], + [ + "versions.yml:md5,60caed2a2383e82ed06867a75d5a50e2", + "versions.yml:md5,87d473a6cb931a2032357c3272b3249d", + "versions.yml:md5,91c635a35016586e75b620c7f33e6461", + "versions.yml:md5,b314282e0db6b00dc8acaffe182f2b80" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-19T21:09:04.079276367" + } +} \ No newline at end of file diff --git a/subworkflows/nf-core/bam_stats_mirna_mirtop/tests/nextflow.config b/subworkflows/nf-core/bam_stats_mirna_mirtop/tests/nextflow.config new file mode 100644 index 00000000..83d77e20 --- /dev/null +++ b/subworkflows/nf-core/bam_stats_mirna_mirtop/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: 'MIRTOP_COUNTS' { + ext.args = '--add-extra' + } +} diff --git a/subworkflows/nf-core/bam_stats_samtools/tests/main.nf.test b/subworkflows/nf-core/bam_stats_samtools/tests/main.nf.test index c8b21f28..76e7a40a 100644 --- a/subworkflows/nf-core/bam_stats_samtools/tests/main.nf.test +++ b/subworkflows/nf-core/bam_stats_samtools/tests/main.nf.test @@ -15,9 +15,6 @@ nextflow_workflow { test("test_bam_stats_samtools_single_end") { when { - params { - outdir = "$outputDir" - } workflow { """ input[0] = Channel.of([ @@ -36,9 +33,11 @@ nextflow_workflow { then { assertAll( { assert workflow.success}, - { assert snapshot(workflow.out.stats).match("test_bam_stats_samtools_single_end_stats") }, - { assert snapshot(workflow.out.flagstat).match("test_bam_stats_samtools_single_end_flagstats") }, - { assert snapshot(workflow.out.idxstats).match("test_bam_stats_samtools_single_end_idxstats") } + { assert snapshot( + workflow.out.flagstat, + workflow.out.idxstats, + workflow.out.stats, + workflow.out.versions).match() } ) } } @@ -46,9 +45,6 @@ nextflow_workflow { test("test_bam_stats_samtools_paired_end") { when { - params { - outdir = "$outputDir" - } workflow { """ input[0] = Channel.of([ @@ -67,9 +63,11 @@ nextflow_workflow { then { assertAll( { assert workflow.success }, - { assert snapshot(workflow.out.stats).match("test_bam_stats_samtools_paired_end_stats") }, - { assert snapshot(workflow.out.flagstat).match("test_bam_stats_samtools_paired_end_flagstats") }, - { assert snapshot(workflow.out.idxstats).match("test_bam_stats_samtools_paired_end_idxstats") } + { assert snapshot( + workflow.out.flagstat, + workflow.out.idxstats, + workflow.out.stats, + workflow.out.versions).match() } ) } } @@ -77,9 +75,6 @@ nextflow_workflow { test("test_bam_stats_samtools_paired_end_cram") { when { - params { - outdir = "$outputDir" - } workflow { """ input[0] = Channel.of([ @@ -98,11 +93,96 @@ nextflow_workflow { then { assertAll( { assert workflow.success}, - { assert snapshot(workflow.out.stats).match("test_bam_stats_samtools_paired_end_cram_stats") }, - { assert snapshot(workflow.out.flagstat).match("test_bam_stats_samtools_paired_end_cram_flagstats") }, - { assert snapshot(workflow.out.idxstats).match("test_bam_stats_samtools_paired_end_cram_idxstats") } + { assert snapshot( + workflow.out.flagstat, + workflow.out.idxstats, + workflow.out.stats, + workflow.out.versions).match() } ) } } + test ("test_bam_stats_samtools_single_end - stub") { + + options "-stub" + + when { + workflow { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.sorted.bam.bai', checkIfExists: true) + ]) + input[1] = Channel.of([ + [ id:'genome' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll( + { assert workflow.success}, + { assert snapshot(workflow.out).match() } + ) + } + } + + test("test_bam_stats_samtools_paired_end - stub") { + + options "-stub" + + when { + workflow { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true) + ]) + input[1] = Channel.of([ + [ id:'genome' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(workflow.out).match() } + ) + } + } + + test("test_bam_stats_samtools_paired_end_cram - stub") { + + options "-stub" + + when { + workflow { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true) + ]) + input[1] = Channel.of([ + [ id:'genome' ], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll( + { assert workflow.success}, + { assert snapshot(workflow.out).match() } + ) + } + } } diff --git a/subworkflows/nf-core/bam_stats_samtools/tests/main.nf.test.snap b/subworkflows/nf-core/bam_stats_samtools/tests/main.nf.test.snap index bf0b0c69..8ca22526 100644 --- a/subworkflows/nf-core/bam_stats_samtools/tests/main.nf.test.snap +++ b/subworkflows/nf-core/bam_stats_samtools/tests/main.nf.test.snap @@ -1,59 +1,230 @@ { - "test_bam_stats_samtools_paired_end_cram_flagstats": { + "test_bam_stats_samtools_paired_end - stub": { "content": [ - [ - [ - { - "id": "test", - "single_end": false - }, - "test.flagstat:md5,a53f3d26e2e9851f7d528442bbfe9781" + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,73c55059ed478cd2f9cd93dd3185da3a", + "versions.yml:md5,80d8653e01575b3c381d87073f672fb5", + "versions.yml:md5,cb889532237a2f3d813978ac14a12d51" + ], + "flagstat": [ + [ + { + "id": "test", + "single_end": true + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "idxstats": [ + [ + { + "id": "test", + "single_end": true + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "stats": [ + [ + { + "id": "test", + "single_end": true + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,73c55059ed478cd2f9cd93dd3185da3a", + "versions.yml:md5,80d8653e01575b3c381d87073f672fb5", + "versions.yml:md5,cb889532237a2f3d813978ac14a12d51" ] - ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2023-11-06T09:31:26.194017574" + "timestamp": "2024-09-16T08:08:35.660286921" }, - "test_bam_stats_samtools_paired_end_stats": { + "test_bam_stats_samtools_single_end - stub": { "content": [ - [ - [ - { - "id": "test", - "single_end": true - }, - "test.stats:md5,ddaf8f33fe9c1ebe9b06933213aec8ed" + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,73c55059ed478cd2f9cd93dd3185da3a", + "versions.yml:md5,80d8653e01575b3c381d87073f672fb5", + "versions.yml:md5,cb889532237a2f3d813978ac14a12d51" + ], + "flagstat": [ + [ + { + "id": "test", + "single_end": true + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "idxstats": [ + [ + { + "id": "test", + "single_end": true + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "stats": [ + [ + { + "id": "test", + "single_end": true + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,73c55059ed478cd2f9cd93dd3185da3a", + "versions.yml:md5,80d8653e01575b3c381d87073f672fb5", + "versions.yml:md5,cb889532237a2f3d813978ac14a12d51" ] - ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:45:06.230091746" + "timestamp": "2024-09-16T08:08:24.220305512" }, - "test_bam_stats_samtools_paired_end_flagstats": { + "test_bam_stats_samtools_paired_end_cram - stub": { "content": [ - [ - [ - { - "id": "test", - "single_end": true - }, - "test.flagstat:md5,4f7ffd1e6a5e85524d443209ac97d783" + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,73c55059ed478cd2f9cd93dd3185da3a", + "versions.yml:md5,80d8653e01575b3c381d87073f672fb5", + "versions.yml:md5,cb889532237a2f3d813978ac14a12d51" + ], + "flagstat": [ + [ + { + "id": "test", + "single_end": false + }, + "test.flagstat:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "idxstats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "stats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,73c55059ed478cd2f9cd93dd3185da3a", + "versions.yml:md5,80d8653e01575b3c381d87073f672fb5", + "versions.yml:md5,cb889532237a2f3d813978ac14a12d51" ] - ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-01-18T17:17:27.717482" + "timestamp": "2024-09-16T08:08:54.206770141" }, - "test_bam_stats_samtools_single_end_flagstats": { + "test_bam_stats_samtools_single_end": { "content": [ [ [ @@ -63,52 +234,48 @@ }, "test.flagstat:md5,2191911d72575a2358b08b1df64ccb53" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" - }, - "timestamp": "2023-11-06T09:26:10.340046381" - }, - "test_bam_stats_samtools_paired_end_cram_idxstats": { - "content": [ + ], [ [ { "id": "test", - "single_end": false + "single_end": true }, - "test.idxstats:md5,e179601fa7b8ebce81ac3765206f6c15" + "test.idxstats:md5,613e048487662c694aa4a2f73ca96a20" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" - }, - "timestamp": "2023-11-06T09:31:26.207052003" - }, - "test_bam_stats_samtools_single_end_stats": { - "content": [ + ], [ [ { "id": "test", "single_end": true }, - "test.stats:md5,dc178e1a4956043aba8abc83e203521b" + "test.stats:md5,291bb2393ec947140d12d42c2795b222" ] + ], + [ + "versions.yml:md5,73c55059ed478cd2f9cd93dd3185da3a", + "versions.yml:md5,80d8653e01575b3c381d87073f672fb5", + "versions.yml:md5,cb889532237a2f3d813978ac14a12d51" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:44:57.442208382" + "timestamp": "2024-09-16T08:07:49.731645858" }, - "test_bam_stats_samtools_paired_end_idxstats": { + "test_bam_stats_samtools_paired_end": { "content": [ + [ + [ + { + "id": "test", + "single_end": true + }, + "test.flagstat:md5,4f7ffd1e6a5e85524d443209ac97d783" + ] + ], [ [ { @@ -117,33 +284,29 @@ }, "test.idxstats:md5,df60a8c8d6621100d05178c93fb053a2" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" - }, - "timestamp": "2024-01-18T17:17:27.726719" - }, - "test_bam_stats_samtools_single_end_idxstats": { - "content": [ + ], [ [ { "id": "test", "single_end": true }, - "test.idxstats:md5,613e048487662c694aa4a2f73ca96a20" + "test.stats:md5,8140d69cdedd77570ca1d7618a744e16" ] + ], + [ + "versions.yml:md5,73c55059ed478cd2f9cd93dd3185da3a", + "versions.yml:md5,80d8653e01575b3c381d87073f672fb5", + "versions.yml:md5,cb889532237a2f3d813978ac14a12d51" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2023-11-06T09:26:10.349439801" + "timestamp": "2024-09-16T08:08:01.421996172" }, - "test_bam_stats_samtools_paired_end_cram_stats": { + "test_bam_stats_samtools_paired_end_cram": { "content": [ [ [ @@ -151,14 +314,37 @@ "id": "test", "single_end": false }, - "test.stats:md5,d3345c4887f4a9ea4f7f56405b495db0" + "test.flagstat:md5,a53f3d26e2e9851f7d528442bbfe9781" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.idxstats:md5,e179601fa7b8ebce81ac3765206f6c15" ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,1622856127bafd6cdbadee9cd64ec9b7" + ] + ], + [ + "versions.yml:md5,73c55059ed478cd2f9cd93dd3185da3a", + "versions.yml:md5,80d8653e01575b3c381d87073f672fb5", + "versions.yml:md5,cb889532237a2f3d813978ac14a12d51" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-13T16:45:14.997164209" + "timestamp": "2024-09-16T08:08:12.640915756" } } \ No newline at end of file diff --git a/subworkflows/nf-core/fastq_fastqc_umitools_fastp/main.nf b/subworkflows/nf-core/fastq_fastqc_umitools_fastp/main.nf index 764ce013..ab6cbb32 100644 --- a/subworkflows/nf-core/fastq_fastqc_umitools_fastp/main.nf +++ b/subworkflows/nf-core/fastq_fastqc_umitools_fastp/main.nf @@ -12,12 +12,18 @@ include { FASTP } from '../../../modules/nf-core/fastp/main' // import groovy.json.JsonSlurper -def getFastpReadsAfterFiltering(json_file) { +def getFastpReadsAfterFiltering(json_file, min_num_reads) { + + if ( workflow.stubRun ) { return min_num_reads } + def Map json = (Map) new JsonSlurper().parseText(json_file.text).get('summary') return json['after_filtering']['total_reads'].toLong() } def getFastpAdapterSequence(json_file){ + + if ( workflow.stubRun ) { return "" } + def Map json = (Map) new JsonSlurper().parseText(json_file.text) try{ adapter = json['adapter_cutting']['read1_adapter_sequence'] @@ -91,6 +97,7 @@ workflow FASTQ_FASTQC_UMITOOLS_FASTP { FASTP ( umi_reads, adapter_fasta, + false, // don't want to set discard_trimmed_pass, else there will be no reads output save_trimmed_fail, save_merged ) @@ -108,7 +115,7 @@ workflow FASTQ_FASTQC_UMITOOLS_FASTP { .out .reads .join(trim_json) - .map { meta, reads, json -> [ meta, reads, getFastpReadsAfterFiltering(json) ] } + .map { meta, reads, json -> [ meta, reads, getFastpReadsAfterFiltering(json, min_trimmed_reads.toLong()) ] } .set { ch_num_trimmed_reads } ch_num_trimmed_reads diff --git a/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/main.nf.test b/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/main.nf.test index 961b5b4f..48ba5f48 100644 --- a/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/main.nf.test +++ b/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/main.nf.test @@ -4,7 +4,7 @@ nextflow_workflow { script "../main.nf" workflow "FASTQ_FASTQC_UMITOOLS_FASTP" config './nextflow.config' - + tag "subworkflows" tag "subworkflows_nfcore" tag "subworkflows/fastq_fastqc_umitools_fastp" @@ -31,7 +31,7 @@ nextflow_workflow { input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map - [ + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] @@ -52,24 +52,23 @@ nextflow_workflow { then { assertAll( { assert workflow.success }, + { assert workflow.out.fastqc_raw_html }, + { assert workflow.out.fastqc_raw_zip }, + { assert workflow.out.fastqc_trim_html }, + { assert workflow.out.fastqc_trim_zip }, + { assert workflow.out.trim_html }, + { assert workflow.out.trim_log }, { assert snapshot( + workflow.out.adapter_seq, workflow.out.reads, - workflow.out.umi_log, workflow.out.trim_json, + workflow.out.trim_read_count, workflow.out.trim_reads_fail, workflow.out.trim_reads_merged, - workflow.out.adapter_seq, - workflow.out.trim_read_count, + workflow.out.umi_log, workflow.out.versions - ).match() - }, - - { assert workflow.out.fastqc_raw_html }, - { assert workflow.out.fastqc_raw_zip }, - { assert workflow.out.trim_html }, - { assert workflow.out.trim_log }, - { assert workflow.out.fastqc_trim_html }, - { assert workflow.out.fastqc_trim_zip } + ).match() + } ) } } @@ -91,7 +90,7 @@ nextflow_workflow { input[0] = Channel.of([ [ id:'test', single_end: false ], // meta map - [ + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] @@ -112,24 +111,23 @@ nextflow_workflow { then { assertAll( { assert workflow.success }, + { assert workflow.out.trim_html }, + { assert workflow.out.trim_log }, + { assert !workflow.out.fastqc_raw_html }, + { assert !workflow.out.fastqc_raw_zip }, + { assert !workflow.out.fastqc_trim_html }, + { assert !workflow.out.fastqc_trim_zip }, { assert snapshot( + workflow.out.adapter_seq, workflow.out.reads, - workflow.out.umi_log, workflow.out.trim_json, + workflow.out.trim_read_count, workflow.out.trim_reads_fail, workflow.out.trim_reads_merged, - workflow.out.adapter_seq, - workflow.out.trim_read_count, + workflow.out.umi_log, workflow.out.versions - ).match() - }, - - { assert !workflow.out.fastqc_raw_html }, - { assert !workflow.out.fastqc_raw_zip }, - { assert workflow.out.trim_html }, - { assert workflow.out.trim_log }, - { assert !workflow.out.fastqc_trim_html }, - { assert !workflow.out.fastqc_trim_zip } + ).match() + } ) } } @@ -151,7 +149,7 @@ nextflow_workflow { input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map - [ + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] @@ -172,23 +170,22 @@ nextflow_workflow { then { assertAll( { assert workflow.success }, + { assert workflow.out.fastqc_raw_html }, + { assert workflow.out.fastqc_raw_zip }, + { assert workflow.out.fastqc_trim_html }, + { assert workflow.out.fastqc_trim_zip }, + { assert workflow.out.trim_html }, + { assert workflow.out.trim_log }, { assert snapshot( + workflow.out.adapter_seq, workflow.out.reads, workflow.out.trim_json, + workflow.out.trim_read_count, workflow.out.trim_reads_fail, workflow.out.trim_reads_merged, - workflow.out.adapter_seq, - workflow.out.trim_read_count, workflow.out.versions - ).match() - }, - - { assert workflow.out.fastqc_raw_html }, - { assert workflow.out.fastqc_raw_zip }, - { assert workflow.out.trim_html }, - { assert workflow.out.trim_log }, - { assert workflow.out.fastqc_trim_html }, - { assert workflow.out.fastqc_trim_zip } + ).match() + } ) } } @@ -211,7 +208,7 @@ nextflow_workflow { input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map - [ + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] @@ -232,24 +229,23 @@ nextflow_workflow { then { assertAll( { assert workflow.success }, + { assert workflow.out.fastqc_raw_html }, + { assert workflow.out.fastqc_raw_zip }, + { assert workflow.out.fastqc_trim_html }, + { assert workflow.out.fastqc_trim_zip }, + { assert workflow.out.trim_html }, + { assert workflow.out.trim_log }, { assert snapshot( + workflow.out.adapter_seq, workflow.out.reads, - workflow.out.umi_log, workflow.out.trim_json, + workflow.out.trim_read_count, workflow.out.trim_reads_fail, workflow.out.trim_reads_merged, - workflow.out.adapter_seq, - workflow.out.trim_read_count, + workflow.out.umi_log, workflow.out.versions - ).match() - }, - - { assert workflow.out.fastqc_raw_html }, - { assert workflow.out.fastqc_raw_zip }, - { assert workflow.out.trim_html }, - { assert workflow.out.trim_log }, - { assert workflow.out.fastqc_trim_html }, - { assert workflow.out.fastqc_trim_zip } + ).match() + } ) } } @@ -271,7 +267,7 @@ nextflow_workflow { input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map - [ + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] @@ -292,24 +288,23 @@ nextflow_workflow { then { assertAll( { assert workflow.success }, + { assert workflow.out.fastqc_raw_html }, + { assert workflow.out.fastqc_raw_zip }, + { assert workflow.out.fastqc_trim_html }, + { assert workflow.out.fastqc_trim_zip }, + { assert workflow.out.trim_html }, + { assert workflow.out.trim_log }, { assert snapshot( + workflow.out.adapter_seq, workflow.out.reads, - workflow.out.umi_log, workflow.out.trim_json, + workflow.out.trim_read_count, workflow.out.trim_reads_fail, workflow.out.trim_reads_merged, - workflow.out.adapter_seq, - workflow.out.trim_read_count, + workflow.out.umi_log, workflow.out.versions - ).match() - }, - - { assert workflow.out.fastqc_raw_html }, - { assert workflow.out.fastqc_raw_zip }, - { assert workflow.out.trim_html }, - { assert workflow.out.trim_log }, - { assert workflow.out.fastqc_trim_html }, - { assert workflow.out.fastqc_trim_zip } + ).match() + } ) } } @@ -331,7 +326,7 @@ nextflow_workflow { input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map - [ + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] @@ -352,27 +347,24 @@ nextflow_workflow { then { assertAll( { assert workflow.success }, + { assert workflow.out.fastqc_raw_html }, + { assert workflow.out.fastqc_raw_zip }, + { assert !workflow.out.fastqc_trim_html }, + { assert !workflow.out.fastqc_trim_zip }, + { assert !workflow.out.trim_html }, + { assert !workflow.out.trim_log }, { assert snapshot( + // If we skip trimming then input is output, so not snapshotting + workflow.out.adapter_seq, workflow.out.reads.get(0).get(0), // Reads meta map - // Because the input file is passed to the output file, we have to do check the filename only - file(workflow.out.reads.get(0).get(1).get(0)).name, - file(workflow.out.reads.get(0).get(1).get(1)).name, - workflow.out.umi_log, workflow.out.trim_json, + workflow.out.trim_read_count, workflow.out.trim_reads_fail, workflow.out.trim_reads_merged, - workflow.out.adapter_seq, - workflow.out.trim_read_count, + workflow.out.umi_log, workflow.out.versions - ).match() - }, - - { assert workflow.out.fastqc_raw_html }, - { assert workflow.out.fastqc_raw_zip }, - { assert !workflow.out.trim_html }, - { assert !workflow.out.trim_log }, - { assert !workflow.out.fastqc_trim_html }, - { assert !workflow.out.fastqc_trim_zip } + ).match() + } ) } } @@ -396,7 +388,7 @@ nextflow_workflow { input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map - [ + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] @@ -417,24 +409,23 @@ nextflow_workflow { then { assertAll( { assert workflow.success }, + { assert workflow.out.fastqc_raw_html }, + { assert workflow.out.fastqc_raw_zip }, + { assert workflow.out.fastqc_trim_html }, + { assert workflow.out.fastqc_trim_zip }, + { assert workflow.out.trim_html }, + { assert workflow.out.trim_log }, { assert snapshot( + workflow.out.adapter_seq, workflow.out.reads, - workflow.out.umi_log, workflow.out.trim_json, + workflow.out.trim_read_count, workflow.out.trim_reads_fail, workflow.out.trim_reads_merged, - workflow.out.adapter_seq, - workflow.out.trim_read_count, + workflow.out.umi_log, workflow.out.versions - ).match() - }, - - { assert workflow.out.fastqc_raw_html }, - { assert workflow.out.fastqc_raw_zip }, - { assert workflow.out.trim_html }, - { assert workflow.out.trim_log }, - { assert workflow.out.fastqc_trim_html }, - { assert workflow.out.fastqc_trim_zip } + ).match() + } ) } } @@ -456,7 +447,7 @@ nextflow_workflow { input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map - [ + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] @@ -477,24 +468,23 @@ nextflow_workflow { then { assertAll( { assert workflow.success }, + { assert workflow.out.fastqc_raw_html }, + { assert workflow.out.fastqc_raw_zip }, + { assert workflow.out.fastqc_trim_html }, + { assert workflow.out.fastqc_trim_zip }, + { assert workflow.out.trim_html }, + { assert workflow.out.trim_log }, { assert snapshot( + workflow.out.adapter_seq, workflow.out.reads, - workflow.out.umi_log, workflow.out.trim_json, + workflow.out.trim_read_count, workflow.out.trim_reads_fail, workflow.out.trim_reads_merged, - workflow.out.adapter_seq, - workflow.out.trim_read_count, + workflow.out.umi_log, workflow.out.versions - ).match() - }, - - { assert workflow.out.fastqc_raw_html }, - { assert workflow.out.fastqc_raw_zip }, - { assert workflow.out.trim_html }, - { assert workflow.out.trim_log }, - { assert workflow.out.fastqc_trim_html }, - { assert workflow.out.fastqc_trim_zip } + ).match() + } ) } } @@ -517,7 +507,7 @@ nextflow_workflow { input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map - [ + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] @@ -538,24 +528,445 @@ nextflow_workflow { then { assertAll( { assert workflow.success }, + { assert workflow.out.fastqc_raw_html }, + { assert workflow.out.fastqc_raw_zip }, + { assert workflow.out.fastqc_trim_html }, + { assert workflow.out.fastqc_trim_zip }, + { assert workflow.out.trim_html }, + { assert workflow.out.trim_log }, { assert snapshot( + workflow.out.adapter_seq, workflow.out.reads, - workflow.out.umi_log, workflow.out.trim_json, + workflow.out.trim_read_count, workflow.out.trim_reads_fail, workflow.out.trim_reads_merged, - workflow.out.adapter_seq, - workflow.out.trim_read_count, + workflow.out.umi_log, workflow.out.versions - ).match() - }, + ).match() + } + ) + } + } - { assert workflow.out.fastqc_raw_html }, - { assert workflow.out.fastqc_raw_zip }, - { assert workflow.out.trim_html }, - { assert workflow.out.trim_log }, - { assert workflow.out.fastqc_trim_html }, - { assert workflow.out.fastqc_trim_zip } + test("sarscov2 paired-end [fastq] - stub") { + + options '-stub' + + when { + workflow { + """ + skip_fastqc = false + with_umi = false + skip_umi_extract = false + umi_discard_read = 1 + skip_trimming = false + adapter_fasta = [] + save_trimmed_fail = false + save_merged = false + min_trimmed_reads = 1 + + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ]) + input[1] = skip_fastqc + input[2] = with_umi + input[3] = skip_umi_extract + input[4] = umi_discard_read + input[5] = skip_trimming + input[6] = adapter_fasta + input[7] = save_trimmed_fail + input[8] = save_merged + input[9] = min_trimmed_reads + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(workflow.out).match() } + ) + } + } + + test("skip_fastqc - stub") { + + options "-stub" + + when { + workflow { + """ + skip_fastqc = true + with_umi = false + skip_umi_extract = false + umi_discard_read = 1 + skip_trimming = false + adapter_fasta = [] + save_trimmed_fail = false + save_merged = false + min_trimmed_reads = 1 + + input[0] = Channel.of([ + [ id:'test', single_end: false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ]) + input[1] = skip_fastqc + input[2] = with_umi + input[3] = skip_umi_extract + input[4] = umi_discard_read + input[5] = skip_trimming + input[6] = adapter_fasta + input[7] = save_trimmed_fail + input[8] = save_merged + input[9] = min_trimmed_reads + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(workflow.out).match() } + ) + } + } + + test("with_umi - stub") { + + options "-stub" + + when { + workflow { + """ + skip_fastqc = false + with_umi = true + skip_umi_extract = false + umi_discard_read = 1 + skip_trimming = false + adapter_fasta = [] + save_trimmed_fail = false + save_merged = false + min_trimmed_reads = 1 + + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ]) + input[1] = skip_fastqc + input[2] = with_umi + input[3] = skip_umi_extract + input[4] = umi_discard_read + input[5] = skip_trimming + input[6] = adapter_fasta + input[7] = save_trimmed_fail + input[8] = save_merged + input[9] = min_trimmed_reads + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(workflow.out).match() } + ) + } + } + + + test("skip_umi_extract - stub") { + + options "-stub" + + when { + workflow { + """ + skip_fastqc = false + with_umi = true + skip_umi_extract = true + umi_discard_read = 1 + skip_trimming = false + adapter_fasta = [] + save_trimmed_fail = false + save_merged = false + min_trimmed_reads = 1 + + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ]) + input[1] = skip_fastqc + input[2] = with_umi + input[3] = skip_umi_extract + input[4] = umi_discard_read + input[5] = skip_trimming + input[6] = adapter_fasta + input[7] = save_trimmed_fail + input[8] = save_merged + input[9] = min_trimmed_reads + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(workflow.out).match() } + ) + } + } + + test("umi_discard_read = 2 - stub") { + + options "-stub" + + when { + workflow { + """ + skip_fastqc = false + with_umi = true + skip_umi_extract = true + umi_discard_read = 2 + skip_trimming = false + adapter_fasta = [] + save_trimmed_fail = false + save_merged = false + min_trimmed_reads = 1 + + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ]) + input[1] = skip_fastqc + input[2] = with_umi + input[3] = skip_umi_extract + input[4] = umi_discard_read + input[5] = skip_trimming + input[6] = adapter_fasta + input[7] = save_trimmed_fail + input[8] = save_merged + input[9] = min_trimmed_reads + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(workflow.out).match() } + ) + } + } + + test("skip_trimming - stub") { + + options "-stub" + + when { + workflow { + """ + skip_fastqc = false + with_umi = false + skip_umi_extract = false + umi_discard_read = 1 + skip_trimming = true + adapter_fasta = [] + save_trimmed_fail = false + save_merged = false + min_trimmed_reads = 1 + + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ]) + input[1] = skip_fastqc + input[2] = with_umi + input[3] = skip_umi_extract + input[4] = umi_discard_read + input[5] = skip_trimming + input[6] = adapter_fasta + input[7] = save_trimmed_fail + input[8] = save_merged + input[9] = min_trimmed_reads + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.adapter_seq, + workflow.out.fastqc_raw_html, + workflow.out.fastqc_raw_zip, + workflow.out.fastqc_trim_html, + workflow.out.fastqc_trim_zip, + workflow.out.trim_html, + workflow.out.trim_json, + workflow.out.trim_log, + workflow.out.trim_read_count, + workflow.out.trim_reads_fail, + workflow.out.trim_reads_merged, + workflow.out.umi_log, + workflow.out.versions).match() } + ) + } + } + + test("save_trimmed_fail - stub") { + + options "-stub" + + config './nextflow.save_trimmed.config' + + when { + workflow { + """ + skip_fastqc = false + with_umi = false + skip_umi_extract = false + umi_discard_read = 1 + skip_trimming = false + adapter_fasta = [] + save_trimmed_fail = true + save_merged = false + min_trimmed_reads = 1 + + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ]) + input[1] = skip_fastqc + input[2] = with_umi + input[3] = skip_umi_extract + input[4] = umi_discard_read + input[5] = skip_trimming + input[6] = adapter_fasta + input[7] = save_trimmed_fail + input[8] = save_merged + input[9] = min_trimmed_reads + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(workflow.out).match() } + ) + } + } + + test("save_merged - stub") { + + options "-stub" + + when { + workflow { + """ + skip_fastqc = false + with_umi = false + skip_umi_extract = false + umi_discard_read = 1 + skip_trimming = false + adapter_fasta = [] + save_trimmed_fail = false + save_merged = true + min_trimmed_reads = 1 + + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ]) + input[1] = skip_fastqc + input[2] = with_umi + input[3] = skip_umi_extract + input[4] = umi_discard_read + input[5] = skip_trimming + input[6] = adapter_fasta + input[7] = save_trimmed_fail + input[8] = save_merged + input[9] = min_trimmed_reads + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(workflow.out).match() } + ) + } + } + + test("min_trimmed_reads = 26 - stub") { + // Subworkflow should stop after FASTP which trims down to 25 reads + + options "-stub" + + when { + workflow { + """ + skip_fastqc = false + with_umi = false + skip_umi_extract = false + umi_discard_read = 1 + skip_trimming = false + adapter_fasta = [] + save_trimmed_fail = false + save_merged = true + min_trimmed_reads = 26 + + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ]) + input[1] = skip_fastqc + input[2] = with_umi + input[3] = skip_umi_extract + input[4] = umi_discard_read + input[5] = skip_trimming + input[6] = adapter_fasta + input[7] = save_trimmed_fail + input[8] = save_merged + input[9] = min_trimmed_reads + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(workflow.out).match() } ) } } diff --git a/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/main.nf.test.snap b/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/main.nf.test.snap index 3e11d9ec..e7d1f51e 100644 --- a/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/main.nf.test.snap +++ b/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/main.nf.test.snap @@ -7,14 +7,8 @@ "id": "test", "single_end": false }, - [ - "test_1.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7", - "test_2.fastp.fastq.gz:md5,25cbdca08e2083dbd4f0502de6b62f39" - ] + "unspecified" ] - ], - [ - ], [ [ @@ -22,14 +16,11 @@ "id": "test", "single_end": false }, - "test.fastp.json:md5,1e0f8e27e71728e2b63fc64086be95cd" + [ + "test_1.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7", + "test_2.fastp.fastq.gz:md5,25cbdca08e2083dbd4f0502de6b62f39" + ] ] - ], - [ - - ], - [ - ], [ [ @@ -37,7 +28,7 @@ "id": "test", "single_end": false }, - "unspecified" + "test.fastp.json:md5,1e0f8e27e71728e2b63fc64086be95cd" ] ], [ @@ -48,16 +39,25 @@ }, 198 ] + ], + [ + + ], + [ + + ], + [ + ], [ "versions.yml:md5,85bd0117e5778fff18e3920972a296ad" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-03-18T16:53:49.315194" + "timestamp": "2024-07-22T16:56:01.933832" }, "save_trimmed_fail": { "content": [ @@ -67,14 +67,8 @@ "id": "test", "single_end": false }, - [ - "test_1.fastp.fastq.gz:md5,6ff32a64c5188b9a9192be1398c262c7", - "test_2.fastp.fastq.gz:md5,db0cb7c9977e94ac2b4b446ebd017a8a" - ] + "unspecified" ] - ], - [ - ], [ [ @@ -82,7 +76,10 @@ "id": "test", "single_end": false }, - "test.fastp.json:md5,4c3268ddb50ea5b33125984776aa3519" + [ + "test_1.fastp.fastq.gz:md5,6ff32a64c5188b9a9192be1398c262c7", + "test_2.fastp.fastq.gz:md5,db0cb7c9977e94ac2b4b446ebd017a8a" + ] ] ], [ @@ -91,15 +88,8 @@ "id": "test", "single_end": false }, - [ - "test.paired.fail.fastq.gz:md5,409b687c734cedd7a1fec14d316e1366", - "test_1.fail.fastq.gz:md5,4f273cf3159c13f79e8ffae12f5661f6", - "test_2.fail.fastq.gz:md5,f97b9edefb5649aab661fbc9e71fc995" - ] + "test.fastp.json:md5,4c3268ddb50ea5b33125984776aa3519" ] - ], - [ - ], [ [ @@ -107,7 +97,7 @@ "id": "test", "single_end": false }, - "unspecified" + 162 ] ], [ @@ -116,8 +106,18 @@ "id": "test", "single_end": false }, - 162 + [ + "test.paired.fail.fastq.gz:md5,409b687c734cedd7a1fec14d316e1366", + "test_1.fail.fastq.gz:md5,4f273cf3159c13f79e8ffae12f5661f6", + "test_2.fail.fastq.gz:md5,f97b9edefb5649aab661fbc9e71fc995" + ] ] + ], + [ + + ], + [ + ], [ "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", @@ -126,10 +126,10 @@ ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-03-18T16:51:45.34934" + "timestamp": "2024-07-22T16:57:38.736" }, "skip_umi_extract": { "content": [ @@ -139,14 +139,8 @@ "id": "test", "single_end": false }, - [ - "test_1.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7", - "test_2.fastp.fastq.gz:md5,25cbdca08e2083dbd4f0502de6b62f39" - ] + "unspecified" ] - ], - [ - ], [ [ @@ -154,14 +148,11 @@ "id": "test", "single_end": false }, - "test.fastp.json:md5,1e0f8e27e71728e2b63fc64086be95cd" + [ + "test_1.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7", + "test_2.fastp.fastq.gz:md5,25cbdca08e2083dbd4f0502de6b62f39" + ] ] - ], - [ - - ], - [ - ], [ [ @@ -169,7 +160,7 @@ "id": "test", "single_end": false }, - "unspecified" + "test.fastp.json:md5,1e0f8e27e71728e2b63fc64086be95cd" ] ], [ @@ -180,6 +171,15 @@ }, 198 ] + ], + [ + + ], + [ + + ], + [ + ], [ "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", @@ -188,13 +188,22 @@ ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-03-18T12:07:40.34249" + "timestamp": "2024-07-22T16:56:47.905105" }, "umi_discard_read = 2": { "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "unspecified" + ] + ], [ [ { @@ -208,7 +217,13 @@ ] ], [ - + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,1e0f8e27e71728e2b63fc64086be95cd" + ] ], [ [ @@ -216,7 +231,7 @@ "id": "test", "single_end": false }, - "test.fastp.json:md5,1e0f8e27e71728e2b63fc64086be95cd" + 198 ] ], [ @@ -224,6 +239,251 @@ ], [ + ], + [ + + ], + [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T16:57:05.436744" + }, + "umi_discard_read = 2 - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "10": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "11": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "12": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "13": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + + ], + "9": [ + + ], + "adapter_seq": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "fastqc_raw_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_raw_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_read_count": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "trim_reads_fail": [ + + ], + "trim_reads_merged": [ + + ], + "umi_log": [ + + ], + "versions": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T16:59:27.273892" + }, + "skip_trimming - stub": { + "content": [ + [ + ], [ [ @@ -231,7 +491,7 @@ "id": "test", "single_end": false }, - "unspecified" + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], [ @@ -240,20 +500,45 @@ "id": "test", "single_end": false }, - 198 + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], [ - "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", - "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-03-18T12:08:24.141938" + "timestamp": "2024-07-22T16:59:39.247758" }, "save_merged": { "content": [ @@ -263,14 +548,8 @@ "id": "test", "single_end": false }, - [ - "test_1.fastp.fastq.gz:md5,54b726a55e992a869fd3fa778afe1672", - "test_2.fastp.fastq.gz:md5,29d3b33b869f7b63417b8ff07bb128ba" - ] + "unspecified" ] - ], - [ - ], [ [ @@ -278,11 +557,11 @@ "id": "test", "single_end": false }, - "test.fastp.json:md5,b712fd68ed0322f4bec49ff2a5237fcc" + [ + "test_1.fastp.fastq.gz:md5,54b726a55e992a869fd3fa778afe1672", + "test_2.fastp.fastq.gz:md5,29d3b33b869f7b63417b8ff07bb128ba" + ] ] - ], - [ - ], [ [ @@ -290,7 +569,7 @@ "id": "test", "single_end": false }, - "test.merged.fastq.gz:md5,c873bb1ab3fa859dcc47306465e749d5" + "test.fastp.json:md5,b712fd68ed0322f4bec49ff2a5237fcc" ] ], [ @@ -299,8 +578,11 @@ "id": "test", "single_end": false }, - "unspecified" + 75 ] + ], + [ + ], [ [ @@ -308,8 +590,11 @@ "id": "test", "single_end": false }, - 75 + "test.merged.fastq.gz:md5,c873bb1ab3fa859dcc47306465e749d5" ] + ], + [ + ], [ "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", @@ -318,22 +603,20 @@ ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-03-18T12:10:18.546963" + "timestamp": "2024-07-22T16:57:57.472342" }, "skip_trimming": { "content": [ + [ + + ], { "id": "test", "single_end": false }, - "test_1.fastq.gz", - "test_2.fastq.gz", - [ - - ], [ ], @@ -354,72 +637,67 @@ ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-03-19T15:49:26.574759" + "timestamp": "2024-07-22T16:57:19.875543" }, - "sarscov2 paired-end [fastq]": { + "with_umi": { "content": [ [ [ { "id": "test", - "single_end": false + "single_end": true }, - [ - "test_1.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7", - "test_2.fastp.fastq.gz:md5,25cbdca08e2083dbd4f0502de6b62f39" - ] + "" ] - ], - [ - ], [ [ { "id": "test", - "single_end": false + "single_end": true }, - "test.fastp.json:md5,1e0f8e27e71728e2b63fc64086be95cd" + "test.fastp.fastq.gz:md5,ba8c6c3a7ce718d9a2c5857e2edf53bc" ] - ], - [ - - ], - [ - ], [ [ { "id": "test", - "single_end": false + "single_end": true }, - "unspecified" + "test.fastp.json:md5,d39c5c6d9a2e35fb60d26ced46569af6" ] ], [ [ { "id": "test", - "single_end": false + "single_end": true }, - 198 + 99 ] ], [ + + ], + [ + + ], + [ + "versions.yml:md5,01f264f78de3c6d893c449cc6d3cd721", "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-03-18T16:53:39.139038" + "timestamp": "2024-07-22T16:56:26.778625" }, "min_trimmed_reads = 26": { "content": [ @@ -429,14 +707,8 @@ "id": "test", "single_end": false }, - [ - "test_1.fastp.fastq.gz:md5,54b726a55e992a869fd3fa778afe1672", - "test_2.fastp.fastq.gz:md5,29d3b33b869f7b63417b8ff07bb128ba" - ] + "unspecified" ] - ], - [ - ], [ [ @@ -444,11 +716,11 @@ "id": "test", "single_end": false }, - "test.fastp.json:md5,b712fd68ed0322f4bec49ff2a5237fcc" + [ + "test_1.fastp.fastq.gz:md5,54b726a55e992a869fd3fa778afe1672", + "test_2.fastp.fastq.gz:md5,29d3b33b869f7b63417b8ff07bb128ba" + ] ] - ], - [ - ], [ [ @@ -456,7 +728,7 @@ "id": "test", "single_end": false }, - "test.merged.fastq.gz:md5,c873bb1ab3fa859dcc47306465e749d5" + "test.fastp.json:md5,b712fd68ed0322f4bec49ff2a5237fcc" ] ], [ @@ -465,8 +737,11 @@ "id": "test", "single_end": false }, - "unspecified" + 75 ] + ], + [ + ], [ [ @@ -474,8 +749,11 @@ "id": "test", "single_end": false }, - 75 + "test.merged.fastq.gz:md5,c873bb1ab3fa859dcc47306465e749d5" ] + ], + [ + ], [ "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", @@ -484,66 +762,1646 @@ ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-03-18T11:52:23.849945" + "timestamp": "2024-07-22T16:58:16.36697" }, - "with_umi": { + "min_trimmed_reads = 26 - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "10": [ + [ + { + "id": "test", + "single_end": false + }, + 26 + ] + ], + "11": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "12": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "13": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + + ], + "9": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "adapter_seq": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "fastqc_raw_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_raw_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_read_count": [ + [ + { + "id": "test", + "single_end": false + }, + 26 + ] + ], + "trim_reads_fail": [ + + ], + "trim_reads_merged": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "umi_log": [ + + ], + "versions": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T17:00:16.524361" + }, + "with_umi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "10": [ + [ + { + "id": "test", + "single_end": true + }, + 1 + ] + ], + "11": [ + [ + { + "id": "test", + "single_end": true + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "12": [ + [ + { + "id": "test", + "single_end": true + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "13": [ + "versions.yml:md5,01f264f78de3c6d893c449cc6d3cd721", + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.umi_extract.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": true + }, + "" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + + ], + "9": [ + + ], + "adapter_seq": [ + [ + { + "id": "test", + "single_end": true + }, + "" + ] + ], + "fastqc_raw_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_raw_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_zip": [ + [ + { + "id": "test", + "single_end": true + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "trim_html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_json": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_read_count": [ + [ + { + "id": "test", + "single_end": true + }, + 1 + ] + ], + "trim_reads_fail": [ + + ], + "trim_reads_merged": [ + + ], + "umi_log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.umi_extract.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,01f264f78de3c6d893c449cc6d3cd721", + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T16:58:56.42517" + }, + "skip_fastqc - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + + ], + "10": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "11": [ + + ], + "12": [ + + ], + "13": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad" + ], + "2": [ + + ], + "3": [ + + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + + ], + "9": [ + + ], + "adapter_seq": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "fastqc_raw_html": [ + + ], + "fastqc_raw_zip": [ + + ], + "fastqc_trim_html": [ + + ], + "fastqc_trim_zip": [ + + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_read_count": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "trim_reads_fail": [ + + ], + "trim_reads_merged": [ + + ], + "umi_log": [ + + ], + "versions": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T16:58:41.207281" + }, + "save_merged - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "10": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "11": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "12": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "13": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + + ], + "9": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "adapter_seq": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "fastqc_raw_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_raw_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_read_count": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "trim_reads_fail": [ + + ], + "trim_reads_merged": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "umi_log": [ + + ], + "versions": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T17:00:03.695409" + }, + "sarscov2 paired-end [fastq]": { "content": [ [ [ { "id": "test", - "single_end": true + "single_end": false }, - "test.fastp.fastq.gz:md5,ba8c6c3a7ce718d9a2c5857e2edf53bc" + "unspecified" ] ], [ [ { "id": "test", - "single_end": true + "single_end": false }, - "test.fastp.json:md5,d39c5c6d9a2e35fb60d26ced46569af6" + [ + "test_1.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7", + "test_2.fastp.fastq.gz:md5,25cbdca08e2083dbd4f0502de6b62f39" + ] ] - ], - [ - - ], - [ - ], [ [ { "id": "test", - "single_end": true + "single_end": false }, - "" + "test.fastp.json:md5,1e0f8e27e71728e2b63fc64086be95cd" ] ], [ [ { "id": "test", - "single_end": true + "single_end": false }, - 99 + 198 ] ], [ - "versions.yml:md5,01f264f78de3c6d893c449cc6d3cd721", + + ], + [ + + ], + [ + + ], + [ "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T16:55:50.614571" + }, + "sarscov2 paired-end [fastq] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "10": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "11": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "12": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "13": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + + ], + "9": [ + + ], + "adapter_seq": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "fastqc_raw_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_raw_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_read_count": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "trim_reads_fail": [ + + ], + "trim_reads_merged": [ + + ], + "umi_log": [ + + ], + "versions": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T16:58:29.296468" + }, + "save_trimmed_fail - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "10": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "11": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "12": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "13": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test.paired.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_1.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "9": [ + + ], + "adapter_seq": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "fastqc_raw_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_raw_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_read_count": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "trim_reads_fail": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test.paired.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_1.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "trim_reads_merged": [ + + ], + "umi_log": [ + + ], + "versions": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T16:59:51.615894" + }, + "skip_umi_extract - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "10": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "11": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "12": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "13": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + + ], + "9": [ + + ], + "adapter_seq": [ + [ + { + "id": "test", + "single_end": false + }, + "" + ] + ], + "fastqc_raw_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_raw_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fastqc_trim_zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "trim_html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "trim_read_count": [ + [ + { + "id": "test", + "single_end": false + }, + 1 + ] + ], + "trim_reads_fail": [ + + ], + "trim_reads_merged": [ + + ], + "umi_log": [ + + ], + "versions": [ + "versions.yml:md5,85bd0117e5778fff18e3920972a296ad", + "versions.yml:md5,c50aa59475ab901bc6f9a2cf7b1a14e0", + "versions.yml:md5,f3dcaae948e8eed92b4a5557b4c6668e" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-03-18T17:31:09.193212" + "timestamp": "2024-07-22T16:59:12.592278" } } \ No newline at end of file diff --git a/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/nextflow.config b/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/nextflow.config index 12f7b257..0174cae5 100644 --- a/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/nextflow.config +++ b/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/nextflow.config @@ -7,5 +7,5 @@ process { withName: UMICOLLAPSE { ext.prefix = { "${meta.id}.dedup" } } - -} \ No newline at end of file + +} diff --git a/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/nextflow.save_trimmed.config b/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/nextflow.save_trimmed.config index 2430e9d5..21207add 100644 --- a/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/nextflow.save_trimmed.config +++ b/subworkflows/nf-core/fastq_fastqc_umitools_fastp/tests/nextflow.save_trimmed.config @@ -3,4 +3,4 @@ process { withName: FASTP { ext.args = "-e 30" } -} \ No newline at end of file +} diff --git a/subworkflows/nf-core/fastq_find_mirna_mirdeep2/main.nf b/subworkflows/nf-core/fastq_find_mirna_mirdeep2/main.nf new file mode 100644 index 00000000..f8c3da93 --- /dev/null +++ b/subworkflows/nf-core/fastq_find_mirna_mirdeep2/main.nf @@ -0,0 +1,33 @@ +include { SEQKIT_FQ2FA } from '../../../modules/nf-core/seqkit/fq2fa/main' +include { SEQKIT_REPLACE } from '../../../modules/nf-core/seqkit/replace/main' +include { MIRDEEP2_MAPPER } from '../../../modules/nf-core/mirdeep2/mapper/main' +include { MIRDEEP2_MIRDEEP2 } from '../../../modules/nf-core/mirdeep2/mirdeep2/main' + +workflow FASTQ_FIND_MIRNA_MIRDEEP2 { + + take: + ch_reads // channel: [ val(meta), fastq ] + ch_genome_fasta // channel: [ val(meta), genome_fasta ] + ch_bowtie_index // channel: [ val(meta), index ] + ch_mirna_mature_hairpin // channel: [ val(meta), mature_mirna, hairpin_mirna ] + + main: + + ch_versions = Channel.empty() + + SEQKIT_FQ2FA ( ch_reads ) + ch_versions = ch_versions.mix(SEQKIT_FQ2FA.out.versions) + + SEQKIT_REPLACE ( SEQKIT_FQ2FA.out.fasta ) + ch_versions = ch_versions.mix(SEQKIT_REPLACE.out.versions) + + MIRDEEP2_MAPPER ( SEQKIT_REPLACE.out.fastx, ch_bowtie_index ) + ch_versions = ch_versions.mix(MIRDEEP2_MAPPER.out.versions) + + MIRDEEP2_MIRDEEP2 ( MIRDEEP2_MAPPER.out.outputs, ch_genome_fasta, ch_mirna_mature_hairpin ) + ch_versions = ch_versions.mix(MIRDEEP2_MIRDEEP2.out.versions) + + emit: + outputs = MIRDEEP2_MIRDEEP2.out.outputs // channel: [ val(meta), [ bed, csv, html ] ] + versions = ch_versions // channel: [ versions.yml ] +} diff --git a/subworkflows/nf-core/fastq_find_mirna_mirdeep2/meta.yml b/subworkflows/nf-core/fastq_find_mirna_mirdeep2/meta.yml new file mode 100644 index 00000000..22a475b3 --- /dev/null +++ b/subworkflows/nf-core/fastq_find_mirna_mirdeep2/meta.yml @@ -0,0 +1,51 @@ +name: "fastq_find_mirna_mirdeep2" +description: | + This subworkflow identifies miRNAs from FASTQ files using miRDeep2. The workflow converts FASTQ to FASTA, processes and replaces any whitespace in sequence IDs, builds a Bowtie index of the genome, and then maps reads using miRDeep2 mapper before identifying known and novel miRNAs. +keywords: + - miRNA + - FASTQ + - FASTA + - Bowtie + - miRDeep2 +components: + - seqkit/fq2fa + - seqkit/replace + - bowtie/build + - mirdeep2/mapper + - mirdeep2/mirdeep2 +input: + - ch_reads: + type: file + description: | + The input channel containing the FASTQ files to process and identify miRNAs. + Structure: [ val(meta), path(fastq) ] + pattern: "*.fastq.gz" + - ch_genome_fasta: + type: file + description: | + The input channel containing the genome FASTA files used to build the Bowtie index. + Structure: [ val(meta), path(fasta) ] + pattern: "*.fa" + - ch_mirna_mature_hairpin: + type: file + description: | + The input channel containing the mature and hairpin miRNA sequences for miRNA identification. + Structure: [ val(meta), path(mature_fasta), path(hairpin_fasta) ] + pattern: "*.fa" +output: + - outputs: + type: file + description: | + The output channel containing the BED, CSV, and HTML files with the identified miRNAs. + Structure: [ val(meta), path(bed), path(csv), path(html) ] + pattern: "*.{bed,csv,html}" + - versions: + type: file + description: | + File containing software versions + Structure: [ path(versions.yml) ] + pattern: "versions.yml" +authors: + - "@atrigila" +maintainers: + - "@atrigila" diff --git a/subworkflows/nf-core/fastq_find_mirna_mirdeep2/tests/main.nf.test b/subworkflows/nf-core/fastq_find_mirna_mirdeep2/tests/main.nf.test new file mode 100644 index 00000000..13c10e52 --- /dev/null +++ b/subworkflows/nf-core/fastq_find_mirna_mirdeep2/tests/main.nf.test @@ -0,0 +1,80 @@ +nextflow_workflow { + + name "Test Subworkflow FASTQ_FIND_MIRNA_MIRDEEP2" + script "../main.nf" + workflow "FASTQ_FIND_MIRNA_MIRDEEP2" + + tag "subworkflows" + tag "subworkflows_nfcore" + tag "subworkflows/fastq_find_mirna_mirdeep2" + tag "mirdeep2/mapper" + tag "mirdeep2/mirdeep2" + tag "seqkit/fq2fa" + tag "seqkit/replace" + tag "bowtie/build" + + + test("smrnaseq - fasta - single_end") { + config "./nextflow.config" + + setup { + run("SEQKIT_REPLACE") { + script "modules/nf-core/seqkit/replace/main.nf" + config "./nextflow.config" + + process { + """ + input[0] = [ + [ id:'genome' ], // meta map + file('https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/genome.fa', checkIfExists: true) + ] + """ + } + } + + run("BOWTIE_BUILD") { + script "modules/nf-core/bowtie/build/main.nf" + process { + """ + input[0] = SEQKIT_REPLACE.out.fastx + """ + } + } + } + + when { + workflow { + """ + input[0] = [ + [ id:'small_Clone1_N1', single_end:false ], // meta map + file('https://github.com/nf-core/test-datasets/raw/smrnaseq/testdata/trimmed/small_Clone1_N1.fastp.fastq.gz', checkIfExists: true) + ] + + input[1] = SEQKIT_REPLACE.out.fastx + + input[2] = BOWTIE_BUILD.out.index + + input[3] = [ + [ id:'mirna_mature_hairpin'], // meta map + file('https://github.com/nf-core/test-datasets/raw/smrnaseq/MirGeneDB/mirgenedb_hsa_mature.fa', checkIfExists: true), + file('https://github.com/nf-core/test-datasets/raw/smrnaseq/MirGeneDB/mirgenedb_hsa_hairpin.fa', checkIfExists: true), + [] + ] + """ + } + } + + then { + assertAll( + { assert workflow.success}, + { assert snapshot(workflow.out.versions, + path(workflow.out.outputs.get(0).get(1)[2]).readLines().last().contains(''), + workflow.out.outputs.get(0).get(1)[0], + path(workflow.out.outputs.get(0).get(1)[1]).readLines().first().contains('miRDeep2 score') + ).match()}, + // Assert .html + { assert path(workflow.out.outputs.get(0).get(1)[2]).readLines().last().contains('') } + ) + } + } +} diff --git a/subworkflows/nf-core/fastq_find_mirna_mirdeep2/tests/main.nf.test.snap b/subworkflows/nf-core/fastq_find_mirna_mirdeep2/tests/main.nf.test.snap new file mode 100644 index 00000000..c48df3d7 --- /dev/null +++ b/subworkflows/nf-core/fastq_find_mirna_mirdeep2/tests/main.nf.test.snap @@ -0,0 +1,20 @@ +{ + "smrnaseq - fasta - single_end": { + "content": [ + [ + "versions.yml:md5,10138b74aed5b2658c26ddf80ff391d5", + "versions.yml:md5,631c0428c28d5355f0e3e9bd790bd77d", + "versions.yml:md5,706a3f609ec9d66162576d93a6f6a67b", + "versions.yml:md5,756eee52b4a45f7a9effe33b1cd3cb92" + ], + true, + "result_small_Clone1_N1.bed:md5,98a74ac6dd16ee876e9a3f54d2695c88", + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-23T14:56:03.274059331" + } +} \ No newline at end of file diff --git a/subworkflows/nf-core/fastq_find_mirna_mirdeep2/tests/nextflow.config b/subworkflows/nf-core/fastq_find_mirna_mirdeep2/tests/nextflow.config new file mode 100644 index 00000000..ec097561 --- /dev/null +++ b/subworkflows/nf-core/fastq_find_mirna_mirdeep2/tests/nextflow.config @@ -0,0 +1,11 @@ +process { + withName: 'MIRDEEP2_MAPPER' { + ext.args = "-c -j -k TCGTATGCCGTCTTCTGCTTGT -l 18 -m -v" + } + + withName: 'SEQKIT_REPLACE' { + ext.args = "-p '\s.+'" + ext.suffix = "fasta" + } + +} diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/main.nf b/subworkflows/nf-core/utils_nextflow_pipeline/main.nf index ac31f28f..0fcbf7b3 100644 --- a/subworkflows/nf-core/utils_nextflow_pipeline/main.nf +++ b/subworkflows/nf-core/utils_nextflow_pipeline/main.nf @@ -2,18 +2,13 @@ // Subworkflow with functionality that may be useful for any Nextflow pipeline // -import org.yaml.snakeyaml.Yaml -import groovy.json.JsonOutput -import nextflow.extension.FilesEx - /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW DEFINITION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow UTILS_NEXTFLOW_PIPELINE { - take: print_version // boolean: print version dump_parameters // boolean: dump parameters @@ -26,7 +21,7 @@ workflow UTILS_NEXTFLOW_PIPELINE { // Print workflow version and exit on --version // if (print_version) { - log.info "${workflow.manifest.name} ${getWorkflowVersion()}" + log.info("${workflow.manifest.name} ${getWorkflowVersion()}") System.exit(0) } @@ -49,16 +44,16 @@ workflow UTILS_NEXTFLOW_PIPELINE { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // // Generate version string // def getWorkflowVersion() { - String version_string = "" + def version_string = "" as String if (workflow.manifest.version) { def prefix_v = workflow.manifest.version[0] != 'v' ? 'v' : '' version_string += "${prefix_v}${workflow.manifest.version}" @@ -76,13 +71,13 @@ def getWorkflowVersion() { // Dump pipeline parameters to a JSON file // def dumpParametersToJSON(outdir) { - def timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss') - def filename = "params_${timestamp}.json" - def temp_pf = new File(workflow.launchDir.toString(), ".${filename}") - def jsonStr = JsonOutput.toJson(params) - temp_pf.text = JsonOutput.prettyPrint(jsonStr) + def timestamp = new java.util.Date().format('yyyy-MM-dd_HH-mm-ss') + def filename = "params_${timestamp}.json" + def temp_pf = new File(workflow.launchDir.toString(), ".${filename}") + def jsonStr = groovy.json.JsonOutput.toJson(params) + temp_pf.text = groovy.json.JsonOutput.prettyPrint(jsonStr) - FilesEx.copyTo(temp_pf.toPath(), "${outdir}/pipeline_info/params_${timestamp}.json") + nextflow.extension.FilesEx.copyTo(temp_pf.toPath(), "${outdir}/pipeline_info/params_${timestamp}.json") temp_pf.delete() } @@ -90,37 +85,40 @@ def dumpParametersToJSON(outdir) { // When running with -profile conda, warn if channels have not been set-up appropriately // def checkCondaChannels() { - Yaml parser = new Yaml() + def parser = new org.yaml.snakeyaml.Yaml() def channels = [] try { def config = parser.load("conda config --show channels".execute().text) channels = config.channels - } catch(NullPointerException | IOException e) { - log.warn "Could not verify conda channel configuration." - return + } + catch (NullPointerException e) { + log.warn("Could not verify conda channel configuration.") + return null + } + catch (IOException e) { + log.warn("Could not verify conda channel configuration.") + return null } // Check that all channels are present // This channel list is ordered by required channel priority. - def required_channels_in_order = ['conda-forge', 'bioconda', 'defaults'] + def required_channels_in_order = ['conda-forge', 'bioconda'] def channels_missing = ((required_channels_in_order as Set) - (channels as Set)) as Boolean // Check that they are in the right order - def channel_priority_violation = false - def n = required_channels_in_order.size() - for (int i = 0; i < n - 1; i++) { - channel_priority_violation |= !(channels.indexOf(required_channels_in_order[i]) < channels.indexOf(required_channels_in_order[i+1])) - } + def channel_priority_violation = required_channels_in_order != channels.findAll { ch -> ch in required_channels_in_order } if (channels_missing | channel_priority_violation) { - log.warn "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" + - " There is a problem with your Conda configuration!\n\n" + - " You will need to set-up the conda-forge and bioconda channels correctly.\n" + - " Please refer to https://bioconda.github.io/\n" + - " The observed channel order is \n" + - " ${channels}\n" + - " but the following channel order is required:\n" + - " ${required_channels_in_order}\n" + - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" + log.warn """\ + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + There is a problem with your Conda configuration! + You will need to set-up the conda-forge and bioconda channels correctly. + Please refer to https://bioconda.github.io/ + The observed channel order is + ${channels} + but the following channel order is required: + ${required_channels_in_order} + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" + """.stripIndent(true) } } diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config b/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config index d0a926bf..a09572e5 100644 --- a/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config +++ b/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config @@ -3,7 +3,7 @@ manifest { author = """nf-core""" homePage = 'https://127.0.0.1' description = """Dummy pipeline""" - nextflowVersion = '!>=23.04.0' + nextflowVersion = '!>=23.04.0' version = '9.9.9' doi = 'https://doi.org/10.5281/zenodo.5070524' } diff --git a/subworkflows/nf-core/utils_nfcore_pipeline/main.nf b/subworkflows/nf-core/utils_nfcore_pipeline/main.nf index a8b55d6f..5cb7bafe 100644 --- a/subworkflows/nf-core/utils_nfcore_pipeline/main.nf +++ b/subworkflows/nf-core/utils_nfcore_pipeline/main.nf @@ -2,17 +2,13 @@ // Subworkflow with utility functions specific to the nf-core pipeline template // -import org.yaml.snakeyaml.Yaml -import nextflow.extension.FilesEx - /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW DEFINITION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow UTILS_NFCORE_PIPELINE { - take: nextflow_cli_args @@ -25,23 +21,20 @@ workflow UTILS_NFCORE_PIPELINE { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // // Warn if a -profile or Nextflow config has not been provided to run the pipeline // def checkConfigProvided() { - valid_config = true + def valid_config = true as Boolean if (workflow.profile == 'standard' && workflow.configFiles.size() <= 1) { - log.warn "[$workflow.manifest.name] You are attempting to run the pipeline without any custom configuration!\n\n" + - "This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" + - " (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" + - " (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" + - " (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" + - "Please refer to the quick start section and usage docs for the pipeline.\n " + log.warn( + "[${workflow.manifest.name}] You are attempting to run the pipeline without any custom configuration!\n\n" + "This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" + " (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" + " (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" + " (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" + "Please refer to the quick start section and usage docs for the pipeline.\n " + ) valid_config = false } return valid_config @@ -52,12 +45,14 @@ def checkConfigProvided() { // def checkProfileProvided(nextflow_cli_args) { if (workflow.profile.endsWith(',')) { - error "The `-profile` option cannot end with a trailing comma, please remove it and re-run the pipeline!\n" + - "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + error( + "The `-profile` option cannot end with a trailing comma, please remove it and re-run the pipeline!\n" + "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + ) } if (nextflow_cli_args[0]) { - log.warn "nf-core pipelines do not accept positional arguments. The positional argument `${nextflow_cli_args[0]}` has been detected.\n" + - "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + log.warn( + "nf-core pipelines do not accept positional arguments. The positional argument `${nextflow_cli_args[0]}` has been detected.\n" + "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + ) } } @@ -65,20 +60,22 @@ def checkProfileProvided(nextflow_cli_args) { // Citation string for pipeline // def workflowCitation() { - return "If you use ${workflow.manifest.name} for your analysis please cite:\n\n" + - "* The pipeline\n" + - " ${workflow.manifest.doi}\n\n" + - "* The nf-core framework\n" + - " https://doi.org/10.1038/s41587-020-0439-x\n\n" + - "* Software dependencies\n" + - " https://github.com/${workflow.manifest.name}/blob/master/CITATIONS.md" + def temp_doi_ref = "" + def manifest_doi = workflow.manifest.doi.tokenize(",") + // Handling multiple DOIs + // Removing `https://doi.org/` to handle pipelines using DOIs vs DOI resolvers + // Removing ` ` since the manifest.doi is a string and not a proper list + manifest_doi.each { doi_ref -> + temp_doi_ref += " https://doi.org/${doi_ref.replace('https://doi.org/', '').replace(' ', '')}\n" + } + return "If you use ${workflow.manifest.name} for your analysis please cite:\n\n" + "* The pipeline\n" + temp_doi_ref + "\n" + "* The nf-core framework\n" + " https://doi.org/10.1038/s41587-020-0439-x\n\n" + "* Software dependencies\n" + " https://github.com/${workflow.manifest.name}/blob/master/CITATIONS.md" } // // Generate workflow version string // def getWorkflowVersion() { - String version_string = "" + def version_string = "" as String if (workflow.manifest.version) { def prefix_v = workflow.manifest.version[0] != 'v' ? 'v' : '' version_string += "${prefix_v}${workflow.manifest.version}" @@ -96,8 +93,8 @@ def getWorkflowVersion() { // Get software versions for pipeline // def processVersionsFromYAML(yaml_file) { - Yaml yaml = new Yaml() - versions = yaml.load(yaml_file).collectEntries { k, v -> [ k.tokenize(':')[-1], v ] } + def yaml = new org.yaml.snakeyaml.Yaml() + def versions = yaml.load(yaml_file).collectEntries { k, v -> [k.tokenize(':')[-1], v] } return yaml.dumpAsMap(versions).trim() } @@ -107,8 +104,8 @@ def processVersionsFromYAML(yaml_file) { def workflowVersionToYAML() { return """ Workflow: - $workflow.manifest.name: ${getWorkflowVersion()} - Nextflow: $workflow.nextflow.version + ${workflow.manifest.name}: ${getWorkflowVersion()} + Nextflow: ${workflow.nextflow.version} """.stripIndent().trim() } @@ -116,11 +113,7 @@ def workflowVersionToYAML() { // Get channel of software versions used in pipeline in YAML format // def softwareVersionsToYAML(ch_versions) { - return ch_versions - .unique() - .map { processVersionsFromYAML(it) } - .unique() - .mix(Channel.of(workflowVersionToYAML())) + return ch_versions.unique().map { version -> processVersionsFromYAML(version) }.unique().mix(Channel.of(workflowVersionToYAML())) } // @@ -128,25 +121,31 @@ def softwareVersionsToYAML(ch_versions) { // def paramsSummaryMultiqc(summary_params) { def summary_section = '' - for (group in summary_params.keySet()) { - def group_params = summary_params.get(group) // This gets the parameters of that particular group - if (group_params) { - summary_section += "

    $group

    \n" - summary_section += "
    \n" - for (param in group_params.keySet()) { - summary_section += "
    $param
    ${group_params.get(param) ?: 'N/A'}
    \n" + summary_params + .keySet() + .each { group -> + def group_params = summary_params.get(group) + // This gets the parameters of that particular group + if (group_params) { + summary_section += "

    ${group}

    \n" + summary_section += "
    \n" + group_params + .keySet() + .sort() + .each { param -> + summary_section += "
    ${param}
    ${group_params.get(param) ?: 'N/A'}
    \n" + } + summary_section += "
    \n" } - summary_section += "
    \n" } - } - String yaml_file_text = "id: '${workflow.manifest.name.replace('/','-')}-summary'\n" - yaml_file_text += "description: ' - this information is collected when the pipeline is started.'\n" - yaml_file_text += "section_name: '${workflow.manifest.name} Workflow Summary'\n" - yaml_file_text += "section_href: 'https://github.com/${workflow.manifest.name}'\n" - yaml_file_text += "plot_type: 'html'\n" - yaml_file_text += "data: |\n" - yaml_file_text += "${summary_section}" + def yaml_file_text = "id: '${workflow.manifest.name.replace('/', '-')}-summary'\n" as String + yaml_file_text += "description: ' - this information is collected when the pipeline is started.'\n" + yaml_file_text += "section_name: '${workflow.manifest.name} Workflow Summary'\n" + yaml_file_text += "section_href: 'https://github.com/${workflow.manifest.name}'\n" + yaml_file_text += "plot_type: 'html'\n" + yaml_file_text += "data: |\n" + yaml_file_text += "${summary_section}" return yaml_file_text } @@ -155,7 +154,7 @@ def paramsSummaryMultiqc(summary_params) { // nf-core logo // def nfCoreLogo(monochrome_logs=true) { - Map colors = logColours(monochrome_logs) + def colors = logColours(monochrome_logs) as Map String.format( """\n ${dashedLine(monochrome_logs)} @@ -174,7 +173,7 @@ def nfCoreLogo(monochrome_logs=true) { // Return dashed line // def dashedLine(monochrome_logs=true) { - Map colors = logColours(monochrome_logs) + def colors = logColours(monochrome_logs) as Map return "-${colors.dim}----------------------------------------------------${colors.reset}-" } @@ -182,7 +181,7 @@ def dashedLine(monochrome_logs=true) { // ANSII colours used for terminal logging // def logColours(monochrome_logs=true) { - Map colorcodes = [:] + def colorcodes = [:] as Map // Reset / Meta colorcodes['reset'] = monochrome_logs ? '' : "\033[0m" @@ -194,54 +193,54 @@ def logColours(monochrome_logs=true) { colorcodes['hidden'] = monochrome_logs ? '' : "\033[8m" // Regular Colors - colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" - colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" - colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" - colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" - colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" - colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" - colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" - colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" + colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" + colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" + colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" + colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" + colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" + colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" + colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" + colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" // Bold - colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" - colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" - colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" - colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" - colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" - colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" - colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" - colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" + colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" + colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" + colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" + colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" + colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" + colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" + colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" + colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" // Underline - colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" - colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" - colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" - colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" - colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" - colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" - colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" - colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" + colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" + colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" + colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" + colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" + colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" + colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" + colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" + colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" // High Intensity - colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" - colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" - colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" - colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" - colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" - colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" - colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" - colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" + colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" + colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" + colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" + colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" + colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" + colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" + colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" + colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" // Bold High Intensity - colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" - colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" - colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" - colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" - colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" - colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" - colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" - colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" + colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" + colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" + colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" + colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" + colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" + colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" + colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" + colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" return colorcodes } @@ -256,14 +255,15 @@ def attachMultiqcReport(multiqc_report) { mqc_report = multiqc_report.getVal() if (mqc_report.getClass() == ArrayList && mqc_report.size() >= 1) { if (mqc_report.size() > 1) { - log.warn "[$workflow.manifest.name] Found multiple reports from process 'MULTIQC', will use only one" + log.warn("[${workflow.manifest.name}] Found multiple reports from process 'MULTIQC', will use only one") } mqc_report = mqc_report[0] } } - } catch (all) { + } + catch (Exception all) { if (multiqc_report) { - log.warn "[$workflow.manifest.name] Could not attach MultiQC report to summary email" + log.warn("[${workflow.manifest.name}] Could not attach MultiQC report to summary email") } } return mqc_report @@ -275,26 +275,35 @@ def attachMultiqcReport(multiqc_report) { def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdir, monochrome_logs=true, multiqc_report=null) { // Set up the e-mail variables - def subject = "[$workflow.manifest.name] Successful: $workflow.runName" + def subject = "[${workflow.manifest.name}] Successful: ${workflow.runName}" if (!workflow.success) { - subject = "[$workflow.manifest.name] FAILED: $workflow.runName" + subject = "[${workflow.manifest.name}] FAILED: ${workflow.runName}" } def summary = [:] - for (group in summary_params.keySet()) { - summary << summary_params[group] - } + summary_params + .keySet() + .sort() + .each { group -> + summary << summary_params[group] + } def misc_fields = [:] misc_fields['Date Started'] = workflow.start misc_fields['Date Completed'] = workflow.complete misc_fields['Pipeline script file path'] = workflow.scriptFile misc_fields['Pipeline script hash ID'] = workflow.scriptId - if (workflow.repository) misc_fields['Pipeline repository Git URL'] = workflow.repository - if (workflow.commitId) misc_fields['Pipeline repository Git Commit'] = workflow.commitId - if (workflow.revision) misc_fields['Pipeline Git branch/tag'] = workflow.revision - misc_fields['Nextflow Version'] = workflow.nextflow.version - misc_fields['Nextflow Build'] = workflow.nextflow.build + if (workflow.repository) { + misc_fields['Pipeline repository Git URL'] = workflow.repository + } + if (workflow.commitId) { + misc_fields['Pipeline repository Git Commit'] = workflow.commitId + } + if (workflow.revision) { + misc_fields['Pipeline Git branch/tag'] = workflow.revision + } + misc_fields['Nextflow Version'] = workflow.nextflow.version + misc_fields['Nextflow Build'] = workflow.nextflow.build misc_fields['Nextflow Compile Timestamp'] = workflow.nextflow.timestamp def email_fields = [:] @@ -332,39 +341,41 @@ def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdi // Render the sendmail template def max_multiqc_email_size = (params.containsKey('max_multiqc_email_size') ? params.max_multiqc_email_size : 0) as nextflow.util.MemoryUnit - def smail_fields = [ email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, projectDir: "${workflow.projectDir}", mqcFile: mqc_report, mqcMaxSize: max_multiqc_email_size.toBytes() ] + def smail_fields = [email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, projectDir: "${workflow.projectDir}", mqcFile: mqc_report, mqcMaxSize: max_multiqc_email_size.toBytes()] def sf = new File("${workflow.projectDir}/assets/sendmail_template.txt") def sendmail_template = engine.createTemplate(sf).make(smail_fields) def sendmail_html = sendmail_template.toString() // Send the HTML e-mail - Map colors = logColours(monochrome_logs) + def colors = logColours(monochrome_logs) as Map if (email_address) { try { - if (plaintext_email) { throw GroovyException('Send plaintext e-mail, not HTML') } + if (plaintext_email) { +new org.codehaus.groovy.GroovyException('Send plaintext e-mail, not HTML') } // Try to send HTML e-mail using sendmail def sendmail_tf = new File(workflow.launchDir.toString(), ".sendmail_tmp.html") sendmail_tf.withWriter { w -> w << sendmail_html } - [ 'sendmail', '-t' ].execute() << sendmail_html - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Sent summary e-mail to $email_address (sendmail)-" - } catch (all) { + ['sendmail', '-t'].execute() << sendmail_html + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Sent summary e-mail to ${email_address} (sendmail)-") + } + catch (Exception all) { // Catch failures and try with plaintext - def mail_cmd = [ 'mail', '-s', subject, '--content-type=text/html', email_address ] + def mail_cmd = ['mail', '-s', subject, '--content-type=text/html', email_address] mail_cmd.execute() << email_html - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Sent summary e-mail to $email_address (mail)-" + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Sent summary e-mail to ${email_address} (mail)-") } } // Write summary e-mail HTML to a file def output_hf = new File(workflow.launchDir.toString(), ".pipeline_report.html") output_hf.withWriter { w -> w << email_html } - FilesEx.copyTo(output_hf.toPath(), "${outdir}/pipeline_info/pipeline_report.html"); + nextflow.extension.FilesEx.copyTo(output_hf.toPath(), "${outdir}/pipeline_info/pipeline_report.html") output_hf.delete() // Write summary e-mail TXT to a file def output_tf = new File(workflow.launchDir.toString(), ".pipeline_report.txt") output_tf.withWriter { w -> w << email_txt } - FilesEx.copyTo(output_tf.toPath(), "${outdir}/pipeline_info/pipeline_report.txt"); + nextflow.extension.FilesEx.copyTo(output_tf.toPath(), "${outdir}/pipeline_info/pipeline_report.txt") output_tf.delete() } @@ -372,15 +383,17 @@ def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdi // Print pipeline summary on completion // def completionSummary(monochrome_logs=true) { - Map colors = logColours(monochrome_logs) + def colors = logColours(monochrome_logs) as Map if (workflow.success) { if (workflow.stats.ignoredCount == 0) { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Pipeline completed successfully${colors.reset}-" - } else { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.yellow} Pipeline completed successfully, but with errored process(es) ${colors.reset}-" + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Pipeline completed successfully${colors.reset}-") + } + else { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.yellow} Pipeline completed successfully, but with errored process(es) ${colors.reset}-") } - } else { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.red} Pipeline completed with errors${colors.reset}-" + } + else { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.red} Pipeline completed with errors${colors.reset}-") } } @@ -389,21 +402,30 @@ def completionSummary(monochrome_logs=true) { // def imNotification(summary_params, hook_url) { def summary = [:] - for (group in summary_params.keySet()) { - summary << summary_params[group] - } + summary_params + .keySet() + .sort() + .each { group -> + summary << summary_params[group] + } def misc_fields = [:] - misc_fields['start'] = workflow.start - misc_fields['complete'] = workflow.complete - misc_fields['scriptfile'] = workflow.scriptFile - misc_fields['scriptid'] = workflow.scriptId - if (workflow.repository) misc_fields['repository'] = workflow.repository - if (workflow.commitId) misc_fields['commitid'] = workflow.commitId - if (workflow.revision) misc_fields['revision'] = workflow.revision - misc_fields['nxf_version'] = workflow.nextflow.version - misc_fields['nxf_build'] = workflow.nextflow.build - misc_fields['nxf_timestamp'] = workflow.nextflow.timestamp + misc_fields['start'] = workflow.start + misc_fields['complete'] = workflow.complete + misc_fields['scriptfile'] = workflow.scriptFile + misc_fields['scriptid'] = workflow.scriptId + if (workflow.repository) { + misc_fields['repository'] = workflow.repository + } + if (workflow.commitId) { + misc_fields['commitid'] = workflow.commitId + } + if (workflow.revision) { + misc_fields['revision'] = workflow.revision + } + misc_fields['nxf_version'] = workflow.nextflow.version + misc_fields['nxf_build'] = workflow.nextflow.build + misc_fields['nxf_timestamp'] = workflow.nextflow.timestamp def msg_fields = [:] msg_fields['version'] = getWorkflowVersion() @@ -428,13 +450,13 @@ def imNotification(summary_params, hook_url) { def json_message = json_template.toString() // POST - def post = new URL(hook_url).openConnection(); + def post = new URL(hook_url).openConnection() post.setRequestMethod("POST") post.setDoOutput(true) post.setRequestProperty("Content-Type", "application/json") - post.getOutputStream().write(json_message.getBytes("UTF-8")); - def postRC = post.getResponseCode(); - if (! postRC.equals(200)) { - log.warn(post.getErrorStream().getText()); + post.getOutputStream().write(json_message.getBytes("UTF-8")) + def postRC = post.getResponseCode() + if (!postRC.equals(200)) { + log.warn(post.getErrorStream().getText()) } } diff --git a/subworkflows/nf-core/utils_nfschema_plugin/main.nf b/subworkflows/nf-core/utils_nfschema_plugin/main.nf new file mode 100644 index 00000000..4994303e --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/main.nf @@ -0,0 +1,46 @@ +// +// Subworkflow that uses the nf-schema plugin to validate parameters and render the parameter summary +// + +include { paramsSummaryLog } from 'plugin/nf-schema' +include { validateParameters } from 'plugin/nf-schema' + +workflow UTILS_NFSCHEMA_PLUGIN { + + take: + input_workflow // workflow: the workflow object used by nf-schema to get metadata from the workflow + validate_params // boolean: validate the parameters + parameters_schema // string: path to the parameters JSON schema. + // this has to be the same as the schema given to `validation.parametersSchema` + // when this input is empty it will automatically use the configured schema or + // "${projectDir}/nextflow_schema.json" as default. This input should not be empty + // for meta pipelines + + main: + + // + // Print parameter summary to stdout. This will display the parameters + // that differ from the default given in the JSON schema + // + if(parameters_schema) { + log.info paramsSummaryLog(input_workflow, parameters_schema:parameters_schema) + } else { + log.info paramsSummaryLog(input_workflow) + } + + // + // Validate the parameters using nextflow_schema.json or the schema + // given via the validation.parametersSchema configuration option + // + if(validate_params) { + if(parameters_schema) { + validateParameters(parameters_schema:parameters_schema) + } else { + validateParameters() + } + } + + emit: + dummy_emit = true +} + diff --git a/subworkflows/nf-core/utils_nfschema_plugin/meta.yml b/subworkflows/nf-core/utils_nfschema_plugin/meta.yml new file mode 100644 index 00000000..f7d9f028 --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/meta.yml @@ -0,0 +1,35 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json +name: "utils_nfschema_plugin" +description: Run nf-schema to validate parameters and create a summary of changed parameters +keywords: + - validation + - JSON schema + - plugin + - parameters + - summary +components: [] +input: + - input_workflow: + type: object + description: | + The workflow object of the used pipeline. + This object contains meta data used to create the params summary log + - validate_params: + type: boolean + description: Validate the parameters and error if invalid. + - parameters_schema: + type: string + description: | + Path to the parameters JSON schema. + This has to be the same as the schema given to the `validation.parametersSchema` config + option. When this input is empty it will automatically use the configured schema or + "${projectDir}/nextflow_schema.json" as default. The schema should not be given in this way + for meta pipelines. +output: + - dummy_emit: + type: boolean + description: Dummy emit to make nf-core subworkflows lint happy +authors: + - "@nvnieuwk" +maintainers: + - "@nvnieuwk" diff --git a/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test b/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test new file mode 100644 index 00000000..842dc432 --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test @@ -0,0 +1,117 @@ +nextflow_workflow { + + name "Test Subworkflow UTILS_NFSCHEMA_PLUGIN" + script "../main.nf" + workflow "UTILS_NFSCHEMA_PLUGIN" + + tag "subworkflows" + tag "subworkflows_nfcore" + tag "subworkflows/utils_nfschema_plugin" + tag "plugin/nf-schema" + + config "./nextflow.config" + + test("Should run nothing") { + + when { + + params { + test_data = '' + } + + workflow { + """ + validate_params = false + input[0] = workflow + input[1] = validate_params + input[2] = "" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } + + test("Should validate params") { + + when { + + params { + test_data = '' + outdir = 1 + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "" + """ + } + } + + then { + assertAll( + { assert workflow.failed }, + { assert workflow.stdout.any { it.contains('ERROR ~ Validation of pipeline parameters failed!') } } + ) + } + } + + test("Should run nothing - custom schema") { + + when { + + params { + test_data = '' + } + + workflow { + """ + validate_params = false + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } + + test("Should validate params - custom schema") { + + when { + + params { + test_data = '' + outdir = 1 + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + """ + } + } + + then { + assertAll( + { assert workflow.failed }, + { assert workflow.stdout.any { it.contains('ERROR ~ Validation of pipeline parameters failed!') } } + ) + } + } +} diff --git a/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config new file mode 100644 index 00000000..0907ac58 --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config @@ -0,0 +1,8 @@ +plugins { + id "nf-schema@2.1.0" +} + +validation { + parametersSchema = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + monochromeLogs = true +} \ No newline at end of file diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/nextflow_schema.json b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json similarity index 95% rename from subworkflows/nf-core/utils_nfvalidation_plugin/tests/nextflow_schema.json rename to subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json index 7626c1c9..331e0d2f 100644 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/nextflow_schema.json +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json @@ -1,10 +1,10 @@ { - "$schema": "http://json-schema.org/draft-07/schema", + "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/./master/nextflow_schema.json", "title": ". pipeline parameters", "description": "", "type": "object", - "definitions": { + "$defs": { "input_output_options": { "title": "Input/output options", "type": "object", @@ -87,10 +87,10 @@ }, "allOf": [ { - "$ref": "#/definitions/input_output_options" + "$ref": "#/$defs/input_output_options" }, { - "$ref": "#/definitions/generic_options" + "$ref": "#/$defs/generic_options" } ] } diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/main.nf b/subworkflows/nf-core/utils_nfvalidation_plugin/main.nf deleted file mode 100644 index 2585b65d..00000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/main.nf +++ /dev/null @@ -1,62 +0,0 @@ -// -// Subworkflow that uses the nf-validation plugin to render help text and parameter summary -// - -/* -======================================================================================== - IMPORT NF-VALIDATION PLUGIN -======================================================================================== -*/ - -include { paramsHelp } from 'plugin/nf-validation' -include { paramsSummaryLog } from 'plugin/nf-validation' -include { validateParameters } from 'plugin/nf-validation' - -/* -======================================================================================== - SUBWORKFLOW DEFINITION -======================================================================================== -*/ - -workflow UTILS_NFVALIDATION_PLUGIN { - - take: - print_help // boolean: print help - workflow_command // string: default commmand used to run pipeline - pre_help_text // string: string to be printed before help text and summary log - post_help_text // string: string to be printed after help text and summary log - validate_params // boolean: validate parameters - schema_filename // path: JSON schema file, null to use default value - - main: - - log.debug "Using schema file: ${schema_filename}" - - // Default values for strings - pre_help_text = pre_help_text ?: '' - post_help_text = post_help_text ?: '' - workflow_command = workflow_command ?: '' - - // - // Print help message if needed - // - if (print_help) { - log.info pre_help_text + paramsHelp(workflow_command, parameters_schema: schema_filename) + post_help_text - System.exit(0) - } - - // - // Print parameter summary to stdout - // - log.info pre_help_text + paramsSummaryLog(workflow, parameters_schema: schema_filename) + post_help_text - - // - // Validate parameters relative to the parameter JSON schema - // - if (validate_params){ - validateParameters(parameters_schema: schema_filename) - } - - emit: - dummy_emit = true -} diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/meta.yml b/subworkflows/nf-core/utils_nfvalidation_plugin/meta.yml deleted file mode 100644 index 3d4a6b04..00000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/meta.yml +++ /dev/null @@ -1,44 +0,0 @@ -# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json -name: "UTILS_NFVALIDATION_PLUGIN" -description: Use nf-validation to initiate and validate a pipeline -keywords: - - utility - - pipeline - - initialise - - validation -components: [] -input: - - print_help: - type: boolean - description: | - Print help message and exit - - workflow_command: - type: string - description: | - The command to run the workflow e.g. "nextflow run main.nf" - - pre_help_text: - type: string - description: | - Text to print before the help message - - post_help_text: - type: string - description: | - Text to print after the help message - - validate_params: - type: boolean - description: | - Validate the parameters and error if invalid. - - schema_filename: - type: string - description: | - The filename of the schema to validate against. -output: - - dummy_emit: - type: boolean - description: | - Dummy emit to make nf-core subworkflows lint happy -authors: - - "@adamrtalbot" -maintainers: - - "@adamrtalbot" - - "@maxulysse" diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/main.nf.test b/subworkflows/nf-core/utils_nfvalidation_plugin/tests/main.nf.test deleted file mode 100644 index 5784a33f..00000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/main.nf.test +++ /dev/null @@ -1,200 +0,0 @@ -nextflow_workflow { - - name "Test Workflow UTILS_NFVALIDATION_PLUGIN" - script "../main.nf" - workflow "UTILS_NFVALIDATION_PLUGIN" - tag "subworkflows" - tag "subworkflows_nfcore" - tag "plugin/nf-validation" - tag "'plugin/nf-validation'" - tag "utils_nfvalidation_plugin" - tag "subworkflows/utils_nfvalidation_plugin" - - test("Should run nothing") { - - when { - - params { - monochrome_logs = true - test_data = '' - } - - workflow { - """ - help = false - workflow_command = null - pre_help_text = null - post_help_text = null - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success } - ) - } - } - - test("Should run help") { - - - when { - - params { - monochrome_logs = true - test_data = '' - } - workflow { - """ - help = true - workflow_command = null - pre_help_text = null - post_help_text = null - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert workflow.exitStatus == 0 }, - { assert workflow.stdout.any { it.contains('Input/output options') } }, - { assert workflow.stdout.any { it.contains('--outdir') } } - ) - } - } - - test("Should run help with command") { - - when { - - params { - monochrome_logs = true - test_data = '' - } - workflow { - """ - help = true - workflow_command = "nextflow run noorg/doesntexist" - pre_help_text = null - post_help_text = null - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert workflow.exitStatus == 0 }, - { assert workflow.stdout.any { it.contains('nextflow run noorg/doesntexist') } }, - { assert workflow.stdout.any { it.contains('Input/output options') } }, - { assert workflow.stdout.any { it.contains('--outdir') } } - ) - } - } - - test("Should run help with extra text") { - - - when { - - params { - monochrome_logs = true - test_data = '' - } - workflow { - """ - help = true - workflow_command = "nextflow run noorg/doesntexist" - pre_help_text = "pre-help-text" - post_help_text = "post-help-text" - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert workflow.exitStatus == 0 }, - { assert workflow.stdout.any { it.contains('pre-help-text') } }, - { assert workflow.stdout.any { it.contains('nextflow run noorg/doesntexist') } }, - { assert workflow.stdout.any { it.contains('Input/output options') } }, - { assert workflow.stdout.any { it.contains('--outdir') } }, - { assert workflow.stdout.any { it.contains('post-help-text') } } - ) - } - } - - test("Should validate params") { - - when { - - params { - monochrome_logs = true - test_data = '' - outdir = 1 - } - workflow { - """ - help = false - workflow_command = null - pre_help_text = null - post_help_text = null - validate_params = true - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.failed }, - { assert workflow.stdout.any { it.contains('ERROR ~ ERROR: Validation of pipeline parameters failed!') } } - ) - } - } -} diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/tags.yml b/subworkflows/nf-core/utils_nfvalidation_plugin/tests/tags.yml deleted file mode 100644 index 60b1cfff..00000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nfvalidation_plugin: - - subworkflows/nf-core/utils_nfvalidation_plugin/** diff --git a/tests/lib/UTILS.groovy b/tests/lib/UTILS.groovy new file mode 100644 index 00000000..deacb586 --- /dev/null +++ b/tests/lib/UTILS.groovy @@ -0,0 +1,11 @@ +// Function to remove Nextflow version from software_versions.yml + +class UTILS { + public static String removeNextflowVersion(outputDir) { + def softwareVersions = path("$outputDir/pipeline_info/nf_core_pipeline_software_mqc_versions.yml").yaml + if (softwareVersions.containsKey("Workflow")) { + softwareVersions.Workflow.remove("Nextflow") + } + return softwareVersions + } +} diff --git a/tests/test_contamination_tech_reps.nf.test b/tests/test_contamination_tech_reps.nf.test new file mode 100644 index 00000000..02266078 --- /dev/null +++ b/tests/test_contamination_tech_reps.nf.test @@ -0,0 +1,106 @@ + +nextflow_pipeline { + + name "Test Workflow main.nf - test_contamination_tech_reps" + script "main.nf" + profile "test_contamination_tech_reps" + tag "test_contamination_tech_reps" + tag "contamination" + tag "pipeline" + + test("test_contamination_tech_reps") { + + when { + params { + outdir = "$outputDir" + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(UTILS.removeNextflowVersion("$outputDir")).match("software_versions") }, + { assert workflow.trace.succeeded().size() == 100 }, + + { assert snapshot( + path("$outputDir/contaminant_filter/filter/Clone1_N1_trimmed.contamination_mqc.yaml").exists(), //TODO see if we can make these deterministic or why they are non-deterministic + path("$outputDir/contaminant_filter/filter/Clone1_N3_trimmed.contamination_mqc.yaml").exists(), + path("$outputDir/contaminant_filter/filter/Clone1_N1.contamination_mqc.yaml").exists() + ).match("contaminant_filter_filter") }, + + { assert snapshot( + path("$outputDir/mirna_quant/bam/mature/Clone1_N3_trimmed_mature.sorted.stats").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_mature.sorted.stats").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N3_trimmed_mature.sorted.idxstats").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N3_trimmed_mature.sorted.flagstat").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_trimmed_mature.sorted.idxstats").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_mature.sorted.flagstat").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_trimmed_mature.sorted.flagstat").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_mature.sorted.idxstats").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_trimmed_mature.sorted.stats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_mature_hairpin.sorted.flagstat").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_trimmed_mature_hairpin.sorted.flagstat").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N3_trimmed_mature_hairpin.sorted.flagstat").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_trimmed_mature_hairpin.sorted.stats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_trimmed_mature_hairpin.sorted.idxstats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N3_trimmed_mature_hairpin.sorted.stats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N3_trimmed_mature_hairpin.sorted.idxstats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_mature_hairpin.sorted.stats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_mature_hairpin.sorted.idxstats").exists() + ).match("mirna_quant_bam") }, + + { assert snapshot( + path("$outputDir/mirna_quant/edger_qc/hairpin_edgeR_MDS_distance_matrix.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_counts.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_edgeR_MDS_plot_coordinates.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_log2CPM_sample_distances.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_counts.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_normalized_CPM.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_logtpm.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_unmapped_read_counts.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_edgeR_MDS_plot_coordinates.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_logtpm.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_edgeR_MDS_distance_matrix.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_logtpm.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_normalized_CPM.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_logtpm.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_log2CPM_sample_distances.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_unmapped_read_counts.txt").exists() + ).match("mirna_quant_edger_qc") }, + + { assert snapshot( + path("$outputDir/mirna_quant/mirtop/joined_samples_mirtop.tsv").exists(), + path("$outputDir/mirna_quant/mirtop/mirna.tsv").exists(), + ).match("mirna_quant_mirtop") }, + + { assert snapshot( + path("$outputDir/mirtrace/Clone1_N1/mirtrace-report.html").exists(), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-contamination_detailed.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-contamination_basic.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-length.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-mirna-complexity.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-phred.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-qcstatus.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-rnatype.tsv"), + path("$outputDir/mirtrace/Clone1_N1_trimmed/mirtrace-report.html").exists(), + path("$outputDir/mirtrace/Clone1_N1_trimmed/mirtrace-stats-contamination_detailed.tsv"), + path("$outputDir/mirtrace/Clone1_N1_trimmed/mirtrace-stats-contamination_basic.tsv"), + path("$outputDir/mirtrace/Clone1_N1_trimmed/mirtrace-stats-length.tsv"), + path("$outputDir/mirtrace/Clone1_N1_trimmed/mirtrace-stats-mirna-complexity.tsv"), + path("$outputDir/mirtrace/Clone1_N1_trimmed/mirtrace-stats-phred.tsv"), + path("$outputDir/mirtrace/Clone1_N1_trimmed/mirtrace-stats-qcstatus.tsv"), + path("$outputDir/mirtrace/Clone1_N1_trimmed/mirtrace-stats-rnatype.tsv"), + path("$outputDir/mirtrace/Clone1_N3_trimmed/mirtrace-report.html").exists(), + path("$outputDir/mirtrace/Clone1_N3_trimmed/mirtrace-stats-contamination_detailed.tsv"), + path("$outputDir/mirtrace/Clone1_N3_trimmed/mirtrace-stats-contamination_basic.tsv"), + path("$outputDir/mirtrace/Clone1_N3_trimmed/mirtrace-stats-length.tsv"), + path("$outputDir/mirtrace/Clone1_N3_trimmed/mirtrace-stats-mirna-complexity.tsv"), + path("$outputDir/mirtrace/Clone1_N3_trimmed/mirtrace-stats-phred.tsv"), + path("$outputDir/mirtrace/Clone1_N3_trimmed/mirtrace-stats-qcstatus.tsv"), + path("$outputDir/mirtrace/Clone1_N3_trimmed/mirtrace-stats-rnatype.tsv") + ).match("mirtrace") } + ) + } + } + +} diff --git a/tests/test_contamination_tech_reps.nf.test.snap b/tests/test_contamination_tech_reps.nf.test.snap new file mode 100644 index 00000000..106bc8ed --- /dev/null +++ b/tests/test_contamination_tech_reps.nf.test.snap @@ -0,0 +1,120 @@ +{ + "mirtrace": { + "content": [ + true, + "mirtrace-stats-contamination_detailed.tsv:md5,141f6f46a4f9fcfac91303c125a2fc75", + "mirtrace-stats-contamination_basic.tsv:md5,34005177b3038da76236b5d8fb468825", + "mirtrace-stats-length.tsv:md5,7d835805aca881496e65f6f96aa60d7d", + "mirtrace-stats-mirna-complexity.tsv:md5,3605fb37445fe2773d6b606a2171fbf8", + "mirtrace-stats-phred.tsv:md5,0f84ab70c67a94bab6820cf97c63d10f", + "mirtrace-stats-qcstatus.tsv:md5,aeeebcd9a4305a57d0f13fb2e3dec4cb", + "mirtrace-stats-rnatype.tsv:md5,406fdb7bde295573d4d92b57a1c59767", + true, + "mirtrace-stats-contamination_detailed.tsv:md5,c1af2894292fcd3242fe27a4adcaef49", + "mirtrace-stats-contamination_basic.tsv:md5,307eceb1b067f014aaad9e09263cebfc", + "mirtrace-stats-length.tsv:md5,4bca762e07f887e245d3757834ecbf4c", + "mirtrace-stats-mirna-complexity.tsv:md5,854e5d79c69685dd72922cc7d2135564", + "mirtrace-stats-phred.tsv:md5,1e74ce76a5247ce6bce716de7f0d9377", + "mirtrace-stats-qcstatus.tsv:md5,319be3a39d281282fb139baed4c7cbbb", + "mirtrace-stats-rnatype.tsv:md5,d0323854e2950c8ff7454227ff22dfd1", + true, + "mirtrace-stats-contamination_detailed.tsv:md5,329c0800ed9e59553e74e121b927047a", + "mirtrace-stats-contamination_basic.tsv:md5,651b466f7a6c8f4abff9f9ec44be06fd", + "mirtrace-stats-length.tsv:md5,ac1d2301e09318eff98336459de0b2ca", + "mirtrace-stats-mirna-complexity.tsv:md5,27e7eb5cdb0c5fe3b80217383f1f0570", + "mirtrace-stats-phred.tsv:md5,2b34bed301cb94ff1d24e08a503b92ee", + "mirtrace-stats-qcstatus.tsv:md5,212b57cf93930422c2a0119dc5959b1d", + "mirtrace-stats-rnatype.tsv:md5,eb2d397090cd2c511590ee53470033a5" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-11T15:16:30.896336965" + }, + "software_versions": { + "content": [ + "{BLAT_CDNA={blat=36}, BLAT_NCRNA={blat=36}, BOWTIE2_ALIGN_CDNA={bowtie2=2.5.2, samtools=1.18, pigz=2.6}, BOWTIE2_ALIGN_NCRNA={bowtie2=2.5.2, samtools=1.18, pigz=2.6}, BOWTIE2_ALIGN_TRNA={bowtie2=2.5.2, samtools=1.18, pigz=2.6}, BOWTIE_MAP_HAIRPIN={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_MATURE={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_SEQCLUSTER={bowtie=1.3.0, samtools=1.16.1}, CAT_FASTQ={cat=8.3}, CSVTK_JOIN={csvtk=0.30.0}, DATATABLE_MERGE={r-base=3.6.2}, FASTP={fastp=0.23.4}, FILTER_STATS={BusyBox=1.32.1}, FORMAT_HAIRPIN={fastx_toolkit=0.0.14}, FORMAT_MATURE={fastx_toolkit=0.0.14}, GAWK_CDNA={gawk=5.3.0}, GAWK_NCRNA={gawk=5.3.0}, INDEX_CDNA={bowtie2=2.5.2}, INDEX_HAIRPIN={bowtie=1.3.0}, INDEX_MATURE={bowtie=1.3.0}, INDEX_NCRNA={bowtie2=2.5.2}, INDEX_TRNA={bowtie2=2.5.2}, MIRTOP_COUNTS={mirtop=0.4.28}, MIRTOP_EXPORT={mirtop=0.4.28}, MIRTOP_GFF={mirtop=0.4.28}, MIRTOP_STATS={mirtop=0.4.28}, MIRTRACE_QC={mirtrace=1.0.1}, PARSE_HAIRPIN={seqkit=2.6.1}, PARSE_MATURE={seqkit=2.6.1}, SAMTOOLS_FLAGSTAT={samtools=1.21}, SAMTOOLS_IDXSTATS={samtools=1.21}, SAMTOOLS_INDEX={samtools=1.21}, SAMTOOLS_SORT={samtools=1.21}, SAMTOOLS_STATS={samtools=1.21}, SEQCLUSTER_COLLAPSE={seqcluster=1.2.9}, SEQKIT_GREP_CDNA={seqkit=2.8.0}, SEQKIT_GREP_NCRNA={seqkit=2.8.0}, STATS_GAWK_CDNA={gawk=5.3.0}, STATS_GAWK_NCRNA={gawk=5.3.0}, STATS_GAWK_TRNA={gawk=5.3.0}, Workflow={nf-core/smrnaseq=v2.4.0}}" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:16:26.853242481" + }, + "mirna_quant_bam": { + "content": [ + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-01T20:06:04.974546479" + }, + "mirna_quant_edger_qc": { + "content": [ + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-01T20:06:05.025175321" + }, + "contaminant_filter_filter": { + "content": [ + true, + true, + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-01T20:06:04.920520728" + }, + "mirna_quant_mirtop": { + "content": [ + true, + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-01T20:06:05.070939602" + } +} diff --git a/tests/test_mirgenedb.nf.test b/tests/test_mirgenedb.nf.test new file mode 100644 index 00000000..9433f837 --- /dev/null +++ b/tests/test_mirgenedb.nf.test @@ -0,0 +1,131 @@ + +nextflow_pipeline { + + name "Test Workflow main.nf - test_mirgenedb" + script "main.nf" + profile "test_mirgenedb" + tag "mirgenedb" + tag "pipeline" + + test("test_mirgenedb") { + + when { + params { + outdir = "$outputDir" + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(UTILS.removeNextflowVersion("$outputDir")).match("software_versions") }, + { assert workflow.trace.succeeded().size() == 90 }, + { assert workflow.trace.failed().size() == 1 }, + + { assert snapshot( + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_mature.sorted.flagstat").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_mature.sorted.stats").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone9_N1_mature.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/mature/Clone9_N1_mature.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/mature/Control_N1_mature.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/mature/Control_N1_mature.sorted.stats"), + path("$outputDir/mirna_quant/bam/mature/Clone9_N1_mature.sorted.stats"), + path("$outputDir/mirna_quant/bam/mature/Control_N1_mature.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_mature.sorted.idxstats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Control_N1_mature_hairpin.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/hairpin/Control_N1_mature_hairpin.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/hairpin/Clone9_N1_mature_hairpin.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/hairpin/Clone9_N1_mature_hairpin.sorted.stats"), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_mature_hairpin.sorted.idxstats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_mature_hairpin.sorted.flagstat").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_mature_hairpin.sorted.stats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Control_N1_mature_hairpin.sorted.stats"), + path("$outputDir/mirna_quant/bam/hairpin/Clone9_N1_mature_hairpin.sorted.flagstat") + ).match("mirna_quant_bam") }, + + { assert snapshot( + path("$outputDir/mirna_quant/edger_qc/hairpin_edgeR_MDS_distance_matrix.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_counts.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_edgeR_MDS_plot_coordinates.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_log2CPM_sample_distances.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_logtpm.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_normalized_CPM.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_logtpm.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_counts.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_edgeR_MDS_distance_matrix.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_unmapped_read_counts.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_edgeR_MDS_plot_coordinates.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_log2CPM_sample_distances.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_logtpm.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_logtpm.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_normalized_CPM.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_unmapped_read_counts.txt").exists() + ).match("mirna_quant_edger_qc") }, + + { assert snapshot( + path("$outputDir/genome_quant/bam/Clone9_N1_mature_hairpin_genome.sorted.stats"), + path("$outputDir/genome_quant/bam/Clone9_N1_mature_hairpin_genome.sorted.idxstats"), + path("$outputDir/genome_quant/bam/Clone1_N1_mature_hairpin_genome.sorted.stats"), + path("$outputDir/genome_quant/bam/Control_N1_mature_hairpin_genome.sorted.stats"), + path("$outputDir/genome_quant/bam/Clone1_N1_mature_hairpin_genome.sorted.flagstat"), + path("$outputDir/genome_quant/bam/Clone9_N1_mature_hairpin_genome.sorted.flagstat"), + path("$outputDir/genome_quant/bam/Clone1_N1_mature_hairpin_genome.sorted.idxstats"), + path("$outputDir/genome_quant/bam/Control_N1_mature_hairpin_genome.sorted.idxstats"), + path("$outputDir/genome_quant/bam/Control_N1_mature_hairpin_genome.sorted.flagstat") + ).match("genome_quant_bam") }, + + { assert snapshot( + path("$outputDir/mirdeep2/result_Clone1_N1.csv").exists(), + path("$outputDir/mirdeep2/result_Control_N1.csv").exists(), + path("$outputDir/mirdeep2/result_Control_N1.bed").exists(), + path("$outputDir/mirdeep2/result_Control_N1.bed").exists(), + path("$outputDir/mirdeep2/result_Control_N1.html").exists(), + path("$outputDir/mirdeep2/result_Control_N1.html").exists() + ).match("mirdeep2") }, + + { assert snapshot( + path("$outputDir/multiqc/multiqc_report.html").exists() + ).match("multiqc") }, + + { assert snapshot( + path("$outputDir/multiqc/multiqc_data/fastqc-status-check-heatmap.txt"), + path("$outputDir/multiqc/multiqc_data/fastp_filtered_reads_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_overrepresented_sequences_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_top_overrepresented_sequences_table.txt").exists(), + path("$outputDir/multiqc/multiqc_data/fastqc-1_sequence_counts_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_sequence_gc_content_plot_Percentages.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_citations.txt"), + path("$outputDir/multiqc/multiqc_data/samtools-stats-dp.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_length_distribution_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-n-plot_Read_1_Before_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_sequence_duplication_levels_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_base_sequence_quality_plot.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_general_stats.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_base_n_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_base_n_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-quality-plot_Read_1_After_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_quality_scores_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_sequence_quality_scores_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_top_overrepresented_sequences_table.txt").exists(), + path("$outputDir/multiqc/multiqc_data/fastqc-1_overrepresented_sequences_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1-status-check-heatmap.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_counts_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-quality-plot_Read_1_Before_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/samtools_alignment_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_base_sequence_quality_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-n-plot_Read_1_After_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_adapter_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_duplication_levels_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Percentages.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_sequence_gc_content_plot_Counts.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_adapter_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-gc-plot_Read_1_Before_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Counts.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_sources.txt").exists(), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-gc-plot_Read_1_After_filtering.txt") + ).match("multiqc_multiqc_data")} + ) + } + } + +} diff --git a/tests/test_mirgenedb.nf.test.snap b/tests/test_mirgenedb.nf.test.snap new file mode 100644 index 00000000..b276795e --- /dev/null +++ b/tests/test_mirgenedb.nf.test.snap @@ -0,0 +1,151 @@ +{ + "genome_quant_bam": { + "content": [ + "Clone9_N1_mature_hairpin_genome.sorted.stats:md5,6b1d2b924593096358494dda37b46770", + "Clone9_N1_mature_hairpin_genome.sorted.idxstats:md5,aa37c5da7c2b4505ce58c3a21f97121c", + "Clone1_N1_mature_hairpin_genome.sorted.stats:md5,3c5f51cd7136eed5e97847ad7b857d23", + "Control_N1_mature_hairpin_genome.sorted.stats:md5,a7f2dd17a34c8f0b669a774404247394", + "Clone1_N1_mature_hairpin_genome.sorted.flagstat:md5,5bb521c495f1c450835299b1eb88dc84", + "Clone9_N1_mature_hairpin_genome.sorted.flagstat:md5,6a8ad3be2ca0fa924fd32a04293d4ce4", + "Clone1_N1_mature_hairpin_genome.sorted.idxstats:md5,d92f9eae7657418858e6d2b69436f74f", + "Control_N1_mature_hairpin_genome.sorted.idxstats:md5,a11f543771cea6b383fb596f60e998c3", + "Control_N1_mature_hairpin_genome.sorted.flagstat:md5,df2a57ac3b36f5d40793d3105a4bb2d1" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:23:13.57712901" + }, + "software_versions": { + "content": [ + "{BOWTIE_MAP_GENOME={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_HAIRPIN={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_MATURE={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_SEQCLUSTER={bowtie=1.3.0, samtools=1.16.1}, FASTP={fastp=0.23.4}, FASTQC_RAW={fastqc=0.12.1}, FASTQC_TRIM={fastqc=0.12.1}, FORMAT_HAIRPIN={fastx_toolkit=0.0.14}, FORMAT_MATURE={fastx_toolkit=0.0.14}, INDEX_HAIRPIN={bowtie=1.3.0}, INDEX_MATURE={bowtie=1.3.0}, MIRDEEP2_MAPPER={mirdeep2=2.0.1}, MIRDEEP2_MIRDEEP2={mirdeep2=2.0.1}, PARSE_HAIRPIN={seqkit=2.6.1}, PARSE_MATURE={seqkit=2.6.1}, SAMTOOLS_FLAGSTAT={samtools=1.21}, SAMTOOLS_IDXSTATS={samtools=1.21}, SAMTOOLS_INDEX={samtools=1.21}, SAMTOOLS_SORT={samtools=1.21}, SAMTOOLS_STATS={samtools=1.21}, SEQCLUSTER_COLLAPSE={seqcluster=1.2.9}, SEQKIT_FQ2FA={seqkit=2.8.0}, SEQKIT_REPLACE={seqkit=2.8.0}, Workflow={nf-core/smrnaseq=v2.4.0}}" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:23:13.407922799" + }, + "mirna_quant_bam": { + "content": [ + true, + true, + "Clone9_N1_mature.sorted.flagstat:md5,adf40ba27907b6ef726d6c5923a731b9", + "Clone9_N1_mature.sorted.idxstats:md5,8302f401476f5c8fee3333e1c742c05e", + "Control_N1_mature.sorted.flagstat:md5,f8df7690d20014518f47dc2fe39debec", + "Control_N1_mature.sorted.stats:md5,3c952d69ed81186976ede053c847e109", + "Clone9_N1_mature.sorted.stats:md5,873e4f40e377cc445ace1ac48354729d", + "Control_N1_mature.sorted.idxstats:md5,b7a382b1d0f5cba6cb94b3b5a6b18f84", + true, + "Control_N1_mature_hairpin.sorted.idxstats:md5,79dc5e82ff88e7379c893549224cd87f", + "Control_N1_mature_hairpin.sorted.flagstat:md5,1dc7b98f0014a99a20de7c09a6b95340", + "Clone9_N1_mature_hairpin.sorted.idxstats:md5,f3ed5bf23f73d41c42d3da0bf30f89ea", + "Clone9_N1_mature_hairpin.sorted.stats:md5,c306ef4c5b1e23a3d032b532cf916fc1", + true, + true, + true, + "Control_N1_mature_hairpin.sorted.stats:md5,3e82fa30bfafcab2e8fb2f247e591959", + "Clone9_N1_mature_hairpin.sorted.flagstat:md5,678f4f9e98c3e1fcc5af54e8dd06fbbc" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:23:13.503940378" + }, + "mirdeep2": { + "content": [ + true, + true, + true, + true, + true, + true + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-26T18:15:04.45050483" + }, + "mirna_quant_edger_qc": { + "content": [ + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "23.10.0" + }, + "timestamp": "2024-08-30T20:30:51.020931866" + }, + "multiqc_multiqc_data": { + "content": [ + "fastqc-status-check-heatmap.txt:md5,3f897028847a4a3a5f666325cd732067", + "fastp_filtered_reads_plot.txt:md5,277be8c15684b9f09737bc13bcd3a055", + "fastqc_overrepresented_sequences_plot.txt:md5,7d7ecccd5d6c7fd63439785888c0c164", + true, + "fastqc-1_sequence_counts_plot.txt:md5,59faee895ea86c12a4124d417e3bbd63", + "fastqc-1_per_sequence_gc_content_plot_Percentages.txt:md5,0a4b4285f2c53dca216c107decc9921f", + "multiqc_citations.txt:md5,57db2426be011862828d18f767d25b57", + "samtools-stats-dp.txt:md5,45c0315bade3f07942ded1ead37c1489", + "fastqc_sequence_length_distribution_plot.txt:md5,8b5cf1e3429a1ea0b3c63cfb176e1014", + "fastp-seq-content-n-plot_Read_1_Before_filtering.txt:md5,a0502dd4f701c9deb646ffbec80c09de", + "fastqc-1_sequence_duplication_levels_plot.txt:md5,2072cda513c8884047d9d11c8aacbf33", + "fastqc-1_per_base_sequence_quality_plot.txt:md5,cafad80f4e07df53590cbabbbd024629", + "multiqc_general_stats.txt:md5,950d3fb06c211e984084e6de9dad6bb3", + "fastqc-1_per_base_n_content_plot.txt:md5,a0502dd4f701c9deb646ffbec80c09de", + "fastqc_per_base_n_content_plot.txt:md5,d907ac1ac9a4f19908b7b025eb75abfe", + "fastp-seq-quality-plot_Read_1_After_filtering.txt:md5,0742b9813dcc95d4c62c52c83dec390c", + "fastqc_per_sequence_quality_scores_plot.txt:md5,c7aacf1ab75fbe89f86e33273aefaf26", + "fastqc-1_per_sequence_quality_scores_plot.txt:md5,6b8b8ddf52e9dcc22a4ac00c99105301", + true, + "fastqc-1_overrepresented_sequences_plot.txt:md5,e2e8f71f05beec9a8a7809e47ae738ad", + "fastqc-1-status-check-heatmap.txt:md5,d9c3ce24536a948e1fe9b84c55421ab7", + "fastqc_sequence_counts_plot.txt:md5,4861f0dc120e57e0359c53f417756b0c", + "fastp-seq-quality-plot_Read_1_Before_filtering.txt:md5,e9d8e3289f84f5a1ae6775813ec5a9b4", + "samtools_alignment_plot.txt:md5,b841ffce110bde994ccc6e977d2f856e", + "fastqc_per_base_sequence_quality_plot.txt:md5,a8adbff96d9adb317079e6becd7a80f6", + "fastp-seq-content-n-plot_Read_1_After_filtering.txt:md5,a0502dd4f701c9deb646ffbec80c09de", + "fastqc_adapter_content_plot.txt:md5,bd0fdc9c856c55598976b5a46c23a677", + "fastqc_sequence_duplication_levels_plot.txt:md5,2b1cbdce195d2aedc4bff4c5e9b618d4", + "fastqc_per_sequence_gc_content_plot_Percentages.txt:md5,6906cedd750c0d43a26fdcddeacce257", + "fastqc-1_per_sequence_gc_content_plot_Counts.txt:md5,96abe7c73fc433142fbdbacb1e67e87f", + "fastqc-1_adapter_content_plot.txt:md5,c9d77edf35d9afb8b6e86b939b5be596", + "fastp-seq-content-gc-plot_Read_1_Before_filtering.txt:md5,910576f4999a406ea37306b8dc4eb45b", + "fastqc_per_sequence_gc_content_plot_Counts.txt:md5,7cdf079a279cf080f2e2d7ab00b4f134", + true, + "fastp-seq-content-gc-plot_Read_1_After_filtering.txt:md5,585ec288b2514de54e8fb6251d1e0f98" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-19T03:58:28.495279269" + }, + "multiqc": { + "content": [ + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "23.10.0" + }, + "timestamp": "2024-08-30T20:30:51.144222162" + } +} diff --git a/tests/test_nextflex.nf.test b/tests/test_nextflex.nf.test new file mode 100644 index 00000000..4330c2b0 --- /dev/null +++ b/tests/test_nextflex.nf.test @@ -0,0 +1,140 @@ + +nextflow_pipeline { + + name "Test Workflow main.nf - test_nextflex" + script "main.nf" + profile "test_nextflex" + tag "nextflex" + tag "pipeline" + + test("test_nextflex") { + + when { + params { + outdir = "$outputDir" + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(UTILS.removeNextflowVersion("$outputDir")).match("software_versions") }, + { assert workflow.trace.succeeded().size() == 79 }, + + { assert snapshot( + path("$outputDir/mirna_quant/bam/mature/sample2_mature.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/mature/sample1_mature.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/mature/sample2_mature.sorted.stats"), + path("$outputDir/mirna_quant/bam/mature/sample3_mature.sorted.stats"), + path("$outputDir/mirna_quant/bam/mature/sample3_mature.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/mature/sample3_mature.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/mature/sample1_mature.sorted.stats"), + path("$outputDir/mirna_quant/bam/mature/sample2_mature.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/mature/sample1_mature.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/hairpin/sample3_mature_hairpin.sorted.stats"), + path("$outputDir/mirna_quant/bam/hairpin/sample2_mature_hairpin.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/hairpin/sample1_mature_hairpin.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/hairpin/sample1_mature_hairpin.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/hairpin/sample3_mature_hairpin.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/hairpin/sample2_mature_hairpin.sorted.stats"), + path("$outputDir/mirna_quant/bam/hairpin/sample1_mature_hairpin.sorted.stats"), + path("$outputDir/mirna_quant/bam/hairpin/sample3_mature_hairpin.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/hairpin/sample2_mature_hairpin.sorted.idxstats") + ).match("mirna_quant_bam") }, + + { assert snapshot( + path("$outputDir/mirna_quant/edger_qc/hairpin_counts.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_edgeR_MDS_plot_coordinates.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_edgeR_MDS_distance_matrix.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_log2CPM_sample_distances.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_logtpm.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_logtpm.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_normalized_CPM.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_edgeR_MDS_plot_coordinates.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_edgeR_MDS_distance_matrix.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_counts.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_log2CPM_sample_distances.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_unmapped_read_counts.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_normalized_CPM.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_logtpm.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_logtpm.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_unmapped_read_counts.txt").exists() + ).match("mirna_quant_edger_qc") }, + + { assert snapshot( + path("$outputDir/mirtrace/sample1/mirtrace-report.html").exists(), + path("$outputDir/mirtrace/sample1/mirtrace-stats-contamination_basic.tsv"), + path("$outputDir/mirtrace/sample1/mirtrace-stats-mirna-complexity.tsv"), + path("$outputDir/mirtrace/sample1/mirtrace-stats-phred.tsv"), + path("$outputDir/mirtrace/sample1/mirtrace-stats-length.tsv"), + path("$outputDir/mirtrace/sample1/mirtrace-stats-contamination_detailed.tsv"), + path("$outputDir/mirtrace/sample1/mirtrace-stats-qcstatus.tsv"), + path("$outputDir/mirtrace/sample1/mirtrace-stats-rnatype.tsv"), + + path("$outputDir/mirtrace/sample2/mirtrace-report.html").exists(), + path("$outputDir/mirtrace/sample2/mirtrace-stats-contamination_basic.tsv"), + path("$outputDir/mirtrace/sample2/mirtrace-stats-mirna-complexity.tsv"), + path("$outputDir/mirtrace/sample2/mirtrace-stats-phred.tsv"), + path("$outputDir/mirtrace/sample2/mirtrace-stats-length.tsv"), + path("$outputDir/mirtrace/sample2/mirtrace-stats-contamination_detailed.tsv"), + path("$outputDir/mirtrace/sample2/mirtrace-stats-qcstatus.tsv"), + path("$outputDir/mirtrace/sample2/mirtrace-stats-rnatype.tsv"), + + path("$outputDir/mirtrace/sample3/mirtrace-report.html").exists(), + path("$outputDir/mirtrace/sample3/mirtrace-stats-contamination_basic.tsv"), + path("$outputDir/mirtrace/sample3/mirtrace-stats-mirna-complexity.tsv"), + path("$outputDir/mirtrace/sample3/mirtrace-stats-phred.tsv"), + path("$outputDir/mirtrace/sample3/mirtrace-stats-length.tsv"), + path("$outputDir/mirtrace/sample3/mirtrace-stats-contamination_detailed.tsv"), + path("$outputDir/mirtrace/sample3/mirtrace-stats-qcstatus.tsv"), + path("$outputDir/mirtrace/sample3/mirtrace-stats-rnatype.tsv") + ).match("mirtrace") }, + + { assert snapshot( + path("$outputDir/multiqc/multiqc_data/fastqc-status-check-heatmap.txt"), + path("$outputDir/multiqc/multiqc_data/fastp_filtered_reads_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_overrepresented_sequences_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_top_overrepresented_sequences_table.txt").exists(), + path("$outputDir/multiqc/multiqc_data/fastqc-1_sequence_counts_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_complexity_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_sequence_gc_content_plot_Percentages.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_citations.txt"), + path("$outputDir/multiqc/multiqc_data/samtools-stats-dp.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_length_distribution_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-n-plot_Read_1_Before_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_sequence_duplication_levels_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_base_sequence_quality_plot.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_general_stats.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_base_n_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_base_n_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-quality-plot_Read_1_After_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_sequence_quality_scores_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_qc_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_sequence_quality_scores_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_length_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_top_overrepresented_sequences_table.txt").exists(), + path("$outputDir/multiqc/multiqc_data/fastqc-1-status-check-heatmap.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_counts_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_rna_categories_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-quality-plot_Read_1_Before_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/samtools_alignment_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_base_sequence_quality_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-n-plot_Read_1_After_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_contamination_check_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_adapter_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_duplication_levels_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Percentages.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_sequence_gc_content_plot_Counts.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_adapter_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-gc-plot_Read_1_Before_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Counts.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_sources.txt").exists(), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-gc-plot_Read_1_After_filtering.txt") + ).match("multiqc_multiqc_data") }, + + ) + } + + } + +} diff --git a/tests/test_nextflex.nf.test.snap b/tests/test_nextflex.nf.test.snap new file mode 100644 index 00000000..dfc54c7f --- /dev/null +++ b/tests/test_nextflex.nf.test.snap @@ -0,0 +1,145 @@ +{ + "mirtrace": { + "content": [ + true, + "mirtrace-stats-contamination_basic.tsv:md5,a299193a1453a3abe1e5a0522af4ba96", + "mirtrace-stats-mirna-complexity.tsv:md5,9c932b03070aa4b13162f01ad9f5066f", + "mirtrace-stats-phred.tsv:md5,b1903783618346f63f267bdeec7ce2ea", + "mirtrace-stats-length.tsv:md5,5b185869894275973fa47e4ae16c8a20", + "mirtrace-stats-contamination_detailed.tsv:md5,fbb92d8e56263aeb00b7bcf8a802306e", + "mirtrace-stats-qcstatus.tsv:md5,d16ecd87584893661a02b88490368406", + "mirtrace-stats-rnatype.tsv:md5,ff383255e7f6cbb632e3055dc8857e2a", + true, + "mirtrace-stats-contamination_basic.tsv:md5,cf2a6799f2b7a1d8b6bfe89139f24b17", + "mirtrace-stats-mirna-complexity.tsv:md5,0410685fb0146f074e6b3731fbd61383", + "mirtrace-stats-phred.tsv:md5,e5266158a2ba50edde8de8c959f2ab9b", + "mirtrace-stats-length.tsv:md5,d260924b340ccc8559e97f769420c2b8", + "mirtrace-stats-contamination_detailed.tsv:md5,b043078ea1db631f8f5c767f86366c60", + "mirtrace-stats-qcstatus.tsv:md5,21b85f590394539be7417c25cc8ffc0a", + "mirtrace-stats-rnatype.tsv:md5,1a3e5156f9ebc9845988dad6932b0bdc", + true, + "mirtrace-stats-contamination_basic.tsv:md5,2536bebcaeaa1dc0334e5484968f67ad", + "mirtrace-stats-mirna-complexity.tsv:md5,408118e08e297e6ffea9692228ae61be", + "mirtrace-stats-phred.tsv:md5,74b89cafe342aa184203c533090ab6b3", + "mirtrace-stats-length.tsv:md5,febc7969e6eeb988675f719756e58000", + "mirtrace-stats-contamination_detailed.tsv:md5,a62fcf38cc7e7e754ed69457efc2f9f3", + "mirtrace-stats-qcstatus.tsv:md5,d4ab1e47d6a32ddeb502bf4001b4d446", + "mirtrace-stats-rnatype.tsv:md5,4450d9e0c9804b332c4ac79973d8fbfc" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-11T13:45:08.273052652" + }, + "software_versions": { + "content": [ + "{BOWTIE_MAP_HAIRPIN={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_MATURE={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_SEQCLUSTER={bowtie=1.3.0, samtools=1.16.1}, CSVTK_JOIN={csvtk=0.30.0}, DATATABLE_MERGE={r-base=3.6.2}, FASTP={fastp=0.23.4}, FASTQC_RAW={fastqc=0.12.1}, FASTQC_TRIM={fastqc=0.12.1}, FORMAT_HAIRPIN={fastx_toolkit=0.0.14}, FORMAT_MATURE={fastx_toolkit=0.0.14}, INDEX_HAIRPIN={bowtie=1.3.0}, INDEX_MATURE={bowtie=1.3.0}, MIRTOP_COUNTS={mirtop=0.4.28}, MIRTOP_EXPORT={mirtop=0.4.28}, MIRTOP_GFF={mirtop=0.4.28}, MIRTOP_STATS={mirtop=0.4.28}, MIRTRACE_QC={mirtrace=1.0.1}, PARSE_HAIRPIN={seqkit=2.6.1}, PARSE_MATURE={seqkit=2.6.1}, SAMTOOLS_FLAGSTAT={samtools=1.21}, SAMTOOLS_IDXSTATS={samtools=1.21}, SAMTOOLS_INDEX={samtools=1.21}, SAMTOOLS_SORT={samtools=1.21}, SAMTOOLS_STATS={samtools=1.21}, SEQCLUSTER_COLLAPSE={seqcluster=1.2.9}, Workflow={nf-core/smrnaseq=v2.4.0}}" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:25:57.880948228" + }, + "mirna_quant_bam": { + "content": [ + "sample2_mature.sorted.idxstats:md5,9688f02beeebf9a590dc81e49415ede9", + "sample1_mature.sorted.flagstat:md5,7d61ae305e545c7a66ef8d23a0c8be25", + "sample2_mature.sorted.stats:md5,1a2146975784f51e98f4208f4562a4e6", + "sample3_mature.sorted.stats:md5,d1c48aac54b7b741e1108d97e9231232", + "sample3_mature.sorted.flagstat:md5,1aae00444143bce06cb0f8cf31deb8e4", + "sample3_mature.sorted.idxstats:md5,9688f02beeebf9a590dc81e49415ede9", + "sample1_mature.sorted.stats:md5,80f9a8fe5d498265223e95d9dd04a89a", + "sample2_mature.sorted.flagstat:md5,1aae00444143bce06cb0f8cf31deb8e4", + "sample1_mature.sorted.idxstats:md5,6db0cfab41307285fe5c89dfe95b5d46", + "sample3_mature_hairpin.sorted.stats:md5,e725f70ac1e4e4007714b19e588cbc46", + "sample2_mature_hairpin.sorted.flagstat:md5,4e201dd868164d0c53142888dd6ca238", + "sample1_mature_hairpin.sorted.flagstat:md5,7ed3ab444077ddf6c334845e9c4ce75e", + "sample1_mature_hairpin.sorted.idxstats:md5,7b7d142caee6cccbb6d83c8e6568a951", + "sample3_mature_hairpin.sorted.flagstat:md5,4e201dd868164d0c53142888dd6ca238", + "sample2_mature_hairpin.sorted.stats:md5,f5e8be4ea1e9f85c5dc15d3f8cdbae47", + "sample1_mature_hairpin.sorted.stats:md5,052c1f774377058033a165f05d2fcbec", + "sample3_mature_hairpin.sorted.idxstats:md5,8927231d0ea3100fb75a96b4e5317321", + "sample2_mature_hairpin.sorted.idxstats:md5,8927231d0ea3100fb75a96b4e5317321" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:25:57.949148045" + }, + "mirna_quant_edger_qc": { + "content": [ + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true, + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "23.10.0" + }, + "timestamp": "2024-09-06T16:26:35.55181047" + }, + "multiqc_multiqc_data": { + "content": [ + "fastqc-status-check-heatmap.txt:md5,949e6825a7ecc751aa9ba515de7dbd02", + "fastp_filtered_reads_plot.txt:md5,bf3be3a2f4b50b4de0ede0ba46336da3", + "fastqc-1_overrepresented_sequences_plot.txt:md5,11b85c61ea97ca62a9e7c34fae9e575c", + true, + "fastqc-1_sequence_counts_plot.txt:md5,0ab7661061d5e13c5493b60502921f78", + "mirtrace_complexity_plot.txt:md5,127cdbec37b2ce57f6994a20796224d1", + "fastqc-1_per_sequence_gc_content_plot_Percentages.txt:md5,3f7fd27d4553da6a88f4f15dd4b6413b", + "multiqc_citations.txt:md5,3adbccd17a42d0d5d97ee7ebb476f433", + "samtools-stats-dp.txt:md5,1fa31e11ef6c82185d5c9dc2f40d61b2", + "fastqc_sequence_length_distribution_plot.txt:md5,130a5569ba830f7e7abb971d1c8da537", + "fastp-seq-content-n-plot_Read_1_Before_filtering.txt:md5,bd72bc8bfc907c6aab72f315917ab280", + "fastqc-1_sequence_duplication_levels_plot.txt:md5,a53f959bf59ad69d3bcbc53e8fe609b3", + "fastqc-1_per_base_sequence_quality_plot.txt:md5,2f85a658bcb8261328449f1642688086", + "multiqc_general_stats.txt:md5,9bc4353b1ae9588391be745fd069c871", + "fastqc-1_per_base_n_content_plot.txt:md5,e3b4bb3ed98e87f2d8acb0c009485ecd", + "fastqc-1_per_base_n_content_plot.txt:md5,e3b4bb3ed98e87f2d8acb0c009485ecd", + "fastp-seq-quality-plot_Read_1_After_filtering.txt:md5,2956382a3f2e855a4dce8e8246a57add", + "fastqc-1_per_sequence_quality_scores_plot.txt:md5,28ed13d328e755aa06a0f13f87c336eb", + "mirtrace_qc_plot.txt:md5,394ac045f75e2300302b7e3c2418cbbc", + "fastqc-1_per_sequence_quality_scores_plot.txt:md5,28ed13d328e755aa06a0f13f87c336eb", + "mirtrace_length_plot.txt:md5,18717f7f295b4d03524e91fd32c2956e", + true, + "fastqc-1-status-check-heatmap.txt:md5,66af5433ebb61bc68905f8219d7419ab", + "fastqc_sequence_counts_plot.txt:md5,c92bc7da83662b8a6b49d5cdbab3dc42", + "mirtrace_rna_categories_plot.txt:md5,d9f621ebb387d289357004d21c9eb209", + "fastp-seq-quality-plot_Read_1_Before_filtering.txt:md5,e5ea2bfd87e957a18fae5239137d6499", + "samtools_alignment_plot.txt:md5,625da9c9da2fe432bee7a5bfca9cf550", + "fastqc_per_base_sequence_quality_plot.txt:md5,1208509fcaff06edcddc377c907dfdaf", + "fastp-seq-content-n-plot_Read_1_After_filtering.txt:md5,dd53a16aebc689109fc8065d08d8a6c7", + "mirtrace_contamination_check_plot.txt:md5,8fd040b7771963863937f0eb31a265f1", + "fastqc_adapter_content_plot.txt:md5,8aa2cbcf256bbb89c4a1d1fd18019c9b", + "fastqc_sequence_duplication_levels_plot.txt:md5,97a930f423f2cd365c2262b0a185f68a", + "fastqc_per_sequence_gc_content_plot_Percentages.txt:md5,5857a8a1980816cf70b34b7b318e1482", + "fastqc-1_per_sequence_gc_content_plot_Counts.txt:md5,d3ecffd88ebbdac463e297a2b98c8b3d", + "fastqc-1_adapter_content_plot.txt:md5,245d96a402988141cbe68b60a42db535", + "fastp-seq-content-gc-plot_Read_1_Before_filtering.txt:md5,9033ad6887da19d96fb9e2504d8de0a5", + "fastqc_per_sequence_gc_content_plot_Counts.txt:md5,fbe1f23a76ed70b2568d553fc42adef2", + true, + "fastp-seq-content-gc-plot_Read_1_After_filtering.txt:md5,ed44d5035150f69bdeb7855c80271c21" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-20T17:11:24.369706104" + } +} diff --git a/tests/test_skipfastp.nf.test b/tests/test_skipfastp.nf.test new file mode 100644 index 00000000..eb4a0456 --- /dev/null +++ b/tests/test_skipfastp.nf.test @@ -0,0 +1,118 @@ + +nextflow_pipeline { + + name "Test Workflow main.nf - test_skipfastp" + script "main.nf" + profile "test_skipfastp" + tag "skipfastp" + tag "pipeline" + + test("test_skipfastp") { + + when { + params { + outdir = "$outputDir" + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(UTILS.removeNextflowVersion("$outputDir")).match("software_versions") }, + { assert workflow.trace.succeeded().size() == 64 }, + + { assert snapshot( + path("$outputDir/mirna_quant/mirtop/joined_samples_mirtop.tsv").exists(), + path("$outputDir/mirna_quant/mirtop/mirna.tsv").exists(), + ).match("mirna_quant_mirtop") }, + + { assert snapshot( + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N3_mature_hairpin.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_mature_hairpin.sorted.idxstats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N3_mature_hairpin.sorted.stats"), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_mature_hairpin.sorted.stats").exists(), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N3_mature_hairpin.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/hairpin/Clone1_N1_mature_hairpin.sorted.flagstat").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N3_mature.sorted.stats"), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_mature.sorted.flagstat").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_mature.sorted.idxstats").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N1_mature.sorted.stats").exists(), + path("$outputDir/mirna_quant/bam/mature/Clone1_N3_mature.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/mature/Clone1_N3_mature.sorted.flagstat") + ).match("mirna_quant_bam") }, + + { assert snapshot( + path("$outputDir/mirna_quant/edger_qc/hairpin_logtpm.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_counts.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_logtpm.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_counts.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_normalized_CPM.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_unmapped_read_counts.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_logtpm.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_normalized_CPM.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_logtpm.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_unmapped_read_counts.txt").exists() + ).match("mirna_quant_edger_qc") }, + + { assert snapshot( + path("$outputDir/mirtrace/Clone1_N1/mirtrace-report.html").exists(), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-contamination_basic.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-length.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-mirna-complexity.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-contamination_detailed.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-phred.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-rnatype.tsv"), + path("$outputDir/mirtrace/Clone1_N1/mirtrace-stats-qcstatus.tsv"), + path("$outputDir/mirtrace/Clone1_N3/mirtrace-report.html").exists(), + path("$outputDir/mirtrace/Clone1_N3/mirtrace-stats-contamination_basic.tsv"), + path("$outputDir/mirtrace/Clone1_N3/mirtrace-stats-length.tsv"), + path("$outputDir/mirtrace/Clone1_N3/mirtrace-stats-mirna-complexity.tsv"), + path("$outputDir/mirtrace/Clone1_N3/mirtrace-stats-contamination_detailed.tsv"), + path("$outputDir/mirtrace/Clone1_N3/mirtrace-stats-phred.tsv"), + path("$outputDir/mirtrace/Clone1_N3/mirtrace-stats-rnatype.tsv"), + path("$outputDir/mirtrace/Clone1_N3/mirtrace-stats-qcstatus.tsv") + ).match("mirtrace") }, + + { assert snapshot( + path("$outputDir/genome_quant/bam/Clone1_N3_mature_hairpin_genome.sorted.flagstat"), + path("$outputDir/genome_quant/bam/Clone1_N3_mature_hairpin_genome.sorted.idxstats"), + path("$outputDir/genome_quant/bam/Clone1_N1_mature_hairpin_genome.sorted.idxstats"), + path("$outputDir/genome_quant/bam/Clone1_N1_mature_hairpin_genome.sorted.flagstat"), + path("$outputDir/genome_quant/bam/Clone1_N3_mature_hairpin_genome.sorted.stats"), + path("$outputDir/genome_quant/bam/Clone1_N1_mature_hairpin_genome.sorted.stats") + ).match("genome_quant_bam") }, + + { assert snapshot( + path("$outputDir/multiqc/multiqc_report.html").exists() + ).match("multiqc") }, + + { assert snapshot( + path("$outputDir/multiqc/multiqc_data/fastqc-status-check-heatmap.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_overrepresented_sequences_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_complexity_plot.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_citations.txt"), + path("$outputDir/multiqc/multiqc_data/samtools-stats-dp.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_length_distribution_plot.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_general_stats.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_base_n_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_quality_scores_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_qc_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_length_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_top_overrepresented_sequences_table.txt").exists(), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_counts_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_rna_categories_plot.txt"), + path("$outputDir/multiqc/multiqc_data/samtools_alignment_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_base_sequence_quality_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtop_read_count_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtop_unique_read_count_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtop_mean_read_count_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_duplication_levels_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Percentages.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Counts.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_sources.txt").exists() + ).match("multiqc_multiqc_data") } + ) + } + } + +} diff --git a/tests/test_skipfastp.nf.test.snap b/tests/test_skipfastp.nf.test.snap new file mode 100644 index 00000000..2352aaf1 --- /dev/null +++ b/tests/test_skipfastp.nf.test.snap @@ -0,0 +1,145 @@ +{ + "mirtrace": { + "content": [ + true, + "mirtrace-stats-contamination_basic.tsv:md5,34005177b3038da76236b5d8fb468825", + "mirtrace-stats-length.tsv:md5,847f38c5a6bbed4d3200ecaab05e090a", + "mirtrace-stats-mirna-complexity.tsv:md5,d175bfcbcdebbdf768217da0078ad055", + "mirtrace-stats-contamination_detailed.tsv:md5,141f6f46a4f9fcfac91303c125a2fc75", + "mirtrace-stats-phred.tsv:md5,b55f5d6fe7a79dafd618e13f66505dfe", + "mirtrace-stats-rnatype.tsv:md5,424cc10a33b239d01765f981dbe4b60c", + "mirtrace-stats-qcstatus.tsv:md5,7c41c36e544b9aafe68bbeade3de9428", + true, + "mirtrace-stats-contamination_basic.tsv:md5,109ca0f96520aaa21bea6f55dc1b4b03", + "mirtrace-stats-length.tsv:md5,e37044dc40483a893b577ebdbf4fa585", + "mirtrace-stats-mirna-complexity.tsv:md5,bd8e19d2c40c3b1dbfec4037ca288b14", + "mirtrace-stats-contamination_detailed.tsv:md5,d116d38bc104139f7496da66ce0a9436", + "mirtrace-stats-phred.tsv:md5,3e1b63adcc5acd334059ebbaa5141b72", + "mirtrace-stats-rnatype.tsv:md5,8a94743f0e853f8804ae225d4a588e6f", + "mirtrace-stats-qcstatus.tsv:md5,469f837e7403e160eaec4c1cf3af0ba3" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-11T14:28:00.975156754" + }, + "genome_quant_bam": { + "content": [ + "Clone1_N3_mature_hairpin_genome.sorted.flagstat:md5,22d4dffd7b6fc1d2ae7827de3fb68ad7", + "Clone1_N3_mature_hairpin_genome.sorted.idxstats:md5,2620288b88bba1ea3315414016c083a1", + "Clone1_N1_mature_hairpin_genome.sorted.idxstats:md5,e0e4a95f5c21a926f7894cf1fbe3110b", + "Clone1_N1_mature_hairpin_genome.sorted.flagstat:md5,62208acf0c7418d590b41318d7e17d67", + "Clone1_N3_mature_hairpin_genome.sorted.stats:md5,c9d4d7a91bdafdb45592574cd942a7ba", + "Clone1_N1_mature_hairpin_genome.sorted.stats:md5,3309ee9de501160337af8c73d46ec682" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:28:49.376604167" + }, + "software_versions": { + "content": [ + "{BOWTIE_MAP_GENOME={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_HAIRPIN={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_MATURE={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_SEQCLUSTER={bowtie=1.3.0, samtools=1.16.1}, CSVTK_JOIN={csvtk=0.30.0}, DATATABLE_MERGE={r-base=3.6.2}, FASTQC_RAW={fastqc=0.12.1}, FORMAT_HAIRPIN={fastx_toolkit=0.0.14}, FORMAT_MATURE={fastx_toolkit=0.0.14}, INDEX_HAIRPIN={bowtie=1.3.0}, INDEX_MATURE={bowtie=1.3.0}, MIRTOP_COUNTS={mirtop=0.4.28}, MIRTOP_EXPORT={mirtop=0.4.28}, MIRTOP_GFF={mirtop=0.4.28}, MIRTOP_STATS={mirtop=0.4.28}, MIRTRACE_QC={mirtrace=1.0.1}, PARSE_HAIRPIN={seqkit=2.6.1}, PARSE_MATURE={seqkit=2.6.1}, SAMTOOLS_FLAGSTAT={samtools=1.21}, SAMTOOLS_IDXSTATS={samtools=1.21}, SAMTOOLS_INDEX={samtools=1.21}, SAMTOOLS_SORT={samtools=1.21}, SAMTOOLS_STATS={samtools=1.21}, SEQCLUSTER_COLLAPSE={seqcluster=1.2.9}, Workflow={nf-core/smrnaseq=v2.4.0}}" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:28:49.241105443" + }, + "mirna_quant_bam": { + "content": [ + "Clone1_N3_mature_hairpin.sorted.flagstat:md5,1630edf055b591303d7c68d013745938", + true, + "Clone1_N3_mature_hairpin.sorted.stats:md5,488f860e7aed290a4707dbe9b07899b6", + true, + "Clone1_N3_mature_hairpin.sorted.idxstats:md5,b44fb26f6be2accc7d52bc38efff69f4", + true, + "Clone1_N3_mature.sorted.stats:md5,5263c781ef5db00ba664745ed595841f", + true, + true, + true, + "Clone1_N3_mature.sorted.idxstats:md5,bde0293f0938a8a074ad3ac633d8cb73", + "Clone1_N3_mature.sorted.flagstat:md5,9e287eb7ac83624b262864d0255217fd" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:28:49.30034837" + }, + "mirna_quant_edger_qc": { + "content": [ + true, + true, + true, + true, + true, + true, + true, + true, + true, + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "23.10.0" + }, + "timestamp": "2024-08-30T20:38:26.540699935" + }, + "multiqc_multiqc_data": { + "content": [ + "fastqc-status-check-heatmap.txt:md5,a16737b9ae7b9b70b0ef7e462101a729", + "fastqc_overrepresented_sequences_plot.txt:md5,3c1ffe7d55bbf2815e6bc427a2d27a2c", + "mirtrace_complexity_plot.txt:md5,e27fb1e870985b3fe76744c027ce1c40", + "multiqc_citations.txt:md5,f46d2983044658a4a89bdec5ba20fda3", + "samtools-stats-dp.txt:md5,d1854b0ed73a4c9ae62a3a625c19d4b2", + "fastqc_sequence_length_distribution_plot.txt:md5,ff2def0eab8321d4ed590b483641f43b", + "multiqc_general_stats.txt:md5,3f5a47801a45a1b62bb519bea4896ab4", + "fastqc_per_base_n_content_plot.txt:md5,c345fe5430e3a17ad1dbcc14e7595f50", + "fastqc_per_sequence_quality_scores_plot.txt:md5,edf4d21e2928d37d94bb33a25e1d92a6", + "mirtrace_qc_plot.txt:md5,49c178fc849a1aa44781ddc67c85927c", + "mirtrace_length_plot.txt:md5,7023ffcd95379998adbd65204b9998ee", + true, + "fastqc_sequence_counts_plot.txt:md5,a8680d3b059401e71c7c0fe2404f5933", + "mirtrace_rna_categories_plot.txt:md5,606174b789bdb9841f8e99206d147bb9", + "samtools_alignment_plot.txt:md5,2778636ce5b6c9432b67014382fc35af", + "fastqc_per_base_sequence_quality_plot.txt:md5,60f539c88c503680c0b2603749494948", + "mirtop_read_count_plot.txt:md5,420dda9dbd3348b5c685e888dbe2a85a", + "mirtop_unique_read_count_plot.txt:md5,ade512aac4f23e63239b7db54d2544c8", + "mirtop_mean_read_count_plot.txt:md5,9f9cee399a861fd17f0627fc843b7c15", + "fastqc_sequence_duplication_levels_plot.txt:md5,7e7eb4105b8f963bdf68e422e4ebce67", + "fastqc_per_sequence_gc_content_plot_Percentages.txt:md5,7ac995de6a861676f64879b02d04f819", + "fastqc_per_sequence_gc_content_plot_Counts.txt:md5,c18bf431a08ec1230720d83781e8903b", + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-20T16:48:44.52005314" + }, + "multiqc": { + "content": [ + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "23.10.0" + }, + "timestamp": "2024-08-30T20:38:26.610014871" + }, + "mirna_quant_mirtop": { + "content": [ + true, + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-01T20:19:25.557700049" + } +} diff --git a/tests/test_umi.nf.test b/tests/test_umi.nf.test new file mode 100644 index 00000000..e2c4cff5 --- /dev/null +++ b/tests/test_umi.nf.test @@ -0,0 +1,135 @@ + +nextflow_pipeline { + + name "Test Workflow main.nf - test_umi" + script "main.nf" + profile "test_umi" + tag "umi" + tag "pipeline" + + test("test_umi") { + + when { + params { + outdir = "$outputDir" + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot(UTILS.removeNextflowVersion("$outputDir")).match("software_versions") }, + { assert workflow.trace.succeeded().size() == 74 }, + + { assert snapshot( + path("$outputDir/mirna_quant/bam/mature/SRX8195118_SRR11631014_mature.sorted.stats"), + path("$outputDir/mirna_quant/bam/mature/SRX8195118_SRR11631014_mature.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/mature/SRX8195118_SRR11631014_mature.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/mature/SRX8195117_SRR11631013_mature.sorted.stats"), + path("$outputDir/mirna_quant/bam/mature/SRX8195117_SRR11631013_mature.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/mature/SRX8195117_SRR11631013_mature.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/hairpin/SRX8195117_SRR11631013_mature_hairpin.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/hairpin/SRX8195117_SRR11631013_mature_hairpin.sorted.flagstat"), + path("$outputDir/mirna_quant/bam/hairpin/SRX8195117_SRR11631013_mature_hairpin.sorted.stats"), + path("$outputDir/mirna_quant/bam/hairpin/SRX8195118_SRR11631014_mature_hairpin.sorted.stats"), + path("$outputDir/mirna_quant/bam/hairpin/SRX8195118_SRR11631014_mature_hairpin.sorted.idxstats"), + path("$outputDir/mirna_quant/bam/hairpin/SRX8195118_SRR11631014_mature_hairpin.sorted.flagstat"), + ).match("mirna_quant_bam") }, + + { assert snapshot( + path("$outputDir/mirna_quant/edger_qc/hairpin_logtpm.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_logtpm.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_counts.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_unmapped_read_counts.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/hairpin_normalized_CPM.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_logtpm.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_normalized_CPM.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_unmapped_read_counts.txt").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_counts.csv").exists(), + path("$outputDir/mirna_quant/edger_qc/mature_logtpm.txt").exists() + ).match("mirna_quant_edger_qc") }, + + { assert snapshot( + path("$outputDir/mirna_quant/mirtop/joined_samples_mirtop.tsv").exists(), + path("$outputDir/mirna_quant/mirtop/mirna.tsv").exists(), + ).match("mirna_quant_mirtop") }, + + { assert snapshot( + path("$outputDir/mirtrace/SRX8195117_SRR11631013/mirtrace-report.html").exists(), + path("$outputDir/mirtrace/SRX8195117_SRR11631013/mirtrace-stats-contamination_detailed.tsv"), + path("$outputDir/mirtrace/SRX8195117_SRR11631013/mirtrace-stats-contamination_basic.tsv"), + path("$outputDir/mirtrace/SRX8195117_SRR11631013/mirtrace-stats-length.tsv"), + path("$outputDir/mirtrace/SRX8195117_SRR11631013/mirtrace-stats-qcstatus.tsv"), + path("$outputDir/mirtrace/SRX8195117_SRR11631013/mirtrace-stats-mirna-complexity.tsv"), + path("$outputDir/mirtrace/SRX8195117_SRR11631013/mirtrace-stats-phred.tsv"), + path("$outputDir/mirtrace/SRX8195117_SRR11631013/mirtrace-stats-rnatype.tsv"), + path("$outputDir/mirtrace/SRX8195118_SRR11631014/mirtrace-report.html").exists(), + path("$outputDir/mirtrace/SRX8195118_SRR11631014/mirtrace-stats-contamination_detailed.tsv"), + path("$outputDir/mirtrace/SRX8195118_SRR11631014/mirtrace-stats-contamination_basic.tsv"), + path("$outputDir/mirtrace/SRX8195118_SRR11631014/mirtrace-stats-length.tsv"), + path("$outputDir/mirtrace/SRX8195118_SRR11631014/mirtrace-stats-qcstatus.tsv"), + path("$outputDir/mirtrace/SRX8195118_SRR11631014/mirtrace-stats-mirna-complexity.tsv"), + path("$outputDir/mirtrace/SRX8195118_SRR11631014/mirtrace-stats-phred.tsv"), + path("$outputDir/mirtrace/SRX8195118_SRR11631014/mirtrace-stats-rnatype.tsv") + ).match("mirtrace") }, + + { assert snapshot( + path("$outputDir/genome_quant/bam/SRX8195118_SRR11631014_mature_hairpin_genome.sorted.flagstat"), + path("$outputDir/genome_quant/bam/SRX8195117_SRR11631013_mature_hairpin_genome.sorted.flagstat"), + path("$outputDir/genome_quant/bam/SRX8195117_SRR11631013_mature_hairpin_genome.sorted.idxstats"), + path("$outputDir/genome_quant/bam/SRX8195118_SRR11631014_mature_hairpin_genome.sorted.idxstats"), + path("$outputDir/genome_quant/bam/SRX8195118_SRR11631014_mature_hairpin_genome.sorted.stats"), + path("$outputDir/genome_quant/bam/SRX8195117_SRR11631013_mature_hairpin_genome.sorted.stats") + ).match("genome_quant_bam") }, + + { assert snapshot( + path("$outputDir/multiqc/multiqc_report.html").exists() + ).match("multiqc") }, + + { assert snapshot( + path("$outputDir/multiqc/multiqc_data/fastqc-status-check-heatmap.txt"), + path("$outputDir/multiqc/multiqc_data/fastp_filtered_reads_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_top_overrepresented_sequences_table.txt").exists(), + path("$outputDir/multiqc/multiqc_data/fastqc-1_sequence_counts_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_complexity_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_sequence_gc_content_plot_Percentages.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_citations.txt"), + path("$outputDir/multiqc/multiqc_data/samtools-stats-dp.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_length_distribution_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-n-plot_Read_1_Before_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_sequence_duplication_levels_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_base_sequence_quality_plot.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_general_stats.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_base_n_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_base_n_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-quality-plot_Read_1_After_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_quality_scores_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_qc_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_sequence_quality_scores_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_length_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_top_overrepresented_sequences_table.txt").exists(), + path("$outputDir/multiqc/multiqc_data/fastqc-1-status-check-heatmap.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_counts_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_rna_categories_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-quality-plot_Read_1_Before_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/samtools_alignment_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_base_sequence_quality_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtop_read_count_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-n-plot_Read_1_After_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/mirtop_unique_read_count_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtop_mean_read_count_plot.txt"), + path("$outputDir/multiqc/multiqc_data/mirtrace_contamination_check_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_adapter_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_sequence_duplication_levels_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Percentages.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_per_sequence_gc_content_plot_Counts.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc-1_adapter_content_plot.txt"), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-gc-plot_Read_1_Before_filtering.txt"), + path("$outputDir/multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Counts.txt"), + path("$outputDir/multiqc/multiqc_data/multiqc_sources.txt").exists(), + path("$outputDir/multiqc/multiqc_data/fastp-seq-content-gc-plot_Read_1_After_filtering.txt") + ).match("multiqc_multiqc_data") } + ) + } + } +} diff --git a/tests/test_umi.nf.test.snap b/tests/test_umi.nf.test.snap new file mode 100644 index 00000000..fb0b6d09 --- /dev/null +++ b/tests/test_umi.nf.test.snap @@ -0,0 +1,163 @@ +{ + "mirtrace": { + "content": [ + true, + "mirtrace-stats-contamination_detailed.tsv:md5,c805a20b67de8a9d4bf1d86a5c92a7e8", + "mirtrace-stats-contamination_basic.tsv:md5,e69bac9cdb8672286714ae046f47ba52", + "mirtrace-stats-length.tsv:md5,23c93df112644ec817ba44f2d1a288ac", + "mirtrace-stats-qcstatus.tsv:md5,789cbc8e576cc0ebc1b0d848ef67cc8a", + "mirtrace-stats-mirna-complexity.tsv:md5,bdf6ab8c20e9a6caa5b00ecd3946fda6", + "mirtrace-stats-phred.tsv:md5,5facbd5f028d6f8186c7636b8af831f1", + "mirtrace-stats-rnatype.tsv:md5,c4e4db698aa81f42f49c4d2368a7a735", + true, + "mirtrace-stats-contamination_detailed.tsv:md5,ff78979388fffe5e3805a7080e7a0987", + "mirtrace-stats-contamination_basic.tsv:md5,677d95c9df736edd47db2be36411213c", + "mirtrace-stats-length.tsv:md5,9e904ff06e6872fea420bd7a743b1fe9", + "mirtrace-stats-qcstatus.tsv:md5,c457e8b4da79605dbd33b48217b61412", + "mirtrace-stats-mirna-complexity.tsv:md5,cee0c7d45a47d9a33bbae3df822708db", + "mirtrace-stats-phred.tsv:md5,d1110059417aabd532a0aaac86db9612", + "mirtrace-stats-rnatype.tsv:md5,d1dc9a39d78e5c35764e006773bca76d" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-11T14:33:25.542680002" + }, + "genome_quant_bam": { + "content": [ + "SRX8195118_SRR11631014_mature_hairpin_genome.sorted.flagstat:md5,235383f64a943885f5d899f5b8e03eba", + "SRX8195117_SRR11631013_mature_hairpin_genome.sorted.flagstat:md5,977e88cbe62027285df73e1f7f9cd9bc", + "SRX8195117_SRR11631013_mature_hairpin_genome.sorted.idxstats:md5,cc0413bf90252c3b3af8926fd64bc873", + "SRX8195118_SRR11631014_mature_hairpin_genome.sorted.idxstats:md5,a4874de294706a7ead30258944ff2dad", + "SRX8195118_SRR11631014_mature_hairpin_genome.sorted.stats:md5,f1ff3ed8478070d92f068c10bc2c1222", + "SRX8195117_SRR11631013_mature_hairpin_genome.sorted.stats:md5,a058de81d9a8055bc40e345ae2ed660c" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:34:54.845245469" + }, + "software_versions": { + "content": [ + "{BOWTIE_MAP_GENOME={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_HAIRPIN={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_MATURE={bowtie=1.3.0, samtools=1.16.1}, BOWTIE_MAP_SEQCLUSTER={bowtie=1.3.0, samtools=1.16.1}, CSVTK_JOIN={csvtk=0.30.0}, DATATABLE_MERGE={r-base=3.6.2}, FASTP={fastp=0.23.4}, FASTP_LENGTH_FILTER={fastp=0.23.4}, FASTQC_RAW={fastqc=0.12.1}, FASTQC_TRIM={fastqc=0.12.1}, FORMAT_HAIRPIN={fastx_toolkit=0.0.14}, FORMAT_MATURE={fastx_toolkit=0.0.14}, INDEX_HAIRPIN={bowtie=1.3.0}, INDEX_MATURE={bowtie=1.3.0}, MIRTOP_COUNTS={mirtop=0.4.28}, MIRTOP_EXPORT={mirtop=0.4.28}, MIRTOP_GFF={mirtop=0.4.28}, MIRTOP_STATS={mirtop=0.4.28}, MIRTRACE_QC={mirtrace=1.0.1}, PARSE_HAIRPIN={seqkit=2.6.1}, PARSE_MATURE={seqkit=2.6.1}, SAMTOOLS_FLAGSTAT={samtools=1.21}, SAMTOOLS_IDXSTATS={samtools=1.21}, SAMTOOLS_INDEX={samtools=1.21}, SAMTOOLS_SORT={samtools=1.21}, SAMTOOLS_STATS={samtools=1.21}, SEQCLUSTER_COLLAPSE={seqcluster=1.2.9}, UMICOLLAPSE_FASTQ={umicollapse=1.0.0-1}, Workflow={nf-core/smrnaseq=v2.4.0}}" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:34:54.715037951" + }, + "mirna_quant_bam": { + "content": [ + "SRX8195118_SRR11631014_mature.sorted.stats:md5,d2e3a471dc1948dd546015623e4347f9", + "SRX8195118_SRR11631014_mature.sorted.flagstat:md5,57c6d477394d367ebae59f7267b430a5", + "SRX8195118_SRR11631014_mature.sorted.idxstats:md5,8b9cf0f1647b938f058b80522df24667", + "SRX8195117_SRR11631013_mature.sorted.stats:md5,e489a85f4bf467724d7841d7e843a224", + "SRX8195117_SRR11631013_mature.sorted.flagstat:md5,171387fb18ba9868e28ca03d24a7daca", + "SRX8195117_SRR11631013_mature.sorted.idxstats:md5,fb6c4000f82a66654b4f2a40570649b5", + "SRX8195117_SRR11631013_mature_hairpin.sorted.idxstats:md5,4e7c1c98804febf6210cee5e3941709e", + "SRX8195117_SRR11631013_mature_hairpin.sorted.flagstat:md5,b86bd14dc687a26ba5a84d1015f4b70a", + "SRX8195117_SRR11631013_mature_hairpin.sorted.stats:md5,b5d4e6934c5c07ace5ad9a7a4eeade08", + "SRX8195118_SRR11631014_mature_hairpin.sorted.stats:md5,990e41b7818b98083463b3bcd9862283", + "SRX8195118_SRR11631014_mature_hairpin.sorted.idxstats:md5,f4485713620f31d97a5006acdf6d8a5d", + "SRX8195118_SRR11631014_mature_hairpin.sorted.flagstat:md5,e0c44533bc7813d552de4864d997c916" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T23:34:54.777375775" + }, + "mirna_quant_edger_qc": { + "content": [ + true, + true, + true, + true, + true, + true, + true, + true, + true, + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "23.10.0" + }, + "timestamp": "2024-08-30T20:47:03.902417016" + }, + "multiqc_multiqc_data": { + "content": [ + "fastqc-status-check-heatmap.txt:md5,c1509fdd74b21a718fe099de64514995", + "fastp_filtered_reads_plot.txt:md5,54caeafa94c6ec8de2e5fda261aee04a", + true, + "fastqc-1_sequence_counts_plot.txt:md5,036a1ca02aa27567988d53bcefce1959", + "mirtrace_complexity_plot.txt:md5,5a860a872f793250b8c4482d031176a8", + "fastqc-1_per_sequence_gc_content_plot_Percentages.txt:md5,351f949c0abf4fb7587f3f5d9a28d461", + "multiqc_citations.txt:md5,02ab194a83114a3c2c22c2749cd27717", + "samtools-stats-dp.txt:md5,74808822577fb62efb39811272e6919e", + "fastqc_sequence_length_distribution_plot.txt:md5,8c34b57ec084e2da9d62c254c0a517f4", + "fastp-seq-content-n-plot_Read_1_Before_filtering.txt:md5,dfdb23f41359b8a6b84d6626a0474d02", + "fastqc-1_sequence_duplication_levels_plot.txt:md5,b5ae95ecd73055798ed70947dda3747c", + "fastqc-1_per_base_sequence_quality_plot.txt:md5,89adfa92b1cde0ad4e401b430bbc68ce", + "multiqc_general_stats.txt:md5,ec1bbecb2844d95f6282b893b6709a06", + "fastqc-1_per_base_n_content_plot.txt:md5,db081d3aa63007e5a78113f0fc26f27d", + "fastqc_per_base_n_content_plot.txt:md5,5b5b8cee3162d092c0bcddffbd000f34", + "fastp-seq-quality-plot_Read_1_After_filtering.txt:md5,66a47c7ce00ede2053f8e6eb20ec3417", + "fastqc_per_sequence_quality_scores_plot.txt:md5,3aa99649540afc898d32d2e49a364487", + "mirtrace_qc_plot.txt:md5,98a104b1e65164016ae4081b8815f33e", + "fastqc-1_per_sequence_quality_scores_plot.txt:md5,4108da6fe352558a652ee2b17d609e07", + "mirtrace_length_plot.txt:md5,440a84ce9bbdb89b736e4e2446382665", + true, + "fastqc-1-status-check-heatmap.txt:md5,cb2ea844834808ae4c95c6440269cf2e", + "fastqc_sequence_counts_plot.txt:md5,78f80dbdcc711e490c779a998f94b69a", + "mirtrace_rna_categories_plot.txt:md5,57039a101b0062b1849dadf994df3a88", + "fastp-seq-quality-plot_Read_1_Before_filtering.txt:md5,f36b7cfd3057b26281367397db45033a", + "samtools_alignment_plot.txt:md5,b5d5a2f86d2b715f310fa8d6a008123d", + "fastqc_per_base_sequence_quality_plot.txt:md5,e2e187bc0b0c1f0d1abb3b666945c7b3", + "mirtop_read_count_plot.txt:md5,1a250ab7f8cab415c06fbc16714a1be7", + "fastp-seq-content-n-plot_Read_1_After_filtering.txt:md5,bbad2035ada86867c4ed579a93b78d64", + "mirtop_unique_read_count_plot.txt:md5,57463270ca5f6b5fec3f67bdf9dd1970", + "mirtop_mean_read_count_plot.txt:md5,77e38f9e8daea0f3fd2b8161d36403c5", + "mirtrace_contamination_check_plot.txt:md5,2e4a51b79b8d062ff195822bfd5a91a6", + "fastqc_adapter_content_plot.txt:md5,de1d7324ff5146b49fc9a2e6d4633962", + "fastqc_sequence_duplication_levels_plot.txt:md5,fe7598e49f93bb980a7675a2bb4bd3b5", + "fastqc_per_sequence_gc_content_plot_Percentages.txt:md5,c2f2f9282a50c3eef475664cc969b8ec", + "fastqc-1_per_sequence_gc_content_plot_Counts.txt:md5,15d8fa32e0c11ef0d3d10fc28370972c", + "fastqc-1_adapter_content_plot.txt:md5,89cd342fdc6fbba5f67078c9a2f0c684", + "fastp-seq-content-gc-plot_Read_1_Before_filtering.txt:md5,f832e92fb36db181ed1079be110edb2a", + "fastqc_per_sequence_gc_content_plot_Counts.txt:md5,488e25de89d18d20f29b86f2580a8df9", + true, + "fastp-seq-content-gc-plot_Read_1_After_filtering.txt:md5,d673e3b18c40c5af1edccffba386d678" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-20T16:55:26.694937898" + }, + "multiqc": { + "content": [ + true + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "23.10.0" + }, + "timestamp": "2024-08-30T20:47:04.136759497" + }, + "mirna_quant_mirtop": { + "content": [ + true, + true + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-20T19:12:28.290360163" + } +} diff --git a/workflows/smrnaseq.nf b/workflows/smrnaseq.nf index 9252fec0..d7161c9b 100644 --- a/workflows/smrnaseq.nf +++ b/workflows/smrnaseq.nf @@ -3,23 +3,27 @@ IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ +// nf-core modules include { CAT_FASTQ } from '../modules/nf-core/cat/fastq/main' -include { CONTAMINANT_FILTER } from '../subworkflows/local/contaminant_filter' include { FASTQC } from '../modules/nf-core/fastqc/main' -include { FASTQ_FASTQC_UMITOOLS_FASTP } from '../subworkflows/nf-core/fastq_fastqc_umitools_fastp' include { FASTP as FASTP_LENGTH_FILTER } from '../modules/nf-core/fastp' -include { GENOME_QUANT } from '../subworkflows/local/genome_quant' -include { INDEX_GENOME } from '../modules/local/bowtie_genome' -include { MIRNA_QUANT } from '../subworkflows/local/mirna_quant' -include { MIRDEEP2 } from '../subworkflows/local/mirdeep2' -include { MIRTRACE } from '../subworkflows/local/mirtrace' +include { FASTP as FASTP3 } from '../modules/nf-core/fastp' include { MULTIQC } from '../modules/nf-core/multiqc/main' include { UMICOLLAPSE as UMICOLLAPSE_FASTQ } from '../modules/nf-core/umicollapse/main' include { UMITOOLS_EXTRACT } from '../modules/nf-core/umitools/extract/main' -include { UNTARFILES as UNTAR_BOWTIE_INDEX } from '../modules/nf-core/untarfiles' -include { paramsSummaryMap } from 'plugin/nf-validation' +include { MIRTRACE_QC } from '../modules/nf-core/mirtrace/qc/main' +// nf-core subworkflows +include { FASTQ_FASTQC_UMITOOLS_FASTP } from '../subworkflows/nf-core/fastq_fastqc_umitools_fastp' +include { FASTQ_FIND_MIRNA_MIRDEEP2 } from '../subworkflows/nf-core/fastq_find_mirna_mirdeep2/main' include { paramsSummaryMultiqc } from '../subworkflows/nf-core/utils_nfcore_pipeline' include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' +// local subworkflows +include { CONTAMINANT_FILTER } from '../subworkflows/local/contaminant_filter/main' +include { GENOME_QUANT } from '../subworkflows/local/genome_quant' +include { MIRNA_QUANT } from '../subworkflows/local/mirna_quant' +include { methodsDescriptionText } from '../subworkflows/local/utils_nfcore_smrnaseq_pipeline' +// plugins +include { paramsSummaryMap } from 'plugin/nf-schema' /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -41,34 +45,32 @@ ch_fastp_adapters = Channel.fromPath(params.fastp_known_mirn workflow NFCORE_SMRNASEQ { take: - ch_input // channel: samplesheet file as specified to --input - ch_samplesheet // channel: sample fastqs parsed from --input - ch_versions // channel: [ path(versions.yml) ] + has_fasta // boolean + has_mirtrace_species // boolean + ch_mirna_adapters // channel: [ val(string) ] + ch_mirtrace_species // channel: [ val(string) ] + ch_reference_mature // channel: [ val(meta), path(fasta) ] + ch_reference_hairpin // channel: [ val(meta), path(fasta) ] + ch_mirna_gtf // channel: [ val(meta), path(gtf) ] + ch_fasta // channel: [ val(meta), path(fasta) ] + ch_bowtie_index // channel: [ val(meta), [ path(directory_index) ] ] + ch_rrna // channel: [ val(meta), path(fasta) ] + ch_trna // channel: [ val(meta), path(fasta) ] + ch_cdna // channel: [ val(meta), path(fasta) ] + ch_ncrna // channel: [ val(meta), path(fasta) ] + ch_pirna // channel: [ val(meta), path(fasta) ] + ch_other_contamination // channel: [ val(meta), path(fasta) ] + ch_versions // channel: [ path(versions.yml) ] + ch_samplesheet // channel: sample fastqs parsed from --input + ch_three_prime_adapter // channel: [ val(string) ] + ch_phred_offset // channel: [ val(string) ] main: - //Config checks - // Check optional parameters - if (!params.mirtrace_species) { - exit 1, "Reference species for miRTrace is not defined via the --mirtrace_species parameter." - } - - // Genome options - def mirna_gtf_from_species = params.mirtrace_species ? "https://mirbase.org/download/CURRENT/genomes/${params.mirtrace_species}.gff3" : false - def mirna_gtf = params.mirna_gtf ?: mirna_gtf_from_species - - if (!params.mirgenedb) { - if (params.mature) { reference_mature = file(params.mature, checkIfExists: true) } else { exit 1, "Mature miRNA fasta file not found: ${params.mature}" } - if (params.hairpin) { reference_hairpin = file(params.hairpin, checkIfExists: true) } else { exit 1, "Hairpin miRNA fasta file not found: ${params.hairpin}" } - } else { - if (params.mirgenedb_mature) { reference_mature = file(params.mirgenedb_mature, checkIfExists: true) } else { exit 1, "Mature miRNA fasta file not found: ${params.mirgenedb_mature}" } - if (params.mirgenedb_hairpin) { reference_hairpin = file(params.mirgenedb_hairpin, checkIfExists: true) } else { exit 1, "Hairpin miRNA fasta file not found: ${params.mirgenedb_hairpin}" } - if (params.mirgenedb_gff) { mirna_gtf = file(params.mirgenedb_gff, checkIfExists: true) } else { exit 1, "MirGeneDB gff file not found: ${params.mirgenedb_gff}"} - if (!params.mirgenedb_species) { exit 1, "MirGeneDB species not set, please specify via the --mirgenedb_species parameter"} - } + ch_multiqc_files = Channel.empty() // // Create separate channels for samples that have single/multiple FastQ files to merge // - ch_samplesheet + ch_fastq = ch_samplesheet .branch { meta, fastqs -> single : fastqs.size() == 1 @@ -76,24 +78,22 @@ workflow NFCORE_SMRNASEQ { multiple: fastqs.size() > 1 return [ meta, fastqs.flatten() ] } - .set { ch_fastq } - + // // MODULE: Concatenate FastQ files from same sample if required // CAT_FASTQ ( ch_fastq.multiple ) - .reads - .mix(ch_fastq.single) - .set { ch_cat_fastq } - + ch_cat_fastq = CAT_FASTQ.out.reads.mix(ch_fastq.single) ch_versions = ch_versions.mix(CAT_FASTQ.out.versions.first()) - mirna_adapters = params.with_umi ? [] : params.fastp_known_mirna_adapters - // // SUBWORKFLOW: Read QC, extract UMI and trim adapters & dedup UMIs if necessary / desired by the user // + if ( params.skip_fastp && params.skip_fastqc ) { + exit 1, "At least one of skip_fastp or skip_fastqc must be false" + } + FASTQ_FASTQC_UMITOOLS_FASTP ( ch_cat_fastq, params.skip_fastqc, @@ -101,36 +101,24 @@ workflow NFCORE_SMRNASEQ { params.skip_umi_extract_before_dedup, params.umi_discard_read, params.skip_fastp, - mirna_adapters, + ch_mirna_adapters, params.save_trimmed_fail, params.save_merged, params.min_trimmed_reads ) ch_versions = ch_versions.mix(FASTQ_FASTQC_UMITOOLS_FASTP.out.versions) - ch_fasta = params.fasta ? file(params.fasta): [] ch_reads_for_mirna = FASTQ_FASTQC_UMITOOLS_FASTP.out.reads - - // even if bowtie index is specified, there still needs to be a fasta. - // without fasta, no genome analysis. - if(params.fasta) { - //Prepare bowtie index, unless specified - //This needs to be done here as the index is used by GENOME_QUANT - if(params.bowtie_index) { - ch_fasta = Channel.fromPath(params.fasta) - if (params.bowtie_index.endsWith(".tar.gz")) { - UNTAR_BOWTIE_INDEX ( [ [], params.bowtie_index ]).files.map { it[1] }.set {ch_bowtie_index} - ch_versions = ch_versions.mix(UNTAR_BOWTIE_INDEX.out.versions) - } else { - Channel.fromPath("${params.bowtie_index}**ebwt", checkIfExists: true).ifEmpty{ error "Bowtie1 index directory not found: ${params.bowtie_index}" }.filter { it != null }.set { ch_bowtie_index } - } - } else { - INDEX_GENOME ( [ [:], ch_fasta ] ) - ch_versions = ch_versions.mix(INDEX_GENOME.out.versions) - ch_bowtie_index = INDEX_GENOME.out.index - // set to reformatted fasta as generated by `bowtie index` - ch_fasta = INDEX_GENOME.out.fasta - } + // Trim 3' end nucleotides after adapter is removed, otherwise they are not really trimmed + if (params.three_prime_clip_r1){ + FASTP3( + ch_reads_for_mirna, + [], + false, + false, + false + ) + ch_reads_for_mirna = FASTP3.out.reads } // UMI Dedup for fastq input @@ -147,7 +135,8 @@ workflow NFCORE_SMRNASEQ { // Filter out sequences smaller than params.fastp_min_length FASTP_LENGTH_FILTER ( UMITOOLS_EXTRACT.out.reads, - mirna_adapters, + ch_mirna_adapters, + false, params.save_trimmed_fail, params.save_merged ) @@ -159,18 +148,32 @@ workflow NFCORE_SMRNASEQ { // // MODULE: mirtrace QC // - FASTQ_FASTQC_UMITOOLS_FASTP.out.adapter_seq - .join( ch_reads_for_mirna ) - .dump() - .map { meta, adapter_seq, reads -> [adapter_seq, meta.id, reads] } - .groupTuple() - .set { ch_mirtrace_inputs } - // - // SUBWORKFLOW: MIRTRACE - // - MIRTRACE(ch_mirtrace_inputs) - ch_versions = ch_versions.mix(MIRTRACE.out.versions) + ch_mirtrace_config = ch_reads_for_mirna + .transpose() + .combine(ch_three_prime_adapter) + .combine(ch_phred_offset) + .collectFile { meta, reads, adapter, phred -> + def config_filename = "${meta.id}.data" + [ config_filename, "./${reads.getFileName().toString()},${meta.id},${adapter},${phred}\n" ] + } + .map { config_file -> + def base_name = config_file.getBaseName() + [ ['id':base_name], config_file ] + } + + ch_mirtrace_qc_inputs = ch_reads_for_mirna + .map{meta, reads -> [[id: meta.id], reads]} + .join(ch_mirtrace_config) + + if (has_mirtrace_species){ + + MIRTRACE_QC(ch_mirtrace_qc_inputs, ch_mirtrace_species) + ch_versions = ch_versions.mix(MIRTRACE_QC.out.versions) + + } else { + log.warn "The parameter --mirtrace_species is absent. MIRTRACE quantification skipped." + } // // SUBWORKFLOW: remove contaminants from reads @@ -178,27 +181,27 @@ workflow NFCORE_SMRNASEQ { contamination_stats = Channel.empty() if (params.filter_contamination){ CONTAMINANT_FILTER ( - reference_hairpin, - params.rrna, - params.trna, - params.cdna, - params.ncrna, - params.pirna, - params.other_contamination, + ch_reference_hairpin, + ch_rrna, + ch_trna, + ch_cdna, + ch_ncrna, + ch_pirna, + ch_other_contamination, ch_reads_for_mirna ) - contamination_stats = CONTAMINANT_FILTER.out.filter_stats ch_versions = ch_versions.mix(CONTAMINANT_FILTER.out.versions) ch_reads_for_mirna = CONTAMINANT_FILTER.out.filtered_reads - } + //MIRNA_QUANT process should still run even if mirtrace_species is null, when mirgendb is true MIRNA_QUANT ( - [ [:], reference_mature], - [ [:], reference_hairpin], - mirna_gtf, - ch_reads_for_mirna + ch_reference_mature, + ch_reference_hairpin, + ch_mirna_gtf, + ch_reads_for_mirna, + ch_mirtrace_species ) ch_versions = ch_versions.mix(MIRNA_QUANT.out.versions) @@ -206,23 +209,33 @@ workflow NFCORE_SMRNASEQ { // GENOME // genome_stats = Channel.empty() - if (params.fasta){ - GENOME_QUANT ( ch_bowtie_index, ch_fasta, MIRNA_QUANT.out.unmapped ) + if (has_fasta){ + GENOME_QUANT ( + ch_bowtie_index, + ch_fasta, + MIRNA_QUANT.out.unmapped + ) genome_stats = GENOME_QUANT.out.stats ch_versions = ch_versions.mix(GENOME_QUANT.out.versions) - hairpin_clean = MIRNA_QUANT.out.fasta_hairpin.map { it -> it[1] } - mature_clean = MIRNA_QUANT.out.fasta_mature.map { it -> it[1] } + ch_hairpin_clean = MIRNA_QUANT.out.fasta_hairpin.map { it -> it[1] } + ch_mature_clean = MIRNA_QUANT.out.fasta_mature.map { it -> it[1] } + + ch_mature_hairpin = ch_mature_clean + .combine(ch_hairpin_clean) + .map { mature, hairpin -> + [[id: 'mature_hairpin'], mature, hairpin, []] + } + .first() if (!params.skip_mirdeep) { - MIRDEEP2 ( - ch_reads_for_mirna, - GENOME_QUANT.out.fasta, - GENOME_QUANT.out.index.collect(), - hairpin_clean, - mature_clean - ) - ch_versions = ch_versions.mix(MIRDEEP2.out.versions) + FASTQ_FIND_MIRNA_MIRDEEP2 ( + ch_reads_for_mirna, + ch_fasta, + ch_bowtie_index, + ch_mature_hairpin, + ) + ch_versions = ch_versions.mix(FASTQ_FIND_MIRNA_MIRDEEP2.out.versions) } } @@ -230,18 +243,54 @@ workflow NFCORE_SMRNASEQ { // Collate and save software versions // softwareVersionsToYAML(ch_versions) - .collectFile(storeDir: "${params.outdir}/pipeline_info", name: 'nf_core_smrnaseq_software_mqc_versions.yml', sort: true, newLine: true) - .set {ch_collated_versions} + .collectFile( + storeDir: "${params.outdir}/pipeline_info", + name: 'nf_core_' + 'pipeline_software_' + 'mqc_' + 'versions.yml', + sort: true, + newLine: true + ).set { ch_collated_versions } + // .collectFile(storeDir: "${params.outdir}/pipeline_info", name: 'nf_core_smrnaseq_software_mqc_versions.yml', sort: true, newLine: true) + // .set {ch_collated_versions} + // // MODULE: MultiQC // ch_multiqc_report = Channel.empty() if (!params.skip_multiqc) { - summary_params = paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") + // summary_params = paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") + // ch_workflow_summary = Channel.value(paramsSummaryMultiqc(summary_params)) + + ch_multiqc_config = Channel.fromPath( + "$projectDir/assets/multiqc_config.yml", checkIfExists: true) + ch_multiqc_custom_config = params.multiqc_config ? + Channel.fromPath(params.multiqc_config, checkIfExists: true) : + Channel.empty() + ch_multiqc_logo = params.multiqc_logo ? + Channel.fromPath(params.multiqc_logo, checkIfExists: true) : + Channel.empty() + + summary_params = paramsSummaryMap( + workflow, parameters_schema: "nextflow_schema.json") ch_workflow_summary = Channel.value(paramsSummaryMultiqc(summary_params)) + ch_multiqc_custom_methods_description = params.multiqc_methods_description ? + Channel.fromPath(params.multiqc_methods_description, checkIfExists: true) : + Channel.fromPath("$projectDir/assets/methods_description_template.yml", checkIfExists: true) + // ch_methods_description = Channel.value( + // methodsDescriptionText(ch_multiqc_custom_methods_description)) + ch_multiqc_files = Channel.empty() + ch_multiqc_files = ch_multiqc_files.mix( + ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml')) + ch_multiqc_files = ch_multiqc_files.mix(ch_collated_versions) + // ch_multiqc_files = ch_multiqc_files.mix( + // ch_methods_description.collectFile( + // name: 'methods_description_mqc.yaml', + // sort: true + // ) + // ) + ch_multiqc_files = ch_multiqc_files.mix(ch_collated_versions) ch_multiqc_files = ch_multiqc_files.mix(ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml')) ch_multiqc_files = ch_multiqc_files.mix(FASTQ_FASTQC_UMITOOLS_FASTP.out.fastqc_raw_zip.collect{it[1]}.ifEmpty([])) @@ -254,14 +303,20 @@ workflow NFCORE_SMRNASEQ { ch_multiqc_files = ch_multiqc_files.mix(genome_stats.collect({it[1]}).ifEmpty([])) ch_multiqc_files = ch_multiqc_files.mix(MIRNA_QUANT.out.mature_stats.collect({it[1]}).ifEmpty([])) ch_multiqc_files = ch_multiqc_files.mix(MIRNA_QUANT.out.hairpin_stats.collect({it[1]}).ifEmpty([])) - ch_multiqc_files = ch_multiqc_files.mix(MIRNA_QUANT.out.mirtop_logs.collect().ifEmpty([])) - ch_multiqc_files = ch_multiqc_files.mix(MIRTRACE.out.results.collect().ifEmpty([])) + ch_multiqc_files = ch_multiqc_files.mix(MIRNA_QUANT.out.mirtop_logs.collect({it[1]}).ifEmpty([])) + if (has_mirtrace_species){ + ch_multiqc_files = ch_multiqc_files.mix(MIRTRACE_QC.out.html.collect({it[1]}).ifEmpty([])) + ch_multiqc_files = ch_multiqc_files.mix(MIRTRACE_QC.out.json.collect({it[1]}).ifEmpty([])) + ch_multiqc_files = ch_multiqc_files.mix(MIRTRACE_QC.out.tsv.collect({it[1]}).ifEmpty([])) + } MULTIQC ( ch_multiqc_files.collect(), ch_multiqc_config.toList(), ch_multiqc_custom_config.toList(), - ch_multiqc_logo.toList() + ch_multiqc_logo.toList(), + [], + [] ) ch_multiqc_report = MULTIQC.out.report