-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Release 0.8.3
- Loading branch information
Showing
65 changed files
with
1,924 additions
and
1,948 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
docs/ | ||
|
||
mess.egg-info/ | ||
mess/__pycache__ | ||
build/ | ||
|
||
htmlcov/ | ||
.tox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*,cover | ||
tests/__pycache__ | ||
.pytest_cache | ||
|
||
.snakemake | ||
mess/workflow/conda | ||
mess/workflow/taxonkit |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
name: Docker publish | ||
|
||
on: | ||
push: | ||
branches: ["main"] | ||
tags: ["v*.*.*"] | ||
pull_request: | ||
branches: ["main"] | ||
|
||
env: | ||
REGISTRY: ghcr.io | ||
IMAGE_NAME: ${{ github.repository }} | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
permissions: | ||
contents: read | ||
packages: write | ||
id-token: write | ||
|
||
steps: | ||
- name: Checkout repository | ||
uses: actions/checkout@v4 | ||
|
||
- name: Install cosign | ||
if: github.event_name != 'pull_request' | ||
uses: sigstore/cosign-installer@v3.5.0 | ||
with: | ||
cosign-release: "v2.2.4" | ||
|
||
- name: Set up QEMU | ||
uses: docker/setup-qemu-action@v3 | ||
|
||
- name: Set up Docker Buildx | ||
uses: docker/setup-buildx-action@v3 | ||
|
||
- name: Log into registry ${{ env.REGISTRY }} | ||
if: github.event_name != 'pull_request' | ||
uses: docker/login-action@v3 | ||
with: | ||
registry: ${{ env.REGISTRY }} | ||
username: ${{ github.actor }} | ||
password: ${{ secrets.GITHUB_TOKEN }} | ||
|
||
- name: Extract Docker metadata | ||
id: meta | ||
uses: docker/metadata-action@v5 | ||
with: | ||
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} | ||
|
||
- name: Build and push Docker image | ||
id: build-and-push | ||
uses: docker/build-push-action@v5 | ||
with: | ||
context: . | ||
push: ${{ github.event_name != 'pull_request' }} | ||
tags: ${{ steps.meta.outputs.tags }} | ||
labels: ${{ steps.meta.outputs.labels }} | ||
cache-from: type=gha | ||
cache-to: type=gha,mode=max | ||
|
||
- name: Sign the published Docker image | ||
if: ${{ github.event_name != 'pull_request' }} | ||
env: | ||
TAGS: ${{ steps.meta.outputs.tags }} | ||
DIGEST: ${{ steps.build-and-push.outputs.digest }} | ||
run: echo "${TAGS}" | xargs -I {} cosign sign --yes {}@${DIGEST} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
FROM mambaorg/micromamba | ||
LABEL org.opencontainers.image.source=https://github.com/metagenlab/MeSS | ||
LABEL org.opencontainers.image.description="Snakemake pipeline for simulating shotgun metagenomic samples" | ||
LABEL org.opencontainers.image.licenses=MIT | ||
ADD . /tmp/repo | ||
WORKDIR /tmp/repo | ||
ENV LANG C.UTF-8 | ||
ENV SHELL /bin/bash | ||
USER root | ||
|
||
RUN micromamba install -q -y -c bioconda -c conda-forge -n base \ | ||
mess --only-deps && \ | ||
micromamba install -q -y -c conda-forge -n base mamba && \ | ||
micromamba clean -afy | ||
|
||
ENV PATH /opt/conda/bin:${PATH} | ||
RUN pip install . |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# Benchmarks | ||
We benchmarked MeSS and CAMISIM, the state-of-the art metagenome simulator, in terms of species composition and resource usage. | ||
|
||
We demonstrated that, MeSS generates the same species composition as CAMISIM, while being 10x faster. | ||
|
||
## [Species composition](species-composition.md) | ||
|
||
## [Resource usage](resource-usage.md) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
16 samples were used to benchmark MeSS and CAMISIM resources usage. | ||
|
||
Samples were create by subsampling 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 40, 80, 160, 320, 640 genomes from a total of 2000 complete bacterial genomes (downloaded with [assembly_finder](https://github.com/metagenlab/assembly_finder)). | ||
|
||
Each genome was covered at 1x using art_illumina with CAMISIM's custom MBARC error model. | ||
|
||
See [this nextflow pipeline](https://github.com/farchaab/benchmark-MeSS-CAMISIM) to run the benchmark. | ||
## Results | ||
### Physical RAM usage | ||
|
||
![ram](../images/ram-usage.svg) | ||
|
||
### CPU usage | ||
|
||
![cpu-usage](../images/cpu-usage.svg) | ||
|
||
### CPU time | ||
|
||
![cpu-usage](../images/cpu-time.svg) | ||
|
||
!!! warning | ||
To simulate a sample with 2.4G base pairs, using one CPU, CAMISIM takes 32 hours, while MeSS takes 3 hours. | ||
|
||
## Conclusions | ||
MeSS vs CAMISIM on average: | ||
|
||
- [x] 5x more parallel (CPU usage) | ||
- [x] 10x faster using one CPU (CPU time) | ||
- [x] Uses 16.7x less memory (physical RAM) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
5 samples from the [human microbiome project](https://www.hmpdacc.org/hmp/) were were classified with [kraken2](https://github.com/DerrickWood/kraken2) and [bracken](https://github.com/jenniferlu717/Bracken). Taxa with at least at 200 reads were kept and used as input to both MeSS and CAMISIM. | ||
|
||
Use [this nextflow pipeline](https://github.com/farchaab/benchmark-MeSS-CAMISIM) to generate the fastqs. | ||
|
||
## Results | ||
|
||
[microViz](https://github.com/david-barnett/microViz/) was used for the ordination plots and statistical tests. | ||
|
||
### Bray-curtis | ||
|
||
![bray](../images/species-bray-NMDS.svg) | ||
|
||
:material-arrow-right: Samples from the same bodysite cluster together. In addition, simulated samples cluster well with real samples (gold_standard and gs_filtered). | ||
|
||
### PERMANOVA | ||
|
||
:simple-hypothesis: **Null hypothesis** : No significant difference in species composition between simulated and non simulated samples | ||
|
||
??? info "**Code**" | ||
```R | ||
perm <- dist_permanova(mdist, | ||
variables = "origin:simulated+body_site", | ||
n_perms = 999, | ||
n_processes = 3 | ||
) | ||
``` | ||
|
||
```R | ||
Df SumOfSqs R2 F Pr(>F) | ||
body_site 3 12.153 0.37843 15.6933 0.001 *** | ||
origin:simulated 3 1.117 0.03479 1.4429 0.067 . | ||
Residual 73 18.844 0.58678 | ||
Total 79 32.115 1.00000 | ||
``` | ||
|
||
:material-arrow-right: Significant difference between body sites. No significant difference between simulated and real samples | ||
|
||
### Beta dispersion | ||
|
||
:simple-hypothesis: **Null hypothesis** : No significant difference in dispersion between samples of different origin | ||
|
||
```R | ||
Fit: aov(formula = distances ~ group, data = df) | ||
|
||
$group | ||
diff lwr upr p adj | ||
gs_filtered-gold_standard 2.249163e-03 -0.03593552 0.04043384 0.9986690 | ||
camisim-gold_standard -2.310968e-02 -0.06129435 0.01507500 0.3905351 | ||
mess-gold_standard -2.308946e-02 -0.06127414 0.01509522 0.3913195 | ||
camisim-gs_filtered -2.535884e-02 -0.06354352 0.01282584 0.3082419 | ||
mess-gs_filtered -2.533862e-02 -0.06352330 0.01284606 0.3089344 | ||
mess-camisim 2.021632e-05 -0.03816446 0.03820490 1.0000000 | ||
``` | ||
|
||
:material-arrow-right: No significant difference between filtered and non-filtered samples, simulated and real samples. | ||
|
||
## Conclusions | ||
|
||
- [x] Same species composition between original and filtered samples | ||
- [x] Same species composition between MeSS and CAMISIM | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
# Citation | ||
|
||
![`mess citation`](docs/images/mess-citation.svg) | ||
![`mess citation`](images/mess-citation.svg) |
Oops, something went wrong.