Skip to content

Commit

Permalink
Update README.md (#353)
Browse files Browse the repository at this point in the history
Added optional input for BAM basename and updated documentation
  • Loading branch information
ekiernan authored Jun 11, 2020
1 parent 94e459b commit af305aa
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 12 deletions.
11 changes: 6 additions & 5 deletions pipelines/optimus/Optimus.changelog.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
# optimus_v3.0.0
2020-06-10 (Date of Last Commit)

* Removed zarr formatted matrix and metrics outputs and replaced with Loom
* Removed emptyDrops for sn_rna mode
* Updated Loom file attribute names: CellID to cell_names, Gene to gene_names, and Accession to ensembl_ids
* Removed the Zarr formatted matrix and metrics outputs and replaced with Loom
* Removed EmptyDrops for sn_rna mode
* Updated the Loom file attribute names: CellID to cell_names, Gene to gene_names, and Accession to ensembl_ids
* Added metrics for mitochondrial reads
* Added an optional input for the BAM basename; this input is listed as ‘bam_output_basename’and the default is 'sample_id'

# optimus_v2.0.0
2020-02-08 (Date of Last Commit)

* Fixed bug that resulted in emptyDrops output being incorrect
* Updated workflow to WDL 1.0
* Fixed a bug that resulted in emptyDrops output being incorrect
* Updated the workflow to WDL 1.0

# optimus_v1.4.0

Expand Down
3 changes: 2 additions & 1 deletion pipelines/optimus/Optimus.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ workflow Optimus {
Array[File] r2_fastq
Array[File]? i1_fastq
String sample_id
String? output_bam_basename = sample_id

# organism reference parameters
File tar_star_reference
Expand Down Expand Up @@ -222,7 +223,7 @@ workflow Optimus {
call Merge.MergeSortBamFiles as MergeSorted {
input:
bam_inputs = PreMergeSort.bam_output,
output_bam_filename = sample_id + ".bam",
output_bam_filename = output_bam_basename + ".bam",
sort_order = "coordinate"
}

Expand Down
14 changes: 8 additions & 6 deletions pipelines/optimus/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@ Optimus is a pipeline developed by the Data Coordination Platform (DCP) of the [

Optimus has been validated for analyzing both [human](https://github.com/HumanCellAtlas/skylab/blob/master/benchmarking/optimus/optimus_report.rst) and [mouse](https://docs.google.com/document/d/1_3oO0ZQSrwEoe6D3GgKdSmAQ9qkzH_7wrE7x6_deL10/edit) data sets. More details about the human validation can be found in the [in the original file](https://docs.google.com/document/d/158ba_xQM9AYyu8VcLWsIvSoEYps6PQhgddTr9H0BFmY/edit).

| **Update on Single Nuclei RNAseq (sn_rna) Pipeline** |
| --- |
| We are in the process of validating Optimus for snRNAseq using `sn_rna` parameter. These changes are detailed in the documentation. Once the pipeline is validated for snRNAseq, we will provide the validation report link in the above section. |

## Quick Start Table

| Pipeline Features | Description | Source |
Expand Down Expand Up @@ -90,6 +94,7 @@ The JSON file also contains metadata for the reference information in the follow
| Annotations_gtf | Cloud path to GTF containing gene annotations used for gene tagging (must match GTF in STAR reference) | NA |
| Chemistry | Optional string description of whether data was generated with 10x v2 or v3 chemistry. Optimus validates this string. If the string does not match one of the optional strings, the pipeline will fail. You can remove the checks by setting "force_no_check = true" in the input JSON | "tenX_v2" (default) or "tenX_v3" |
| Counting_mode | String description of whether data is single-cell or single-nuclei | "sc_rna" or "sn_rna" |
| Output_bam_basename | Optional string used for the output BAM file basename; the default is sample_id | NA |


### Sample Inputs for Analyses in a Terra Workspace
Expand Down Expand Up @@ -193,23 +198,20 @@ Output files of the pipeline include:
3. Cell metadata, including cell metrics
4. Gene metadata, including gene metrics

The following table lists the output files produced from the pipeline. For samples that have sequenced over multiple lanes, the pipeline will output one merged version of each listed file.
The following table lists the output files produced from the pipeline. For samples that have sequenced over multiple lanes, the pipeline will output one merged version of each listed file.

| Output Name | Filename, if applicable | Output Type |Output Format |
| ------ |------ | ------ | ------ |
| pipeline_version | | Version of the processing pipeline run on this data | String |
| bam | merged.bam | aligned bam | bam |
| bam | <sample_id>.bam | Aligned BAM | BAM |
| matrix_row_index | sparse_counts_row_index.npy | Index of cells in expression matrix | Numpy array index |
| matrix_col_index | sparse_counts_col_index.npy | Index of genes in expression matrix | Numpy array index |
| cell_metrics | merged-cell-metrics.csv.gz | cell metrics | compressed csv | Matrix of metrics by cells |
| gene_metrics | merged-gene-metrics.csv.gz | gene metrics | compressed csv | Matrix of metrics by genes |
| loom_output_file | output.loom | Loom | Loom | Loom file with expression data and metadata | N/A |


The Loom is the default output. See the [create_loom_optimus.py](https://github.com/HumanCellAtlas/skylab/blob/master/docker/loom-output/create_loom_optimus.py) for the detailed code.


The final Loom output contains the unnormalized (unfiltered), UMI-corrected expression matrices, as well as the gene and cell metrics detailed in the [Loom_schema documentation](https://github.com/HumanCellAtlas/skylab/blob/master/pipelines/optimus/Loom_schema.md).
The Loom is the default output. See the [create_loom_optimus.py](https://github.com/HumanCellAtlas/skylab/blob/master/docker/loom-output/create_loom_optimus.py) for the detailed code. The final Loom output contains the unnormalized (unfiltered), UMI-corrected expression matrices, as well as the gene and cell metrics detailed in the [Loom_schema documentation](https://github.com/HumanCellAtlas/skylab/blob/master/pipelines/optimus/Loom_schema.md).

| Zarr Array Deprecation Notice June 2020 |
| --- |
Expand Down

0 comments on commit af305aa

Please sign in to comment.