Skip to content

Required resources

Stephany Orjuela edited this page Nov 5, 2019 · 8 revisions

The required resources for the generation of each output file (grouped by the Snakemake rule) in the real data walk-through, as reported by the benchmarking directive of Snakemake. The four different panels show the read and written bytes (in MB), the memory usage (in MB) and the run time (in seconds). RSS = Resident Set Size. IO = Input/Output.

The code for the real data walk-through is in the chiron_readataworkflow branch of the ARMOR GitHub repository. More details can be found in the "Real data walk-through" section of our preprint.

Note that for the most memory-intensive processes (STAR index building and alignment), the memory usage is largely independent of the number of reads, and instead depends mainly on the size of the reference genome (the figure above shows that the memory usage is similar for all STAR alignment runs, although the number of reads vary between 12.5 and 41 million across the samples). Similarly, the execution time for the most time consuming steps (STAR index building and DRIMSeq DTU analysis) is also largely independent of the number of reads. Thus, the required resource usage cannot be predicted from the number of reads alone.