tiny-count is a counting utility that allows for hierarchical assignment of small RNA reads to features based on user-defined selection rules. This tutorial offers an introductory procedure for setting up and running tiny-count using your own data files.
If you instead want to use the tinyRNA workflow, where tiny-count execution is handled automatically, please see the other tutorial.
Standalone installation requires conda. If conda is already installed, you can install tiny-count from the bioconda channel. See the tiny-count installation section in the README for instructions.
Alternatively, if you have already installed tinyRNA, you can use the tiny-count
command within the tinyrna conda environment.
Gather the following files for the analysis:
- SAM or BAM files containing small RNA reads aligned to a reference genome, one file per sample
- GFF3 or GFF2/GTF file(s) containing annotations for features that you want to assign reads to
First, you'll need to obtain template copies of the configuration files. Start by activating the conda environment where tiny-count is installed, then run the following command:
tiny-count --get-templates
Next, fill out the configuration files that were copied:
Edit this file to add the paths to your SAM or BAM files, and to define the group name, replicate number, etc. for each sample.
Edit this file to add the paths to your GFF annotation(s) under the gff_files
key. You can leave the alias
key as-is for now. All other keys in this file are used in the tinyRNA workflow.
Edit this file to define the selection rules for assigning reads to features. For now, we'll add a fully permissive rule:
Select for... | with value... | Classify as... | Source Filter | Type Filter | Hierarchy | Strand | 5' End Nucleotide | Length | Overlap |
---|---|---|---|---|---|---|---|---|---|
Any | Any | Any | 0 | Both | Any | Any | Partial |
Now you're ready to run tiny-count. Make sure you've activated the conda environment where tiny-count is installed, then run the following command:
tiny-count --paths-file paths.yml
A new directory with be created with the name tiny-count_{timestamp}
where {timestamp}
is the date and time at the start of the run. This directory holds the counting results along with a config
subdirectory for auto-documenting the run's configuration. The primary output is feature_counts.csv, a table of classified counts per feature. You can read about the other file outputs in the Counts and Pipeline Statistics section of the README.
Now that you've run tiny-count, you can edit the configuration files to customize the analysis. For example, you can increase the specificity of your selection rule, or add more selection rules with similar or different hierarchy values, or add more GFF files to the Paths File. You can also add more samples to the Samples Sheet, and run tiny-count again to add them to the output.