For an explanation of tiny-plot's parameters in the Run Config and by commandline, see the parameters documentation.
tinyRNA automatically handles file inputs when tiny-plot is called as a step in a pipeline run. It is necessary to identify these files when running tiny-plot as a standalone step. The list of possible input files is lengthy, so you are free to specify only the subset of inputs that are required for your desired plot types. The dependencies per plot type are as follows:
Plot Type | Input File Commandline Argument(s) | Source |
---|---|---|
len_dist | --len-dist FILE FILE FILE ... |
tiny-count |
rule_charts | --rule-counts FILE |
tiny-count |
class_charts | --raw-counts FILE --summary-stats FILE |
tiny-count tiny-count |
replicate_scatter | --norm-counts FILE |
tiny-deseq.r |
sample_avg_scatter_by_dge | --norm-counts FILE --dge-tables FILE FILE FILE ... |
tiny-deseq.r tiny-deseq.r |
sample_avg_scatter_by_dge_class | --norm-counts FILE --dge-tables FILE FILE FILE ... |
tiny-deseq.r tiny-deseq.r |
The distributions of 5' end nucleotides vs. sequence lengths can be used to assess the overall quality of your libraries. This can also be used for analyzing small RNA distributions in non-model organisms without annotations.
Two plots are produced for each replicate:
- Distribution of Mapped Reads, which are counted for every alignment reported in tiny-count's input SAM files
- Distribution of Assigned Reads, which are counted at each alignment where at least one overlapping feature passed selection and was assigned a portion of the sequence's original counts
Lengths are plotted over a continuous range, even if an intermediate length was not observed, and the bounds of this range can be assigned automatically or manually. Manual lengths can be assigned using plot_len_dist_min and plot_len_dist_max.
When tiny-plot is called as a step in a pipeline run, min and max bounds are determined independently in the following order of priority:
- Manual assignment in the Run Config
- From the corresponding optional entries for fastp (
length_required
andlength_limit
) in the Run Config - Automatic assignment from the data. Bounds are determined by considering the min/max lengths across all libraries such that all plots have the same bounds. This determination is performed separately for each plot subtype.
When tiny-plot is called as a standalone step, orders 1 and 3 are used. Manual assignment is performed via the equivalent commandline arguments in order 1.
Placeholder bases, e.g. N, will be reported if they are encountered at the 5' end. Otherwise only the 4 standard bases are reported.
Counts are assigned only to the features that meet selection criteria at each alignment locus. It is useful to see how each selection rule contributed to the overall assignment of counts. The rule_charts plot type shows the percentage of mapped reads that each rule contributed to the total assigned reads.
Rules are referred to by their row number in the Features Sheet and the first non-header row is considered rule 0. Rule N represents the percentage of mapped reads that were unassigned. Sources of unassigned reads include:
- A lack of features passing selection at alignment loci
- Alignments which do not overlap with any features
Percentage label darkness and bar colors reflect the magnitude of the rule's contribution. Magnitude is always considered on a 0-100% scale, rather than scaling down to the chart's view limits. These styles cannot be changed using a plot style sheet.
Features can have multiple classes associated with them, so it is useful to see the proportions of counts by class. The class_charts plot type shows the percentage of mapped reads that were assigned to features by class. Each feature's associated classes are determined by the rules that it matched during Stage 1 selection, and is therefore determined by its GFF annotations.
This category represents the percentage of mapped reads that weren't assigned to any features. Sources of unassigned reads include:
- A lack of features passing selection at alignment loci
- Alignments which do not overlap with any features
You can customize this label using the unassigned class parameter.
This category represents the percentage of mapped reads that matched rules which did not have a specified Classify as...
value. You can customize this label using the unknown class parameter.
Proportions in rule_charts and class_charts are plotted using the same function. Styles are the same between the two. See rule chart styles for more info.
Feature count comparisons between replicates can be used to assess the overall quality of your libraries. The replicate_scatter plot type shows these comparisons using DESeq2's normalized counts on Log2 scale axes. A plot is produced for all replicate combinations in each sample group.
Differential gene expression between sample groups can be visualized with this plot type. Normalized feature counts from DESeq2 are averaged across replicates for each sample and plotted on Log2 scale axes. Features with significant expression levels will have their counts plotted with red points.
The P value cutoff can be changed (default: 0.05).
The control condition is plotted on the x-axis, but it must be specified in your Samples Sheet prior to running an end-to-end or tiny recount
analysis. If using tiny replot
, is not possible to change a no-control experiment to a control experiment and have these changes reflected in these plots. This is because tiny-deseq.r must be aware of the control condition in order to perform the proper directional comparisons.
Both the lower and upper bound of the plot's axes can be set manually. Unspecified bounds are automatically calculated to fit the data.
Due to the plot's log scale, points are not plotted for features that have 0 reads in one of the compared conditions. Zero-count features will be supported in a future release.
The previous plot type can be extended to group and color differentially expressed features by class.
You can filter which classes are displayed using plot_class_scatter_filter.
If all features have 0 reads for a given class in one of the compared conditions, then that class is omitted from the plot and legend due to the plot's log scale. Zero-count classes will be supported in a future release.
If you find that two groups of interest share proximity and are too similar in color, you can change the group's color with a modified Plot Style Sheet. Group colors are assigned from the axes.prop_cycle
color cycler when there are fewer groups than colors, or from the tab20 colormap when groups outnumber colors. First, the total list of unique classes is gathered from the counts table and sorted, and the resulting list of classes is assigned colors in the order produced by the cycler.
For example, changing the color of the miRNA group in the above plot means changing the 6th color in the axes.prop_cycle
list (assuming all classes are represented in the plot). The P value outgroup is always the same color and doesn't affect the assignment process. See the config file documentation for more info about the Plot Style Sheet.