Skip to content

Designing Reports

Irzam Sarfraz edited this page Aug 5, 2022 · 18 revisions

Introduction

One aspect of singleCellTK is the generation of comprehensive reports that input data with corresponding parameters and generate a PDF/HTML report that contains the description of overall process, methods or algorithms used, results computed, various related plots along with a brief summary, all integrated together in a single easy to read document.

Why make reports?

  • Descriptive form makes it easier for the users to understand the analysis (data, steps & results)
  • You don’t have to run each step separately (as with GUI or Console)
  • Everything including data description, input params, code for the analysis, description of the analysis, results (tables & figures), summary of the results, is integrated in a single document
  • Easier if you want to share your results or the analysis
  • Just needs input data and params

Examples of existing reports:

Differential Expression Report:

Load up the data into a SCE object, define parameters such as the assay to use and and the groups for differential expression. The report compiles the results of the differential expression including a table that shows the top differentially expressed features and various plots to visualize these features.

Seurat Report: (add figure)

Typically, you load up data into a SCE object and use the ‘runSeuratReport’ function to run the workflow and generate a report. This runs all steps in the Seurat workflow controlled by the parameters to the ‘runSeuratReport’ function.

The output document generated contains description of the input data followed by all steps of the Seurat workflow that contain the description of the workflow step, the code for this part of the analysis, resulting plots or tables, and additionally a small summary for this part of the analysis that can be copy pasted to papers or presentations.

The markup of the report is defined in such a modular and flexible manner that if a user desires to re-run the analysis (and re-generate the report document) by changing parameters to a specific part of the workflow, the report will re-compute only that specific part and the parts that depend on it, essentially saving computational resources and time. Give example of control params.

Overall structure of a report: (add a figure with function call explaining interactions) Header Section: Title, authors, report output params, analysis params Content: Analysis description, code, plots, tables. Summary: Paragraph that describes the analysis run and the results computed. SessionInfo

Process Steps: (make a figure)

a. If a short individual report (give examples):

  1. Create rmarkdown file in /rmakdown folder (confirm dir)
  2. Insert/update the header section (add required analysis parameters and control parameters)
  3. Add content including library calls, description/introduction, analysis code, summary, sessionInfo
  4. Create an R function “reportAlgorithmName” that calls this rmarkdown and stores the output file.

b. If a large report that calls a number of distinct functions possibly a workflow:

  1. Divide the workflow into individual steps where each individual step as a separate report (refer to a)
  2. Create an rmarkdown for file for the workflow
  3. Insert/update the header section
  4. Add introduction, summary, sessionInfo.
  5. Compile individual reports using knit_child function and store in variables.
  6. Call variables to display content usint cat() where appropriate.
  7. Create R function “reportWorkflowName” that calls this overall workflow rmarkdown and stores the output file.

Report calling function:

An R function that can be called with input data and params from the R environment to generate a report. Typically, this function checks for validity of the input object and params (or sets default values), checks for the output path, calls the rmarkdown to render the report and finally saves both the output object (includes all computed data) and the report document to the defined directory.

(add figure that shows structure of this function)

Modular Structure for Reports

The modular structure, where a larger report is divided into smaller individual reports, helps with re-usability and easy maintenance of reports. See for example, how the Seurat Workflow report re-uses individual reports for various sections of the report: (add figure from ppt) Tell benefits of modular structure and explain how they work (Figures from ppt) - https://drive.google.com/file/d/10BnuGvfJlCTRuK5bAE-Uk5BXieoX5XLS/view?usp=sharing

Control variables in reports:

Control variables differ from other parameters in the sense that they do not have an effect on the analysis, but only on the output document. These control variables control different levels of headings, to show or hide description and plots in a particular report, or if the computation from an analysis should be re-run or skipped when the document is re-rendered.