Skip to content

Pipeline upload

robes edited this page Jul 16, 2018 · 1 revision

UNDER CONSTRUCTION

The upload configuration for data generated and submitted by the bioinformatics pipeline is slightly different than normal upload configuration. This procedure is not intended for normal data submissions to FaceBase.

File should be organized in the following directory structure.

/<pipeline_rid>/<replicate_rid>/proc/<mapping_assembly>/filenames...

Where:

  • pipeline_rid: is the RID of the pipeline metadata record in the FaceBase database. Currently, we are testing this on our staging host and the RID to use is 1-4H6C;
  • replicate_rid: is the replicate RID for each replicate's data downloaded and processed from the database;
  • proc: is a constant which stands for "processed data" since the pipeline by definition is producing only processed data from the raw sequences;
  • mapping_assembly: the mapping assembly (a.k.a., reference genome) name such as mm10, hg18, or hg19;
  • filenames...: the processed data files, which may include BAM, BAI, count, tsv, FastQC, BED, BigBED, and BigWIG.