Skip to content
GuyKha edited this page Jul 28, 2016 · 49 revisions

General files that are created during processing

processing time is set to be up to 6 hours except for SnpCgh microarray which is limited to 1 hour of processing

note: the original uploaded file and the files inside it (in case of an archive) is deleted by process_input_files.php

  1. process_log.text - log file for the project processing.

  2. working.txt - generated at the start of processing contain the processing start timestamp, used to indicate front end that the installation still running.

  3. working_done.txt - when the installation is done the file working.txt changes name to working_done.txt

  4. condensed_log.txt - condensed process log contains process levels to be displayed in the UI during processing

  5. completed.txt - created when processing is done, contain the timestamp of when the processing was done.

  6. error.txt - will be created in case an error occurred during processing.

    can contain the following errors:

    • Error : FASTA file uploaded as input. Upload FASTQ, or ZIP or GZ archives.
    • Error : Archive contained a file with no extension and the file type could not be determined.\nUpload FASTQ, or ZIP or GZ archives containing a FASTQ file.
    • Error : File had no extension and the file type could not be determined.\nUpload FASTQ, or ZIP or GZ archives containing a FASTQ file.
    • Error : Unknown file type as input.\nUpload FASTQ, or ZIP or GZ archives containing a FASTQ file.
  7. zipTemp.txt - contain the output of the unzip of zip files uploaded

  8. gzTemp.txt - containt the output of the gzip (unpacking) of the files uploaded

Files created during Whole Genome NGS - Single end read (including intermediate files that my be deleted during the process)

  1. parent.txt - used to save the name of the directory of the project
  2. dataType.txt - contains number to describe the data type uploaded:
    • SnpCgh microarray - 1:0
  3. upload_size_1.txt - saves the size (in bytes) of the uploaded file
  4. datafiles.txt - contains the name of the datafiles to work on (if a Gareth's pileup format file was uploaded then the file will contain null1 and null2)
  5. data.sam - the sam file created from the user input
    • If the user inserted bam file - it's created by scripts_seqModules/bam2sam.sh (the original bam file and the .bai file is deleted)
  6. data_r1.b.fastq, data_r2.b.fastq, data_r2.c.fastq - temp files created by scripts_seqModules/sam2fastq.sh when converting sam files to fastq (these files are deleted by scripts_seqModules/sam2fastq.sh)
  7. data_r1.fastq, data_r2.fastq - final files created by scripts_seqModules/sam2fastq.sh when converting sam file to paired-FASTQ files
  8. SNP_CNV_v1.txt - created by scripts_seqModules/Gareth2pileups.sh in order to convert the uploaded tab-delimited text data to pileup formats used in the pipeline, during the processing the file temp_dir/temp.SNP_CNV_v1.txt is created and deleted
  9. putative_SNPs_v4.txt - created by scripts_seqModules/Gareth2pileups.sh in order to convert the uploaded tab-delimited text data to pileup formats used in the pipeline, during the processing the file temp_dir/temp.putative_SNPs_v4.txt is created and deleted

Files created during Whole Genome NGS - Paired end read (including intermediate files that my be deleted during the process)

In addition to the files that are created in single end read:

  1. upload_size_2.txt - saves the size of the second uploaded file (in bytes)