GitHub - QuLab-VU/GES_2021: Code (data analysis and model simulations) for GES paper (2020).

Data repository for "An in vitro model of tumor heterogeneity resolves genetic, epigenetic, and stochastic sources of cell state variability," Hayford et al. (2021), PLoS Biology 19 : e3000797; DOI: 10.1371/journal.pbio.3000797

*Instructions for creating panels in all main and supplementary figures based on experimental and simulated data in this repository

MAIN FIGURES
- FIGURE 1: N/A
- FIGURE 2
  
  Panels A and C: In the DrugResponse directory, run DrugResponse.R, which pulls data from the two Parental-*.csv files in the directory and the well conditions in the DrugResponse/Platemaps subdirectory.
  
  Panels B and D: In the cFP directory, run cFP.R, which pulls data from the 10 cFP_*.csv files in the directory.
- FIGURE 3
  
  Panel A: In the WES directory, run WES.R, which pulls data from mutations_byChromosome.csv.
  
  Panels B, C, and D: In the WES directory, run WES.R, which pulls data from the vep_*.txt files in the directory and uses the database in the RData object in RefCDS_human_GRCH38.p12.rda to cross-reference variants. NOTE: The vep_*.txt files must be manually unzipped before running WES.R.
  
  Panel E: In the scRNAseq/inferCNV subdirectory, run inferCNV.R, which pulls a counts matrix from the RData object in PC9.CLV.10x.counts.matrix.rds, included in the directory. Necessary annotation and gene order files are also provided.
  
  Panel F: In the scRNAseq directory, run scRNAseq.R, which pulls from 10x Genomics reduced data in the scRNAseq/read_count and scRNAseq/umi_count subdirectories. Scripts to de-multiplex hashed raw data and outputs are included in the scRNAseq/HTO_identification subdirectory. A full matrix of de-multiplexed counts is included as PC9_scRNAseqCounts_HTOdemux.csv.zip.
  
  Panel G: In the GO directory, run GO_correlation.R, which pulls data from mutations_DEGs-hg38.RData, a file that compiles all IMPACT genetic mutations (from the WES directory) and differentially expressed genes (DEGs; from the scRNAseq directory).
- FIGURE 4
  
  Panel A: In the WES folder, run WES.R, which pulls data from mutations_byChromosome.csv.
  
  Panels B, C, and D: In the WES directory, run WES.R, which pulls data from the vep_*.txt files in the directory and uses the database in the RData object in RefCDS_human_GRCH38.p12.rda to cross-reference variants.
  
  Panel E: In the scRNAseq/inferCNV subdirectory, run inferCNV.R, which pulls a counts matrix from the RData object in PC9.VUDS.10x.counts.matrix.rds (created in inferCNV.R). Necessary annotation and gene order files are also provided.
  
  Panel F: In the scRNAseq directory, run scRNAseq.R, which pulls from 10x Genomics reduced data in the scRNAseq/read_count and scRNAseq/umi_count subdirectories. Scripts to de-multiplex hashed raw data and outputs are provided in the scRNAseq/HTO_identification subdirectory. A full matrix of de-multiplexed counts is included as PC9_scRNAseqCounts_HTOdemux.csv.zip.
  
  Panel G: In the GO folder, run GO_correlation.R, which pulls data from mutations_DEGs-hg38.RData, a file that compiles all IMPACT genetic mutations (from the WES directory) and differentially expressed genes (DEGs; from the scRNAseq directory).
- FIGURE 5
  
  Panels A and E: In the cFP directory, run cFP.R, which pulls data from the trajectories_*.csv files in the directory.
  
  Panels B and F: In the cFP directory, run cFP.R, which pulls simulated data from the trajectories_*.csv files in the directory. Model trajectories are representative examples of a larger simulation scan (*.py models in the Simulations directory).
  
  Panels C and G: In the cFP directory, run cFP.R, which pulls simulated data from the distributions_*.csv files in the directory. Model distributions were calculated from example trajectories as part of a larger simulation scan (*.py models in the Simulations directory). For each subline, the mean and confidence interval reported on the plot is calculated based on 100 bootstrapped p-values provided in one of the ADbootstrap*.csv files.
  
  Panels D and H: In the Simulations directory, run plotParameterScan.R, which pulls data from the *_lowVal.csv files in the directory.
- FIGURE 6: N/A
SUPPLEMENTARY FIGURES
- SUPPLEMENTARY FIGURE S1
  
  Panel A: Screenshot of the EGFR gene from the Integrative Genomics Viewer (IGV) based on raw exome sequencing data (available in the Sequence Read Archive (SRA) at accession #PRJNA632351). Image is stored as PC9-EGFRgene_mutations_ex19delCommon.svg in the WES directory.
  
  Panel B: N/A
- SUPPLEMENTARY FIGURE S2
  
  Panels A, B, and C: In the cFP directory, run cFP.R, which pulls data from the trajectories_*.csv files in the directory. Data from overlays in panel C come from the PopD_trajectories.RData object.
- SUPPLEMENTARY FIGURE S3
  
  Panel A: In the WES directory, run WES.R, which pulls data from number_mutations.csv in the directory.
  
  Panel B: In the WES directory, run WES.R, which pulls data from samples_called_vars_named.vcf.gz in the directory. Directions to download reference FASTA and GTF files are provided in WES.R.
  
  Panel C: In the WES directory, run WES.R, which pulls data from shared_variants_CLV.csv in the directory.
  
  Panel D: In the WES directory, run WES.R, which pulls data from shared_variants_sublines.csv in the directory.
  
  Panel E: In the WES directory, run WES.R, which pulls data from shared_variants_VUDSlines.csv in the directory.
- SUPPLEMENTARY FIGURE S4
  
  Panels A and B: In the WES directory, run WES.R, which pulls data from the vep_*.txt files in the directory and uses the database in the RData object in RefCDS_human_GRCH38.p12.rda to cross-reference variants.
- SUPPLEMENTARY FIGURE S5
  
  Panel A: Screenshot of the summarized output from the Cell Ranger quality control analysis on the scRNA-seq library (available in the Gene Expression Omnibus (GEO) data repository at accession #GSE150084). Settings are shown in the image, which is stored as CellRanger_PC9.svg in the scRNAseq directory.
  
  Panel B: In the scRNAseq directory, run scRNAseq.R, which pulls from 10x Genomics reduced data in the scRNAseq/read_count and scRNAseq/umi_count subdirectories.
- SUPPLEMENTARY FIGURE S6
  
  Panels A and B: In the scRNAseq directory, run scRNAseq.R, which pulls from 10x Genomics reduced data in the scRNAseq/read_count and scRNAseq/umi_count subdirectories and subsets data by cell line versions.
  
  Panels C and D: In the scRNAseq directory, run scRNAseq.R, which pulls from 10x Genomics reduced data in the scRNAseq/read_count and scRNAseq/umi_count subdirectories and subsets data by sublines.
  
  Panels E and F: In the scRNAseq directory, run scRNAseq.R, which pulls from 10x Genomics reduced data in the scRNAseq/read_count and scRNAseq/umi_count subdirectories.
- SUPPLEMENTARY FIGURE S7
  
  Panels A and B: In the RNAseq directory, run RNAseq.R, which pulls from all 8 *_featurecounts.txt files in the directory. These files were created using the Bash script in RNAseq_processing.txt. NOTE: The *_featurecounts.txt files must be manually unzipped before running RNAseq.R.
- SUPPLEMENTARY FIGURE S8
  
  In the scRNAseq directory, run scRNAseq.R, which pulls from 10x Genomics reduced data in the scRNAseq/read_count and scRNAseq/umi_count subdirectories. Input hallmark gene signature (.gmt) files can be found in the scRNAseq/VISION_gmt/hallmark subdirectory.
- SUPPLEMENTARY FIGURE S9
  
  In the GO directory, run semanticSimilarity.R, which pulls data from mutations_DEGs-hg38.RData, a file that compiles all IMPACT genetic mutations (from the WES directory) and differentially expressed genes (DEGs; from the scRNAseq directory). Directions for downloading reference GTF file are provided in semanticSimilarity.R.
- SUPPLEMENTARY FIGURE S10
  
  In the scRNAseq/inferCNV directory, run inferCNV.R, which pulls a counts matrix from the RData object in PC9.VUDS.10x.counts.matrix.rds (created in inferCNV.R). Necessary annotation and gene order files are also provided in the directory.
- SUPPLEMENTARY FIGURE S11: N/A
- SUPPLEMENTARY FIGURE S12
  
  Panel A: In the cFP directory, run cFP.R, which pulls data from the trajectories_*.csv files in the directory.
  
  Panel B: In the cFP directory, run cFP.R, which pulls data from the trajectories_*.csv files in the directory. Model trajectories are representative examples of a larger simulation scan (*.py models in the Simulations directory).
  
  Panel C: In the cFP directory, run cFP.R, which pulls simulated data from the distributions_*.csv files in the directory. Model distributions were calculated from example trajectories as part of a larger simulation scan (*.py models in the Simulations directory). For each subline, the mean and confidence interval reported on the plot is calculated based on 100 bootstrapped p-values provided in one of the ADbootstrap*.csv files.
  
  Panel D: In the Simulations directory, run plotParameterScan.R, which pulls from the *_lowVal.csv files in the directory.
- SUPPLEMENTARY FIGURE S13
  
  Panels A and B: In the WES directory, run WES.R, which pulls data from samples_called_vars_named.vcf.gz in the directory.
- SUPPLEMENTARY FIGURE S14
  
  In the scRNAseq directory, run scRNAseq.R, which pulls from 10x Genomics reduced data in the scRNAseq/read_count and scRNAseq/umi_count subdirectories.
- SUPPLEMENTARY FIGURE S15
  
  In the scRNAseq directory, run scRNAseq.R, which pulls from 10x Genomics reduced data in the scRNAseq/read_count and scRNAseq/umi_count subdirectories.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data repository for "An in vitro model of tumor heterogeneity resolves genetic, epigenetic, and stochastic sources of cell state variability," Hayford et al. (2021), PLoS Biology 19 : e3000797; DOI: 10.1371/journal.pbio.3000797

*Instructions for creating panels in all main and supplementary figures based on experimental and simulated data in this repository

MAIN FIGURES

FIGURE 1: N/A

FIGURE 2

FIGURE 3

FIGURE 4

FIGURE 5

FIGURE 6: N/A

SUPPLEMENTARY FIGURES

SUPPLEMENTARY FIGURE S1

SUPPLEMENTARY FIGURE S2

SUPPLEMENTARY FIGURE S3

SUPPLEMENTARY FIGURE S4

SUPPLEMENTARY FIGURE S5

SUPPLEMENTARY FIGURE S6

SUPPLEMENTARY FIGURE S7

SUPPLEMENTARY FIGURE S8

SUPPLEMENTARY FIGURE S9

SUPPLEMENTARY FIGURE S10

SUPPLEMENTARY FIGURE S11: N/A

SUPPLEMENTARY FIGURE S12

SUPPLEMENTARY FIGURE S13

SUPPLEMENTARY FIGURE S14

SUPPLEMENTARY FIGURE S15

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
DrugResponse		DrugResponse
GES_2020		GES_2020
GO		GO
Joint_functions		Joint_functions
RNAseq		RNAseq
Simulations		Simulations
WES		WES
cFP		cFP
scRNAseq		scRNAseq
README.md		README.md
README_OLD.md		README_OLD.md

QuLab-VU/GES_2021

Folders and files

Latest commit

History

Repository files navigation

Data repository for "An in vitro model of tumor heterogeneity resolves genetic, epigenetic, and stochastic sources of cell state variability," Hayford et al. (2021), PLoS Biology 19 : e3000797; DOI: 10.1371/journal.pbio.3000797

*Instructions for creating panels in all main and supplementary figures based on experimental and simulated data in this repository

MAIN FIGURES

FIGURE 1: N/A

FIGURE 2

FIGURE 3

FIGURE 4

FIGURE 5

FIGURE 6: N/A

SUPPLEMENTARY FIGURES

SUPPLEMENTARY FIGURE S1

SUPPLEMENTARY FIGURE S2

SUPPLEMENTARY FIGURE S3

SUPPLEMENTARY FIGURE S4

SUPPLEMENTARY FIGURE S5

SUPPLEMENTARY FIGURE S6

SUPPLEMENTARY FIGURE S7

SUPPLEMENTARY FIGURE S8

SUPPLEMENTARY FIGURE S9

SUPPLEMENTARY FIGURE S10

SUPPLEMENTARY FIGURE S11: N/A

SUPPLEMENTARY FIGURE S12

SUPPLEMENTARY FIGURE S13

SUPPLEMENTARY FIGURE S14

SUPPLEMENTARY FIGURE S15

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages