The functions can be used to perform the downstream analysis of the scCITESeq data. I have used all these function in pipeline_functions/scCITESeq_Execution.R
-
scCITESeq_QC function - This function will perform the quality check scRNA and scADT dataset before and after fitering the cells based on the cell number, number of genes, mitochondir percent, and ADT UMI.
source("./pipeline_functions/CR7_scCITESeq_QC.R") CR_outdir ="./post_processing/genotype_demux/Exp01_geno/outs/per_sample_outs/Exp01_geno" scCITE_QC(samplepath = CR_outdir, samplename = "BB22003", saveDir = "./output/", min_cells = 3, min_genes = 200, max_genes = 5000, mitopercent = 10, ADT_UMI = 100000)
This will create the filtered RNA and ADT (protein) object
-
scCITEseq_Doublet_Finder - This function will remove the doublets from the object.
source("./pipeline_functions/scCITEseq_Doublet_Finder.R") doublet_scCITEseq(Obj= RNA_obj, dims = 30, res = 0.8, saveDir = "./output/", Obj2 = ADT_obj, samplename = "BB22003", process = "Doublet_finder", Assay = "RNA")
This will filter out doublets from both the RNA and ADT
-
scCITESeq_sctransform_V2 - The object is normalized using scTransform V2 and samples integreated using CCA.
source("./pipeline_functions/scCITESeq_sctransform_V2.R") sctransform_V2_integration(obj = RNA_obj, saveDir = "./output/" , ngenes=4000, regress = "mito_score", Assay = "RNA", process ="norm", objname = "BB22003", split_by="orig.ident", reference = NULL, dims=30, sample_tree=NULL, k_weight=100)
-
scCITESeq_RNA_intergation - The RNA / ADT object integration function.
source("pipeline_functions/scCITESeq_RNA_intergation.R") RNA_integration(obj = RNA_obj/ADT_obj, saveDir = "./output/", dims=30, RNA_features = c("CD4","CD8A"), Assay="RNA", process="integration", objname = "BB22003", ncol=2, ndims = 50)
-
scCITESeq_ADT_merging - Performing the ADT (protein) Sample Integration using CCA.
source("pipeline_functions/scCITESeq_ADT_merging.R") ADT_merging(ADT_obj = ADT_obj, saveDir= "./output/", dims=10, numfeatures=NULL, Assay="ADT", process="ADT_integration", objname="BB22003", split_by="orig.ident", reference=NULL, sample_tree = NULL, k_weight=100)
-
cluster_UMAP_QC - Performing clustering and checking the quality for each cluster.
source("pipeline_functions/cluster_UMAP_QC.R") cluster_UMAP_and_QC(obj_path=RNA_obj/ADT_obj, dims = 30, res = 0.8, saveDir = "./output/", Assay = "RNA", QC_features = c("nCount_RNA","nFeature_RNA"), objname = "BB22003", process = "UMAP_QC", col_sel = c("Age","Run","orig.ident","gender"))
-
scCITEseq_modality_integration - Modality RNA and ADT integration.
source("pipeline_functions/scCITEseq_modality_integration.R") modality_integration(RNA_obj = RNA_obj, ADT_obj = ADT_obj, RNA_dims = 30, ADT_dims = 10, saveDir = savedir, Assay = "integrated", process = "RNA_ADT", objname= "BB22003")
These 7 step will perform the batch as well as modality (RNA and protein) integration for the scCITESeq experiment
Pseudobulk PCA and Differential:
source("pipeline_functions/Psuedobulk_PCA.R")
CD8_subset_dds <- COVID_pseudobulk_within_cluster_AJ(obj = CD8_mem,
savedir = savedir,
group1 = group1,
group2 = group2,
grouping_by = "Age",
cluster = "all",
cell_freq = 20,
remove_samples = remove_samples,
cluster_group = "seurat_clusters",
sample_col = "orig.ident",
batch_col = "Run",
gene_min_counts = 5,
column_name_split = c("sampleid","virus","age","age_number","gender","Run"))
source("pipeline_functions/Pseudobulk_differential.R")
design0 <- model.matrix(~ 0 + Age + Run, data = colData(CD8_subset_dds))
colnames(design0) <- c("O","Y",paste("Run",1:(ncol(design0)-2),sep = ""))
cm <- makeContrasts(O_VS_Y = O-Y,levels = design0)
desl_clus <- LimmaEdgeR_differential(dds = CD8_subset_dds,
design0 = design0,
cm = cm,
savedir = savedir2,
logfc = 0.5,
p_value_adj = 0.05)