Skip to content

Karimi-Lab/TE_aging_manuscript

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aging manuscript (to be updated with actual title)

Scripts and data used in the manuscript.

Screenshot 2023-12-15 at 15 31 46

How to use the scripts

generate_singscores.R: Steps needed to generate singscsores from Aging-related gene sets using Illumina HumanHT-12 v4.
methylation_avail_probes.R: Check Illumina Infinium Human-Methylation probe set in used methylation datasets. Used to generate Supplementary Figure 1b.
scRNA_analysis.R: Steps needed to run the scRNA analysis and generate related figures.
plots.Rmd: Steps needed to run the statisctical analyses and generate related plots and figures.

The scripts necessary to convert scRNA data to pseudo-bulk RNA for RTE classes and families can be found under Scripts/scRNA_Pseudobulk.

Example figure generation

Generate gene expression probe availability bars (Supp. Fig. 1a)

Screenshot 2023-12-22 at 10 27 02

Run the "Probe bar chart" chunk in plots.Rmd. The necessary file "missing_genes_231221.xlsx" is created by running the generate_singscores.R script. Legend and labels were added manually.

Generate boxplots split by gene set score quartiles for different cohorts and TEs (Supp. Fig. 6-7)

Screenshot 2023-12-22 at 10 37 17

The "quartile.all" function takes cohort and TE class/family as parameters (plots.Rmd). For example, to generate the plot for GARP cohort (GSE48556) and TE class LTR, use the following code.

quartile.all(cohort = "GSE48556", te_class = "LTR")

Generate boxplots comparing Control vs. Centenarian groups for gene set scores (Supp. Fig. 13)

Screenshot 2023-12-22 at 12 39 09

The "Supercentenarian cohort" part in scRNA_analysis.R. Creates both individual plots for cell types and a "summary" plot combining all cell types in a single pdf (Summarised_.pdf). Individual plots will be found in Data > Single_Cell > Supercentenarian > Inflammatory_analysis > plot > class/fam > TE Class > Cell Type.pdf
If you encounter an error when saving the file, you may need to create the appropriate subfolders manually.

Generate GSVA heatmap different cohorts and TEs (Fig. 3a)

Screenshot 2023-12-22 at 11 31 46

The "gsva.heatmap" function takes a cohort list and TE class/family list as parameters (plots.Rmd).

c("LINE", "L1", "L2",
  "LTR", "ERV1", "ERVL", "ERVL-MaLR","ERVK",
  "SINE", "Alu", "MIR")

gsva.heatmap(cohorts_list = c("GSE56045", "GSE48556", "GSE58137"), te_list = all_tes)

Generate cell-type specific boxplots of RTE expression vs. age for PBMC scRNA-seq cohorts (Fig. 5a-f)

Screenshot 2023-12-22 at 11 01 14

The "Inflammation" part in scRNA_analysis.R. Creates both individual plots for cell types and a "summary" plot combining all cell types in a single pdf (Summarised_<Gene_Set>_.pdf). Individual plots will be found in Data > Single_Cell > Inflammatory_analysis > plot > class/fam > TE Class > Gene Set > Cell Type.pdf
If you encounter an error when saving the file, you may need to create the appropriate subfolders manually.

Additional data required (the folder the file should be put into)

MESA cohort gene expression RDS file (Gene_Expression): https://emckclac-my.sharepoint.com/:u:/g/personal/k2140993_kcl_ac_uk/EePfklf6BYNAiPh_xrss5VkBJsdKhhYBU3zHiNpZ-kvYNQ?e=x51Fa4

hg38 RepeatMasker file (Single_Cell): https://emckclac-my.sharepoint.com/:u:/g/personal/k2140993_kcl_ac_uk/EZFfPw8xHllBs3-5flzFExUBpZtOdGs5L_CS959mVZ5aaw?e=SiRRh2

Japanese cohort scRNA expression RDS file (Single_Cell/Supercentenarian): https://emckclac-my.sharepoint.com/:u:/g/personal/k2140993_kcl_ac_uk/EfHOPUPCjdRDtLARUqNcgecBsD5b1UV6BgN_Zp51vbmYww?e=76txII

UMI expression matrix (Single_Cell/Supercentenarian): https://emckclac-my.sharepoint.com/:t:/g/personal/k2140993_kcl_ac_uk/EaF0O8LoEcFCtT2TJ3pnv98BYSBBYLN7kGrRkWRAgaUwwQ?e=TqFpRb

Releases

No releases published

Packages

No packages published