Inquiry on Best Practices for DEG Analysis Using scGPT Model Outputs #276

ellieujin · 2024-12-31T13:40:52Z

I have been utilizing your scGPT model to perform perturbation prediction tasks by fine-tuning it with perturb-seq data. Your model is exceptionally well-designed and has significantly advanced my research.

In my current workflow, I compare the model’s prediction results with gene expression values from control cells to perform differential gene expression (DEG) analysis. Specifically, I generate prediction values equivalent to the number of control cells before the model averages multiple predictions, and then compare these values with the control gene expressions to identify DEGs.

However, I have encountered an issue where the gene expression distribution of the model’s output differs from that of the control cells, which results in a distorted volcano plot, as shown in the figure below.

It appears that the model’s output does not adequately reflect the sparsity typically observed in single-cell data. To address this, I performed imputation on the control cells to better align their distribution with the model’s output. Despite this adjustment, DEG analyses using both Wilcoxon and MAST methods still resulted in p-value inflation and much more downregulated DEGs than up DEGs, as shown below.

Given that DEG analysis is crucial for deriving meaningful biological insights from the model, I am seeking your guidance on best practices for conducting DEG analysis with scGPT model outputs. Your expertise would be invaluable in guiding my research forward. I would greatly appreciate any recommendations or insights you could share.

Thank you very much in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry on Best Practices for DEG Analysis Using scGPT Model Outputs #276

Inquiry on Best Practices for DEG Analysis Using scGPT Model Outputs #276

ellieujin commented Dec 31, 2024

Inquiry on Best Practices for DEG Analysis Using scGPT Model Outputs #276

Inquiry on Best Practices for DEG Analysis Using scGPT Model Outputs #276

Comments

ellieujin commented Dec 31, 2024