Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry on Best Practices for DEG Analysis Using scGPT Model Outputs #276

Open
ellieujin opened this issue Dec 31, 2024 · 0 comments
Open

Comments

@ellieujin
Copy link

I have been utilizing your scGPT model to perform perturbation prediction tasks by fine-tuning it with perturb-seq data. Your model is exceptionally well-designed and has significantly advanced my research.

In my current workflow, I compare the model’s prediction results with gene expression values from control cells to perform differential gene expression (DEG) analysis. Specifically, I generate prediction values equivalent to the number of control cells before the model averages multiple predictions, and then compare these values with the control gene expressions to identify DEGs.

However, I have encountered an issue where the gene expression distribution of the model’s output differs from that of the control cells, which results in a distorted volcano plot, as shown in the figure below.
PastedGraphic-1
PastedGraphic-2

It appears that the model’s output does not adequately reflect the sparsity typically observed in single-cell data. To address this, I performed imputation on the control cells to better align their distribution with the model’s output. Despite this adjustment, DEG analyses using both Wilcoxon and MAST methods still resulted in p-value inflation and much more downregulated DEGs than up DEGs, as shown below.
PastedGraphic-5
PastedGraphic-4

Given that DEG analysis is crucial for deriving meaningful biological insights from the model, I am seeking your guidance on best practices for conducting DEG analysis with scGPT model outputs. Your expertise would be invaluable in guiding my research forward. I would greatly appreciate any recommendations or insights you could share.

Thank you very much in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant