Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased heterogeneity is only described on first PC #172

Closed
vali301s opened this issue Aug 13, 2024 · 3 comments
Closed

Increased heterogeneity is only described on first PC #172

vali301s opened this issue Aug 13, 2024 · 3 comments
Labels

Comments

@vali301s
Copy link

vali301s commented Aug 13, 2024

Hi,

thank you for the development of Splatter, it's been very exciting exploring your package so far.

I have been using Splatter to simulate data of one group with varying heterogeneity. To set the heterogeneity levels I am adjusting the BCV parameters (for higher heterogeneity -> increase bcv.common and decrease bcv.df).
In the following picture, you can see that in the PCA plots the cells are more dispersed with higher heterogeneity (as expected). However, when I plotted the Elbow plots (below each PCA plot) I noticed that the increase in heterogeneity is mainly comming from the first PC.
Screenshot 2024-08-13 105004

This looks super unnatural to me and I have never seen this in real scRNAseq data. Do you know why this is happening? Also, despite this, do you think that I can further use the datasets that I have created, i.e. is it a problem that the Elbow plot looks like this?

PS: Apart from only changing the BCV parameters, I also estimated the parameters from real data: immune cells with low heterogeneity (Naive T/B cells) and high heterogeneity (macrophages/monocytes) and simulated new scRNAseq datasets with said parameters. Once again, I noticed that the additional heterogenetiy that the macrophages and monocytes have is again described mostly by PC1. Since the Elbow plot of the simulated macrophages/monocytes (estimated from real) data looks like this, it really seems that its a feature of Splatter to describe the heterogenetiy on only the first PC...
Screenshot 2024-08-13 110415

Thank you very much in advance.

@lazappi
Copy link
Collaborator

lazappi commented Aug 14, 2024

Hi @vali301s

Thanks for giving {splatter} a go. Modifying the variation in a single population is something that hasn't come up very often and is maybe something that the splat simulation doesn't do very well. As you have seen the bcv parameters have some effect but maybe not what you would like and introducing enough different kinds of variation is something many simulations struggle with. I would be curious to see what this looks like in real data though. If you subset to only similar cells in a real population do you see a similar effect on the PCA?

An alternative approach which has been used previously is to simulate a single path rather than one homogenous group. This gives you access to more parameters which you can manipulate to give you something closer to what you want, for example by reducing the amount of differential expression along the path so that it gives your cells some variation but not enough to create two separate populations.

@lazappi
Copy link
Collaborator

lazappi commented Oct 4, 2024

@vali301s Do you have any further questions on this?

@lazappi
Copy link
Collaborator

lazappi commented Nov 21, 2024

I'm going to close this now, please comment if you want to discuss further.

@lazappi lazappi closed this as completed Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants