-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using SCT counts with runEscape? #138
Comments
Hey @alexwskh, As of now, If you are playing around with it, I would love to get more info on SCT vs RNA count results that you see. It is an unknown in the field and might spark future parameters to allow users to select assays. Thanks again, |
Hey @ncborcherding , thank you so much for the quick reply. I have some quick results here to chat about and a question if you have time. Just as a bit of background, each individual scRNAseq run has had its counts adjusted by decontX in an attempt to compensate for ambient RNA, and doubletfinder was used to identify possible doublets for removal. Objects were then SCTransformed and integrated via Seurat. This is actually a subset of the main dataset, but that shouldn't matter. I have two signatures I'm interested in. Our lab studies neuroblastoma, and we have a Mesenchymal(MES) and Adrenergic(ADRN) signatures. I created assays from both the RNA assay counts and the SCT counts, and normalized according to the number of features. RNA assay, non-normalized So, not really any major differences, however it gets weird when I normalize with performNormalization. RNA assay, normalized I'm unsure why the normalization is so different for the SCT counts. I'm guessing it's something about the number of features after SCT scaling? Also, here's a question for you. I don't want to bias myself and cherry pick things, and the non-normalized expression score looks better. The enrichment of the mesenchymal score in the top clusters is exactly what I want to see, and you lose a lot of signal intensity when normalizing. So my question is, does normalizing by number of features detected not artificially deflate scores in cells that are more transcriptionally complex? From my understanding the ssGSEA scoring is taking into consideration the size of the gene set and the number of genes... If mesenchymal cells happen to be doing more things biologically than adrenergic cells and require a broader range of transcripts, they will have a lower normalized score... Given that the SCT counts should be corrected for sequencing depth across conditions, do you think I am able to work with the expression score metric directly instead of normalizing? Or is that an incorrect assumption? |
This is an excellent rundown!! Thank you so much for taking the time. For the normalization, are you using
This is a real possibility, normalizing by features may be a blunt instrument in very complex cells. There is no problem with reporting non-normalized enrichment values, just like unequal distribution of RNA within these cells. But to clarify, the ssGSEA implementation in escape reports raw enrichment score unless you normalize, so there is no two levels of normalization.
Did you use Interesting stuff!! Nick |
@ncborcherding Happy to help when you're spending the time developing tools like this.
The pictures I showed you were normalized with performNormalization using the nFeature_RNA parameter (Pretty sure, I'll confirm... I was playing with my script a lot!). It is also a bit strange when I normalize with the default settings (or vice versa). So to confirm, setting normalize = TRUE for escape.matrix is equivalent to running performNoramlization on default settings? Default would be log1p (enrichment score / #genes from gene set that are present), versus log1p (enrichment score / #total detected genes in each sample) ?
Yeah I did run PrepSCTFindMarkers before extracting the counts, as my understanding is that it strips away the individual SCT scaling for each sample and readjusts as a whole. When you say "it might make sense to use it to reduce noise", do you mean employing normalization if I use SCT? Or that PrepSCTFFindMarkers itself would be good to use? As for the #features, taking a look at the assays on the seurat object can see the SCT assay has 21562 features while the RNA assay has 25109, so it definitely cuts some as it transforms the counts. Here are the #counts and #genes for the cells. I feel a bit better seeing that the high-nFeature/nCount cells with the high mesenchymal scores in the top are separate from the high nFeature cells with the higher Adrn score in the bottom right... So it's not just deeper sequencing/more genes detected solely influencing these scores. At least that's my interpretation, what about yours? |
yes the
My understanding is that until you use
Good to know - so there is some internal filtering when running SCT, but it is maintaining like ~85% of the feature space. |
Great, thanks for the continued discussion.
Hmmmm. Here are some relevant entries from the Seurat web pages. It's possible I'm misunderstanding but... From SCTransform page From PrepSCTFindMarkers page I took from this that the SCT counts slot is a modified value, and if you want the raw counts you need to go back to the RNA assay and extract counts from there. I can confirm this as when I am trying to extract all cells with transgenic GFP mRNA, if I use the SCT assay counts I lose some cells (ones with very low count value likely). Either way, I did run the PrepSCTFindMarkers before running escape.matrix, so that should be accounted for! |
@ncborcherding Just a second question, I am seeing some strange results when I do default normalization on the raw counts from the RNA assay... Any thoughts? |
@ncborcherding Looks like I am using the correct version (2.2.2)?
|
Hi there, and if the developer of this package happens to read this, thank you for developing this tool.
I have an integrated SCTransformed Seurat object from five separate rnaseq experiments. My understanding is that if I have run PrepSCTFindMarkers, the counts layer of the SCT assay should be adjusted for the integrated model. I want to runEscape on my Seurat object, and it works-but it is defaulting to the RNA assay counts instead of the SCT assay.
Is there a way I can tell it which assay to use? Thank you.
The text was updated successfully, but these errors were encountered: