Address lingering seed and variable use issues#25
Conversation
|
@PeteHaitch any ideas for the failing build with status 401? I'll get to it in the next week if you don't have any immediate ideas |
|
Thanks, Alan. I don't think the GHA builds for OSCA have worked for some time ... I'll test the PR locally and report back. |
|
Ah yeah, sounds about right. I'll try block some time but my cup doesn't exactly runneth over either. Probably best to fix them or disable them anyways |
|
The second sanity check fails when I tried on my machine. > stopifnot(mean(vars_du) > mean(vars_tu))
Error: mean(vars_du) > mean(vars_tu) is not TRUE
> vars_du
[1] 0.2258 0.2745
> vars_tu
UMAP1 UMAP2
0.1803 0.6826 |
|
Works for me now on devel and release. PCAtools will probably need to be removed |
|
Unfortunately one of the sanity checks from this PR still appear broken on macOS (arm64) and Linux (x86_64). Separately, there are discussions to remove PCAtools from the book. macOS (arm64)#--- loading ---#
suppressPackageStartupMessages(library(scRNAseq))
sce.zeisel <- ZeiselBrainData()
suppressPackageStartupMessages(library(scater))
sce.zeisel <- aggregateAcrossFeatures(sce.zeisel,
id=sub("_loc[0-9]+$", "", rownames(sce.zeisel)))
#--- gene-annotation ---#
suppressPackageStartupMessages(library(org.Mm.eg.db))
rowData(sce.zeisel)$Ensembl <- mapIds(org.Mm.eg.db,
keys=rownames(sce.zeisel), keytype="SYMBOL", column="ENSEMBL")
#> 'select()' returned 1:many mapping between keys and columns
#--- quality-control ---#
stats <- perCellQCMetrics(sce.zeisel, subsets=list(
Mt=rowData(sce.zeisel)$featureType=="mito"))
qc <- quickPerCellQC(stats, percent_subsets=c("altexps_ERCC_percent",
"subsets_Mt_percent"))
sce.zeisel <- sce.zeisel[,!qc$discard]
#--- normalization ---#
library(scran)
set.seed(1000)
clusters <- quickCluster(sce.zeisel)
sce.zeisel <- computeSumFactors(sce.zeisel, cluster=clusters)
sce.zeisel <- logNormCounts(sce.zeisel)
#--- variance-modelling ---#
dec.zeisel <- modelGeneVarWithSpikes(sce.zeisel, "ERCC")
top.hvgs <- getTopHVGs(dec.zeisel, prop=0.1)
library(scran)
top.zeisel <- getTopHVGs(dec.zeisel, n=2000)
set.seed(100)
sce.zeisel <- fixedPCA(sce.zeisel, subset.row=top.zeisel)
set.seed(123)
library(densvis)
dt <- densne(reducedDim(sce.zeisel, "PCA"), dens_frac = 0.4, dens_lambda = 0.4)
reducedDim(sce.zeisel, "dens-SNE") <- dt
dm <- densmap(reducedDim(sce.zeisel, "PCA"), dens_frac = 0.4, dens_lambda = 0.4)
reducedDim(sce.zeisel, "densMAP") <- dm
sce.zeisel <- runUMAP(sce.zeisel) # for comparison
sce.zeisel <- runTSNE(sce.zeisel) # for comparison
ts <- reducedDim(sce.zeisel, "TSNE")
tu <- reducedDim(sce.zeisel, "UMAP")
ds <- scale(dt)
ts <- scale(ts)
du <- scale(dm)
tu <- scale(tu)
vars_ds <- colVars(ds[sce.zeisel$level1class == "astrocytes_ependymal", ])
vars_ts <- colVars(ts[sce.zeisel$level1class == "astrocytes_ependymal", ])
stopifnot(mean(vars_ds) > mean(vars_ts))
vars_ds
#> [1] 0.5256126 0.2784563
vars_ts
#> TSNE1 TSNE2
#> 0.5767240 0.1227339
vars_du <- colVars(du[sce.zeisel$level1class == "astrocytes_ependymal", ])
vars_tu <- colVars(tu[sce.zeisel$level1class == "astrocytes_ependymal", ])
stopifnot(mean(vars_du) > mean(vars_tu))
#> Error: mean(vars_du) > mean(vars_tu) is not TRUE
vars_du
#> [1] 0.1903438 0.1699268
vars_tu
#> UMAP1 UMAP2
#> 0.1802935 0.6826027
|
|
Hm that's disconcerting as I should have had up-to-date devel and release installs on Linux that both passed fine. The difference between platforms is also a bit worrying, seemingly for all 4 methods? (t-SNE and UMAP with and without density weighting). I will set this up on github actions so I can test across platforms rather than relying just on my local build, that might be relying on some errant caching... |
|
Righto, I changed the checks and disabled UMAP as that seems to be the one suffering from platform inconsistency. You've alluded to discussions elsewhere a few times; is this something I should be aware of? I've not seen anything on Zulip |
|
Thanks, Alan, it seems to be working for me now on arm64 (now using BioC 3.23).
Sorry, I now see you weren't on the email from Lori and Vince. |
|
Cheers, no worries re emails just wanted to be sure I hadn't snoozed something important. And yeah I came across pcatools failing on an unrelated project right before we move to publish 😄 |
From #24