Skip to content

perf: "two-pass" seurat hvg via scanpy.get.aggregate#4013

Draft
ilan-gold wants to merge 8 commits intomainfrom
ig/two_pass_hvg_v3
Draft

perf: "two-pass" seurat hvg via scanpy.get.aggregate#4013
ilan-gold wants to merge 8 commits intomainfrom
ig/two_pass_hvg_v3

Conversation

@ilan-gold
Copy link
Copy Markdown
Contributor

An idea that popped into my head for disk-bound datasets but likely also normal ones. This should, in theory, greatly improve on-disk access and produce speed ups for disk bound data by reducing the amount of i/o in the worst case, unordered scenario (while, I would guess, leaving in-memory datasets untocuhed or maybe improved thanks to memory access + more efficient mean/var).

  • Closes #
  • Tests included or not required because:

@ilan-gold ilan-gold added this to the 1.12.1 milestone Mar 26, 2026
@ilan-gold ilan-gold changed the title perf: "two-pass" seurat hvg3 via scanpy.get.aggregate perf: "two-pass" seurat hvg via scanpy.get.aggregate Mar 26, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
1638 1 1637 1103
View the top 1 failed test(s) by shortest run time
tests/test_neighbors.py::test_connectivities_euclidean[umap]
Stack Traces | 0.009s run time
[XPASS(strict)] umap<0.6 is broken with numba≥0.62.0rc1

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@scverse-benchmark
Copy link
Copy Markdown

scverse-benchmark bot commented Mar 26, 2026

Benchmark changes

Change Before [9bc2c1e] After [7e0390e] Ratio Benchmark (Parameter)
+ 1.09±0s 2.39±0s 2.19 preprocessing_log.HVGSuite.time_highly_variable_genes('seurat_v3')

Comparison: https://github.com/scverse/scanpy/compare/9bc2c1ed1bd1cf6f6c06cec71cce99916d048163..7e0390ee10fe2c38d382b97ad2ff0bf8e6280e6b
Last changed: Thu, 9 Apr 2026 08:15:10 +0000

More details: https://github.com/scverse/scanpy/pull/4013/checks?check_run_id=70564424668

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant