perf: more efficient _sparse_nanmean by Reovirus · Pull Request #3570 · scverse/scanpy

Reovirus · 2025-04-08T14:58:23Z

Fixes #1894. Reduced redundant data copying; the original matrix is now copied once instead of twice. One copy remains necessary to replace NaNs with zeros without modifying the original matrix.

Closes #1894
Tests included
Release notes not necessary because

Reovirus · 2025-04-08T15:51:24Z

@flying-sheep Can you assign the reviewer please? Fall test looks like testing bug

Reovirus · 2025-04-08T17:45:41Z

And I can rewrite logic without any coping using numba, but it can be slowly that implemented methods

Reovirus · 2025-04-09T09:24:46Z

I know that I have "No milestone failure" but I haven't permissions to set milestone, probably I need help of maintainers to set it

Zethson · 2025-04-11T09:07:15Z

@Reovirus I restarted the CI as there should hopefully be no flaky tests at the moment.

Don't worry about the milestone, please.

Reovirus · 2025-04-11T09:56:57Z

failure is scipy.sparse.csr_matrix.count_nonzero which has axis argument since scipy 1.15.0. I'll rewrite with numba

Reovirus · 2025-04-11T14:38:48Z

@Zethson cund you please restart CI? I catch very strange bug with http, it's not mine code

… copying

for more information, see https://pre-commit.ci

… the same time

for more information, see https://pre-commit.ci

Reovirus · 2025-04-11T14:53:31Z

CI pased, can somebody review in some time pls

flying-sheep · 2025-04-11T15:29:50Z

I made the benchmarks run, let’s see how well this works!

Could you please add a release note fragment?

hatch run towncrier:create 3570.performance.md

Reovirus · 2025-04-11T15:35:32Z

Yes, l'll make a note, thanks!)

scverse-benchmark · 2025-04-11T16:11:49Z

Benchmark changes

Change	Before [`9bc2c1e`]	After [`24f042e`]	Ratio	Benchmark (Parameter)
+	290M	338M	1.16	tools.ToolsSuite.peakmem_score_genes
-	19.4±0.2ms	17.5±0.1ms	0.9	tools.ToolsSuite.time_score_genes

Comparison: https://github.com/scverse/scanpy/compare/9bc2c1ed1bd1cf6f6c06cec71cce99916d048163..24f042ecdcc124a1ebcc526606373b9b3a940024
Last changed: Tue, 7 Apr 2026 15:34:15 +0000

More details: https://github.com/scverse/scanpy/pull/3570/checks?check_run_id=70256269762

flying-sheep · 2025-04-11T16:38:20Z

according to the benchmarks, this is actually slower than before.

the benchmarks are a bit flaky, so it could be wrong, but usually a factor of 2.5 like here is real.

Reovirus · 2025-04-11T17:47:48Z

Understand. Try to change and speed up

for more information, see https://pre-commit.ci

Reovirus · 2025-04-11T23:26:10Z

@flying-sheep
We have a small win (23.4±2ms | 21.8±0.8ms | 0.93 | tools.time_score_genes, too high p-value to display cause stderr is high, implementation is lessly effective on smal-amount-of-zeros data) in score_genes time and small lose in memory. Should I optimize memory usage? (it can be caused by mask=np.isnan(data), instead of thic I can backgroundly recalc mask in np.nan_to_num and reduce memory on 1 array)))

Intron7 · 2025-04-14T06:19:48Z

I think these should be parallel numba kernels with mask arguments for major and minor axis. I have a working implementation in rapids-singlecell that doesnt need to copy at all. Starting from _get_mean_var is I think the best way forward

Intron7

Refactor kernels to be 0 copy

for more information, see https://pre-commit.ci

Reovirus · 2025-04-30T18:11:41Z

@Intron7 I rewrite logics. But my local benchmarking gives another result (I just measure peak memory by ), I'm trying to fix it. And I have no idea about some metrics like preprocessing_counts.FastSuite.time_log1p('pbmc3k', 'counts-off-axis'). It change sign randomly, is it normal?

Intron7 · 2025-05-13T09:18:49Z

The kernels already look good. However my main point I was trying to make is that you can create the means without subsetting. That means you wouldnt need to use the get subset function but just use masks within your kernel.

Zethson

Just typos I think. These should be renamed.

src/scanpy/tools/_score_genes.py

codecov · 2026-01-12T13:32:29Z

Codecov Report

❌ Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.49%. Comparing base (9bc2c1e) to head (24f042e).
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/scanpy/tools/_score_genes.py	77.77%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3570      +/-   ##
==========================================
- Coverage   78.51%   78.49%   -0.02%     
==========================================
  Files         117      117              
  Lines       12753    12751       -2     
==========================================
- Hits        10013    10009       -4     
- Misses       2740     2742       +2

Flag	Coverage Δ
hatch-test.low-vers	`77.78% <77.77%> (-0.02%)`	⬇️
hatch-test.pre	`77.45% <77.77%> (+12.61%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/scanpy/tools/_score_genes.py	`85.45% <77.77%> (-2.05%)`	⬇️

flying-sheep · 2026-01-12T13:45:53Z

I merged upstream changes. Do you still plan to work in the changes @Intron7 requested?

Reovirus and others added 11 commits April 11, 2025 16:52

Refactored scanpy.tools._sparse_nanmean to eliminate unnecessary data…

acfbd11

… copying

[pre-commit.ci] auto fixes from pre-commit.com hooks

2882948

for more information, see https://pre-commit.ci

rewrite logics with numba (for scipy <1.15.0)

a36b33c

Add types

cdb443b

[pre-commit.ci] auto fixes from pre-commit.com hooks

8021f8c

for more information, see https://pre-commit.ci

correct jit docorators

523cccc

add correct fuction names

968e093

rewrite logics without prange (prange tries to rewrite one element in…

9d06854

… the same time

[pre-commit.ci] auto fixes from pre-commit.com hooks

4087447

for more information, see https://pre-commit.ci

one ptr copy

45688e7

[pre-commit.ci] auto fixes from pre-commit.com hooks

e089438

for more information, see https://pre-commit.ci

Reovirus force-pushed the _sparse_nanmean_is_inefficient branch from 68e1e9c to e089438 Compare April 11, 2025 14:55

add score_genes benchmark

fa3c5c2

flying-sheep added the benchmark label Apr 11, 2025

flying-sheep added this to the 1.11.2 milestone Apr 11, 2025

Reovirus and others added 3 commits April 11, 2025 20:59

some changes

cad1db9

replace np.add.at by np.bincount + add some njint

baa0959

[pre-commit.ci] auto fixes from pre-commit.com hooks

6b3e891

for more information, see https://pre-commit.ci

Intron7 requested changes Apr 14, 2025

View reviewed changes

flying-sheep and others added 6 commits April 14, 2025 14:47

Merge branch 'main' into _sparse_nanmean_is_inefficient

2bb6d2a

Merge branch 'scverse:main' into _sparse_nanmean_is_inefficient

ffee669

add release notes

1c3a67e

[pre-commit.ci] auto fixes from pre-commit.com hooks

2630ee0

for more information, see https://pre-commit.ci

style

7130f25

[pre-commit.ci] auto fixes from pre-commit.com hooks

7e23b19

for more information, see https://pre-commit.ci

flying-sheep modified the milestones: 1.11.2, 1.11.3 May 28, 2025

ilan-gold modified the milestones: 1.11.3, 1.11.4 Jul 1, 2025

flying-sheep modified the milestones: 1.11.4, 1.11.6 Oct 21, 2025

Zethson reviewed Dec 19, 2025

View reviewed changes

src/scanpy/tools/_score_genes.py Outdated Show resolved Hide resolved

src/scanpy/tools/_score_genes.py Outdated Show resolved Hide resolved

flying-sheep added 2 commits January 12, 2026 14:22

Merge branch 'main' into pr/Reovirus/3570

0506e85

fix: typo

762096f

smaller relnote

8de0847

Merge branch 'main' into pr/Reovirus/3570

67dffea

flying-sheep modified the milestones: 1.11.6, 1.12.1 Apr 7, 2026

flying-sheep added 2 commits April 7, 2026 15:49

oops

e42a63e

Rename performance notes file to 3570.perf.md

24f042e

flying-sheep changed the title ~~Refactored scanpy.tools._sparse_nanmean to eliminate unnecessary data…~~ perf: more efficient _sparse_nanmean Apr 7, 2026

flying-sheep removed the benchmark label Apr 7, 2026

Conversation

Reovirus commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reovirus commented Apr 8, 2025

Uh oh!

Reovirus commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reovirus commented Apr 9, 2025

Uh oh!

Zethson commented Apr 11, 2025

Uh oh!

Reovirus commented Apr 11, 2025

Uh oh!

Reovirus commented Apr 11, 2025

Uh oh!

Reovirus commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

flying-sheep commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reovirus commented Apr 11, 2025

Uh oh!

scverse-benchmark bot commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark changes

Uh oh!

flying-sheep commented Apr 11, 2025

Uh oh!

Reovirus commented Apr 11, 2025

Uh oh!

Reovirus commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Intron7 commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Intron7 left a comment

Choose a reason for hiding this comment

Uh oh!

Reovirus commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Intron7 commented May 13, 2025

Uh oh!

Zethson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

flying-sheep commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Reovirus commented Apr 8, 2025 •

edited

Loading

Reovirus commented Apr 8, 2025 •

edited

Loading

Reovirus commented Apr 11, 2025 •

edited

Loading

flying-sheep commented Apr 11, 2025 •

edited

Loading

scverse-benchmark bot commented Apr 11, 2025 •

edited

Loading

Reovirus commented Apr 11, 2025 •

edited

Loading

Intron7 commented Apr 14, 2025 •

edited

Loading

Reovirus commented Apr 30, 2025 •

edited

Loading

codecov bot commented Jan 12, 2026 •

edited

Loading