Improve performance with sparse Jacobians #4037

hersle · 2025-11-29T12:20:04Z

I noticed two things when profiling with sparse+analytical Jacobians generated by MTK in my package:

using SymBoltz
M = ΛCDM(K = nothing) # creates a System with 219 equations
pars = parameters_Planck18(M)
ks = 1.0:1.0:100.0

@profview prob = CosmologyProblem(M, pars; jac = true, sparse = true) # creates ODEProblem with sparse 54x54 Jacobian
@profview sol = solve(prob, ks; thread = false) # solves 100x ODEProblems with sparse Jacobians

Almost all time in ODEProblem(...; jac = true, sparse = true) is spent inside this iszero check:

The solve is slowed by this sparsity pattern check every time the analytically generated Jacobian function is called:

I suggest two changes:

Circumvent that iszero altogether. It is used when nonzeros of W are explicitly stored in J. But this can be done equivalently by recreating the sparse Jacobian in (I, J, V)-form with explicit zeros added at all of W's nonzero entries.
Make it possible to disable the checking of the sparsity pattern at runtime. My first suggestion is for this checking to be done only if checkbounds = true. I am sure there are also other ways to solve this.

With these changes, the respective sections of the flamegraph disappear and the timings improve:

@time prob = CosmologyProblem(M, pars; jac = true, sparse = true)
@time sol = solve(prob, ks; thread = false)

 31.711084 seconds (262.74 M allocations: 8.367 GiB, 1.73% gc time, 0.78% compilation time: <1% of which was recompilation)
  0.594824 seconds (1.32 M allocations: 346.698 MiB, 34.96% gc time)

to

  4.061829 seconds (40.72 M allocations: 1.243 GiB, 2.49% gc time, 0.10% compilation time: <1% of which was recompilation)
  0.396266 seconds (1.10 M allocations: 154.179 MiB, 7.32% gc time)

Checklist

Appropriate tests were added
Any code changes were done in a way that does not break public API
All documentation related to code changes were updated
The new code follows the
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Any new documentation only uses public API

hersle · 2025-11-29T12:45:19Z

Alternatively, if you think the bounds check should be preserved for safety, I suggest to change it to check only the number of nonzero entries, and not the correctness of every index. Something like length(out.nzval) == length(jac_precomputed.nzval).

AayushSabharwal

Thanks for the contribution! I think this definitely makes sense. I have just a couple minor suggestions below. Could you also add this case to the benchmark script? We can then notice future regressions.

src/systems/codegen.jl

hersle · 2025-12-01T13:26:01Z

Good idea, I will try to make an example that doesn't need any external dependencies and add it to the benchmarks.

…unds to build_function_wrapper

hersle · 2025-12-01T14:43:01Z

I set checkbounds = false by default now. I also made sure that it is still forwarded to build_function_wrapper, like it was before this PR. The default in ODEProblem, which is propagated to generate_*, is also false.

I also added an example to the benchmarks with a "semi-complicated" sparsity pattern. It indeed hits problem 1+2 described in the OP (before this PR):

julia> prob.f.jac_prototype
25×25 SparseArrays.SparseMatrixCSC{Float64, Int64} with 225 stored entries:
⎡⠕⣥⢊⠔⡡⢊⠔⡡⢊⠔⡡⢊⠄⎤
⎢⢊⠔⡱⢎⠔⡡⢊⠔⡡⢊⠔⡡⠂⎥
⎢⡡⢊⠔⡡⢛⢔⡡⢊⠔⡡⢊⠔⡁⎥
⎢⠔⡡⢊⠔⡡⢊⠕⣥⢊⠔⡡⢊⠄⎥
⎢⢊⠔⡡⢊⠔⡡⢊⠔⡱⢎⠔⡡⠂⎥
⎢⡡⢊⠔⡡⢊⠔⡡⢊⠔⡡⢛⢔⡁⎥
⎣⠀⠁⠈⠀⠁⠈⠀⠁⠈⠀⠁⠈⠁⎦

julia> @profview prob = ODEProblem(model, u0, tspan; jac, sparse)

julia> @profview for i in 1:100000 (prob.f.jac.f_iip)(out, prob.u0, prob.p, first(tspan)) end

So the timings should improve with this PR.

hersle · 2025-12-01T15:07:48Z

I think these are the benchmark results, although they fail to be posted here? Copy-paste:

println(replace(copypasta, "\\n" => "\n"))

Benchmark Results (Julia vlts)

Time benchmarks

	master	`9ea5906`...	master / `9ea5906`...
ODEProblem	0.0901 ± 0.007 s	0.0921 ± 0.0076 s	0.978 ± 0.11
init	0.0581 ± 0.0018 ms	0.0582 ± 0.0017 ms	0.999 ± 0.042
large_parameter_init/ODEProblem	1.05 ± 0.018 s	1.02 ± 0.011 s	1.03 ± 0.021
large_parameter_init/init	0.108 ± 0.0025 ms	0.109 ± 0.0025 ms	0.992 ± 0.032
mtkcompile	0.0326 ± 0.0013 s	0.0335 ± 0.0014 s	0.975 ± 0.056
sparse_analytical_jacobian/ODEProblem	0.348 ± 0.0097 s	0.221 ± 0.01 s	1.57 ± 0.084
sparse_analytical_jacobian/f_iip	2.24 ± 2.4 μs	0.07 ± 0.01 μs	32.1 ± 35
sparse_analytical_jacobian/f_oop	0.38 ± 0.0092 ms	0.384 ± 0.0092 ms	0.991 ± 0.034
time_to_load	5.13 ± 0.038 s	5.05 ± 0.035 s	1.02 ± 0.01

Memory benchmarks

	master	`9ea5906`...	master / `9ea5906`...
ODEProblem	0.659 M allocs: 22.1 MB	0.659 M allocs: 22.1 MB	1
init	0.892 k allocs: 0.046 MB	0.892 k allocs: 0.046 MB	1
large_parameter_init/ODEProblem	4.21 M allocs: 0.163 GB	4.21 M allocs: 0.163 GB	0.999
large_parameter_init/init	1.74 k allocs: 0.0856 MB	1.74 k allocs: 0.0856 MB	1
mtkcompile	0.229 M allocs: 8.46 MB	0.229 M allocs: 8.46 MB	1
sparse_analytical_jacobian/ODEProblem	1.21 M allocs: 0.042 GB	0.82 M allocs: 29.3 MB	1.47
sparse_analytical_jacobian/f_iip	3 allocs: 5.95 kB	0 allocs: 0 B
sparse_analytical_jacobian/f_oop	0.634 k allocs: 19.6 kB	0.634 k allocs: 19.6 kB	1
time_to_load	0.153 k allocs: 14.5 kB	0.153 k allocs: 14.5 kB	1

AayushSabharwal · 2025-12-01T15:09:54Z

Yeah, the workflow isn't able to comment if the PR comes from a fork. I don't yet know how to fix that.

AayushSabharwal · 2025-12-01T15:11:12Z

You might have to update some of the tests for this - they expect an AssertionError but it now throws a BoundsError (InterfaceI)

hersle · 2025-12-01T15:48:54Z

Got it. Running those failing tests locally segfaults instead of giving me a BoundsError, though. That should be because the tests are taking the default checkbounds = false, and then crashing when writing to non-existent elements in the sparse matrix. I think the CI tests don't segfault because they are run with --check-bounds=yes.

…se matrices

hersle · 2025-12-01T15:53:08Z

I set checkbounds = true instead (to never run into the segfault) and kept the AssertionError. I can easily change it to BoundsError (which relies on --check-bounds=yes to not segfault) if that makes more sense.

AayushSabharwal · 2025-12-01T15:54:13Z

That makes sense for the test.

hersle · 2025-12-01T20:24:25Z

I believe the tests are ok now.

hersle added 3 commits November 29, 2025 11:46

Add W indices to sparse J to avoid iszero check

1762fd9

Check indices in sparse iip Jacobian function only if checkbounds

4d7e178

Check that checkbounds checks sparse iip Jacobian indices

ac61c8a

Test that sparse+analytical iip Jacobian does not allocate

e14cc64

hersle force-pushed the sparse_performance branch from fe2296b to e14cc64 Compare November 29, 2025 19:18

AayushSabharwal reviewed Dec 1, 2025

View reviewed changes

src/systems/codegen.jl Show resolved Hide resolved

src/systems/codegen.jl Outdated Show resolved Hide resolved

hersle added 3 commits December 1, 2025 14:30

Generate J and W with default checkbounds=false; also forward checkbo…

57340c3

…unds to build_function_wrapper

Document checkbounds usage in generate_W and generate_jacobian

3db221b

Benchmark ODEProblem/f_iip/f_oop with sparse+analytical Jacobian

9ea5906

Set checkbounds=true in tests that call iip J and W with invalid spar…

5ad8691

…se matrices

Uh oh!

Improve performance with sparse Jacobians #4037

Are you sure you want to change the base?

Improve performance with sparse Jacobians #4037

Conversation

hersle commented Nov 29, 2025

Checklist

Uh oh!

hersle commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AayushSabharwal left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hersle commented Dec 1, 2025

Uh oh!

hersle commented Dec 1, 2025

Uh oh!

hersle commented Dec 1, 2025

Benchmark Results (Julia vlts)

Uh oh!

AayushSabharwal commented Dec 1, 2025

Uh oh!

AayushSabharwal commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hersle commented Dec 1, 2025

Uh oh!

hersle commented Dec 1, 2025

Uh oh!

AayushSabharwal commented Dec 1, 2025

Uh oh!

hersle commented Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hersle commented Nov 29, 2025 •

edited

Loading

AayushSabharwal commented Dec 1, 2025 •

edited

Loading