Skip to content

[Bug]: User-supplied jacfn silently overwritten when sparsity is provided #46

@WilliamRoyce

Description

@WilliamRoyce

scikit-SUNDAE Version

sksundae 1.1.1

Python Version

python 3.11

Describe the bug

Summary

When both jacfn and sparsity are provided to IDA(), the user-supplied Jacobian callback is silently replaced by the internal finite-difference sparse Jacobian. The validation layer correctly warns that the sparsity FD approximation will be "ignored in favor of jacfn", but the opposite happens at runtime — the user's jacfn is the one that gets ignored.

Root Cause

In _cy_ida.pyx, the _idaLSSparseDQJac._setup_memory() method unconditionally overwrites aux.jacfn:

# _cy_ida.pyx, line 453-466
cdef _setup_memory(self, void* mem, sunindextype NEQ):
    self.mem = mem
    self.aux.jacfn = self          # <-- UNCONDITIONAL OVERWRITE (line 460)

    if self.aux.linsolver == "sparse":
        nnz = self.sparsity.nnz
        self.aux.np_JJ = np.zeros(nnz, DTYPE)
    else:
        self.aux.np_JJ = np.zeros((NEQ, NEQ), DTYPE)

This is called from the solver initialization at line 720, before the conditional that's supposed to preserve the user's callback:

# _cy_ida.pyx, lines 717-729
sparsity = self._options["sparsity"]
if sparsity is not None:
    spjac = _idaLSSparseDQJac(self.aux, sparsity)
    spjac._setup_memory(self.mem, self.NEQ)    # aux.jacfn overwritten here

    if self._options["jacfn"] is None:          # This check is correct but too late
        self._options["jacfn"] = spjac

jacfn = self._options["jacfn"]                  # Reads user's function (from _options)
if jacfn:
    flag = IDASetJacFn(self.mem, _jacfn_wrapper)  # Installs the C wrapper

Execution Flow

  1. AuxData.__cinit__ (line 309): self.jacfn = options["jacfn"]aux.jacfn = user's function
  2. _setup_memory (line 460): self.aux.jacfn = selfaux.jacfn = FD sparse object (user's function lost)
  3. Line 722: if self._options["jacfn"] is None: → False (user provided one), so _options dict is preserved
  4. Line 727: IDASetJacFn(self.mem, _jacfn_wrapper) → C-level wrapper installed
  5. At runtime, _jacfn_wrapper (line 146) calls aux.jacfn(...) → calls the FD sparse object, not the user's function

The _options dict correctly preserves the user's jacfn, but the C-level wrapper reads from aux.jacfn, which was overwritten at step 2.

Contradicts Documented Behavior

The validation at line 1587-1589 explicitly states the intended precedence:

if (sparsity is not None) and (jacfn is not None):
    warn("Sparse Jacobian approximation will be ignored in favor of 'jacfn'.")

The sparsity docstring also implies jacfn should take precedence:

If 'jacfn' is None, this argument activates a custom Jacobian routine [...]

Both indicate the user's jacfn should win when both are provided. The opposite occurs.

Suggested Fix

In _setup_memory, only overwrite aux.jacfn if the user didn't provide one:

cdef _setup_memory(self, void* mem, sunindextype NEQ):
    self.mem = mem
    if self.aux.jacfn is None:       # <-- Only set if user didn't provide jacfn
        self.aux.jacfn = self

    if self.aux.linsolver == "sparse":
        nnz = self.sparsity.nnz
        self.aux.np_JJ = np.zeros(nnz, DTYPE)
    else:
        self.aux.np_JJ = np.zeros((NEQ, NEQ), DTYPE)

This preserves the memory allocation logic (np_JJ sizing) while respecting user-supplied callbacks.

Steps to Reproduce

Minimal Reproduction

import numpy as np
from scipy.sparse import eye as speye
import sksundae

call_log = []

def resfn(t, y, yp, res):
    """Simple test DAE: yp[0] = -y[0], 0 = y[0] - y[1]."""
    res[0] = yp[0] + y[0]
    res[1] = y[0] - y[1]

def my_jacfn(t, y, yp, res, cj, JJ):
    """Analytical Jacobian — should be called but isn't."""
    call_log.append("my_jacfn called")
    JJ[0] = 1.0 + cj   # dres0/dy0 + cj*dres0/dyp0
    JJ[1] = 0.0         # dres0/dy1 + cj*dres0/dyp1
    JJ[2] = 1.0         # dres1/dy0 + cj*dres1/dyp0
    JJ[3] = -1.0        # dres1/dy1 + cj*dres1/dyp1

sparsity = np.ones((2, 2))  # Dense sparsity (all non-zero)

solver = sksundae.ida.IDA(
    resfn,
    jacfn=my_jacfn,
    sparsity=sparsity,
    linsolver="sparse",
)

sol = solver.solve([0, 1], [1.0, 1.0], [0.0, 0.0])

if not call_log:
    print("BUG: my_jacfn was never called despite being provided!")
    print(f"aux.jacfn is: {type(solver.aux.jacfn)}")
else:
    print(f"OK: my_jacfn was called {len(call_log)} times")

Impact

This prevents users from supplying analytical sparse Jacobians to direct sparse solvers (SuperLU_MT). The user's callback is silently discarded with no error — the solver runs successfully but uses finite-difference approximation instead.

For our project (TIDAL), we have a fully implemented and tested sparse analytical Jacobian (_create_sparse_jacfn()) that we cannot use due to this bug. We currently fall through to colored finite-difference + SuperLU_MT, which works but adds unnecessary FD evaluation cost for systems with 2,000-200,000 state variables.

Relevant log output

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions