Speed up large-index mapping by avoiding repeated unique and using fast integer dedupicator by romainsacchi · Pull Request #33 · brightway-lca/matrix_utils

romainsacchi · 2026-02-26T12:52:32Z

This PR fixes a performance bottleneck in matrix_utils when building large mapped matrices (especially technosphere).

In my large case (~4.46M index entries), almost all runtime was spent inside index deduplication for ArrayMapper, with np.unique taking hundreds of seconds on unsorted integer arrays.

What changed

ArrayMapper: For large integer arrays, use np.sort(pd.unique(array)) instead of plain np.unique(array). Keep np.unique as fallback/for smaller arrays.
MappedMatrix: Collect raw row/col indices from groups and let ArrayMapper deduplicate once.
Avoid redundant per-group unique calls before mapper construction.
ResourceGroup: Added raw mapping index accessors (row_indices_for_mapping, col_indices_for_mapping) used by MappedMatrix.

Result

On the same large technosphere case: build_tech_mm: ~500s -> ~0.65s

No other changes are intended. This should not break anything, but I have not tested that thoroughly.

…st integer dedup

matrix_utils/array_mapper.py

cmutel · 2026-02-26T13:06:10Z

It's crazy that using pandas is faster than numpy for unique, but this is apparently well known...

cmutel · 2026-02-26T13:07:30Z

@jsvgoncalves This PR proposes including pandas in the core Brightway calculation path, with significant calculation speed benefits. I know you have strong opinions on this, care to weigh in?

romainsacchi · 2026-02-26T13:18:54Z

There still seems to be a test failing (tests/monte_carlo.py::test_distributions_without_uncertainties), but it does not appear to be caused by the proposed changes.

cmutel · 2026-02-27T14:22:43Z

The failing test is probabilistic - it should fail a small percentage of the time, or at least that the way it currently works.

Speed up large-index mapping by avoiding repeated unique and using fa…

fc588c3

…st integer dedup

cmutel reviewed Feb 26, 2026

View reviewed changes

matrix_utils/array_mapper.py Show resolved Hide resolved

romainsacchi added 2 commits February 26, 2026 13:58

fix seed handling in stochastic paths

d828cb3

remove object type check, since we always receive int64 objects.

830489f

cmutel reviewed Feb 26, 2026

View reviewed changes

matrix_utils/array_mapper.py Outdated Show resolved Hide resolved

make exception handling more specific

ffdddbf

jsvgoncalves approved these changes Feb 27, 2026

View reviewed changes

Improved tests coverage

86af532

cmutel merged commit ec89c2f into brightway-lca:main Feb 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up large-index mapping by avoiding repeated unique and using fast integer dedupicator#33

Speed up large-index mapping by avoiding repeated unique and using fast integer dedupicator#33
cmutel merged 5 commits intobrightway-lca:mainfrom
romainsacchi:main

romainsacchi commented Feb 26, 2026

Uh oh!

Uh oh!

Uh oh!

cmutel commented Feb 26, 2026

Uh oh!

cmutel commented Feb 26, 2026

Uh oh!

romainsacchi commented Feb 26, 2026

Uh oh!

cmutel commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

romainsacchi commented Feb 26, 2026

What changed

Result

Uh oh!

Uh oh!

Uh oh!

cmutel commented Feb 26, 2026

Uh oh!

cmutel commented Feb 26, 2026

Uh oh!

romainsacchi commented Feb 26, 2026

Uh oh!

cmutel commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants