Skip to content

Add Comparator equality for pathlib, numpy arrays, and pandas objects#1115

Open
ghostiee-11 wants to merge 2 commits intoholoviz:mainfrom
ghostiee-11:feat/comparator-more-types
Open

Add Comparator equality for pathlib, numpy arrays, and pandas objects#1115
ghostiee-11 wants to merge 2 commits intoholoviz:mainfrom
ghostiee-11:feat/comparator-more-types

Conversation

@ghostiee-11
Copy link
Copy Markdown

Description

Comparator.is_equal lacks support for common types like numpy arrays, pandas DataFrames/Series, and pathlib paths. Assigning an identical array to a parameter always triggers watchers because the comparator falls through to return False.

This PR adds equality support for:

  • pathlib.PurePath - uses operator.eq
  • numpy arrays - uses np.array_equal with shape/dtype pre-checks
  • pandas DataFrame/Series - uses .equals() with shape pre-check

To avoid expensive element-wise comparisons on large data, both numpy and pandas paths bail out (return False) when the object has more than array_max_size elements (default 1,000,000). Better to trigger an unnecessary watcher than to block on a million-element comparison.

The numpy and pandas checks use lambda predicates (type(o).__module__.startswith(...)) so neither library is imported at module level.

Closes #902

How Has This Been Tested?

38 tests in tests/testcomparator.py:

  • Parametrized identity tests for all supported types (str, int, float, decimal, bytes, None, list, tuple, set, dict, date, datetime, pathlib.Path, PurePosixPath, np.datetime64, np.array variants, pd.Timestamp, pd.Series, pd.DataFrame)
  • pathlib: equal paths, unequal paths, PurePosixPath
  • numpy: equal arrays, unequal arrays, different shape, different dtype, identity, size cutoff, NaN handling
  • pandas: equal/unequal Series and DataFrames, different shape, NaN handling (pd.equals treats NaN as equal), size cutoff
pytest tests/testcomparator.py -v  # 38 passed

Checklist

  • Tests added and passing
  • Added documentation

Extends Comparator.equalities with support for pathlib.PurePath (using
operator.eq), numpy arrays (using np.array_equal with dtype and shape
checks), and pandas DataFrame/Series (using .equals()). Large arrays
and frames above array_max_size (default 1M elements) skip comparison
and return False to avoid expensive element-wise checks. Adds 38 tests
covering equal, not-equal, shape mismatch, dtype mismatch, NaN handling,
identity, and size cutoff behavior.

Closes holoviz#902
Copilot AI review requested due to automatic review settings March 13, 2026 05:53
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds richer equality semantics to param.parameterized.Comparator.is_equal so watcher triggering better reflects “no real change” for common scientific/python types, addressing #902.

Changes:

  • Add Comparator equality handling for pathlib.PurePath, numpy array-like objects, and pandas objects with size-based cutoffs to avoid expensive comparisons.
  • Introduce Comparator.array_max_size threshold (default 1,000,000 elements) to bail out of large numpy/pandas comparisons.
  • Expand tests/testcomparator.py coverage for the newly supported types and cutoff behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
param/parameterized.py Extends Comparator with pathlib + numpy/pandas equality logic and introduces array_max_size cutoff.
tests/testcomparator.py Adds new parametrized and targeted tests for pathlib/numpy/pandas comparator behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2124 to +2128
def _array_equal(obj1, obj2):
"""Equality check for numpy arrays with a size cutoff."""
import numpy as np
if obj1 is obj2:
return True
Comment on lines +2116 to +2117
lambda o: type(o).__module__.startswith('numpy') and hasattr(o, 'shape'): lambda a, b: Comparator._array_equal(a, b),
lambda o: type(o).__module__.startswith('pandas') and hasattr(o, 'equals'): lambda a, b: Comparator._pandas_equal(a, b),
b = pd.Series(range(10))
assert not Comparator.is_equal(a, b)
finally:
Comparator.array_max_size = old
…dd DataFrame cutoff test

- Tighten numpy lambda to require shape, dtype, and size attributes
- Tighten pandas lambda to require shape, size, and equals attributes
- Move identity and type checks before numpy import in _array_equal
- Catch ImportError in _array_equal in case numpy is not available
- Wrap attribute access (shape, dtype, size) in try/except AttributeError
- Add test_dataframe_large_skips for DataFrame size cutoff coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Define Comparator equality functions for more types

2 participants