Skip to content

Inconsistent behavior of hexbin mincnt parameter when C is provided (parity fix) #63

@rowan-stein

Description

@rowan-stein

Inconsistent behavior of hexbins mincnt parameter, depending on C parameter

Task reference: ID 282

Bug report

Bug summary

Different behavior of hexbin's mincnt parameter, depending on whether the C parameter is supplied.

Code for reproduction

from matplotlib import pyplot
import numpy as np

np.random.seed(42)

X, Y = np.random.multivariate_normal([0.0, 0.0], [[1.0, 0.1], [0.1, 1.0]], size=250).T
Z = np.ones_like(X)

extent = [-3., 3., -3., 3.]
gridsize = (7, 7)

# no mincnt, no C
fig, ax = pyplot.subplots(1, 1)
ax.hexbin(X, Y, extent=extent, gridsize=gridsize, linewidth=0.0, cmap='Blues')
ax.set_facecolor("green")

# mincnt=1, no C
fig, ax = pyplot.subplots(1, 1)
ax.hexbin(X, Y, mincnt=1, extent=extent, gridsize=gridsize, linewidth=0.0, cmap='Blues')
ax.set_facecolor("green")

# no mincnt, with C
fig, ax = pyplot.subplots(1, 1)
ax.hexbin(X, Y, C=Z, reduce_C_function=np.sum, extent=extent, gridsize=gridsize, linewidth=0.0, cmap='Blues')
ax.set_facecolor("green")

# mincnt=1, with C
fig, ax = pyplot.subplots(1, 1)
ax.hexbin(X, Y, C=Z, reduce_C_function=np.sum, mincnt=1, extent=extent, gridsize=gridsize, linewidth=0.0, cmap='Blues')
ax.set_facecolor("green")

# mincnt=0, with C
fig, ax = pyplot.subplots(1, 1)
ax.hexbin(X, Y, C=Z, reduce_C_function=np.sum, mincnt=0, extent=extent, gridsize=gridsize, linewidth=0.0, cmap='Blues')
ax.set_facecolor("green")

Actual outcome

  • With no C specified, mincnt=1 shows bins with at least one datum.
  • With C specified but not mincnt, bins with at least one datum are shown.
  • With C specified and mincnt=1, bins require at least two datapoints (unexpected off-by-one).

Expected outcome

With mincnt == 1, the same gridpoints should be plotted whether C is supplied or not—i.e., bins with at least one point should be included.

Additional resources

Research and proposed resolution (by Emerson Gray)

Code locations (branch matplotlib__matplotlib-26113):

  • File: lib/matplotlib/axes/_axes.py, Axes.hexbin
  • C is None path effectively applies counts >= mincnt by setting accum[accum < mincnt] = np.nan.
  • C is not None path uses len(acc) > mincnt to decide reducing via reduce_C_function, and defaults mincnt to 0 when None—leading to off-by-one relative to the C is None path.

Proposed fix:

  • In the C is not None path:
    • If mincnt is None, set mincnt = 1 (default excludes empty bins consistently).
    • Change reduction decision to len(acc) >= mincnt (parity with the C is None path).
  • Update docstring for mincnt to state “at least mincnt” points instead of “more than mincnt”.

Potential side effects considered:

  • Avoids calling reduce_C_function([]) by defaulting mincnt to 1; parity achieved without breaking expected behavior.
  • Masked arrays and color mapping continue to treat failing bins as NaN; unaffected.
  • bins='log'/norm autoscale unaffected; marginals compute reduces where len(ci) > 0 independent of mincnt.

Unit tests to add:

  • Deterministic dataset with two points in separate bins: X=Y=[0.1, 1.6], gridsize=(2,2), extent=(0,2,0,2).
    • Test A: C=None, mincnt=1 includes both bins, values equal to 1.
    • Test B: C=np.ones_like(X), reduce_C_function=np.sum, mincnt=1 includes both bins, sums equal to 1.
    • Optional: mincnt=2 excludes bins with a single point for both paths.

We will submit a PR implementing this change and tests, targeting base branch matplotlib__matplotlib-26113.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions