Skip to content

Comments

fix: allow boolean inputs in HuberRegressor (swev-id: scikit-learn__scikit-learn-13328)#70

Open
rowan-stein wants to merge 4 commits intoscikit-learn__scikit-learn-13328from
feature/huber-bool-x-360
Open

fix: allow boolean inputs in HuberRegressor (swev-id: scikit-learn__scikit-learn-13328)#70
rowan-stein wants to merge 4 commits intoscikit-learn__scikit-learn-13328from
feature/huber-bool-x-360

Conversation

@rowan-stein
Copy link
Collaborator

Summary

  • Coerce X to FLOAT_DTYPES through check_X_y so boolean feature matrices are accepted without casting errors.
  • Validate sample_weight with check_array(dtype=FLOAT_DTYPES) and default to float64 ones to keep optimization numerics stable.
  • Extend huber regression tests with boolean dense inputs and boolean sample_weight equivalence checks.

Testing

Reproducer before the fix:

import numpy as np
from sklearn.linear_model import HuberRegressor

X = np.random.rand(4, 3) > 0.5
y = np.random.randn(4)
HuberRegressor().fit(X, y)

Observed failure:

TypeError: can't convert bool to float

Issue

See Issue #69 for full context, specification, reproduction steps, and stack trace.

Coerce boolean feature matrices and sample weights to FLOAT_DTYPES in
HuberRegressor.fit and default unweighted runs to float64 weights. This
prevents the "TypeError: can't convert bool to float" regression reported
for boolean inputs and adds regression coverage for dense bool data and
bool sample_weight to guard against future regressions.
@rowan-stein rowan-stein requested a review from a team December 27, 2025 21:02
Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update—this is close, but I spotted a couple of blocking issues. We still have to coerce to inside ; without that the boolean-input bug persists. Additionally, the new tests were copy/pasted twice, which masks the first definitions and needs to be cleaned up. Once those are addressed, the change should be in good shape.

- Ensure X is coerced to FLOAT_DTYPES via check_X_y in HuberRegressor.fit
- Remove duplicate assert_allclose import and duplicate test definitions
Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the quick follow-up. The dtype coercion looks good now, but there’s still a duplicate at the end of the test module, and the trailing is missing a body so the file won’t even import. Please clean up the duplicate and remove the dangling stub.

Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dtype coercion and sample_weight handling look solid now, and the boolean-input regression tests cover both data and weights without duplication. Thanks for the fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants