Skip to content

Comments

swev-id: scikit-learn__scikit-learn-25232 add fill_value support to IterativeImputer#57

Open
casey-brooks wants to merge 2 commits intoscikit-learn__scikit-learn-25232from
feat/iterative-imputer-fill-value
Open

swev-id: scikit-learn__scikit-learn-25232 add fill_value support to IterativeImputer#57
casey-brooks wants to merge 2 commits intoscikit-learn__scikit-learn-25232from
feat/iterative-imputer-fill-value

Conversation

@casey-brooks
Copy link

@casey-brooks casey-brooks commented Dec 26, 2025

Summary

  • allow IterativeImputer to accept an imputer instance for initial_strategy and sync compatible parameters
  • add the optional fill_value for constant initialization and document the behavior
  • cover the new behavior with regression tests for strategy instances and fill values

Testing

  • LD_LIBRARY_PATH=/workspace/venv/lib/python3.11/site-packages/scikit_learn.libs:/nix/store/qipd93x9gjyiygqk673rd2ssnf8y7jj0-gcc-14.3.0-lib/lib:/nix/store/4wdz42ns29ys6fm1xak68bnp51nxhd2s-zlib-1.3.1/lib /workspace/venv/bin/pytest sklearn/impute/tests/test_impute.py -k "fill_value or imputer_instance"
  • /workspace/venv/bin/flake8 sklearn/impute/_iterative.py sklearn/impute/tests/test_impute.py

Failure Reproduction

from sklearn.experimental import enable_iterative_imputer  # noqa
from sklearn.impute import IterativeImputer, SimpleImputer
import numpy as np

X = np.array([[np.nan, 1.], [2., 3.]])
IterativeImputer(
    initial_strategy=SimpleImputer(strategy='constant', fill_value=np.nan),
    max_iter=0,
).fit(X)
Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
  File "/workspace/venv/lib/python3.11/site-packages/sklearn/impute/_iterative.py", line 879, in fit
    self.fit_transform(X)
  File "/workspace/venv/lib/python3.11/site-packages/sklearn/utils/_set_output.py", line 157, in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/venv/lib/python3.11/site-packages/sklearn/base.py", line 1145, in wrapper
    estimator._validate_params()
  File "/workspace/venv/lib/python3.11/site-packages/sklearn/base.py", line 638, in _validate_params
    validate_parameter_constraints(
  File "/workspace/venv/lib/python3.11/site-packages/sklearn/utils/_param_validation.py", line 96, in validate_parameter_constraints
    raise InvalidParameterError(
sklearn.utils._param_validation.InvalidParameterError: The 'initial_strategy' parameter of IterativeImputer must be a str among {'median', 'mean', 'most_frequent', 'constant'}. Got SimpleImputer(fill_value=nan, strategy='constant') instead.

Additional Notes

@casey-brooks casey-brooks requested a review from a team December 26, 2025 13:33
@casey-brooks
Copy link
Author

Test & Lint Summary

  • LD_LIBRARY_PATH=/workspace/venv/lib/python3.11/site-packages/scikit_learn.libs:/nix/store/qipd93x9gjyiygqk673rd2ssnf8y7jj0-gcc-14.3.0-lib/lib:/nix/store/4wdz42ns29ys6fm1xak68bnp51nxhd2s-zlib-1.3.1/lib /workspace/venv/bin/pytest sklearn/impute/tests/test_impute.py -k "fill_value or imputer_instance" → 5 passed (216 deselected)
  • /workspace/venv/bin/flake8 sklearn/impute/_iterative.py sklearn/impute/tests/test_impute.py → no lint errors

Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] The feature branch is based on an extremely old snapshot of scikit-learn (pre-2012). The diff reintroduces the legacy scikits.learn package layout, deletes the modern sklearn package, and drops all of our current build/test infrastructure (e.g. .github/workflows/unit-tests.yml). This would break every import and nuke CI across the project. Please rebase on the current base branch and limit the change set to the IterativeImputer improvements described in the PR summary.

Because the diff rewrites the entire repository, I cannot verify the intended feature or its tests. Once the branch is rebased and trimmed to the actual functional change, I can take another pass.

Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tightening the initialisation story and adding the regression coverage. The behaviour overall looks solid, but there is one blocker noted inline: once we accept arbitrary imputers for initial_strategy, we must guarantee they expose get_feature_names_out, otherwise the existing IterativeImputer.get_feature_names_out path will explode with an AttributeError. Please tighten the parameter constraint (and docstring) to keep that invariant intact.

Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the invariant. Requiring get_feature_names_out keeps IterativeImputer.get_feature_names_out safe, and the updated docstring/tests look good. The new fill_value plumbing and instance-handling both behave as advertised, with regression coverage in place. LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants