-
Notifications
You must be signed in to change notification settings - Fork 0
Description
User request
IterativeImputer has no parameter "fill_value".
In the first imputation round of IterativeImputer, an initial value needs to be set for the missing values. From its docs:
initial_strategy {‘mean’, ‘median’, ‘most_frequent’, ‘constant’}, default=’mean’
Which strategy to use to initialize the missing values. Same as the strategy parameter in SimpleImputer.
I set initial_strategy='constant' and want to define the constant via fill_value, mirroring SimpleImputer’s behavior (including allowing np.nan). Also, passing a SimpleImputer instance as initial_strategy was suggested but currently fails.
Proposed solution
Add a fill_value parameter to IterativeImputer used when initial_strategy == 'constant'. Additionally, allow passing an imputer instance (e.g., SimpleImputer) as the value of initial_strategy for custom initialization.
Researcher specification
-
API changes
- Add
fill_value=NonetoIterativeImputer.__init__. - Broaden
initial_strategyto accept either a string in {"mean","median","most_frequent","constant"} or an imputer instance implementingfit,transform, andget_params(e.g.,SimpleImputer). - Semantics:
fill_valueis used only for the'constant'strategy; if an imputer instance is provided,fill_valueis ignored (document and warn). - Defaults match
SimpleImputer: whenfill_value=None, defaults to0for numeric and'missing_value'for object/string arrays (though IterativeImputer validates numeric input).
- Add
-
Initialization behavior
- In
_initial_imputation, constructself.initial_imputer_as follows:- If
initial_strategyis a string: buildSimpleImputer(missing_values=self.missing_values, strategy=self.initial_strategy, fill_value=self.fill_value, keep_empty_features=self.keep_empty_features). - If
initial_strategyis an imputer instance:clone()it; alignmissing_valuesandkeep_empty_featuresviaset_paramsif supported; set asself.initial_imputer_.
- If
- Fit/transform unchanged beyond delegating to the constructed imputer.
- In
-
Validation
- Update
_parameter_constraintsto allow either string options or aHasMethods(["fit","transform","get_params"])forinitial_strategy. fill_valueaccepts any Python object; validation is delegated to the chosen imputer.- If both an imputer instance and
fill_valueare provided, issue aUserWarningthatfill_valueis ignored. - Explicitly accept
np.nanasfill_value.
- Update
-
Serialization/params
fill_valueis discoverable byget_params/set_params.- Cloning behavior remains compatible when an imputer instance is passed.
-
Documentation
- Update docstring for
initial_strategyandfill_valueplus examples showinginitial_strategy=SimpleImputer(strategy='constant', fill_value=np.nan).
- Update docstring for
-
Tests
- Verify
'constant'strategy usesfill_valuefor numeric arrays, includingnp.nan. - Default
fill_value=Nonebehaves asSimpleImputer(strategy='constant'). - Passing a
SimpleImputerinstance works and matches its initialization behavior. - Warning emitted when both an imputer instance and
fill_valueare supplied. - Align
missing_valueswhen instance is provided.
- Verify
Additional context
Current behavior when passing a SimpleImputer instance to initial_strategy raises a ValueError (only string strategies are accepted). The change will resolve that and add fill_value support in IterativeImputer.