Skip to content

IterativeImputer: add fill_value and allow imputer instance for initial_strategy #54

@rowan-stein

Description

@rowan-stein

User request

IterativeImputer has no parameter "fill_value".

In the first imputation round of IterativeImputer, an initial value needs to be set for the missing values. From its docs:

initial_strategy {‘mean’, ‘median’, ‘most_frequent’, ‘constant’}, default=’mean’
Which strategy to use to initialize the missing values. Same as the strategy parameter in SimpleImputer.

I set initial_strategy='constant' and want to define the constant via fill_value, mirroring SimpleImputer’s behavior (including allowing np.nan). Also, passing a SimpleImputer instance as initial_strategy was suggested but currently fails.

Proposed solution

Add a fill_value parameter to IterativeImputer used when initial_strategy == 'constant'. Additionally, allow passing an imputer instance (e.g., SimpleImputer) as the value of initial_strategy for custom initialization.

Researcher specification

  • API changes

    • Add fill_value=None to IterativeImputer.__init__.
    • Broaden initial_strategy to accept either a string in {"mean","median","most_frequent","constant"} or an imputer instance implementing fit, transform, and get_params (e.g., SimpleImputer).
    • Semantics: fill_value is used only for the 'constant' strategy; if an imputer instance is provided, fill_value is ignored (document and warn).
    • Defaults match SimpleImputer: when fill_value=None, defaults to 0 for numeric and 'missing_value' for object/string arrays (though IterativeImputer validates numeric input).
  • Initialization behavior

    • In _initial_imputation, construct self.initial_imputer_ as follows:
      • If initial_strategy is a string: build SimpleImputer(missing_values=self.missing_values, strategy=self.initial_strategy, fill_value=self.fill_value, keep_empty_features=self.keep_empty_features).
      • If initial_strategy is an imputer instance: clone() it; align missing_values and keep_empty_features via set_params if supported; set as self.initial_imputer_.
    • Fit/transform unchanged beyond delegating to the constructed imputer.
  • Validation

    • Update _parameter_constraints to allow either string options or a HasMethods(["fit","transform","get_params"]) for initial_strategy.
    • fill_value accepts any Python object; validation is delegated to the chosen imputer.
    • If both an imputer instance and fill_value are provided, issue a UserWarning that fill_value is ignored.
    • Explicitly accept np.nan as fill_value.
  • Serialization/params

    • fill_value is discoverable by get_params/set_params.
    • Cloning behavior remains compatible when an imputer instance is passed.
  • Documentation

    • Update docstring for initial_strategy and fill_value plus examples showing initial_strategy=SimpleImputer(strategy='constant', fill_value=np.nan).
  • Tests

    • Verify 'constant' strategy uses fill_value for numeric arrays, including np.nan.
    • Default fill_value=None behaves as SimpleImputer(strategy='constant').
    • Passing a SimpleImputer instance works and matches its initialization behavior.
    • Warning emitted when both an imputer instance and fill_value are supplied.
    • Align missing_values when instance is provided.

Additional context

Current behavior when passing a SimpleImputer instance to initial_strategy raises a ValueError (only string strategies are accepted). The change will resolve that and add fill_value support in IterativeImputer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions