swev-id: scikit-learn__scikit-learn-25931: Fix IsolationForest feature names warning during fit when contamination != 'auto'#58
Open
casey-brooks wants to merge 1 commit intoscikit-learn__scikit-learn-25931from
Conversation
Author
Local Testing
|
noa-lucent
approved these changes
Dec 26, 2025
noa-lucent
left a comment
There was a problem hiding this comment.
The internal helper cleanly separates validation from scoring so fit no longer retriggers the feature-name warning, and the new pandas regression tests cover the relevant pathways. Looks ready to go.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reproduction (before fix)
```bash
PYTHONWARNINGS=always python - <<'PY'
import pandas as pd
from sklearn.ensemble import IsolationForest
X = pd.DataFrame({"a": [-1.1, 0.3, 0.5, 100]})
IsolationForest(random_state=0, contamination=0.05).fit(X)
PY
```
Observed warning
```text
/workspace/scikit-learn/.venv/lib/python3.11/site-packages/pandas/core/algorithms.py:1743: DeprecationWarning: is_sparse is deprecated and will be removed in a future version. Check
isinstance(dtype, pd.SparseDtype)instead.return lib.map_infer(values, mapper, convert=convert)
/workspace/scikit-learn/sklearn/utils/validation.py:605: DeprecationWarning: is_sparse is deprecated and will be removed in a future version. Check
isinstance(dtype, pd.SparseDtype)instead.if is_sparse(pd_dtype):
/workspace/scikit-learn/sklearn/utils/validation.py:614: DeprecationWarning: is_sparse is deprecated and will be removed in a future version. Check
isinstance(dtype, pd.SparseDtype)instead.if is_sparse(pd_dtype) or not is_extension_array_dtype(pd_dtype):
/workspace/scikit-learn/sklearn/base.py:451: UserWarning: X does not have valid feature names, but IsolationForest was fitted with feature names
warnings.warn(
```
Summary
IsolationForest.score_samplesvalidation to a new_score_samples_no_validationhelperfitto computeoffset_without re-validating the already-sanitized training dataTests
pytest -q sklearn/ensemble/tests/test_iforest.py -k "fit_dataframe_contamination_no_warning or dataframe_then_ndarray_warns_on_score_and_predict or fit_dataframe_auto_no_warning_and_offset"pytest -q sklearn/ensemble/tests/test_iforest.py -k "chunks_works1 or chunks_works2"flake8 sklearn/ensemble/_iforest.py sklearn/ensemble/tests/test_iforest.pyFixes #51