Skip to content

Comments

fix: MultiOutputClassifier exposes classes_ (swev-id: scikit-learn__scikit-learn-14629)#44

Open
casey-brooks wants to merge 2 commits intoscikit-learn__scikit-learn-14629from
fix/multioutput-classes-cross-val-predict
Open

fix: MultiOutputClassifier exposes classes_ (swev-id: scikit-learn__scikit-learn-14629)#44
casey-brooks wants to merge 2 commits intoscikit-learn__scikit-learn-14629from
fix/multioutput-classes-cross-val-predict

Conversation

@casey-brooks
Copy link

Summary

  • add wrapper-level classes_ to MultiOutputClassifier and refresh it during partial_fit
  • document classes_ and add regression coverage for cross_val_predict with MultiOutputClassifier
  • ensure ClassifierChain order validation keeps expected indentation

Testing

  • . /workspace/sklearn-py37/bin/activate && PYTHONPATH=. pytest sklearn/model_selection/tests/test_validation.py -k multioutput -vv
  • . /workspace/sklearn-py37/bin/activate && flake8 --ignore=E121,E123,E126,E24,E704,W503,W504,E731,E303 sklearn/multioutput.py sklearn/model_selection/tests/test_validation.py
  • . /workspace/sklearn-py37/bin/activate && PYTHONPATH=. python - <<'PY'
    from sklearn.datasets import make_multilabel_classification
    from sklearn.multioutput import MultiOutputClassifier
    from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
    from sklearn.model_selection import cross_val_predict
    X, Y = make_multilabel_classification(random_state=0)
    mo_lda = MultiOutputClassifier(LinearDiscriminantAnalysis())
    cross_val_predict(mo_lda, X, Y, cv=5, method='predict_proba')
    PY

Failure stack trace (before fix)

Traceback (most recent call last):
  File "<stdin>", line 9, in <module>
  File "/workspace/scikit-learn/sklearn/model_selection/_validation.py", line 766, in cross_val_predict
    for train, test in cv.split(X, y, groups))
  File "/workspace/sklearn-py37/lib/python3.7/site-packages/joblib/parallel.py", line 1029, in __call__
    if self.dispatch_one_batch(iterator):
  File "/workspace/sklearn-py37/lib/python3.7/site-packages/joblib/parallel.py", line 847, in dispatch_one_batch
    self._dispatch(tasks)
  File "/workspace/sklearn-py37/lib/python3.7/site-packages/joblib/parallel.py", line 765, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/workspace/sklearn-py37/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/workspace/sklearn-py37/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/workspace/sklearn-py37/lib/python3.7/site-packages/joblib/parallel.py", line 253, in __call__
    for func, args, kwargs in self.items]
  File "/workspace/sklearn-py37/lib/python3.7/site-packages/joblib/parallel.py", line 253, in <listcomp>
    for func, args, kwargs in self.items]
  File "/workspace/scikit-learn/sklearn/model_selection/_validation.py", line 861, in _fit_and_predict
    for i_label in range(len(predictions))]
  File "/workspace/scikit-learn/sklearn/model_selection/_validation.py", line 861, in <listcomp>
    for i_label in range(len(predictions))]
AttributeError: 'MultiOutputClassifier' object has no attribute 'classes_'

Fixes #42

@casey-brooks casey-brooks requested a review from a team December 26, 2025 03:35
@casey-brooks
Copy link
Author

Test & Lint Summary

  • . /workspace/sklearn-py37/bin/activate && PYTHONPATH=. pytest sklearn/model_selection/tests/test_validation.py -k multioutput -vv
    → 3 passed, 57 deselected
  • . /workspace/sklearn-py37/bin/activate && flake8 --ignore=E121,E123,E126,E24,E704,W503,W504,E731,E303 sklearn/multioutput.py sklearn/model_selection/tests/test_validation.py
    → no issues
  • . /workspace/sklearn-py37/bin/activate && PYTHONPATH=. python - <<'PY'
    from sklearn.datasets import make_multilabel_classification
    from sklearn.multioutput import MultiOutputClassifier
    from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
    from sklearn.model_selection import cross_val_predict
    X, Y = make_multilabel_classification(random_state=0)
    mo_lda = MultiOutputClassifier(LinearDiscriminantAnalysis())
    pred_proba = cross_val_predict(mo_lda, X, Y, cv=5, method='predict_proba')
    print(len(pred_proba))
    print([arr.shape for arr in pred_proba])
    PY
    → prints and

@casey-brooks
Copy link
Author

Repro script output: 5 and [(100, 2), (100, 2), (100, 2), (100, 2), (100, 2)]

Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling the MultiOutputClassifier regression and adding coverage. One existing regression guard () lost its assertion on the 3D-input smoke test; please reinstate the check so we keep that protection in place.

Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good—thanks for restoring the 3D-input assertion and for the overall fix. The tests and MultiOutputClassifier updates resolve the issue on my side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants