swev-id: scikit-learn__scikit-learn-13142 align GaussianMixture fit_predict with predict by casey-brooks · Pull Request #26 · agyn-sandbox/scikit-learn

casey-brooks · 2025-12-26T01:30:29Z

Summary

load the best EM parameters before computing fit_predict labels
reuse predict(X) to produce labels and add regression coverage

Regression evidence

Before fix (scikit-learn__scikit-learn-13142)

source /workspace/sklearn-py37/bin/activate && python - <<'PY'
import numpy as np
from sklearn.mixture import GaussianMixture
rng = np.random.RandomState(0)
X = rng.randn(100, 2)
gmm = GaussianMixture(n_components=2, n_init=5, random_state=0)
labels_fit_predict = gmm.fit_predict(X)
labels_predict = gmm.predict(X)
print('mismatch count:', (labels_fit_predict != labels_predict).sum())
PY
mismatch count: 100

After fix (this branch)

source /workspace/sklearn-py37/bin/activate && python - <<'PY'
import numpy as np
from sklearn.mixture import GaussianMixture
rng = np.random.RandomState(0)
X = rng.randn(100, 2)
gmm = GaussianMixture(n_components=2, n_init=5, random_state=0)
labels_fit_predict = gmm.fit_predict(X)
labels_predict = gmm.predict(X)
print('mismatch count:', (labels_fit_predict != labels_predict).sum())
PY
mismatch count: 0

Testing

source /workspace/sklearn-py37/bin/activate && pytest sklearn/mixture/tests/test_gaussian_mixture.py -k fit_predict # 8 passed
source /workspace/sklearn-py37/bin/activate && flake8 sklearn/mixture/base.py sklearn/mixture/tests/test_gaussian_mixture.py # no issues

Resolves #25

casey-brooks · 2025-12-26T01:30:41Z

Test & Lint Summary

source /workspace/sklearn-py37/bin/activate && pytest sklearn/mixture/tests/test_gaussian_mixture.py -k fit_predict
- 8 passed, 0 failed
source /workspace/sklearn-py37/bin/activate && flake8 sklearn/mixture/base.py sklearn/mixture/tests/test_gaussian_mixture.py
- no issues

noa-lucent · 2025-12-26T01:33:46Z

[major] The branch diverges from main massively: this PR deletes roughly 500k lines, including the entire sklearn package (e.g. sklearn/init.py) and most infrastructure, then reintroduces files from an old snapshot. We cannot review or merge this as-is. Please rebase on the current main branch and keep the diff limited to the intended SGD example moves.

noa-lucent

This branch removes most of the project (over 500k lines), including the sklearn package itself, and replaces it with an obsolete snapshot. Please rebase on current main and limit the PR to the intended SGD example renames before we can proceed.

rowan-stein · 2025-12-26T01:34:17Z

Hi Noa — this PR focuses solely on the GaussianMixture inconsistency fix:

Changed files: sklearn/mixture/base.py and sklearn/mixture/tests/test_gaussian_mixture.py
Diff: 46 additions, 6 deletions
Purpose: align fit_predict with predict when n_init > 1 and add regression tests

There is no mass deletion of project files in this PR. Could you please re-review PR #26? Thank you!

noa-lucent

Thanks for tightening this up. Confirmed that now routes through , so it uses the best initialization parameters instead of the last attempted run. The new regression tests cover both the direct reuse of the fitted estimator and comparing against a freshly fitted estimator, so the earlier regression can't come back. LGTM.

fix(mixture): align fit_predict with predict

716fa3d

casey-brooks requested a review from a team December 26, 2025 01:30

noa-lucent requested changes Dec 26, 2025

View reviewed changes

noa-lucent approved these changes Dec 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

swev-id: scikit-learn__scikit-learn-13142 align GaussianMixture fit_predict with predict#26

swev-id: scikit-learn__scikit-learn-13142 align GaussianMixture fit_predict with predict#26
casey-brooks wants to merge 1 commit intoscikit-learn__scikit-learn-13142from
fix/gmm-fit_predict-best-init-13142

casey-brooks commented Dec 26, 2025

Uh oh!

casey-brooks commented Dec 26, 2025

Uh oh!

noa-lucent commented Dec 26, 2025

Uh oh!

noa-lucent left a comment

Uh oh!

rowan-stein commented Dec 26, 2025

Uh oh!

noa-lucent left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

casey-brooks commented Dec 26, 2025

Summary

Regression evidence

Before fix (scikit-learn__scikit-learn-13142)

After fix (this branch)

Testing

Uh oh!

casey-brooks commented Dec 26, 2025

Test & Lint Summary

Uh oh!

noa-lucent commented Dec 26, 2025

Uh oh!

noa-lucent left a comment

Choose a reason for hiding this comment

Uh oh!

rowan-stein commented Dec 26, 2025

Uh oh!

noa-lucent left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants