Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bin_and_custom_split only returns split test data #4

Open
henrymj opened this issue Jun 22, 2022 · 0 comments
Open

bin_and_custom_split only returns split test data #4

henrymj opened this issue Jun 22, 2022 · 0 comments
Assignees

Comments

@henrymj
Copy link
Contributor

henrymj commented Jun 22, 2022

X_train_all, X_test_all, y_train, y_test = train_test_split(xdata_train_binned,ydata_train_binned,stratify=ydata_train_binned)

values returned on this line get overwritten by the next line. It also doesn't use test_size, so you're not getting control of the train proportion.

Concerned this means that e.g. if you thought you trained on color and tested on orientation (from the docstring), you actually just trained and tested on orientation.

This is what we've changed it to temporarily offline:

X_train_all, _, y_train, _ = train_test_split(xdata_train_binned,ydata_train_binned,stratify=ydata_train_binned,test_size=test_size)
_, X_test_all, _, y_test = train_test_split(xdata_test_binned,ydata_test_binned,stratify=ydata_test_binned,test_size=test_size)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants