You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get the following error when trying to use the autosplit feature, regardless of whether I split on a numerical or categorical variable.
2019-12-02 20:08:06,514 ERROR Traceback (most recent call last):
File "<string>", line 182, in auto_split
File "/home/dataiku/dss/plugins/installed/decision-tree-builder/python-lib/dku_idtb_decision_tree/autosplit.py", line 24, in autosplit
return compute_splits(df[[feature]], df[target], max_splits)
File "/home/dataiku/dss/plugins/installed/decision-tree-builder/python-lib/dku_idtb_decision_tree/autosplit.py", line 58, in compute_splits
tree_estimator.fit(feature_df, target_col)
File "/home/dataiku/dataiku-dss-5.1.2/python.packages/sklearn/tree/tree.py", line 801, in fit
X_idx_sorted=X_idx_sorted)
File "/home/dataiku/dataiku-dss-5.1.2/python.packages/sklearn/tree/tree.py", line 140, in fit
check_classification_targets(y)
File "/home/dataiku/dataiku-dss-5.1.2/python.packages/sklearn/utils/multiclass.py", line 171, in check_classification_targets
raise ValueError("Unknown label type: %r" % y_type)
ValueError: Unknown label type: 'continuous'
Could there be a problem in my data causing this. Alternatively, I wonder if the autosplit module should use DecisionTreeRegressor rather than DecisionTreeClassifier.
The text was updated successfully, but these errors were encountered:
This issue is due to your target feature containing floats. Autosplit would work with any discrete values (booleans, integers or strings).
If you are doing a multiclass classification with floats as classes, a workaround would be to convert your classes (for instance, add a string prefix in a Prepare recipe) so that they would be handled properly by scikit-learn.
Please note that for now, we do not support regression but only classification (binary or multiclass); thus we use DecisionTreeClassifier.
I get the following error when trying to use the autosplit feature, regardless of whether I split on a numerical or categorical variable.
Could there be a problem in my data causing this. Alternatively, I wonder if the autosplit module should use DecisionTreeRegressor rather than DecisionTreeClassifier.
The text was updated successfully, but these errors were encountered: