Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Influence of tuning to CART #1

Open
Gu-Youngfeng opened this issue Jul 9, 2018 · 1 comment
Open

Influence of tuning to CART #1

Gu-Youngfeng opened this issue Jul 9, 2018 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@Gu-Youngfeng
Copy link
Owner

In Guo et al. (ase '13), the parameters minbucket and minsplit of cart is set as follows,

S = len(train_set)
if S <= 100:
    minbucket = np.floor((S/10)+(1/2))
    minsplit = 2*minbucket
else:
    minsplit = np.floor((S/10)+(1/2))
    minbucket = np.floor(minsplit/2)

if minbucket < 2:
    minbucket = 2
if minsplit < 4:
    minsplit = 4

minbucket = int(minbucket) # cart cannot set a float minbucket or minsplit 
minsplit = int(minsplit)

cart_model = DecisionTreeRegressor( min_samples_split = minsplit, min_samples_leaf = minbucket)

But in Nair et al. (ase '17) , they simply remove the parameters of cart as follows,

cart_model = DecisionTreeRegressor()

So is there a big difference between tuned cart and non-tuned cart?

@Gu-Youngfeng Gu-Youngfeng added the question Further information is requested label Jul 9, 2018
@Gu-Youngfeng Gu-Youngfeng self-assigned this Jul 9, 2018
@Gu-Youngfeng
Copy link
Owner Author

In the experiment of performance prediction in terms of MMRE, the non-tuned cart seems performs well than tuned cart. The latter will result in large MMRE.

Compare the 2 kinds of cart among all datasets in /data/ might be a good solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant