-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update plr.py : classifier prediction method option added to ml_l #283
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it also be possible to add a test e.g. test_plr_binary
to test the setting with binary outcomes (this might require adjustment of the manual function fit_plr_single_split
def fit_plr_single_split(y, x, d, learner_l, learner_m, learner_g, smpls, score, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PragyanTiwari Thanks.
I would suggest slight changes.
The learner type is checked via the _check_learner()
method. To allow for both options i would suggest to define
ml_l_is_classifier = self._check_learner(ml_l, 'ml_l', regressor=True, classifier=True)
This could then be used directly in the _predict_method
.
Further it would be great if there would be a UserWarning if ml_l
is a classifer to alert the user that this model treats the probability as additive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @SvenKlaassen,
If this is the case, then both ml_l_is_classifier
and ml_m_is_classifier
would likely behave similarly when setting the _predict_method
.
Since ml_l_is_classifier
also requires the same if-else logic as ml_m_is_classifier
(
doubleml-for-py/doubleml/plm/plr.py
Lines 131 to 138 in 9d8f50f
if ml_m_is_classifier: | |
if self._dml_data.binary_treats.all(): | |
self._predict_method['ml_m'] = 'predict_proba' | |
else: | |
raise ValueError(f'The ml_m learner {str(ml_m)} was identified as classifier ' | |
'but at least one treatment variable is not binary with values 0 and 1.') | |
else: | |
self._predict_method['ml_m'] = 'predict' |
Which option do you prefer??
-
Define a separate function that takes the model as parameter and sets its
_predict_method
accordingly for that particular model i.e.ml_l
orml_m
. -
Or, Simply replicate the same conditions for
ml_l_is_classifier
.
Which approach do you think is more maintainable in the long run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe approach 2 would be slightly prefered as the conditions logic is not completely the same.
Since only one outcome is possible in the package ml_l
does not need to check if all treatments are binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, indeed we need a test for the ml_l
(classifier) to check the binary outcomes; hence test_plr_binary
becomes essential. Right now, I've made some basic changes, please look into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. That looks fine.
I think a check on the binary outcome should look similar to this
doubleml-for-py/doubleml/irm/irm.py
Lines 147 to 151 in 92b057d
if obj_dml_data.binary_outcome: | |
self._predict_method = {'ml_g': 'predict_proba', 'ml_m': 'predict_proba'} | |
else: | |
raise ValueError(f'The ml_g learner {str(ml_g)} was identified as classifier ' | |
'but the outcome variable is not binary with values 0 and 1.') |
Further, the tests should at least check the Userwarning and the exception if the outcome is not binary.
I will not have time to look into changes immediately.
Base condition written for ml_l (classifier).
If you want to proceed with this pull request, then please reply. I would be happy to reopen this pull request. |
Description
Added the classification condition to the
ml_l
model. Hence, it will set the prediction (predict or predict_proba) method accordingly.Reference to Issues or PRs
PR for the issue: #236
PR Checklist
Please fill out this PR checklist (see our contributing guidelines for details).