Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update plr.py : classifier prediction method option added to ml_l #283

Closed
wants to merge 5 commits into from

Conversation

PragyanTiwari
Copy link

Description

Added the classification condition to the ml_l model. Hence, it will set the prediction (predict or predict_proba) method accordingly.

Reference to Issues or PRs

PR for the issue: #236

PR Checklist

Please fill out this PR checklist (see our contributing guidelines for details).

  • The title of the pull request summarizes the changes made.
  • The PR contains a detailed description of all changes and additions.
  • References to related issues or PRs are added.
  • The code passes all (unit) tests.
  • Enhancements or new feature are equipped with unit tests.
  • The changes adhere to the PEP8 standards.

Copy link
Member

@SvenKlaassen SvenKlaassen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it also be possible to add a test e.g. test_plr_binary to test the setting with binary outcomes (this might require adjustment of the manual function fit_plr_single_split

def fit_plr_single_split(y, x, d, learner_l, learner_m, learner_g, smpls, score,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PragyanTiwari Thanks.
I would suggest slight changes.
The learner type is checked via the _check_learner() method. To allow for both options i would suggest to define

ml_l_is_classifier = self._check_learner(ml_l, 'ml_l', regressor=True, classifier=True)

This could then be used directly in the _predict_method.

Further it would be great if there would be a UserWarning if ml_l is a classifer to alert the user that this model treats the probability as additive.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @SvenKlaassen,

If this is the case, then both ml_l_is_classifier and ml_m_is_classifier would likely behave similarly when setting the _predict_method.

Since ml_l_is_classifier also requires the same if-else logic as ml_m_is_classifier (

if ml_m_is_classifier:
if self._dml_data.binary_treats.all():
self._predict_method['ml_m'] = 'predict_proba'
else:
raise ValueError(f'The ml_m learner {str(ml_m)} was identified as classifier '
'but at least one treatment variable is not binary with values 0 and 1.')
else:
self._predict_method['ml_m'] = 'predict'
),

Which option do you prefer??

  1. Define a separate function that takes the model as parameter and sets its _predict_method accordingly for that particular model i.e. ml_l or ml_m.

  2. Or, Simply replicate the same conditions for ml_l_is_classifier.

Which approach do you think is more maintainable in the long run?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe approach 2 would be slightly prefered as the conditions logic is not completely the same.
Since only one outcome is possible in the package ml_l does not need to check if all treatments are binary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, indeed we need a test for the ml_l (classifier) to check the binary outcomes; hence test_plr_binary becomes essential. Right now, I've made some basic changes, please look into that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That looks fine.
I think a check on the binary outcome should look similar to this

if obj_dml_data.binary_outcome:
self._predict_method = {'ml_g': 'predict_proba', 'ml_m': 'predict_proba'}
else:
raise ValueError(f'The ml_g learner {str(ml_g)} was identified as classifier '
'but the outcome variable is not binary with values 0 and 1.')

Further, the tests should at least check the Userwarning and the exception if the outcome is not binary.
I will not have time to look into changes immediately.

@PragyanTiwari PragyanTiwari reopened this Jan 11, 2025
Base condition written for ml_l (classifier).
@PragyanTiwari
Copy link
Author

If you want to proceed with this pull request, then please reply. I would be happy to reopen this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants