-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH, FIX i) build_oob_forest
backwards compatiblility with sklearn and ii) HonestForest stratification during bootstrap
#283
Conversation
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
build_oob_forest
backwards compatiblility with sklearn and HonestForest stratification during bootstrapbuild_oob_forest
backwards compatiblility with sklearn and ii) HonestForest stratification during bootstrap
Signed-off-by: Adam Li <adam2392@gmail.com>
Interestingly, this is not an issue on RandomForestClassifier, so I suspect there is a relationship to the empty leaves, or the fact that we use a separate dataset to estimate the posteriors |
Signed-off-by: Adam Li <adam2392@gmail.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #283 +/- ##
==========================================
+ Coverage 76.79% 78.55% +1.75%
==========================================
Files 25 24 -1
Lines 2267 2252 -15
Branches 409 414 +5
==========================================
+ Hits 1741 1769 +28
+ Misses 402 352 -50
- Partials 124 131 +7 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Adam Li <adam2392@gmail.com>
Changes proposed in this pull request:
build_oob_forest
will work with any sklearn Forest that hasestimators_samples_
(in-bag sample indices)Stratification should occur every time we sample the dataset whether its subsampling, or bootstrapping.
Summary
On
main
branch, using the following test:we get the error:
However, if we run it on this branch, we get
0.50498046875 [0.484375, 0.53076171875, 0.513671875, 0.46533203125, 0.53076171875]
, which shows the stratification fixes the bias.