Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TLDR: This PR fixes #250, adds tests with xgb models and finds a bug/inconsitency in
shap
andxgboost.sklearn.XGBClassifier
that is not present inshapiq
.Bugfix of #250.
The bug that the baseline prediction was not properly set stems from the fact that
xgboost
models (note models and not the individual boosters) contain anmodel.base_score
and/ormodel.intercept_
attributes that store the empty prediction of the xgb models (as log-odds). Now this base_score/intercept is added to the values of the xgb modelUncovers a bug in
shap
(not inshapiq
)The
test_tree_explainer.test_xgboost_shap_error
. contains a test uncovering some inconsistencies withshap
: The test is used to show that theshapiq
implementation is correct and theshap
implementation is doing something weird. For some instances (e.g. the one used in this test) the SHAP values are different from the shapiq values. However, when we round thethresholds
of the xgboost trees in shapiq, then the computed explanations match. This is a strange behavior as rounding the thresholds makes the model less true to the original model but only then the explanations match.