Skip to content

Releases: oegedijk/explainerdashboard

v0.3.0: reducing memory footprint

27 Jan 13:49
f58767a
Compare
Choose a tag to compare

Version 0.3.0:

This is a major release and comes with lots of breaking changes to the lower level
ClassifierExplainer and RegressionExplainer API. The higherlevel ExplainerComponent and ExplainerDashboard API has not been
changed however, except for the deprecation of the cats and hide_cats parameters.

Explainers generated with version explainerdashboard <= 0.2.20.1 will not work
with this version! So if you have stored explainers to disk you either have to
rebuild them with this new version, or downgrade back to explainerdashboard==0.2.20.1!
(hope you pinned your dependencies in production! ;)

Main motivation for these breaking changes was to improve memory usage of the
dashboards, especially in production. This lead to the deprecation of the
dual cats grouped/not grouped functionality of the dashboard. Once I had committed
to that breaking change, I decided to clean up the entire API and do all the
needed breaking changes at once.

Breaking Changes

  • onehot encoded features (passed with the cats parameter) are now merged by default. This means that the cats=True
    parameter has been removed from all explainer methods, and the group cats
    toggle has been removed from all ExplainerComponents. This saves both
    on code complexity and memory usage. If you wish to see the see the individual
    contributions of onehot encoded columns, simply don't pass them to the
    cats parameter upon construction.

  • Deprecated explainer attributes:

    • BaseExplainer:
      • shap_values_cats
      • shap_interaction_values_cats
      • permutation_importances_cats
      • get_dfs()
      • formatted_contrib_df()
      • to_sql()
      • check_cats()
      • equivalent_col
    • ClassifierExplainer:
      • get_prop_for_label
  • Naming changes to attributes:

    • BaseExplainer:
      • importances_df() -> get_importances_df()
      • feature_permutations_df() -> get_feature_permutations_df()
      • get_int_idx(index) -> get_idx(index)
      • importances_df() -> get_importances_df()
      • contrib_df() -> get_contrib_df() *
      • contrib_summary_df() -> self.get_summary_contrib_df() *
      • interaction_df() -> get_interactions_df() *
      • shap_values -> get_shap_values_df
      • plot_shap_contributions() -> plot_contributions()
      • plot_shap_summary() -> plot_importances_detailed()
      • plot_shap_dependence() -> plot_dependence()
      • plot_shap_interaction() -> plot_interaction()
      • plot_shap_interaction_summary() -> plot_interactions_detailed()
      • plot_interactions() -> plot_interactions_importance()
      • n_features() -> n_features
      • shap_top_interaction() -> top_shap_interactions
      • shap_interaction_values_by_col() -> shap_interactions_values_for_col()
    • ClassifierExplainer:
      • self.pred_probas -> self.pred_probas()
      • precision_df() -> get_precision_df() *
      • lift_curve_df() -> get_liftcurve_df() *
    • RandomForestExplainer/XGBExplainer:
      • decision_trees -> shadow_trees
      • decisiontree_df() -> get_decisionpath_df()
      • decisiontree_summary_df() -> get_decisionpath_summary_df()
      • decision_path_file() -> decisiontree_file()
      • decision_path() -> decisiontree()
      • decision_path_encoded() -> decisiontree_encoded()

New Features

  • new Explainer parameter precision: defaults to 'float64'. Can be set to
    'float32' to save on memory usage: ClassifierExplainer(model, X, y, precision='float32')
  • new memory_usage() method to show which internal attributes take the most memory.
  • for multiclass classifiers: keep_shap_pos_label_only(pos_label) method:
    • drops shap values and shap interactions for all labels except pos_label
    • this should significantly reduce memory usage for multi class classification
      models.
    • not needed for binary classifiers.
  • added get_index_list(), get_X_row(index), and get_y(index) methods.
    • these can be overridden with .set_index_list_func(), .set_X_row_func()
      and .set_y_func().
    • by overriding these functions you can for example sample observations
      from a database or other external storage instead of from X_test, y_test.
  • added Popout buttons to all the major graphs that open a large modal
    showing just the graph. This makes it easier to focus on a particular
    graph without distraction from the rest of the dashboard and all it's toggles.
  • added max_cat_colors parameters to plot_importance_detailed and plot_dependence and plot_interactions_detailed
    • prevents plotting getting slow with categorical features with many categories.
    • defaults to 5
    • can be set as **kwarg to ExplainerDashboard
  • adds category limits and sorting to RegressionVsCol component
  • adds property X_merged that gives a dataframe with the onehot columns merged.

Bug Fixes

  • shap dependence: when no point cloud, do not highlight!
  • Fixed bug with calculating contributions plot/table for whatif component,
    when InputFeatures had not fully loaded, resulting in shap error.

Improvements

  • saving X.copy(), instead of using a reference to X
    • this would result in more memory usage in development
      though, so you can del X_test to save memory.
  • ClassifierExplainer only stores shap (interaction) values for the positive
    class: shap values for the negative class are generated on the fly
    by multiplying with -1.
  • encoding onehot columns as np.int8 saving memory usage
  • encoding categorical features as pd.category saving memory usage
  • added base TreeExplainer class that RandomForestExplainer and XGBExplainer both derive from
    • will make it easier to extend tree explainers to other models in the future
      • e.g. catboost and lightgbm
  • got rid of the callable properties (that were their to assure backward compatibility),
    and replaced them with regular methods.

v0.2.20.1: backward compatibility fix

14 Jan 10:25
e1e6254
Compare
Choose a tag to compare

0.2.20.1:

Bug Fixes

  • fixes bug allowing single list of logins for ExplainerDashboard when passed
    on to ExplainerHub
  • fixes bug with explainers generated with explainerdashboard <= 0.2.19
    that did not have a .onehot_cols property

v0.2.20: supporting categorical features

12 Jan 14:18
0bc863c
Compare
Choose a tag to compare

0.2.20:

Breaking Changes

  • WhatIfComponent deprecated. Use WhatIfComposite or connect components
    yourself to a FeatureInputComponent
  • renaming properties:
    explainer.cats -> explainer.onehot_cols
    explainer.cats_dict -> explainer.onehot_dict

New Features

  • Adds support for models with categorical features (e.g. CatBoost)
  • Adds filter on number of categories to display in violin plots and pdp plot,
    and how to sort the categories (alphabetical, by frequency or by mean abs shap)

Bug Fixes

  • fixes bug where str tab indicators returned e.g. the old ImportancesTab instead of ImportancesComposite

Improvements

  • No longer dependening on PDPbox dependency: built own partial dependence
    functions with categorical feature support
  • autodetect xgboost.core.Booster or lightgbm.Booster and give ValueError to
    use the sklearn compatible wrappers instead.

Other Changes

  • Introduces list of categorical columns: explainer.categorical_cols
  • Introduces dictionary with categorical columns categories: explainer.categorical_dict
  • Introduces list of all categorical features: explainer.cat_cols

Bugfix: support custom dashboard components that dont take name or kwargs

09 Jan 19:23
Compare
Choose a tag to compare

Bug fix:

  • custom ExplainerComponent that do not have name or **kwargs parameters in the __init__ are no longer broken.

v0.2.19: ExplainerHub improvements (NavBar!)

07 Jan 11:31
Compare
Choose a tag to compare

0.2.19

Breaking Changes

  • ExplainerHub: parameter user_json is now called users_file (and default to a users.yaml file)
  • Renamed a bunch of ExplainerHub private methods:
    • _validate_user_json -> _validate_users_file
    • _add_user_to_json -> _add_user_to_file
    • _add_user_to_dashboard_json -> _add_user_to_dashboard_file
    • _delete_user_from_json -> _delete_user_from_file
    • _delete_user_from_dashboard_json -> _delete_user_from_dashboard_file

New Features

  • Added NavBar to ExplainerHub
  • Made users.yaml to default file for storing users and hashed passwords
    for ExplainerHub for easier manual editing.
  • Added option min_height to ExplainerHub to set the size of the iFrame
    containing the dashboard.
  • Added option fluid=True to ExplainerHub to stretch bootstrap container
    to width of the browser.
  • added parameter bootstrap to ExplainerHub to override default bootstrap theme.
  • added option dbs_open_by_default=True to ExplainerHub so that no login
    is required for dashboards for which there wasn't a specific lists
    of users declared through db_users. So only dashboards for which users
    have been defined are password protected.
  • Added option no_index to ExplainerHub so that no flask route is created
    for index "/", so that you can add your own custom index. The dashboards
    are still loaded on their respective routes, so you can link to them
    or embed them in iframes, etc.
  • Added a "wizard" perfect prediction to the lift curve.
    • hide with hide_wizard=True default to not show with wizard=False.

Bug Fixes

  • ExplainerHub.from_config() now works with non-cwd paths
  • ExplainerHub.to_yaml("subdirectory/hub.yaml") now correctly stores
    the users.yaml file in the correct subdirectory when specified.

Improvements

  • added a "powered by: explainerdashboard" footer. Hide it with hide_poweredby=True.
  • added option "None" to shap dependence color col. Also removes the point cloud
    from the violin plots for categorical features.
  • added option mode to ExplainerDashboard.run() that can override self.mode.

v0.2.18.2: fix bug with ExplainerHub and logins=None

29 Dec 21:21
Compare
Choose a tag to compare

v0.2.18: ExplainerHub user management + CLI

24 Dec 19:51
Compare
Choose a tag to compare

v0.2.18.1

New Features

  • ExplainerHub now does user managment through Flask-Login and a user.json file
  • Can now set specific access policies for specific explainer with db_users parameter
  • adds an explainerhub cli to start explainerhubs and do user management from the command-line

v0.2.17.3: fixes version bump

23 Dec 10:39
2da34a0
Compare
Choose a tag to compare

v0.2.17.2: sklearn v0.24 RandomForestRegressor bugfix

23 Dec 10:35
459c650
Compare
Choose a tag to compare

v0.2.17: Introducing ExplainerHub

20 Dec 13:12
Compare
Choose a tag to compare

0.2.17:

New Features

  • Introducing ExplainerHub: combine multiple dashboards together behind a single frontend with convenient url paths.
    • code example:
    db1 = ExplainerDashboard(explainer, title="Dashboard One", name='db1')
    db2 = ExplainerDashboard(explainer2, title="Dashboard Two", name='project_alpha', description="New proposed model")
    
    hub = ExplainerHub([db1, db2])
    hub.run()
    
    # store an recover from config:
    hub.to_yaml("hub.yaml")
    hub2 = ExplainerHub.from_config("hub.yaml")
  • adds option dump_explainer to ExplainerDashboard.to_yaml() to automatically
    dump the explainer along with the .yaml.
  • adds option use_waitress to ExplainerDashboard.run() and ExplainerHub.run(), to use the waitress python webserver instead of the Flask development server
  • adds parameters to ExplainerDashboard:
    • name: this will be used to assign a url for ExplainerHub (otherwise defaults to dashboard1, dashboard2, etc
    • description: this will be used for the title tooltip in the dashboard
      and in the ExplainerHub frontend.

Improvements

  • the cli now uses the waitress server by default.