0.2.45

haghish · May 29, 2024 · 5c6d05c · 5c6d05c
1 parent 29c4b2c
commit 5c6d05c
Showing 1 changed file with 28 additions and 26 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -8,35 +8,37 @@ Authors@R:
            email = "haghish@uio.no")
 Depends: 
     R (>= 3.5.0)
-Description: This R package introduces an innovative method for calculating SHapley Additive 
-    exPlanations (SHAP) values for a grid of fine-tuned base-learner machine learning 
-    models as well as stacked ensembles, a method not previously available due to the 
+Description: This R package introduces Weighted Mean SHapley Additive exPlanations (WMSHAP),
+    an innovative method for calculating SHAP values for a grid of fine-tuned base-learner machine  
+    learning models as well as stacked ensembles, a method not previously available due to the 
     common reliance on single best-performing models. By integrating the weighted mean 
     SHAP values from individual base-learners comprising the ensemble or individual 
     base-learners in a tuning grid search, the package weights SHAP contributions 
-    according to each model's performance, assessed by the Area Under the Precision-Recall
-    Curve (AUCPR) for binary classifiers (currently implemented). It further extends this 
-    framework to implement weighted confidence intervals for weighted mean SHAP values, 
-    offering a more comprehensive and robust feature importance evaluation over a grid of 
-    machine learning models, instead of solely computing SHAP values for the best model. 
-    This methodology is particularly beneficial for addressing the severe class imbalance 
-    (class rarity) problem by providing a transparent, generalized measure of feature 
-    importance that mitigates the risk of reporting SHAP values for an overfitted or 
-    biased model and maintains robustness under severe class imbalance, where there is no 
-    universal criteria of identifying the absolute best model. Furthermore, the package 
-    implements hypothesis testing to ascertain the statistical significance of SHAP values 
-    for individual features, as well as comparative significance testing of SHAP 
-    contributions between features. Additionally, it tackles a critical gap in feature 
-    selection literature by presenting criteria for the automatic feature selection of the 
-    most important features across a grid of models or stacked ensembles, eliminating the 
-    need for arbitrary determination of the number of top features to be extracted. This 
-    utility is invaluable for researchers analyzing feature significance, particularly 
-    within severely imbalanced outcomes where conventional methods fall short. Moreover, 
-    it is also expected to report democratic feature importance across a grid of models, 
-    resulting in a more comprehensive and generalizable feature selection. The package 
-    further implements a novel method for visualizing SHAP values both at subject level 
-    and feature level as well as a plot for feature selection based on the weighted mean 
-    SHAP ratios.
+    according to each model's performance, assessed by multiple either R squared 
+    (for both regression and classification models). alternatively, this software 
+    also offers weighting SHAP values based on the area under the precision-recall
+    curve (AUCPR), the area under the curve (AUC), and F2 measures for binary classifiers. 
+    It further extends this framework to implement weighted confidence intervals for 
+    weighted mean SHAP values, offering a more comprehensive and robust feature importance 
+    evaluation over a grid of machine learning models, instead of solely computing SHAP 
+    values for the best model. This methodology is particularly beneficial for addressing 
+    the severe class imbalance (class rarity) problem by providing a transparent, 
+    generalized measure of feature importance that mitigates the risk of reporting 
+    SHAP values for an overfitted or biased model and maintains robustness under severe 
+    class imbalance, where there is no universal criteria of identifying the absolute 
+    best model. Furthermore, the package implements hypothesis testing to ascertain the 
+    statistical significance of SHAP values for individual features, as well as 
+    comparative significance testing of SHAP contributions between features. Additionally, 
+    it tackles a critical gap in feature selection literature by presenting criteria for 
+    the automatic feature selection of the most important features across a grid of models 
+    or stacked ensembles, eliminating the need for arbitrary determination of the number 
+    of top features to be extracted. This utility is invaluable for researchers analyzing 
+    feature significance, particularly within severely imbalanced outcomes where 
+    conventional methods fall short. Moreover, it is also expected to report democratic 
+    feature importance across a grid of models, resulting in a more comprehensive and 
+    generalizable feature selection. The package further implements a novel method for 
+    visualizing SHAP values both at subject level and feature level as well as a plot 
+    for feature selection based on the weighted mean SHAP ratios.
 License: MIT + file LICENSE
 Encoding: UTF-8
 Imports: