The goal of splendid
is to provide a supervised learning pipeline that
implements major components of a multiclass classification problem. We
guide the user through fitting a classifier, obtaining predictions, and
ultimately evaluating performance using metrics and visualizations.
You can install splendid from github with:
# install.packages("devtools")
devtools::install_github("AlineTalhouk/splendid")
The following example shows how to use the main function of the package,
splendid()
. A data matrix hgsc
contains a subset of gene expression
measurements of High Grade Serous Ovarian Carcinoma patients from the
Cancer Genome Atlas publicly available datasets. Samples as rows,
features as columns. The function below runs the package through the
splendid()
function. First we extract the reference class labels (by
TCGA) from the row names of hgsc
. Then we fit the random forest and
extreme gradient boosting classifiers to one bootstrapped replicate of
the data.
library(splendid)
data(hgsc)
class <- attr(hgsc, "class.true")
sl_result <- splendid(data = hgsc, class = class, n = 1,
algorithms = c("rf", "xgboost"), seed_boot = 5)
str(sl_result, max.level = 2)
#> List of 8
#> $ models :List of 2
#> ..$ rf :List of 1
#> ..$ xgboost:List of 1
#> $ preds :List of 2
#> ..$ rf :List of 1
#> ..$ xgboost:List of 1
#> $ evals :List of 2
#> ..$ rf :'data.frame': 48 obs. of 1 variable:
#> .. ..- attr(*, "err_632plus")= num 0.433
#> ..$ xgboost:'data.frame': 48 obs. of 1 variable:
#> .. ..- attr(*, "err_632plus")= num 0.858
#> $ bests : Named chr [1:2] "rf" "rf"
#> ..- attr(*, "names")= chr [1:2] "1" "X1"
#> $ ensemble_algs: chr "rf"
#> $ ensemble_mods:List of 1
#> ..$ :List of 18
#> .. ..- attr(*, "class")= chr "randomForest"
#> $ seq_mods : NULL
#> $ seq_preds : NULL