gglm
, The Grammar of Graphics for Linear Model Diagnostics, is an R
package and official ggplot2
extension that creates
beautiful diagnostic plots using ggplot2
for a variety of model
objects. These diagnostic plots are easy to use and adhere to the
Grammar of Graphics. The purpose of this package is to provide a
sensible alternative to using the base-R plot()
function to produce
diagnostic plots for model objects. Currently, gglm
supports all model
objects that are supported by broom::augment()
,
broom.mixed::augment()
, or ggplot2::fortify()
. For example, objects
that are outputted from stats::lm()
, lme4::lmer()
, brms::brm()
,
and many other common modeling functions will work with gglm
. The
function gglm::list_model_classes()
provides a full list of model
classes supported by gglm
.
gglm
can be installed from CRAN:
install.packages("gglm")
Or, the developmental version of gglm
can be installed from GitHub:
devtools::install_github("graysonwhite/gglm")
gglm
has two main types of functions. First, the gglm()
function is
used for quickly creating the four main diagnostic plots, and behaves
similarly to how plot()
works on an lm
type object. Second, the
stat_*()
functions are used to produce diagnostic plots by creating
ggplot2
layers. These layers allow for plotting of particular model
diagnostic plots within the ggplot2
framework.
Consider a simple linear model used to predict miles per gallon with
weight. We can fit this model with lm()
, and then diagnose it easily
by using gglm()
.
library(gglm) # Load gglm
model <- lm(mpg ~ wt, data = mtcars) # Fit the simple linear model
gglm(model) # Plot the four main diagnostic plots
Now, one may be interested in a more complicated model, such as a mixed
model with a varying intercept on cyl
, fit with the lme4
package.
Luckily, gglm
accommodates a variety of models and modeling packages,
so the diagnostic plots for the mixed model can be created in the same
way as they were for the simple linear model.
library(lme4) # Load lme4 to fit the mixed model
mixed_model <- lmer(mpg ~ wt + (1 | cyl), data = mtcars) # Fit the mixed model
gglm(mixed_model) # Plot the four main diagnostic plots.
gglm
also provides functionality to stay within the Grammar of
Graphics by providing functions that can be used as ggplot2
layers. An
example of one of these functions is the stat_fitted_resid()
function.
With this function, we can take a closer look at just the fitted
vs. residual plot from the mixed model fit in Example 1.
ggplot(data = mixed_model) +
stat_fitted_resid()
After taking a closer look, we may want to clean up the look of the plot
for a presentation or a project. This can be done by adding other layers
from ggplot2
to the plot. Note that any ggplot2
layers can be added
on to any of the stat_*()
functions provided by gglm
.
ggplot(data = mixed_model) +
stat_fitted_resid(alpha = 1) +
theme_bw() + # add a clean theme
labs(title = "Residual vs fitted values for the mixed model") + # change the title
theme(plot.title = element_text(hjust = 0.5)) # center the title
Wow! What a beautiful and production-ready diagnostic plot!
gglm()
plots the four default diagnostic plots when supplied a model
object (this is similar to plot.lm()
in the case of an object
generated by lm()
). Note that can gglm()
take many types of model
object classes as its input, and possible model object classes can be
seen with list_model_classes()
stat_normal_qq()
, stat_fitted_resid()
, stat_resid_hist()
,
stat_scale_location()
, stat_cooks_leverage()
, stat_cooks_obs()
,
and stat_resid_leverage()
all are ggplot2
layers used to create
individual diagnostic plots. To use these, follow Example 2.
list_model_classes()
lists the model classes compatible with gglm
.