name | topic | maintainer | version | source | |
---|---|---|---|---|---|
MixedModels |
Mixed, Multilevel, and Hierarchical Models in R |
Ben Bolker, Julia Piaskowski, Emi Tanaka, Phillip Alday, Wolfgang Viechtbauer |
bolker@mcmaster.ca |
2024-05-08 |
Contributors: Maintainers plus Michael Agronah, Matthew Fidler, Thierry Onkelinx
Mixed (or mixed-effect) models are a broad class of statistical models used to analyze data where observations can be assigned a priori to discrete groups, and where the parameters describing the differences between groups are treated as random (or latent) variables. They are one category of multilevel, or hierarchical models; longitudinal data are often analyzed in this framework. In econometrics, longitudinal or cross-sectional time series data are often referred to as panel data and are sometimes fitted with mixed models. Mixed models can be fitted in either frequentist or Bayesian frameworks.
This task view only includes models that incorporate continuous (usually although not always Gaussian) latent variables. This excludes packages that handle hidden Markov models, latent Markov models, and finite (discrete) mixture models (some of these are covered by the r view("Cluster")
task view). Dynamic linear models and other state-space models that do not incorporate a discrete grouping variable are also excluded (some of these are covered by the r view("TimeSeries")
task view). Bioinformatic applications of mixed models hosted on Bioconductor are mostly excluded as well.
Linear mixed models (LMMs) make the following assumptions:
- The expected values of the responses are linear combinations of the fixed predictor variables and the random effects.
- The conditional distribution of the responses is Gaussian (equivalently, the errors are Gaussian).
- The random effects are normally distributed.
Frequentist:
The most commonly used packages and/or functions for frequentist LMMs are:
r pkg("nlme", priority = "core")
:nlme::lme()
provides REML or ML estimation. Allows multiple nested random effects, and provides structures for modeling heteroscedastic and/or correlated errors. Wald estimates of parameter uncertainty.r pkg("lme4", priority = "core")
:lmer4::lmer()
provides REML or ML estimation. Allows multiple nested or crossed random effects, can compute profile confidence intervals and conduct parametric bootstrapping.r pkg("mbest")
: fits large nested LMMs using a fast moment-based approach.
Bayesian:
Most Bayesian R packages use Markov chain Monte Carlo (MCMC) estimation: r pkg("MCMCglmm", priority = "core")
, r pkg("rstanarm")
, and r pkg("brms", priority = "core")
; the latter two packages use the Stan infrastructure. r pkg("blme")
, built on r pkg("lme4", priority = "core")
, uses maximum a posteriori (MAP) estimation. r pkg("bamlss")
provides a flexible set of modular functions for Bayesian regression modeling.
Generalized linear mixed models (GLMMs) can be described as hierarchical extensions of generalized linear models (GLMs), or as extensions of LMMs to different response distributions, typically in the exponential family. The random-effect distributions are typically assumed to be Gaussian on the scale of the linear predictor.
Frequentist:
-
r pkg("MASS")
:MASS::glmmPQL()
fits via penalized quasi-likelihood. -
r pkg("lme4", priority = "core")
:lme4::glmer()
uses Laplace approximation and adaptive Gauss-Hermite quadrature; fits negative binomial as well as exponential-family models. -
r pkg("glmmTMB", priority = "core")
uses Laplace approximation; allows some correlation structures; fits some non-exponential families (Beta, COM-Poisson, etc.) and zero-inflated/hurdle models. -
r pkg("GLMMadaptive")
uses adaptive Gauss-Hermite quadrature; fits exponential family, negative binomial, beta, zero-inflated/hurdle/censored Gaussian models, user-specified log-densities. -
r pkg("hglm")
fits hierarchical GLMs using$h$ -likelihood (sensu Nelder, Lee and Pawitan (2017) -
r pkg("glmm")
fits GLMMs using Monte Carlo likelihood approximation. -
r pkg("glmmEP")
fits probit mixed models for binary data by expectation propagation. -
r pkg("mbest")
fits large nested GLMMs using a fast moment-based approach. -
r pkg("galamm")
fits a wide variety of models (heteroscedastic, mixed response types, factor loadings, etc.) -
r pkg("glmmrBase")
uses MCMC and Laplace approximations to Gaussian, binomial, Poisson, Beta, Gamma responses with flexible correlation structures
Bayesian:
Most Bayesian mixed model packages use some form of Markov chain Monte Carlo (or other Monte Carlo methods).
r pkg("MCMCglmm", priority = "core")
: Gibbs sampling. Exponential family, multinomial, ordinal, zero-inflated/altered/hurdle, censored, multimembership, multi-response models. Pedigree (animal/kinship/phylogenetic) models.r pkg("rstanarm")
Hamiltonian Monte Carlo (based on Stan); designed forlme4
compatibility.r pkg("brms", priority = "core")
: Hamilton Monte Carlo. Linear, robust linear, count data, survival, response times, ordinal, zero-inflated/hurdle/censored data.r pkg("bamlss")
: optimization and derivative-based Metropolis-Hastings/slice sampling. Wide range of distributions and link functions.
The following packages (in addition to r pkg("bamlss")
) find maximum a posteriori fits to Bayesian (G)LMMs by optimization:
r pkg("blme")
wrapsr pkg("lme4", priority = "core")
to add prior distributions.- INLA uses integrated nested Laplace approximation to fit GLMMs using a wide range of latent models (especially for spatial estimation), priors, and distributions.
r pkg("inlabru")
facilitates spatial modeling using integrated nested Laplace approximation via the R-INLA package. Additionally, extends the GAM-like model class to more general nonlinear predictor expressions and implements a log-Gaussian Cox process likelihood for modeling univariate and spatial point processes based on ecological survey data.r github("inbo/inlatools")
provides tools to set sensible priors and check the dispersion and distribution of INLA models.
r pkg("vglmer")
estimates GLMMs by variational Bayesian methods.
Nonlinear mixed models incorporate arbitrary nonlinear responses that cannot be accommodated in the framework of GLMMs. Only a few packages can accommodate generalized nonlinear mixed models (i.e., parametric nonlinear mixed models with non-Gaussian responses). However, many packages allow smooth nonparametric components (see "Additive models" below). Otherwise, users may need to implement GNLMMs themselves in a more general hierarchical modeling framework.
Frequentist:
nlme::nlme()
fromr pkg("nlme")
andlmer4::nlmer()
fromr pkg("lme4", priority = "core")
fit nonlinear mixed effects models by maximum likelihood.nlmixr2::nlmixr2()
fromr pkg("nlmixr2")
fits nonlinear mixed effects model by first order conditional estimation (focei) maximum likelihood approximation (a different approximation thannlme:nlme()
andlmer4:nlmer()
), and allows generalized likelihood as well as a selection of built-in link functions.gnlmm()
andgnlmm3()
fromr pkg("repeated")
fit GNLMMs by Gauss-Hermite integration.r pkg("saemix")
andr pkg("nlmixr2")
both use a stochastic approximation of the EM algorithm to fit a wide range of GNLMMs.
Bayesian:
r pkg("brms")
supports GNLMMs.
General estimating equations (GEEs) are an alternative approach to fitting clustered, longitudinal, or otherwise correlated data. These models produce estimates of the marginal effects (averaged across the group-level variation) rather than conditional effects (conditioned on group-level information).
r pkg("geepack", priority = "core")
,r pkg("gee")
, andr pkg("geeM")
are standard GEE solvers, providing GEE estimation of the parameters in mean structures with possible correlation between the outcomes.r pkg("geesmv")
: GEE estimator using the original sandwich variance estimator proposed by Liang and Zeger (1986), and eight types of variance estimators for improving the finite small-sample performance.r pkg("multgee")
is a GEE solver for correlated nominal or ordinal multinomial responses.r pkg("glmtoolbox")
handles a wide variety of model types (GLMs, beta-binomial and negative binomial, zero-inflation and zero-alteration, mixed models) via GEEs
- [Additive models]{#additive-models} (models incorporating smooth functional components such as regression splines or Gaussian processes; also known as semiparametric models):
r pkg("gamm4")
,r pkg("mgcv")
,r pkg("brms", priority = "core")
,r pkg("lmeSplines")
,r pkg("bamlss")
,r pkg("gamlss")
,r github("Biometris/LMMsolver")
,r pkg("R2BayesX")
,r pkg("GLMMRR")
,r pkg("glmmTMB", priority = "core")
,r pkg("galamm")
. -
Big data/distributed computation:
r pkg("lmmpar")
,r pkg("mbest")
. See also MixedModels.jl (Julia), diamond (Python). -
Bioinformatics/quantitative genetics:
r pkg("MCMC.qpcr")
,r pkg("QGglmm")
,r pkg("CpGassoc")
(methylation studies). -
Censored data (response data known only up to lower/upper bounds):
r pkg("brms", priority = "core")
andr pkg("nlmixr2")
(general),r pkg("ARpLMEC")
(censored Gaussian, autoregressive errors). Censored Gaussian (Tobit) responses:r pkg("GLMMadaptive")
,r pkg("MCMCglmm", priority = "core")
,r pkg("gamlss")
. -
Denominator degree-of-freedom computation: Satterthwaite and/or Kenward-Roger corrections are computed by
r pkg("lmerTest")
,r pkg("pbkrtest")
,r pkg("glmmrBase")
- [Differential equations]{#differential-equations} (fitting DEs with group-structured parameters; this category overlaps considerably with pharmacokinetic modeling):
r pkg("mixedsde")
for stochastic DEs. Ordinary DEs can be run withr pkg("nlmixr2")
using the "focei" or "saem" (EM) methods, or using ther pkg("nlme")
package; see also ther view("DifferentialEquations")
andr view("Pharmacokinetics")
task views. -
Doubly hierarchical GLMs:
r pkg("dhglm")
,r pkg("mdhglm")
(multivariate) -
Factor analytic, latent variable, and structural equation models:
r pkg("lavaan", priority = "core")
,r pkg("nlmm")
,r pkg("sem")
,r pkg("piecewiseSEM")
,r pkg("semtree")
, andr pkg("blavaan")
; see also ther view("Psychometrics")
task view. -
Flexible correlation structures:
r pkg("brms")
,r pkg("glmmTMB")
,r pkg("sommer")
,r pkg("glmmrBase")
,r pkg("regress")
-
Kinship-augmented models (responses where individuals have a known family relationship):
r pkg("pedigreemm")
,r pkg("coxme")
,r pkg("kinship2")
,r github("Biometris/LMMsolver")
,r pkg("MCMCglmm", priority = "core")
,r pkg("sommer", priority = "core")
,r pkg("rrBLUP")
,r pkg("BGLR")
,r github("perpdgo/lme4GS")
,r github("variani/lme4qtl")
,r github("cheuerde/cpgen")
,r pkg("QTLRel")
. -
Location-scale models:
r pkg("nlme", priority = "core")
,r pkg("glmmTMB", priority = "core")
,r pkg("brms", priority = "core")
,r pkg("mgcv")
[withfamily
chosen from one of the*ls
/*lss
options] all allow modeling of the dispersion/scale component. -
Missing values:
r pkg("mice")
,r pkg("CRTgeeDR")
,r pkg("JointAI")
,r pkg("mdmb")
,r pkg("pan")
; see also ther view("MissingData")
task view. - [Multiple membership models]{#multimembership-models}: (Bayesian)
r pkg("MCMCglmm", priority = "core")
,r pkg("brms", priority = "core")
,r github("benrosche/rmm")
; (frequentist)r github("jvparidon/lmerMultiMember")
(can also fit the Bradley-Terry model) -
Multinomial responses:
r pkg("bamlss")
,r pkg("R2BayesX")
,r pkg("MCMCglmm", priority = "core")
,r pkg("mgcv")
,r pkg("mclogit")
. -
Multivariate responses/multi-trait analysis: (multiple dependent variables; the response variables may or may not be constrained to be from the same family)
r pkg("MCMCglmm", priority = "core")
,r github("deruncie/MegaLMM")
,r pkg("brms")
,r pkg("sommer")
, INLA. Many mixed-effect packages allow fitting of (homogeneous) multivariate responses by "melting" the data (converting to long format) and treating each observation in the original data as a cluster. -
Non-Gaussian random effects:
r pkg("brms", priority = "core")
,r pkg("repeated")
,r pkg("spaMM")
. -
Ordinal-valued responses (responses measured on an ordinal scale):
r pkg("ordinal")
,r pkg("GLMMadaptive")
,r pkg("multgee")
(frequentist);r pkg("MCMCglmm")
,r pkg("brms")
(Bayesian),r pkg("cplm")
(both) -
Over-dispersed models:
r pkg("aod")
,r pkg("aods3")
. -
Panel data: in econometrics, panel data typically refers to subjects (individuals or firms) that are sampled repeatedly over time. The theoretical and computational approaches used by econometricians overlap with mixed models (e.g., see here). The
r pkg("plm")
package can fit mixed-effects panel models; see also ther view("Econometrics")
task view. -
Quantile regression:
r pkg("lqmm")
,r pkg("qrLMM")
,r pkg("qrNLMM")
. -
Phylogenetic models:
r pkg("pez")
,r pkg("phyr")
,r pkg("MCMCglmm", priority = "core")
,r pkg("brms", priority = "core")
. -
Repeated measures: (packages with specialized covariance structures for handling repeated measures)
r pkg("nlme", priority = "core")
,r pkg("mmrm")
,r pkg("glmmTMB", priority = "core")
,r github("Biometris/LMMsolver")
,r pkg("repeated")
,r pkg("mmrm")
-
Regularized/penalized models (regularization or variable selection by ridge, lasso, or elastic net penalties):
r pkg("splmm")
fits LMMs for high-dimensional data by imposing penalty on both the fixed effects and random effects for variable selection.r pkg("glmmLasso")
fits GLMMs with L1-penalized (LASSO) fixed effects.r pkg("bamlss")
implements LASSO-like penalization for generalized additive models. -
Robust/heavy-tailed estimation (downweighting the importance of extreme observations):
r pkg("robustlmm")
,r pkg("robustBLME")
(Bayesian robust LME),r pkg("CRTgeeDR")
for the doubly robust inverse probability weighted augmented GEE estimator. Some packages (r pkg("brms", priority = "core")
,r pkg("bamlss")
,r pkg("GLMMadaptive")
,r pkg("glmmTMB")
,r pkg("mgcv")
withfamily = "scat"
,r pkg("nlmixr2")
) allow heavy-tailed response distributions such as Student-$t$. -
Skewed data/response transformation:
r pkg("skewlmm")
fits a scale mixture of skew-normal linear mixed models using expectation-maximization (EM).r pkg("nlmixr2")
can fit skewed data with dynamic transform of both sides with bothcoxBox()
andyeoJohnson()
transformations with maximum likelihood or the EM method "saem".r pkg("bcmixed")
fits Box-Cox-transformed LMMs and provides inferences for differences between treatment levels.r pkg("boxcoxmix")
fits Box-Cox transformed LMMs and logistic mixed models. -
Spatial models:
r pkg("nlme", priority = "core")
(withcorStruct
functions),r pkg("CARBayesST")
,r pkg("sphet")
,r pkg("spind")
,r pkg("spaMM")
,r pkg("glmmfields")
,r pkg("glmmTMB")
,r pkg("inlabru")
(spatial point processes via log-Gaussian Cox processes),r pkg("brms", priority = "core")
,r github("Biometris/LMMsolver")
,r pkg("bamlss")
; see also ther view("Spatial")
andr view("SpatioTemporal")
CRAN task views. -
Sports analytics:
r pkg("mvglmmRank")
, multivariate generalized linear mixed models for ranking sports teams. -
Survival analysis:
r pkg("coxme")
. -
Tree-based models:
r pkg("glmertree")
,r pkg("semtree")
,r pkg("gpboost")
-
Weighted models:
r pkg("WeMix")
(linear and logit models with weights at multiple levels) -
Zero-inflated models: (frequentist)
r pkg("glmmTMB")
,r pkg("cplm")
,r pkg("mgcv")
(zi Poisson only),r pkg("GLMMadaptive")
; (Bayesian):r pkg("MCMCglmm", priority = "core")
,r pkg("brms", priority = "core")
,r pkg("bamlss")
. -
Zero-one inflated Beta regression:
r pkg("brms")
,r pkg("zoib")
,r pkg("glmmTMB")
(zero-inflated only). Ordered beta regression is an alternative framework to address the same type of data:r pkg("ordbetareg")
,r pkg("glmmTMB")
These packages do not directly provide functions to fit mixed models, but instead implement interfaces to general-purpose sampling and optimization toolboxes that can be used to fit mixed models. While models require extra effort to set up, and often require programming in a domain-specific language other than R, these frameworks are more flexible than most of the other packages listed here.
- Interfaces to JAGS/OpenBUGS:
r pkg("R2jags")
,r pkg("rjags")
,r pkg("R2OpenBUGS")
(BUGS language). - Interfaces to Stan (C++ extensions):
r pkg("rstan")
,r github("stan-dev/cmdstanr")
,r github("rmcelreath/rethinking")
(ulam()
function). - Other frameworks:
r pkg("TMB")
(automatic differentiation and Laplace approximation via C++ extensions),r pkg("RTMB")
(simplified R interface toTMB
),r pkg("tmbstan")
,r pkg("nimble")
,r pkg("greta")
(R interface to TensorFlow).
- general:
r pkg("HLMdiag")
(diagnostic tools for hierarchical (multilevel) linear models),r pkg("rockchalk")
,r pkg("performance")
,r pkg("multilevelTools")
,r pkg("merTools")
(for models fitted usinglme4
),r pkg("ggResidpanel")
,r pkg("mlmtools")
,r pkg("DHARMa")
. - influential data points:
r pkg("influence.ME")
,r pkg("influence.SEM")
. - residuals:
r pkg("DHARMa")
.
-
Correlations:
r pkg("iccbeta")
(intraclass correlation),r pkg("rptR")
(repeatabilities) -
$R^2$ calculations:r pkg("r2glmm")
($R^2$ and partial$R^2$ ),r pkg("MuMIn")
(r.squaredGLMM()
function),r pkg("partR2")
,r pkg("performance")
(r2()
function),r pkg("rr2")
,r pkg("mlmtools")
,r pkg("mlmhelpr")
(Note that there are many different methods for computing$R^2$ values for (G)LMMs: see e.g. Nakagawa, Johnson and Schielzeth (2017), Jaeger et al. (2017).). Many of these packages also compute intra-class correlations. -
Information criteria:
r pkg("cAIC4")
(conditional AIC) ,r pkg("blmeco")
(WAIC). -
Robust variance-covariance estimates:
r pkg("clubSandwich")
,r pkg("merDeriv")
,r pkg("mlmhelpr")
,r pkg("glmmrBase")
The first and second derivatives of log-likelihood with respect to parameters can be useful for various model evaluation tasks (e.g., computing sensitivities, robust variance-covariance matrices, or delta-method variances).
r pkg("lmeInfo")
,r pkg("merDeriv")
.
Many packages include small example data sets (e.g., r pkg("lme4", priority = "core")
, r pkg("nlme", priority = "core")
). These packages provide previously described data sets often used in evaluating mixed models.
r pkg("mlmRev")
: examples from the Multilevel Software Comparative Reviews.r pkg("SASmixed")
: data sets from *SAS System for Mixed Modelsr pkg("StroupGLMM")
: R scripts and data sets for Generalized Linear Mixed Models.r pkg("blmeco")
: Data and functions accompanying Bayesian Data Analysis in Ecology using R, BUGS and Stan.r pkg("nlmeU")
: Data sets, functions and scripts described in Linear Mixed-Effects Models: A Step-by-Step Approach.r pkg("VetResearchLMM")
: R scripts and data sets for Linear Mixed Models. An Introduction with applications in Veterinary Research.r pkg("languageR")
: R scripts and data sets for Analyzing Linguistic Data: A practical introduction to statistics using R.r pkg("nlmixr2data")
: includes the data sets for testingr pkg("nlmixr2")
against commercial competitors like 'NONMEM' and 'Monolix'
Functions and frameworks for convenient and tabular and graphical output of mixed model results:
- Tables:
r pkg("huxtable")
,r pkg("broom.mixed", priority = "core")
,r pkg("rockchalk")
,r pkg("parameters")
,r pkg("modelsummary")
. - Figures/visualization:
r pkg("dotwhisker")
,r pkg("sjPlot")
,r pkg("rockchalk")
,r pkg("mlmtools")
These functions provide convenient frameworks to fit and interpret mixed models.
- Model fitting:
r pkg("multilevelmod", priority = "core")
,r pkg("ez")
,r pkg("mixlm")
,r pkg("afex")
, andr pkg("nimble")
. - Model summaries:
r pkg("broom.mixed", priority = "core")
,r pkg("insight")
- Variable selection & model averaging:
r pkg("LMERConvenienceFunctions")
,r pkg("MuMIn")
,r pkg("glmulti")
(see, e.g., maintainer's blog or here for use with mixed models).r pkg("mlmhelpr")
- Centering/scaling predictors at the population or group level:
r pkg("mlmhelpr")
,r pkg("mlmtools")
,arm::standardize()
- Fixed effects:
r pkg("car")
,r pkg("lmerTest")
,r pkg("RVAideMemoire")
,r pkg("emmeans")
,r pkg("afex")
,r pkg("pbkrtest")
,r pkg("CLME")
. - Random effects:
r pkg("varTestnlme")
,r pkg("RLRsim")
,r pkg("mvctm")
.
r pkg("emmeans")
,r pkg("effects")
,r pkg("margins")
,r pkg("MarginalMediation")
,r pkg("marginaleffects")
,r pkg("ggeffects")
.
r pkg("pbkrtest")
,r pkg("lme4", priority = "core")
(lme4::bootMer()
function),r pkg("lmeresampler")
,r pkg("boot.pval")
,r pkg("mlmhelpr")
.
These topics are closely related because there are few available analytical methods for computing statistical power for mixed models; power usually needs to be estimated by simulation.
- Power:
r pkg("longpower")
,r pkg("pass.lme")
,r pkg("simr")
,r pkg("powerEQTL")
(powerLME
function),r github("DejanDraschkow/mixedpower")
- Simulation:
r pkg("faux")
;simulate()
inlme4
(for formula arguments),glmmTMB::simulate_new()
;r pkg("rxode2")
,r pkg("mrgsolve")
,r pkg("PKPDsim")
(ODE/pharmacokinetic models)
r pkg("cAIC4")
(cAIC4::stepcAIC
),r pkg("buildmer")
,r pkg("MuMIn")
,r github("timnewbold/StatisticalModels")
(GLMERSelect
).
- Mplus:
r pkg("MplusAutomation")
. - ASReml-R:
r pkg("asremlPlus")
. r pkg("babelmixr2")
allowsr pkg("nlmixr2")
models to be translated and run in either the commercial tool Monolix or NONMEM and then reads the results back in to create a standardizednlmixr2
fit object. This fit object runs the diagnostics innlmixr2
and compares them to the ones output in the commercial software to "validate" the fit object against the output of the commercial tool. It also interfaces with free tools such asr pkg("PKNCA")
for automatically using observed pharmacokinetic (PK) data for initial estimates of PK models.
- Help: R-SIG-mixed-models mailing list for discussion of mixed-model-related questions, course announcements, etc..
- Help: [r] + [mixed-models] tags on Stack Overflow.
- Help: Cross Validated.
- Other software: Mixed models on Bioconductor
- Other software: ASReml-R (commercial:
r pkg("asremlPlus")
). - Other software: assist.
- Other software: INLA.
- Other software: Zelig Project
- Other software: MixWild/MixRegLS for scale-location modeling.
- Other software: MixedModels.jl for mixed models in Julia.
- Other software: Monolix for ODE based mixed models (commercial).
- Other software: NONMEM for ODE based mixed models (commercial).
- Book: Mixed-Effects Models in S and S-PLUS.
- Book: SAS System for Mixed Models.
- Book: Generalized Linear Mixed Models.
- Book: Bayesian Data Analysis in Ecology using R, BUGS and Stan.
- Book: Linear Mixed-Effects Models: A Step-by-Step Approach.
- Book: Mixed Effects Models and Extensions in Ecology with R.
- Online Book: Embrace Uncertainty: Mixed-effects models with Julia.
- Online Book: Generalized Linear Mixed Models with Applications in Agriculture and Biology