Skip to content

Commit 5981dcd

Browse files
author
Keefe-Murphy
committed
Minor MoE_stepwise speed-ups by avoiding duplication of initialisation for certain steps involving only the potential addition of a gating covariate.
Minor fix to MoE_stepwise for univariate data without covariates. Prettier axis labels for MoE_Uncertainty plots. Minor CRAN compliance edits to the vignette. \doi edits. CRAN 1.3.3 release.
1 parent abd501d commit 5981dcd

15 files changed

+84
-45
lines changed

DESCRIPTION

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
Package: MoEClust
22
Type: Package
3-
Date: 2020-11-17
3+
Date: 2020-12-29
44
Title: Gaussian Parsimonious Clustering Models with Covariates and a Noise Component
5-
Version: 1.3.2
5+
Version: 1.3.3
66
Authors@R: c(person("Keefe", "Murphy", email = "keefe.murphy@mu.ie", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-7709-3159")),
77
person("Thomas Brendan", "Murphy", email = "brendan.murphy@ucd.ie", role = "ctb", comment = c(ORCID = "0000-0002-5668-7046")))
88
Description: Clustering via parsimonious Gaussian Mixtures of Experts using the MoEClust models introduced by Murphy and Murphy (2020) <doi:10.1007/s11634-019-00373-8>. This package fits finite Gaussian mixture models with a formula interface for supplying gating and/or expert network covariates using a range of parsimonious covariance parameterisations from the GPCM family via the EM/CEM algorithm. Visualisation of the results of such models using generalised pairs plots and the inclusion of an additional noise component is also facilitated. A greedy forward stepwise search algorithm is provided for identifying the optimal model in terms of the number of components, the GPCM covariance parameterisation, and the subsets of gating/expert network covariates.

R/Functions.R

Lines changed: 36 additions & 14 deletions
Large diffs are not rendered by default.

R/MoEClust.R

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#' MoEClust: Gaussian Parsimonious Clustering Models with Covariates and a Noise Component
22
#'
3-
#' Clustering via parsimonious Gaussian Mixtures of Experts using the \emph{MoEClust} models introduced by Murphy and Murphy (2020) <\href{https://doi.org/10.1007/s11634-019-00373-8}{doi:10.1007/s11634-019-00373-8}>. This package fits finite Gaussian mixture models with gating and/or expert network covariates using a range of parsimonious covariance parameterisations from the GPCM family via the EM/CEM algorithm. Visualisation of the results of such models using generalised pairs plots and the inclusion of an additional noise component is also facilitated.
3+
#' Clustering via parsimonious Gaussian Mixtures of Experts using the \emph{MoEClust} models introduced by Murphy and Murphy (2020) <\doi{10.1007/s11634-019-00373-8}>. This package fits finite Gaussian mixture models with gating and/or expert network covariates using a range of parsimonious covariance parameterisations from the GPCM family via the EM/CEM algorithm. Visualisation of the results of such models using generalised pairs plots and the inclusion of an additional noise component is also facilitated.
44
#' @section Usage:
55
#' The most important function in the \pkg{MoEClust} package is: \code{\link{MoE_clust}}, for fitting the model via EM/CEM with gating and/or expert network covariates, supplied via formula interfaces.
66
#'
@@ -24,8 +24,8 @@
2424
#' \itemize{
2525
#' \item{Type: }{Package}
2626
#' \item{Package: }{MoEClust}
27-
#' \item{Version: }{1.3.2}
28-
#' \item{Date: }{2020-11-17 (this version), 2017-11-28 (original release)}
27+
#' \item{Version: }{1.3.3}
28+
#' \item{Date: }{2020-12-29 (this version), 2017-11-28 (original release)}
2929
#' \item{Licence: }{GPL (>=2)}
3030
#' }
3131
#'
@@ -37,7 +37,7 @@
3737
#' Keefe Murphy [aut, cre], Thomas Brendan Murphy [ctb]
3838
#'
3939
#' \strong{Maintainer}: Keefe Murphy - <\email{keefe.murphy@@mu.ie}>
40-
#' @references Murphy, K. and Murphy, T. B. (2020). Gaussian parsimonious clustering models with covariates and a noise component. \emph{Advances in Data Analysis and Classification}, 14(2): 293-325. <\href{https://doi.org/10.1007/s11634-019-00373-8}{doi:10.1007/s11634-019-00373-8}>.
40+
#' @references Murphy, K. and Murphy, T. B. (2020). Gaussian parsimonious clustering models with covariates and a noise component. \emph{Advances in Data Analysis and Classification}, 14(2): 293-325. <\doi{10.1007/s11634-019-00373-8}>.
4141
#' @examples
4242
#' \donttest{data(ais)
4343
#'

R/Plotting_Functions.R

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
#' The subsetting must include at least two variables, whether they be the MAP, a response variable, or a covariate, in order to be valid for plotting purposes. The arguments \code{data.ind} and \code{cov.ind} can also be used to simply reorder the panels, without actually subsetting.
1212
#' @param response.type The type of plot desired for the scatter plots comparing continuous response variables. Defaults to \code{"points"}.
1313
#'
14-
#' Points can also be sized according to their associated clustering uncertainty with the option \code{"uncertainty"}. In so doing, the transparency of the points will also be proportional to their clustering uncertainty, provided the device supports transparency. See also \code{\link{MoE_Uncertainty}} for an alternative means of visualising observation-specific cluster uncertainties (especially for univariate data).
14+
#' Points can also be sized according to their associated clustering uncertainty with the option \code{"uncertainty"}. In doing so, the transparency of the points will also be proportional to their clustering uncertainty, provided the device supports transparency. See also \code{\link{MoE_Uncertainty}} for an alternative means of visualising observation-specific cluster uncertainties (especially for univariate data).
1515
#'
1616
#' Alternatively, the bivariate \code{"density"} contours can be displayed (see \code{density.pars}), provided there is at least one Gaussian component in the model. Caution is advised when producing density plots for models with covariates in the expert network; the required number of evaluations of the (multivariate) Gaussian density for each panel (\code{res$G * prod(density.pars$grid.size)}) increases by a factor of \code{res$n}, thus plotting may be slow (particularly for large data sets).
1717
#' @param scatter.type A vector of length 2 (or 1) giving the plot type for the upper and lower triangular portions of the plot, respectively, pertaining to the associated covariates. Defaults to \code{"lm"} for covariate vs. response panels and \code{"points"} otherwise. Only relevant for models with continuous covariates in the gating &/or expert network. \code{"ci"} and \code{"lm"} type plots are only produced for plots pairing covariates with response, and never response vs. response or covariate vs. covariate. Note that lines &/or confidence intervals will only be drawn for continuous covariates included in the expert network; to include covariates included only in the gating network also, the options \code{"lm2"} or \code{"ci2"} can be used but this is not generally advisable.
@@ -84,7 +84,7 @@
8484
#' \code{\link{plot.MoEClust}} is a wrapper to \code{\link{MoE_gpairs}} which accepts the default arguments, and also produces other types of plots. Caution is advised producing generalised pairs plots when the dimension of the data is large.
8585
#' @export
8686
#' @author Keefe Murphy - <\email{keefe.murphy@@mu.ie}>
87-
#' @references Murphy, K. and Murphy, T. B. (2020). Gaussian parsimonious clustering models with covariates and a noise component. \emph{Advances in Data Analysis and Classification}, 14(2): 293-325. <\href{https://doi.org/10.1007/s11634-019-00373-8}{doi:10.1007/s11634-019-00373-8}>.
87+
#' @references Murphy, K. and Murphy, T. B. (2020). Gaussian parsimonious clustering models with covariates and a noise component. \emph{Advances in Data Analysis and Classification}, 14(2): 293-325. <\doi{10.1007/s11634-019-00373-8}>.
8888
#'
8989
#' Emerson, J. W., Green, W. A., Schloerke, B., Crowley, J., Cook, D., Hofmann, H. and Wickham, H. (2013). The generalized pairs plot. \emph{Journal of Computational and Graphical Statistics}, 22(1): 79-91.
9090
#' @seealso \code{\link{MoE_clust}}, \code{\link{MoE_stepwise}}, \code{\link{plot.MoEClust}}, \code{\link{MoE_Uncertainty}}, \code{\link{expert_covar}}, \code{\link[lattice]{panel.stripplot}}, \code{\link[lattice]{panel.bwplot}}, \code{\link[lattice]{panel.violin}}, \code{\link[vcd]{strucplot}}
@@ -633,6 +633,7 @@ MoE_plotGate.MoEClust <- function(res, x.axis = NULL, type = "b", pch = 1, xla
633633
suppressWarnings(graphics::par(pty="m"))
634634
N <- res$n
635635
G <- res$G
636+
if(G == 1) message("No clustering has taken place!\n")
636637
Tau <- .mat_byrow(res$parameters$pro, nrow=N, ncol=ncol(res$z))
637638
vars <- all.vars(stats::as.formula(attr(res$gating, "Formula")))
638639
ncovs <- length(vars) > 1
@@ -763,6 +764,7 @@ MoE_plotLogLik.MoEClust <- function(res, type = "l", xlab = "Iteration", ylab =
763764
on.exit(suppressWarnings(graphics::par(oldpar)))
764765
suppressWarnings(graphics::par(pty="m"))
765766
xll <- res$loglik
767+
if(res$G == 1) message("EM algorithm not used; no clustering has taken place!\n")
766768
if(all(xll != cummax(xll))) warning("Log-likelihoods are not strictly increasing\n", call.=FALSE)
767769
base::plot(xll, type = ifelse(length(xll) == 1, "p", type), xlab = xlab, ylab = ylab, xaxt = xaxt, ...)
768770
if(length(xaxt) == 1 && is.character(xaxt)) {
@@ -840,17 +842,20 @@ MoE_Uncertainty.MoEClust <- function(res, type = c("barplot", "profile"), truth
840842
mC <- classError(classification=res$classification, class=as.numeric(as.factor(truth)))$misclassified
841843
}
842844
G <- res$G + noise
845+
if(G == 1) message("No clustering has taken place!\n")
843846
oneG <- 1/G
844847
min1G <- 1 - oneG
845848
yx <- unique(c(0, pretty(c(0, min1G))))
846-
yx[length(yx)] <- min1G
849+
YX <- which.min(abs(yx - min1G))
850+
yx[YX] <- min1G
851+
yx <- abs(yx[yx < 1])
847852
cm <- mclust.options("classPlotColors")
848853
if(type == "barplot") {
849854
cu <- if(tmiss) replace(rep(cm[1L], n.obs), mC, cm[2L]) else cm[seq_len(2L)][(ucert >= oneG) + 1L]
850855
cu[ucert == 0] <- NA
851856
base::plot(ucert, type="h", ylim=range(yx), col=cu, yaxt="n", ylab="", xlab="Observations", lend=1)
852857
graphics::lines(x=c(0, n.obs), y=c(oneG, oneG), lty=2, col=cm[3L])
853-
graphics::axis(2, at=yx, labels=replace(yx, length(yx), ifelse(noise, expression(1 - frac(1, widehat(G^{'(0)'}))), expression(1 - frac(1, hat(G))))), las=2, xpd=TRUE)
858+
graphics::axis(2, at=yx, labels=replace(yx, YX, ifelse(noise, expression(1 - frac(1, widehat(G^{'(0)'}))), expression(1 - frac(1, hat(G))))), las=2, xpd=TRUE)
854859
graphics::axis(2, at=oneG, labels=ifelse(noise, expression(frac(1, widehat(G^{'(0)'}))), expression(frac(1, hat(G)))), las=2, xpd=TRUE, side=4)
855860
} else {
856861
ord <- order(ucert, decreasing=decreasing)
@@ -861,7 +866,7 @@ MoE_Uncertainty.MoEClust <- function(res, type = c("barplot", "profile"), truth
861866
graphics::lines(ucord)
862867
graphics::points(ucord, pch=15, cex=if(tmiss) replace(rep(0.5, n.obs), mcO, 0.75) else 0.5, col=if(tmiss) replace(rep(1, n.obs), mcO, cm[2L]) else 1)
863868
graphics::lines(x=c(0, n.obs), y=c(oneG, oneG), lty=2, col=cm[3L])
864-
graphics::axis(2, at=yx, labels=replace(yx, length(yx), ifelse(noise, expression(1 - frac(1, widehat(G^{'(0)'}))), expression(1 - frac(1, hat(G))))), las=2, xpd=TRUE)
869+
graphics::axis(2, at=yx, labels=replace(yx, YX, ifelse(noise, expression(1 - frac(1, widehat(G^{'(0)'}))), expression(1 - frac(1, hat(G))))), las=2, xpd=TRUE)
865870
graphics::axis(2, at=oneG, labels=ifelse(noise, expression(frac(1, widehat(G^{'(0)'}))), expression(frac(1, hat(G)))), las=2, xpd=TRUE, side=4)
866871
if(tmiss) {
867872
Nseq <- (seq_len(n.obs))
@@ -904,7 +909,7 @@ MoE_Uncertainty.MoEClust <- function(res, type = c("barplot", "profile"), truth
904909
#' Other types of plots are available by first calling \code{\link{as.Mclust}} on the fitted object, and then calling \code{\link[mclust]{plot.Mclust}} on the results. These can be especially useful for univariate data.
905910
#' @return The visualisation according to \code{what} of the results of a fitted \code{MoEClust} model.
906911
#' @seealso \code{\link{MoE_clust}}, \code{\link{MoE_stepwise}}, \code{\link{MoE_gpairs}}, \code{\link{MoE_plotGate}}, \code{\link{MoE_plotCrit}}, \code{\link{MoE_plotLogLik}}, \code{\link{MoE_Uncertainty}}, \code{\link{as.Mclust}}, \code{\link[mclust]{plot.Mclust}}
907-
#'@references Murphy, K. and Murphy, T. B. (2020). Gaussian parsimonious clustering models with covariates and a noise component. \emph{Advances in Data Analysis and Classification}, 14(2): 293-325. <\href{https://doi.org/10.1007/s11634-019-00373-8}{doi:10.1007/s11634-019-00373-8}>.
912+
#'@references Murphy, K. and Murphy, T. B. (2020). Gaussian parsimonious clustering models with covariates and a noise component. \emph{Advances in Data Analysis and Classification}, 14(2): 293-325. <\doi{10.1007/s11634-019-00373-8}>.
908913
#' @author Keefe Murphy - <\email{keefe.murphy@@mu.ie}>
909914
#' @export
910915
#' @method plot MoEClust

inst/CITATION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ citEntry(entry = "Article",
1919
"URL: https://doi.org/10.1007/s11634-019-00373-8")
2020
)
2121

22-
year <- sub(".*(2[[:digit:]]{3})-.*", "\\1", meta$Date)
22+
year <- 2020
2323
vers <- paste("package version", meta$Version)
2424

2525
citEntry(entry = "Manual",

inst/NEWS.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,13 @@ __with Gating and Expert Network Covariates__
55
__and a Noise Component__
66
=======================================================
77

8+
## MoEClust v1.3.3 - (_11<sup>th</sup> release [patch update]: 2020-12-29_)
9+
### New Features, Improvements, Bug Fixes & Miscellaneous Edits
10+
* Minor `MoE_stepwise` speed-ups by avoiding duplication of initialisation for certain steps.
11+
* Minor fix to `MoE_stepwise` for univariate data sets without covariates.
12+
* Prettier axis labels for `MoE_uncertainty` plots.
13+
* Minor CRAN compliance edits to the vignette.
14+
815
## MoEClust v1.3.2 - (_10<sup>th</sup> release [patch update]: 2020-11-17_)
916
### New Features, Improvements, Bug Fixes & Miscellaneous Edits
1017
* New `MoE_control` arg. `posidens=TRUE` ensures code no longer crashes when observations

man/MoEClust-package.Rd

Lines changed: 4 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)