From fecdef4254c9861f84eaf2abd234327bf8be710a Mon Sep 17 00:00:00 2001
From: "Mattan S. Ben-Shachar" <mattansb@msbstats.info>
Date: Sun, 30 Jul 2023 09:28:30 +0300
Subject: [PATCH] improve pd docs

---
 R/p_direction.R    | 134 ++++++++++++++++++++++++---------------------
 man/p_direction.Rd | 126 +++++++++++++++++++++++++-----------------
 man/p_map.Rd       |   3 +-
 man/sexit.Rd       |   3 +-
 4 files changed, 151 insertions(+), 115 deletions(-)

diff --git a/R/p_direction.R b/R/p_direction.R
index e989ecddb..7de3fe9f7 100644
--- a/R/p_direction.R
+++ b/R/p_direction.R
@@ -1,81 +1,91 @@
 #' Probability of Direction (pd)
 #'
-#' Compute the **Probability of Direction** (***pd***, also known
-#' as the Maximum Probability of Effect - *MPE*). It varies between `50%`
-#' and `100%` (*i.e.*, `0.5` and `1`) and can be interpreted as
-#' the probability (expressed in percentage) that a parameter (described by its
-#' posterior distribution) is strictly positive or negative (whichever is the
-#' most probable). It is mathematically defined as the proportion of the
-#' posterior distribution that is of the median's sign. Although differently
-#' expressed, this index is fairly similar (*i.e.*, is strongly correlated)
-#' to the frequentist **p-value**.
-#' \cr\cr
-#' Note that in some (rare) cases, especially when used with model averaged
-#' posteriors (see [weighted_posteriors()] or
-#' `brms::posterior_average`), `pd` can be smaller than `0.5`,
-#' reflecting high credibility of `0`.
+#' Compute the **Probability of Direction** (***pd***, also known as the Maximum
+#' Probability of Effect - *MPE*). This can be interpreted as the probability
+#' that a parameter (described by its posterior distribution) is strictly
+#' positive or negative (whichever is the most probable). Although differently
+#' expressed, this index is fairly similar (*i.e.*, is strongly correlated) to
+#' the frequentist **p-value** (see details).
 #'
-#' @param x Vector representing a posterior distribution. Can also be a Bayesian model (`stanreg`, `brmsfit` or `BayesFactor`).
-#' @param method Can be `"direct"` or one of methods of [density estimation][estimate_density], such as `"kernel"`, `"logspline"` or `"KernSmooth"`. If `"direct"` (default), the computation is based on the raw ratio of samples superior and inferior to 0. Else, the result is based on the [Area under the Curve (AUC)][auc] of the estimated [density][estimate_density] function.
-#' @param null The value considered as a "null" effect. Traditionally 0, but could also be 1 in the case of ratios.
+#' @param x A vector representing a posterior distribution, a data frame of
+#'   posterior draws (samples be parameter). Can also be a Bayesian model.
+#' @param method Can be `"direct"` or one of methods of [`estimate_density()`],
+#'   such as `"kernel"`, `"logspline"` or `"KernSmooth"`. See details.
+#' @param null The value considered as a "null" effect. Traditionally 0, but
+#'   could also be 1 in the case of ratios of change (OR, IRR, ...).
 #' @inheritParams hdi
 #'
 #' @details
-#' \subsection{What is the *pd*?}{
+#' ## What is the *pd*?
 #' The Probability of Direction (pd) is an index of effect existence, ranging
-#' from `50%` to `100%`, representing the certainty with which an effect goes in
-#' a particular direction (*i.e.*, is positive or negative). Beyond its
-#' simplicity of interpretation, understanding and computation, this index also
-#' presents other interesting properties:
-#' \itemize{
-#'   \item It is independent from the model: It is solely based on the posterior
-#'   distributions and does not require any additional information from the data
-#'   or the model.
-#'   \item It is robust to the scale of both the response variable and the predictors.
-#'   \item It is strongly correlated with the frequentist p-value, and can thus
+#' from 0 to 1, representing the certainty with which an effect goes in a
+#' particular direction (*i.e.*, is positive or negative / has a sign). Beyond
+#' its simplicity of interpretation, understanding and computation, this index
+#' also presents other interesting properties:
+#' - It is robust to the scale of both the response variable and the predictors.
+#' - It is strongly correlated with the frequentist p-value, and can thus
 #'   be used to draw parallels and give some reference to readers non-familiar
-#'   with Bayesian statistics.
-#' }
-#' }
-#' \subsection{Relationship with the p-value}{
-#' In most cases, it seems that the *pd* has a direct correspondence with the frequentist one-sided *p*-value through the formula \ifelse{html}{\out{p<sub>one&nbsp;sided</sub>&nbsp;=&nbsp;1&nbsp;-&nbsp;<sup>p(<em>d</em>)</sup>/<sub>100</sub>}}{\eqn{p_{one sided}=1-\frac{p_{d}}{100}}} and to the two-sided p-value (the most commonly reported one) through the formula \ifelse{html}{\out{p<sub>two&nbsp;sided</sub>&nbsp;=&nbsp;2&nbsp;*&nbsp;(1&nbsp;-&nbsp;<sup>p(<em>d</em>)</sup>/<sub>100</sub>)}}{\eqn{p_{two sided}=2*(1-\frac{p_{d}}{100})}}. Thus, a two-sided p-value of respectively `.1`, `.05`, `.01` and `.001` would correspond approximately to a *pd* of `95%`, `97.5%`, `99.5%` and `99.95%`. See also [pd_to_p()].
-#' }
-#' \subsection{Methods of computation}{
-#'  The most simple and direct way to compute the *pd* is to 1) look at the
-#'  median's sign, 2) select the portion of the posterior of the same sign and
-#'  3) compute the percentage that this portion represents. This "simple" method
-#'  is the most straightforward, but its precision is directly tied to the
-#'  number of posterior draws. The second approach relies on [density
-#'  estimation][estimate_density]. It starts by estimating the density function
-#'  (for which many methods are available), and then computing the [area under
-#'  the curve][area_under_curve] (AUC) of the density curve on the other side of
-#'  0.
-#' }
-#' \subsection{Strengths and Limitations}{
-#' **Strengths:** Straightforward computation and interpretation. Objective
-#' property of the posterior distribution. 1:1 correspondence with the
-#' frequentist p-value.
-#' \cr \cr
-#' **Limitations:** Limited information favoring the null hypothesis.
-#' }
+#'   with Bayesian statistics (Makowski et al., 2019). See also [`pd_to_p()`].
 #'
-#' @return
-#' Values between 0.5 and 1 corresponding to the probability of direction (pd).
+#' ## Possible Range of Values
+#' The largest value *pd* can take is 1 - the posterior is strictly directional.
+#' However, the smallest value *pd* can take depends on the parameter space
+#' represented by the posterior.
 #' \cr\cr
-#' Note that in some (rare) cases, especially when used with model averaged
-#' posteriors (see [weighted_posteriors()] or
-#' `brms::posterior_average`), `pd` can be smaller than `0.5`,
-#' reflecting high credibility of `0`. To detect such cases, the
-#' `method = "direct"` must be used.
+#' **For a continuous parameter space**, exact values of 0 (or any point null
+#' value) are not possible, and so 100% of the posterior has _some_ sign, some
+#' positive, some negative. Therefore, the smallest the *pd* can be is 0.5 -
+#' with an equal posterior mass of positive and negative values. Values close to
+#' 0.5 _cannot_ be used to support the null hypothesis (that the parameter does
+#' _not_ have a direction) is a similar why to how large p-values cannot be used
+#' to support the null hypothesis (see [`pd_tp_p()`]; Makowski et al., 2019).
+#' \cr\cr
+#' **For a discrete parameter space or a parameter space that is a mixture
+#' between discrete and continuous spaces**, exact values of 0 (or any point
+#' null value) _are_ possible! Therefore, the smallest the *pd* can be is 0 -
+#' with 100% of the posterior mass on 0. Thus values close to 0 can be used to
+#' support the null hypothesis (see van den Bergh et al., 2021).
+#' \cr\cr
+#' Examples of posteriors representing discrete parameter space:
+#' - When a parameter can only take discrete values.
+#' - When a mixture prior/posterior is used (such as the spike-and-slab prior;
+#'   see van den Bergh et al., 2021).
+#' - When conducting Bayesian model averaging (e.g., [weighted_posteriors()] or
+#'   `brms::posterior_average`).
+#'
+#' ## Methods of computation
+#' The *pd* is defined as:
+#' \deqn{p_d = max({Pr(\hat{\theta} < \theta_{null}), Pr(\hat{\theta} > \theta_{null})})}{pd = max(mean(x < null), mean(x > null))}
+#' \cr\cr
+#' The most simple and direct way to compute the *pd* is to compute the
+#' proportion of positive (or larger than `null`) posterior samples, the
+#' proportion of negative (or smaller than `null`) posterior samples, and take
+#' the larger of the two. This "simple" method is the most straightforward, but
+#' its precision is directly tied to the number of posterior draws.
+#' \cr\cr
+#' The second approach relies on [`density estimation()`]: It starts by
+#' estimating the continuous-smooth density function (for which many methods are
+#' available), and then computing the [area under the curve][area_under_curve]
+#' (AUC) of the density curve on either side of `null` and taking the maximum
+#' between them. Note the this approach assumes a continuous density function,
+#' and so **when the posterior represents a (partially) discrete parameter
+#' space, only the direct method _must_ be used** (see above).
+#'
+#' @return
+#' Values between 0.5 and 1 *or* between 0 and 1 (see above) corresponding to
+#' the probability of direction (pd).
 #'
 #' @seealso [pd_to_p()] to convert between Probability of Direction (pd) and p-value.
 #'
 #' @note There is also a [`plot()`-method](https://easystats.github.io/see/articles/bayestestR.html) implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}.
 #'
 #' @references
-#' Makowski D, Ben-Shachar MS, Chen SHA, Lüdecke D (2019) Indices of Effect
-#' Existence and Significance in the Bayesian Framework. Frontiers in Psychology
-#' 2019;10:2767. \doi{10.3389/fpsyg.2019.02767}
+#' - Makowski, D., Ben-Shachar, M. S., Chen, S. A., & Lüdecke, D. (2019).
+#'   Indices of effect existence and significance in the Bayesian framework.
+#'   Frontiers in psychology, 10, 2767. \doi{10.3389/fpsyg.2019.02767}
+#' - van den Bergh, D., Haaf, J. M., Ly, A., Rouder, J. N., & Wagenmakers, E. J.
+#'   (2021). A cautionary note on estimating effect size. Advances in Methods
+#'   and Practices in Psychological Science, 4(1). \doi{10.1177/2515245921992035}
 #'
 #' @examples
 #' library(bayestestR)
diff --git a/man/p_direction.Rd b/man/p_direction.Rd
index cf90a3531..ca54f27d4 100644
--- a/man/p_direction.Rd
+++ b/man/p_direction.Rd
@@ -48,13 +48,16 @@ pd(x, ...)
 \method{p_direction}{BFBayesFactor}(x, method = "direct", null = 0, ...)
 }
 \arguments{
-\item{x}{Vector representing a posterior distribution. Can also be a Bayesian model (\code{stanreg}, \code{brmsfit} or \code{BayesFactor}).}
+\item{x}{A vector representing a posterior distribution, a data frame of
+posterior draws (samples be parameter). Can also be a Bayesian model.}
 
 \item{...}{Currently not used.}
 
-\item{method}{Can be \code{"direct"} or one of methods of \link[=estimate_density]{density estimation}, such as \code{"kernel"}, \code{"logspline"} or \code{"KernSmooth"}. If \code{"direct"} (default), the computation is based on the raw ratio of samples superior and inferior to 0. Else, the result is based on the \link[=auc]{Area under the Curve (AUC)} of the estimated \link[=estimate_density]{density} function.}
+\item{method}{Can be \code{"direct"} or one of methods of \code{\link[=estimate_density]{estimate_density()}},
+such as \code{"kernel"}, \code{"logspline"} or \code{"KernSmooth"}. See details.}
 
-\item{null}{The value considered as a "null" effect. Traditionally 0, but could also be 1 in the case of ratios.}
+\item{null}{The value considered as a "null" effect. Traditionally 0, but
+could also be 1 in the case of ratios of change (OR, IRR, ...).}
 
 \item{effects}{Should results for fixed effects, random effects or both be
 returned? Only applies to mixed models. May be abbreviated.}
@@ -70,65 +73,81 @@ filtered by default, so only parameters that typically appear in the
 for the output.}
 }
 \value{
-Values between 0.5 and 1 corresponding to the probability of direction (pd).
-\cr\cr
-Note that in some (rare) cases, especially when used with model averaged
-posteriors (see \code{\link[=weighted_posteriors]{weighted_posteriors()}} or
-\code{brms::posterior_average}), \code{pd} can be smaller than \code{0.5},
-reflecting high credibility of \code{0}. To detect such cases, the
-\code{method = "direct"} must be used.
+Values between 0.5 and 1 \emph{or} between 0 and 1 (see above) corresponding to
+the probability of direction (pd).
 }
 \description{
-Compute the \strong{Probability of Direction} (\emph{\strong{pd}}, also known
-as the Maximum Probability of Effect - \emph{MPE}). It varies between \verb{50\%}
-and \verb{100\%} (\emph{i.e.}, \code{0.5} and \code{1}) and can be interpreted as
-the probability (expressed in percentage) that a parameter (described by its
-posterior distribution) is strictly positive or negative (whichever is the
-most probable). It is mathematically defined as the proportion of the
-posterior distribution that is of the median's sign. Although differently
-expressed, this index is fairly similar (\emph{i.e.}, is strongly correlated)
-to the frequentist \strong{p-value}.
-\cr\cr
-Note that in some (rare) cases, especially when used with model averaged
-posteriors (see \code{\link[=weighted_posteriors]{weighted_posteriors()}} or
-\code{brms::posterior_average}), \code{pd} can be smaller than \code{0.5},
-reflecting high credibility of \code{0}.
+Compute the \strong{Probability of Direction} (\emph{\strong{pd}}, also known as the Maximum
+Probability of Effect - \emph{MPE}). This can be interpreted as the probability
+that a parameter (described by its posterior distribution) is strictly
+positive or negative (whichever is the most probable). Although differently
+expressed, this index is fairly similar (\emph{i.e.}, is strongly correlated) to
+the frequentist \strong{p-value} (see details).
 }
 \details{
 \subsection{What is the \emph{pd}?}{
+
 The Probability of Direction (pd) is an index of effect existence, ranging
-from \verb{50\%} to \verb{100\%}, representing the certainty with which an effect goes in
-a particular direction (\emph{i.e.}, is positive or negative). Beyond its
-simplicity of interpretation, understanding and computation, this index also
-presents other interesting properties:
+from 0 to 1, representing the certainty with which an effect goes in a
+particular direction (\emph{i.e.}, is positive or negative / has a sign). Beyond
+its simplicity of interpretation, understanding and computation, this index
+also presents other interesting properties:
 \itemize{
-\item It is independent from the model: It is solely based on the posterior
-distributions and does not require any additional information from the data
-or the model.
 \item It is robust to the scale of both the response variable and the predictors.
 \item It is strongly correlated with the frequentist p-value, and can thus
 be used to draw parallels and give some reference to readers non-familiar
-with Bayesian statistics.
+with Bayesian statistics (Makowski et al., 2019). See also \code{\link[=pd_to_p]{pd_to_p()}}.
+}
 }
+
+\subsection{Possible Range of Values}{
+
+The largest value \emph{pd} can take is 1 - the posterior is strictly directional.
+However, the smallest value \emph{pd} can take depends on the parameter space
+represented by the posterior.
+\cr\cr
+\strong{For a continuous parameter space}, exact values of 0 (or any point null
+value) are not possible, and so 100\% of the posterior has \emph{some} sign, some
+positive, some negative. Therefore, the smallest the \emph{pd} can be is 0.5 -
+with an equal posterior mass of positive and negative values. Values close to
+0.5 \emph{cannot} be used to support the null hypothesis (that the parameter does
+\emph{not} have a direction) is a similar why to how large p-values cannot be used
+to support the null hypothesis (see \code{\link[=pd_tp_p]{pd_tp_p()}}; Makowski et al., 2019).
+\cr\cr
+\strong{For a discrete parameter space or a parameter space that is a mixture
+between discrete and continuous spaces}, exact values of 0 (or any point
+null value) \emph{are} possible! Therefore, the smallest the \emph{pd} can be is 0 -
+with 100\% of the posterior mass on 0. Thus values close to 0 can be used to
+support the null hypothesis (see van den Bergh et al., 2021).
+\cr\cr
+Examples of posteriors representing discrete parameter space:
+\itemize{
+\item When a parameter can only take discrete values.
+\item When a mixture prior/posterior is used (such as the spike-and-slab prior;
+see van den Bergh et al., 2021).
+\item When conducting Bayesian model averaging (e.g., \code{\link[=weighted_posteriors]{weighted_posteriors()}} or
+\code{brms::posterior_average}).
 }
-\subsection{Relationship with the p-value}{
-In most cases, it seems that the \emph{pd} has a direct correspondence with the frequentist one-sided \emph{p}-value through the formula \ifelse{html}{\out{p<sub>one&nbsp;sided</sub>&nbsp;=&nbsp;1&nbsp;-&nbsp;<sup>p(<em>d</em>)</sup>/<sub>100</sub>}}{\eqn{p_{one sided}=1-\frac{p_{d}}{100}}} and to the two-sided p-value (the most commonly reported one) through the formula \ifelse{html}{\out{p<sub>two&nbsp;sided</sub>&nbsp;=&nbsp;2&nbsp;*&nbsp;(1&nbsp;-&nbsp;<sup>p(<em>d</em>)</sup>/<sub>100</sub>)}}{\eqn{p_{two sided}=2*(1-\frac{p_{d}}{100})}}. Thus, a two-sided p-value of respectively \code{.1}, \code{.05}, \code{.01} and \code{.001} would correspond approximately to a \emph{pd} of \verb{95\%}, \verb{97.5\%}, \verb{99.5\%} and \verb{99.95\%}. See also \code{\link[=pd_to_p]{pd_to_p()}}.
 }
+
 \subsection{Methods of computation}{
-The most simple and direct way to compute the \emph{pd} is to 1) look at the
-median's sign, 2) select the portion of the posterior of the same sign and
-3) compute the percentage that this portion represents. This "simple" method
-is the most straightforward, but its precision is directly tied to the
-number of posterior draws. The second approach relies on \link[=estimate_density]{density estimation}. It starts by estimating the density function
-(for which many methods are available), and then computing the \link[=area_under_curve]{area under the curve} (AUC) of the density curve on the other side of
-0.
-}
-\subsection{Strengths and Limitations}{
-\strong{Strengths:} Straightforward computation and interpretation. Objective
-property of the posterior distribution. 1:1 correspondence with the
-frequentist p-value.
-\cr \cr
-\strong{Limitations:} Limited information favoring the null hypothesis.
+
+The \emph{pd} is defined as:
+\deqn{p_d = max({Pr(\hat{\theta} < \theta_{null}), Pr(\hat{\theta} > \theta_{null})})}{pd = max(mean(x < null), mean(x > null))}
+\cr\cr
+The most simple and direct way to compute the \emph{pd} is to compute the
+proportion of positive (or larger than \code{null}) posterior samples, the
+proportion of negative (or smaller than \code{null}) posterior samples, and take
+the larger of the two. This "simple" method is the most straightforward, but
+its precision is directly tied to the number of posterior draws.
+\cr\cr
+The second approach relies on \code{\link[=density estimation]{density estimation()}}: It starts by
+estimating the continuous-smooth density function (for which many methods are
+available), and then computing the \link[=area_under_curve]{area under the curve}
+(AUC) of the density curve on either side of \code{null} and taking the maximum
+between them. Note the this approach assumes a continuous density function,
+and so \strong{when the posterior represents a (partially) discrete parameter
+space, only the direct method \emph{must} be used} (see above).
 }
 }
 \note{
@@ -184,9 +203,14 @@ if (require("BayesFactor")) {
 }
 }
 \references{
-Makowski D, Ben-Shachar MS, Chen SHA, Lüdecke D (2019) Indices of Effect
-Existence and Significance in the Bayesian Framework. Frontiers in Psychology
-2019;10:2767. \doi{10.3389/fpsyg.2019.02767}
+\itemize{
+\item Makowski, D., Ben-Shachar, M. S., Chen, S. A., & Lüdecke, D. (2019).
+Indices of effect existence and significance in the Bayesian framework.
+Frontiers in psychology, 10, 2767. \doi{10.3389/fpsyg.2019.02767}
+\item van den Bergh, D., Haaf, J. M., Ly, A., Rouder, J. N., & Wagenmakers, E. J.
+(2021). A cautionary note on estimating effect size. Advances in Methods
+and Practices in Psychological Science, 4(1). \doi{10.1177/2515245921992035}
+}
 }
 \seealso{
 \code{\link[=pd_to_p]{pd_to_p()}} to convert between Probability of Direction (pd) and p-value.
diff --git a/man/p_map.Rd b/man/p_map.Rd
index 9e9e625fe..94d9c21f2 100644
--- a/man/p_map.Rd
+++ b/man/p_map.Rd
@@ -41,7 +41,8 @@ of models (see, for example, \code{methods("hdi")}) and not all of those are
 documented in the 'Usage' section, because methods for other classes mostly
 resemble the arguments of the \code{.numeric} or \code{.data.frame}methods.}
 
-\item{null}{The value considered as a "null" effect. Traditionally 0, but could also be 1 in the case of ratios.}
+\item{null}{The value considered as a "null" effect. Traditionally 0, but
+could also be 1 in the case of ratios of change (OR, IRR, ...).}
 
 \item{precision}{Number of points of density data. See the \code{n} parameter in \code{density}.}
 
diff --git a/man/sexit.Rd b/man/sexit.Rd
index 8b0f8029a..74d4cd5ce 100644
--- a/man/sexit.Rd
+++ b/man/sexit.Rd
@@ -7,7 +7,8 @@
 sexit(x, significant = "default", large = "default", ci = 0.95, ...)
 }
 \arguments{
-\item{x}{Vector representing a posterior distribution. Can also be a Bayesian model (\code{stanreg}, \code{brmsfit} or \code{BayesFactor}).}
+\item{x}{A vector representing a posterior distribution, a data frame of
+posterior draws (samples be parameter). Can also be a Bayesian model.}
 
 \item{significant, large}{The threshold values to use for significant and
 large probabilities. If left to 'default', will be selected through