diff --git a/python/src/GeneralLinearModelAlgorithm_doc.i.in b/python/src/GeneralLinearModelAlgorithm_doc.i.in index 41c6f48056f..33f2ce5c36c 100644 --- a/python/src/GeneralLinearModelAlgorithm_doc.i.in +++ b/python/src/GeneralLinearModelAlgorithm_doc.i.in @@ -9,14 +9,15 @@ Available constructors: Parameters ---------- inputSample, outputSample : :class:`~openturns.Sample` or 2d-array - The samples :math:`(\vect{x}_k)_{1 \leq k \leq N} \in \Rset^n` and :math:`(\vect{y}_k)_{1 \leq k \leq N}\in \Rset^d`. + The samples :math:`(\vect{x}_k)_{1 \leq k \leq \sampleSize} \in \Rset^\inputDim` and + :math:`(\vect{y}_k)_{1 \leq k \leq \sampleSize}\in \Rset^\outputDim`. covarianceModel : :class:`~openturns.CovarianceModel` Covariance model of the Gaussian process. See notes for the details. basis : :class:`~openturns.Basis` - Functional basis to estimate the trend: :math:`(\varphi_j)_{1 \leq j \leq n_1}: \Rset^n \rightarrow \Rset`. - If :math:`d>1`, the same basis is used for each marginal output. + Functional basis to estimate the trend: :math:`(\varphi_j)_{1 \leq j \leq n_1}: \Rset^\inputDim \rightarrow \Rset`. + If :math:`\outputDim > 1`, the same basis is used for each marginal output. keepCovariance : bool, optional Indicates whether the covariance matrix has to be stored in the result structure *GeneralLinearModelResult*. @@ -24,9 +25,13 @@ keepCovariance : bool, optional Notes ----- -We suppose we have a sample :math:`(\vect{x}_k, \vect{y}_k)_{1 \leq k \leq N}` where :math:`\vect{y}_k = \cM(\vect{x}_k)` for all :math:`k`, with :math:`\cM:\Rset^n \mapsto \Rset^d` a given function. +We suppose we have a sample :math:`(\vect{x}_k, \vect{y}_k)_{1 \leq k \leq \sampleSize}` where +:math:`\vect{y}_k = \model(\vect{x}_k)` for all :math:`k`, with :math:`\model:\Rset^\inputDim \mapsto \Rset^\outputDim` a given function. -The objective is to build a metamodel :math:`\tilde{\cM}`, using a **general linear model**: the sample :math:`(\vect{y}_k)_{1 \leq k \leq N}` is considered as the restriction of a Gaussian process :math:`\vect{Y}(\omega, \vect{x})` on :math:`(\vect{x}_k)_{1 \leq k \leq N}`. The Gaussian process :math:`\vect{Y}(\omega, \vect{x})` is defined by: +The objective is to build a metamodel :math:`\metaModel`, using a **general linear model**: the sample +:math:`(\vect{y}_k)_{1 \leq k \leq \sampleSize}` is considered as the restriction of a Gaussian process +:math:`\vect{Y}(\omega, \vect{x})` on :math:`(\vect{x}_k)_{1 \leq k \leq \sampleSize}`. +The Gaussian process :math:`\vect{Y}(\omega, \vect{x})` is defined by: .. math:: @@ -44,9 +49,12 @@ where: \end{array} \right) -with :math:`\mu_\ell(\vect{x}) = \sum_{j=1}^{n_\ell} \beta_j^\ell \varphi_j^\ell(\vect{x})` and :math:`\varphi_j^\ell: \Rset^n \rightarrow \Rset` the trend functions. +with :math:`\mu_\ell(\vect{x}) = \sum_{j=1}^{n_\ell} \beta_j^\ell \varphi_j^\ell(\vect{x})` and +:math:`\varphi_j^\ell: \Rset^\inputDim \rightarrow \Rset` the trend functions. -:math:`\vect{W}` is a Gaussian process of dimension :math:`d` with zero mean and covariance function :math:`C = C(\vect{\theta}, \vect{\sigma}, \mat{R}, \vect{\lambda})` (see :class:`~openturns.CovarianceModel` for the notations). +:math:`\vect{W}` is a Gaussian process of dimension :math:`\outputDim` with zero mean and covariance function +:math:`C = C(\vect{\theta}, \vect{\sigma}, \mat{R}, \vect{\lambda})` (see :class:`~openturns.CovarianceModel` +for the notations). We note: @@ -64,12 +72,15 @@ We note: \begin{array}{l} \vect{\beta}^1\\ \vdots \\ - \vect{\beta}^d + \vect{\beta}^\outputDim \end{array} \right)\in \Rset^{\sum_{\ell=1}^p n_\ell} -The *GeneralLinearModelAlgorithm* class estimates the coefficients :math:`\beta_j^\ell` and :math:`\vect{p}` where :math:`\vect{p}` is the vector of parameters of the covariance model (a subset of :math:`\vect{\theta}, \vect{\sigma}, \mat{R}, \vect{\lambda}`) that has been declared as *active* (by default, the full vectors :math:`\vect{\theta}` and :math:`\vect{\sigma}`). +The *GeneralLinearModelAlgorithm* class estimates the coefficients :math:`\beta_j^\ell` and :math:`\vect{p}` +where :math:`\vect{p}` is the vector of parameters of the covariance model (a subset of +:math:`\vect{\theta}, \vect{\sigma}, \mat{R}, \vect{\lambda}`) that has been declared as +*active* (by default, the full vectors :math:`\vect{\theta}` and :math:`\vect{\sigma}`). The estimation is done by maximizing the *reduced* log-likelihood of the model, see its expression below. @@ -113,14 +124,17 @@ The model likelihood writes: .. math:: - \cL(\vect{\beta}, \vect{p};(\vect{x}_k, \vect{y}_k)_{1 \leq k \leq N}) = \dfrac{1}{(2\pi)^{dN/2} |\det \mat{C}_{\vect{p}}|^{1/2}} \exp\left[ -\dfrac{1}{2}\Tr{\left( \vect{y}-\vect{m} \right)} \mat{C}_{\vect{p}}^{-1} \left( \vect{y}-\vect{m} \right) \right] + \cL(\vect{\beta}, \vect{p};(\vect{x}_k, \vect{y}_k)_{1 \leq k \leq \sampleSize}) + = \dfrac{1}{(2\pi)^{\outputDim \sampleSize / 2} |\det \mat{C}_{\vect{p}}|^{1/2}} \exp\left[ -\dfrac{1}{2}\Tr{\left( \vect{y}-\vect{m} \right)} \mat{C}_{\vect{p}}^{-1} \left( \vect{y}-\vect{m} \right) \right] -If :math:`\mat{L}` is the Cholesky factor of :math:`\mat{C}`, ie the lower triangular matrix with positive diagonal such that :math:`\mat{L}\,\Tr{\mat{L}} = \mat{C}`, then: +If :math:`\mat{L}` is the Cholesky factor of :math:`\mat{C}`, ie the lower triangular matrix +with positive diagonal such that :math:`\mat{L}\,\Tr{\mat{L}} = \mat{C}`, then: .. math:: :label: logLikelihood - \log \cL(\vect{\beta}, \vect{p};(\vect{x}_k, \vect{y}_k)_{1 \leq k \leq N}) = cste - \log \det \mat{L}_{\vect{p}} -\dfrac{1}{2} \| \mat{L}_{\vect{p}}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2_2 + \log \cL(\vect{\beta}, \vect{p};(\vect{x}_k, \vect{y}_k)_{1 \leq k \leq \sampleSize}) + = cste - \log \det \mat{L}_{\vect{p}} -\dfrac{1}{2} \| \mat{L}_{\vect{p}}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2_2 The maximization of :eq:`logLikelihood` leads to the following optimality condition for :math:`\vect{\beta}`: @@ -128,35 +142,45 @@ The maximization of :eq:`logLikelihood` leads to the following optimality condit \vect{\beta}^*(\vect{p}^*)=\argmin_{\vect{\beta}} \| \mat{L}_{\vect{p}^*}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2_2 -This expression of :math:`\vect{\beta}^*` as a function of :math:`\vect{p}^*` is taken as a general relation between :math:`\vect{\beta}` and :math:`\vect{p}` and is substituted into :eq:`logLikelihood`, leading to a *reduced log-likelihood* function depending solely on :math:`\vect{p}`. +This expression of :math:`\vect{\beta}^*` as a function of :math:`\vect{p}^*` is taken as a +general relation between :math:`\vect{\beta}` and :math:`\vect{p}` and is substituted into +:eq:`logLikelihood`, leading to a *reduced log-likelihood* function depending solely on :math:`\vect{p}`. -In the particular case where :math:`d=\dim(\vect{\sigma})=1` and :math:`\sigma` is a part of :math:`\vect{p}`, then a further reduction is possible. In this case, if :math:`\vect{q}` is the vector :math:`\vect{p}` in which :math:`\sigma` has been substituted by 1, then: +In the particular case where :math:`\outputDim = \dim(\vect{\sigma}) = 1` and :math:`\sigma` is a part of +:math:`\vect{p}`, then a further reduction is possible. In this case, if :math:`\vect{q}` is +the vector :math:`\vect{p}` in which :math:`\sigma` has been substituted by 1, then: .. math:: - \| \mat{L}_{\vect{p}}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2 = \frac{1}{\sigma^2} \| \mat{L}_{\vect{q}}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2_2 + \| \mat{L}_{\vect{p}}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2 + = \frac{1}{\sigma^2} \| \mat{L}_{\vect{q}}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}}) \|^2_2 -showing that :math:`\vect{\beta}^*` is a function of :math:`\vect{q}^*` only, and the optimality condition for :math:`\sigma` reads: +showing that :math:`\vect{\beta}^*` is a function of :math:`\vect{q}^*` only, and the optimality +condition for :math:`\sigma` reads: .. math:: - \vect{\sigma}^*(\vect{q}^*)=\dfrac{1}{N}\| \mat{L}_{\vect{q}^*}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}^*(\vect{q}^*)}) \|^2_2 + \vect{\sigma}^*(\vect{q}^*)=\dfrac{1}{\sampleSize}\| \mat{L}_{\vect{q}^*}^{-1}(\vect{y}-\vect{m}_{\vect{\beta}^*(\vect{q}^*)}) \|^2_2 -which leads to a further reduction of the log-likelihood function where both :math:`\vect{\beta}` and :math:`\sigma` are replaced by their expression in terms of :math:`\vect{q}`. +which leads to a further reduction of the log-likelihood function where both :math:`\vect{\beta}` +and :math:`\sigma` are replaced by their expression in terms of :math:`\vect{q}`. The default optimizer is :class:`~openturns.TNC` and can be changed thanks to the *setOptimizationAlgorithm* method. -User could also change the default optimization solver by setting the `GeneralLinearModelAlgorithm-DefaultOptimizationAlgorithm` resource map key to one of the :class:`~openturns.NLopt` solver names. +User could also change the default optimization solver by setting the `GeneralLinearModelAlgorithm-DefaultOptimizationAlgorithm` +resource map key to one of the :class:`~openturns.NLopt` solver names. It is also possible to proceed as follows: - - ask for the reduced log-likelihood function of the *GeneralLinearModelAlgorithm* thanks to the *getObjectiveFunction()* method - - optimize it with respect to the parameters :math:`\vect{\theta}` and :math:`\vect{\sigma}` using any optimization algorithms (that can take into account some additional constraints if needed) - - set the optimal parameter value into the covariance model used in the *GeneralLinearModelAlgorithm* - - tell the algorithm not to optimize the parameter using *setOptimizeParameters* +- ask for the reduced log-likelihood function of the *GeneralLinearModelAlgorithm* thanks to the *getObjectiveFunction()* method +- optimize it with respect to the parameters :math:`\vect{\theta}` and :math:`\vect{\sigma}` using any + optimization algorithms (that can take into account some additional constraints if needed) +- set the optimal parameter value into the covariance model used in the *GeneralLinearModelAlgorithm* +- tell the algorithm not to optimize the parameter using *setOptimizeParameters* The behaviour of the reduction is controlled by the following keys in :class:`~openturns.ResourceMap`: - - *ResourceMap.SetAsBool('GeneralLinearModelAlgorithm-UseAnalyticalAmplitudeEstimate', True)* to use the reduction associated to :math:`\sigma`. It has no effect if :math:`d>1` or if :math:`d=1` and :math:`\sigma` is not part of :math:`\vect{p}` - - *ResourceMap.SetAsBool('GeneralLinearModelAlgorithm-UnbiasedVariance', True)* allows one to use the *unbiased* estimate of :math:`\sigma` where :math:`\dfrac{1}{N}` is replaced by :math:`\dfrac{1}{N-p}` in the optimality condition for :math:`\sigma`. + +- *ResourceMap.SetAsBool('GeneralLinearModelAlgorithm-UseAnalyticalAmplitudeEstimate', True)* to use the reduction associated to :math:`\sigma`. It has no effect if :math:`\outputDim > 1` or if :math:`\outputDim = 1` and :math:`\sigma` is not part of :math:`\vect{p}` +- *ResourceMap.SetAsBool('GeneralLinearModelAlgorithm-UnbiasedVariance', True)* allows one to use the *unbiased* estimate of :math:`\sigma` where :math:`\dfrac{1}{\sampleSize}` is replaced by :math:`\dfrac{1}{\sampleSize - p}` in the optimality condition for :math:`\sigma`. With huge samples, the `hierarchical matrix `_ implementation could be used if OpenTURNS had been compiled with `hmat-oss` support. @@ -165,11 +189,11 @@ This implementation, which is based on a compressed representation of an approxi A known centered gaussian observation noise :math:`\epsilon_k` can be taken into account with :func:`setNoise()`: -.. math:: \hat{\vect{y}}_k = \vect{y}_k + \epsilon_k, \epsilon_k \sim \mathcal{N}(0, \tau_k^2) +.. math:: \hat{\vect{y}}_k = \vect{y}_k + \epsilon_k, \epsilon_k \sim \mathcal{\sampleSize}(0, \tau_k^2) Examples -------- -Create the model :math:`\cM: \Rset \mapsto \Rset` and the samples: +Create the model :math:`\model: \Rset \mapsto \Rset` and the samples: >>> import openturns as ot >>> f = ot.SymbolicFunction(['x'], ['x+x * sin(x)']) @@ -212,7 +236,7 @@ result : :class:`~openturns.GeneralLinearModelResult` Returns ------- inputSample : :class:`~openturns.Sample` - The input sample :math:`(\vect{x}_k)_{1 \leq k \leq N}`." + The input sample :math:`(\vect{x}_k)_{1 \leq k \leq \sampleSize}`." // --------------------------------------------------------------------- @@ -222,7 +246,7 @@ inputSample : :class:`~openturns.Sample` Returns ------- outputSample : :class:`~openturns.Sample` - The output sample :math:`(\vect{y}_k)_{1 \leq k \leq N}` ." + The output sample :math:`(\vect{y}_k)_{1 \leq k \leq \sampleSize}` ." // --------------------------------------------------------------------- @@ -241,7 +265,7 @@ The log-likelihood function may be useful for some postprocessing: maximization Examples -------- -Create the model :math:`\cM: \Rset \mapsto \Rset` and the samples: +Create the model :math:`\model: \Rset \mapsto \Rset` and the samples: >>> import openturns as ot >>> f = ot.SymbolicFunction(['x0'], ['x0 * sin(x0)'])