From 6af8b4b5fc85e952f3e09248d0ace014dcd48c6d Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Fri, 5 Apr 2024 22:19:13 +0000 Subject: [PATCH] build based on 32623c4 --- dev/.documenter-siteinfo.json | 2 +- dev/API/CalibrateEmulateSample/index.html | 2 +- dev/API/Emulators/index.html | 24 +++---- dev/API/GaussianProcess/index.html | 10 +-- dev/API/MarkovChainMonteCarlo/index.html | 16 ++--- dev/API/RandomFeatures/index.html | 64 +++++++++---------- dev/API/Utilities/index.html | 4 +- dev/GaussianProcessEmulator/index.html | 2 +- dev/calibrate/index.html | 2 +- dev/contributing/index.html | 2 +- dev/emulate/index.html | 2 +- dev/examples/Cloudy_example/index.html | 2 +- dev/examples/edmf_example/index.html | 2 +- .../emulators/ishigami_3d_1d/index.html | 2 +- .../lorenz_integrator_3d_3d/index.html | 2 +- .../emulators/regression_2d_2d/index.html | 2 +- dev/examples/lorenz_example/index.html | 2 +- dev/examples/sinusoid_example/index.html | 2 +- dev/glossary/index.html | 2 +- dev/index.html | 2 +- dev/installation_instructions/index.html | 2 +- dev/random_feature_emulator/index.html | 2 +- dev/sample/index.html | 2 +- 23 files changed, 77 insertions(+), 77 deletions(-) diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index cea9ecca..02d09d63 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.2","generation_timestamp":"2024-04-05T19:40:41","documenter_version":"1.2.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.2","generation_timestamp":"2024-04-05T22:19:05","documenter_version":"1.2.1"}} \ No newline at end of file diff --git a/dev/API/CalibrateEmulateSample/index.html b/dev/API/CalibrateEmulateSample/index.html index 9bcd3823..21d61f8c 100644 --- a/dev/API/CalibrateEmulateSample/index.html +++ b/dev/API/CalibrateEmulateSample/index.html @@ -1,2 +1,2 @@ -- · CalibrateEmulateSample.jl
+- · CalibrateEmulateSample.jl
diff --git a/dev/API/Emulators/index.html b/dev/API/Emulators/index.html index 2ea6fb8f..797af95c 100644 --- a/dev/API/Emulators/index.html +++ b/dev/API/Emulators/index.html @@ -1,10 +1,10 @@ -General Interface · CalibrateEmulateSample.jl

Emulators

CalibrateEmulateSample.Emulators.EmulatorType
struct Emulator{FT<:AbstractFloat}

Structure used to represent a general emulator, independently of the algorithm used.

Fields

  • machine_learning_tool::CalibrateEmulateSample.Emulators.MachineLearningTool: Machine learning tool, defined as a struct of type MachineLearningTool.

  • training_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer{FT} where FT<:AbstractFloat: Normalized, standardized, transformed pairs given the Booleans normalize_inputs, standardize_outputs, retained_svd_frac.

  • input_mean::AbstractVector{FT} where FT<:AbstractFloat: Mean of input; length input_dim.

  • normalize_inputs::Bool: If normalizing: whether to fit models on normalized inputs ((inputs - input_mean) * sqrt_inv_input_cov).

  • normalization::Union{Nothing, LinearAlgebra.UniformScaling{FT}, AbstractMatrix{FT}} where FT<:AbstractFloat: (Linear) normalization transformation; size input_dim × input_dim.

  • standardize_outputs::Bool: Whether to fit models on normalized outputs: outputs / standardize_outputs_factor.

  • standardize_outputs_factors::Union{Nothing, AbstractVector{FT}} where FT<:AbstractFloat: If standardizing: Standardization factors (characteristic values of the problem).

  • decomposition::Union{Nothing, LinearAlgebra.SVD}: The singular value decomposition of obs_noise_cov, such that obs_noise_cov = decomposition.U * Diagonal(decomposition.S) * decomposition.Vt. NB: the SVD may be reduced in dimensions.

  • retained_svd_frac::AbstractFloat: Fraction of singular values kept in decomposition. A value of 1 implies full SVD spectrum information.

source
CalibrateEmulateSample.Emulators.optimize_hyperparameters!Method
optimize_hyperparameters!(
+General Interface · CalibrateEmulateSample.jl

Emulators

CalibrateEmulateSample.Emulators.EmulatorType
struct Emulator{FT<:AbstractFloat}

Structure used to represent a general emulator, independently of the algorithm used.

Fields

  • machine_learning_tool::CalibrateEmulateSample.Emulators.MachineLearningTool: Machine learning tool, defined as a struct of type MachineLearningTool.

  • training_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer{FT} where FT<:AbstractFloat: Normalized, standardized, transformed pairs given the Booleans normalize_inputs, standardize_outputs, retained_svd_frac.

  • input_mean::AbstractVector{FT} where FT<:AbstractFloat: Mean of input; length input_dim.

  • normalize_inputs::Bool: If normalizing: whether to fit models on normalized inputs ((inputs - input_mean) * sqrt_inv_input_cov).

  • normalization::Union{Nothing, LinearAlgebra.UniformScaling{FT}, AbstractMatrix{FT}} where FT<:AbstractFloat: (Linear) normalization transformation; size input_dim × input_dim.

  • standardize_outputs::Bool: Whether to fit models on normalized outputs: outputs / standardize_outputs_factor.

  • standardize_outputs_factors::Union{Nothing, AbstractVector{FT}} where FT<:AbstractFloat: If standardizing: Standardization factors (characteristic values of the problem).

  • decomposition::Union{Nothing, LinearAlgebra.SVD}: The singular value decomposition of obs_noise_cov, such that obs_noise_cov = decomposition.U * Diagonal(decomposition.S) * decomposition.Vt. NB: the SVD may be reduced in dimensions.

  • retained_svd_frac::AbstractFloat: Fraction of singular values kept in decomposition. A value of 1 implies full SVD spectrum information.

source
CalibrateEmulateSample.Emulators.EmulatorMethod
Emulator(
     machine_learning_tool::CalibrateEmulateSample.Emulators.MachineLearningTool,
     input_output_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer{FT<:AbstractFloat};
     obs_noise_cov,
@@ -14,48 +14,48 @@
     decorrelate,
     retained_svd_frac
 ) -> CalibrateEmulateSample.Emulators.Emulator{FT} where Float64<:FT<:AbstractFloat
-

Positional Arguments

  • machine_learning_tool ::MachineLearningTool,
  • input_output_pairs ::PairedDataContainer

Keyword Arguments

  • obs_noise_cov: A matrix/uniform scaling to provide the observational noise covariance of the data - used for data processing (default nothing),
  • normalize_inputs: Normalize the inputs to be unit Gaussian, in the smallest full-rank space of the data (default true),
  • standardize_outputs: Standardize outputs with by dividing by a vector of provided factors (default false),
  • standardize_outputs_factors: If standardizing, the provided dim_output-length vector of factors,
  • decorrelate: Apply (truncated) SVD to the outputs. Predictions are returned in the decorrelated space, (default true)
  • retained_svd_frac: The cumulative sum of singular values retained after output SVD truncation (default 1.0 - no truncation)
source
GaussianProcesses.predictFunction
predict(
+

Positional Arguments

  • machine_learning_tool ::MachineLearningTool,
  • input_output_pairs ::PairedDataContainer

Keyword Arguments

  • obs_noise_cov: A matrix/uniform scaling to provide the observational noise covariance of the data - used for data processing (default nothing),
  • normalize_inputs: Normalize the inputs to be unit Gaussian, in the smallest full-rank space of the data (default true),
  • standardize_outputs: Standardize outputs with by dividing by a vector of provided factors (default false),
  • standardize_outputs_factors: If standardizing, the provided dim_output-length vector of factors,
  • decorrelate: Apply (truncated) SVD to the outputs. Predictions are returned in the decorrelated space, (default true)
  • retained_svd_frac: The cumulative sum of singular values retained after output SVD truncation (default 1.0 - no truncation)
source
GaussianProcesses.predictFunction
predict(
     gp::CalibrateEmulateSample.Emulators.GaussianProcess{CalibrateEmulateSample.Emulators.GPJL},
     new_inputs::AbstractArray{FT<:AbstractFloat, 2}
 ) -> Tuple{Any, Any}
-

Predict means and covariances in decorrelated output space using Gaussian process models.

source
predict(
+

Predict means and covariances in decorrelated output space using Gaussian process models.

source
predict(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface,
     new_inputs::AbstractMatrix;
     multithread
 ) -> Tuple{Any, Any}
-

Prediction of data observation (not latent function) at new inputs (passed in as columns in a matrix). That is, we add the observational noise into predictions.

source
predict(
+

Prediction of data observation (not latent function) at new inputs (passed in as columns in a matrix). That is, we add the observational noise into predictions.

source
predict(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface,
     new_inputs::AbstractMatrix
 ) -> Tuple{Any, Any}
-

Prediction of data observation (not latent function) at new inputs (passed in as columns in a matrix). That is, we add the observational noise into predictions.

source
predict(
+

Prediction of data observation (not latent function) at new inputs (passed in as columns in a matrix). That is, we add the observational noise into predictions.

source
predict(
     emulator::CalibrateEmulateSample.Emulators.Emulator{FT<:AbstractFloat},
     new_inputs::AbstractArray{FT<:AbstractFloat, 2};
     transform_to_real,
     vector_rf_unstandardize,
     mlt_kwargs...
 ) -> Tuple{Any, Any}
-

Makes a prediction using the emulator on new inputs (each new inputs given as data columns). Default is to predict in the decorrelated space.

source
CalibrateEmulateSample.Emulators.standardizeFunction
standardize(
     outputs::AbstractVecOrMat,
     output_covs::Union{AbstractVector{<:AbstractMatrix}, AbstractVector{<:LinearAlgebra.UniformScaling}},
     factors::AbstractVector
 ) -> Tuple{Any, Union{AbstractVector{<:AbstractMatrix}, AbstractVector{<:LinearAlgebra.UniformScaling}}}
-

Standardize with a vector of factors (size equal to output dimension).

source
CalibrateEmulateSample.Emulators.reverse_standardizeFunction
reverse_standardize(
     emulator::CalibrateEmulateSample.Emulators.Emulator{FT<:AbstractFloat},
     outputs::AbstractVecOrMat,
     output_covs::AbstractVecOrMat
 ) -> Tuple{Any, Any}
-

Reverse a previous standardization with the stored vector of factors (size equal to output dimension). output_cov is a Vector of covariance matrices, such as is returned by svd_reverse_transform_mean_cov.

source
CalibrateEmulateSample.Emulators.svd_transformFunction
svd_transform(
     data::AbstractArray{FT<:AbstractFloat, 2},
     obs_noise_cov::Union{Nothing, AbstractArray{FT<:AbstractFloat, 2}};
     retained_svd_frac
 ) -> Tuple{Any, Any}
-

Apply a singular value decomposition (SVD) to the data

  • data - GP training data/targets; size output_dim × N_samples
  • obs_noise_cov - covariance of observational noise
  • truncate_svd - Project onto this fraction of the largest principal components. Defaults to 1.0 (no truncation).

Returns the transformed data and the decomposition, which is a matrix factorization of type LinearAlgebra.SVD.

Note: If F::SVD is the factorization object, U, S, V and Vt can be obtained via F.U, F.S, F.V and F.Vt, such that A = U * Diagonal(S) * Vt. The singular values in S are sorted in descending order.

source
CalibrateEmulateSample.Emulators.svd_reverse_transform_mean_covFunction
svd_reverse_transform_mean_cov(
+

Apply a singular value decomposition (SVD) to the data

  • data - GP training data/targets; size output_dim × N_samples
  • obs_noise_cov - covariance of observational noise
  • truncate_svd - Project onto this fraction of the largest principal components. Defaults to 1.0 (no truncation).

Returns the transformed data and the decomposition, which is a matrix factorization of type LinearAlgebra.SVD.

Note: If F::SVD is the factorization object, U, S, V and Vt can be obtained via F.U, F.S, F.V and F.Vt, such that A = U * Diagonal(S) * Vt. The singular values in S are sorted in descending order.

source
CalibrateEmulateSample.Emulators.svd_reverse_transform_mean_covFunction
svd_reverse_transform_mean_cov(
     μ::AbstractArray{FT<:AbstractFloat, 2},
     σ2::AbstractVector,
     decomposition::LinearAlgebra.SVD
 ) -> Tuple{Any, Any}
-

Transform the mean and covariance back to the original (correlated) coordinate system

  • μ - predicted mean; size output_dim × N_predicted_points.
  • σ2 - predicted variance; size output_dim × N_predicted_points. - predicted covariance; size N_predicted_points array of size output_dim × output_dim.
  • decomposition - SVD decomposition of obs_noise_cov.

Returns the transformed mean (size output_dim × N_predicted_points) and variance. Note that transforming the variance back to the original coordinate system results in non-zero off-diagonal elements, so instead of just returning the elements on the main diagonal (i.e., the variances), we return the full covariance at each point, as a vector of length N_predicted_points, where each element is a matrix of size output_dim × output_dim.

source
+

Transform the mean and covariance back to the original (correlated) coordinate system

  • μ - predicted mean; size output_dim × N_predicted_points.
  • σ2 - predicted variance; size output_dim × N_predicted_points. - predicted covariance; size N_predicted_points array of size output_dim × output_dim.
  • decomposition - SVD decomposition of obs_noise_cov.

Returns the transformed mean (size output_dim × N_predicted_points) and variance. Note that transforming the variance back to the original coordinate system results in non-zero off-diagonal elements, so instead of just returning the elements on the main diagonal (i.e., the variances), we return the full covariance at each point, as a vector of length N_predicted_points, where each element is a matrix of size output_dim × output_dim.

source
diff --git a/dev/API/GaussianProcess/index.html b/dev/API/GaussianProcess/index.html index e768a502..3c6e56a4 100644 --- a/dev/API/GaussianProcess/index.html +++ b/dev/API/GaussianProcess/index.html @@ -1,22 +1,22 @@ -Gaussian Process · CalibrateEmulateSample.jl

GaussianProcess

CalibrateEmulateSample.Emulators.GaussianProcessType
struct GaussianProcess{GPPackage, FT} <: CalibrateEmulateSample.Emulators.MachineLearningTool

Structure holding training input and the fitted Gaussian process regression models.

Fields

  • models::Vector{Union{Nothing, PyCall.PyObject, GaussianProcesses.GPE}}: The Gaussian Process (GP) Regression model(s) that are fitted to the given input-data pairs.

  • kernel::Union{Nothing, var"#s17", var"#s18"} where {var"#s17"<:GaussianProcesses.Kernel, var"#s18"<:PyCall.PyObject}: Kernel object.

  • noise_learn::Bool: Learn the noise with the White Noise kernel explicitly?

  • alg_reg_noise::Any: Additional observational or regularization noise in used in GP algorithms

  • prediction_type::CalibrateEmulateSample.Emulators.PredictionType: Prediction type (y to predict the data, f to predict the latent function).

source
CalibrateEmulateSample.Emulators.GaussianProcessMethod
GaussianProcess(
+Gaussian Process · CalibrateEmulateSample.jl

GaussianProcess

CalibrateEmulateSample.Emulators.GaussianProcessType
struct GaussianProcess{GPPackage, FT} <: CalibrateEmulateSample.Emulators.MachineLearningTool

Structure holding training input and the fitted Gaussian process regression models.

Fields

  • models::Vector{Union{Nothing, PyCall.PyObject, GaussianProcesses.GPE}}: The Gaussian Process (GP) Regression model(s) that are fitted to the given input-data pairs.

  • kernel::Union{Nothing, var"#s17", var"#s18"} where {var"#s17"<:GaussianProcesses.Kernel, var"#s18"<:PyCall.PyObject}: Kernel object.

  • noise_learn::Bool: Learn the noise with the White Noise kernel explicitly?

  • alg_reg_noise::Any: Additional observational or regularization noise in used in GP algorithms

  • prediction_type::CalibrateEmulateSample.Emulators.PredictionType: Prediction type (y to predict the data, f to predict the latent function).

source
CalibrateEmulateSample.Emulators.GaussianProcessMethod
GaussianProcess(
     package::AbstractFloat;
     kernel,
     noise_learn,
     alg_reg_noise,
     prediction_type
 )
-
  • package - GaussianProcessPackage object.
  • kernel - GaussianProcesses kernel object. Default is a Squared Exponential kernel.
  • noise_learn - Boolean to additionally learn white noise in decorrelated space. Default is true.
  • alg_reg_noise - Float to fix the (small) regularization parameter of algorithms when noise_learn = true
  • prediction_type - PredictionType object. Default predicts data, not latent function (FType()).
source
CalibrateEmulateSample.Emulators.build_models!Method
build_models!(
+
  • package - GaussianProcessPackage object.
  • kernel - GaussianProcesses kernel object. Default is a Squared Exponential kernel.
  • noise_learn - Boolean to additionally learn white noise in decorrelated space. Default is true.
  • alg_reg_noise - Float to fix the (small) regularization parameter of algorithms when noise_learn = true
  • prediction_type - PredictionType object. Default predicts data, not latent function (FType()).
source
CalibrateEmulateSample.Emulators.build_models!Method
build_models!(
     gp::CalibrateEmulateSample.Emulators.GaussianProcess{CalibrateEmulateSample.Emulators.GPJL},
     input_output_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer{FT<:AbstractFloat}
 )
-

Method to build Gaussian process models based on the package.

source
CalibrateEmulateSample.Emulators.optimize_hyperparameters!Method
optimize_hyperparameters!(
     gp::CalibrateEmulateSample.Emulators.GaussianProcess{CalibrateEmulateSample.Emulators.GPJL},
     args...;
     kwargs...
 )
-

Optimize Gaussian process hyperparameters using in-build package method.

Warning: if one uses GPJL() and wishes to modify positional arguments. The first positional argument must be the Optim method (default LBGFS()).

source
GaussianProcesses.predictMethod
predict(
+

Optimize Gaussian process hyperparameters using in-build package method.

Warning: if one uses GPJL() and wishes to modify positional arguments. The first positional argument must be the Optim method (default LBGFS()).

source
GaussianProcesses.predictMethod
predict(
     gp::CalibrateEmulateSample.Emulators.GaussianProcess{CalibrateEmulateSample.Emulators.GPJL},
     new_inputs::AbstractArray{FT<:AbstractFloat, 2}
 ) -> Tuple{Any, Any}
-

Predict means and covariances in decorrelated output space using Gaussian process models.

source
+

Predict means and covariances in decorrelated output space using Gaussian process models.

source
diff --git a/dev/API/MarkovChainMonteCarlo/index.html b/dev/API/MarkovChainMonteCarlo/index.html index 35a6761f..6fa05ad5 100644 --- a/dev/API/MarkovChainMonteCarlo/index.html +++ b/dev/API/MarkovChainMonteCarlo/index.html @@ -1,5 +1,5 @@ -MarkovChainMonteCarlo · CalibrateEmulateSample.jl

MarkovChainMonteCarlo

Top-level class and methods

CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCWrapperType
struct MCMCWrapper

Top-level class holding all configuration information needed for MCMC sampling: the prior, emulated likelihood and sampling algorithm (DensityModel and Sampler, respectively, in AbstractMCMC's terminology).

Fields

  • prior::EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution: ParameterDistribution object describing the prior distribution on parameter values.

  • log_posterior_map::AbstractMCMC.AbstractModel: AdvancedMH.DensityModel object, used to evaluate the posterior density being sampled from.

  • mh_proposal_sampler::AbstractMCMC.AbstractSampler: Object describing a MCMC sampling algorithm and its settings.

  • sample_kwargs::NamedTuple: NamedTuple of other arguments to be passed to AbstractMCMC.sample().

source
CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCWrapperMethod
MCMCWrapper(
+MarkovChainMonteCarlo · CalibrateEmulateSample.jl

MarkovChainMonteCarlo

Top-level class and methods

CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCWrapperType
struct MCMCWrapper

Top-level class holding all configuration information needed for MCMC sampling: the prior, emulated likelihood and sampling algorithm (DensityModel and Sampler, respectively, in AbstractMCMC's terminology).

Fields

  • prior::EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution: ParameterDistribution object describing the prior distribution on parameter values.

  • log_posterior_map::AbstractMCMC.AbstractModel: AdvancedMH.DensityModel object, used to evaluate the posterior density being sampled from.

  • mh_proposal_sampler::AbstractMCMC.AbstractSampler: Object describing a MCMC sampling algorithm and its settings.

  • sample_kwargs::NamedTuple: NamedTuple of other arguments to be passed to AbstractMCMC.sample().

source
CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCWrapperMethod
MCMCWrapper(
     mcmc_alg::CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCProtocol,
     obs_sample::AbstractArray{FT<:AbstractFloat, 1},
     prior::EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution,
@@ -8,11 +8,11 @@
     burnin,
     kwargs...
 )
-

Constructor for MCMCWrapper which performs the same standardization (SVD decorrelation) that was applied in the Emulator. It creates and wraps an instance of EmulatorPosteriorModel, for sampling from the Emulator, and MetropolisHastingsSampler, for generating the MC proposals.

  • mcmc_alg: MCMCProtocol describing the MCMC sampling algorithm to use. Currently implemented algorithms are:

    • RWMHSampling: Metropolis-Hastings sampling from a vanilla random walk with fixed stepsize.
    • pCNMHSampling: Metropolis-Hastings sampling using the preconditioned Crank-Nicholson algorithm, which has a well-behaved small-stepsize limit.
  • obs_sample: A single sample from the observations. Can, e.g., be picked from an Observation struct using get_obs_sample.

  • prior: ParameterDistribution object containing the parameters' prior distributions.

  • em: Emulator to sample from.

  • stepsize: MCMC step size, applied as a scaling to the prior covariance.

  • init_params: Starting parameter values for MCMC sampling.

  • burnin: Initial number of MCMC steps to discard from output (pre-convergence).

source
StatsBase.sampleFunction

sample([rng,] mcmc::MCMCWrapper, args...; kwargs...)

Extends the sample methods of AbstractMCMC (which extends StatsBase) to sample from the emulated posterior, using the MCMC sampling algorithm and Emulator configured in MCMCWrapper. Returns a MCMCChains.Chains object containing the samples.

Supported methods are:

  • sample([rng, ]mcmc, N; kwargs...)

    Return a MCMCChains.Chains object containing N samples from the emulated posterior.

  • sample([rng, ]mcmc, isdone; kwargs...)

    Sample from the model with the Markov chain Monte Carlo sampler until a convergence criterion isdone returns true, and return the samples. The function isdone has the signature

        isdone(rng, model, sampler, samples, state, iteration; kwargs...)

    where state and iteration are the current state and iteration of the sampler, respectively. It should return true when sampling should end, and false otherwise.

  • sample([rng, ]mcmc, parallel_type, N, nchains; kwargs...)

    Sample nchains Monte Carlo Markov chains in parallel according to parallel_type, which may be MCMCThreads() or MCMCDistributed() for thread and parallel sampling, respectively.

source
CalibrateEmulateSample.MarkovChainMonteCarlo.get_posteriorFunction
get_posterior(
+

Constructor for MCMCWrapper which performs the same standardization (SVD decorrelation) that was applied in the Emulator. It creates and wraps an instance of EmulatorPosteriorModel, for sampling from the Emulator, and MetropolisHastingsSampler, for generating the MC proposals.

  • mcmc_alg: MCMCProtocol describing the MCMC sampling algorithm to use. Currently implemented algorithms are:

    • RWMHSampling: Metropolis-Hastings sampling from a vanilla random walk with fixed stepsize.
    • pCNMHSampling: Metropolis-Hastings sampling using the preconditioned Crank-Nicholson algorithm, which has a well-behaved small-stepsize limit.
  • obs_sample: A single sample from the observations. Can, e.g., be picked from an Observation struct using get_obs_sample.

  • prior: ParameterDistribution object containing the parameters' prior distributions.

  • em: Emulator to sample from.

  • stepsize: MCMC step size, applied as a scaling to the prior covariance.

  • init_params: Starting parameter values for MCMC sampling.

  • burnin: Initial number of MCMC steps to discard from output (pre-convergence).

source
StatsBase.sampleFunction

sample([rng,] mcmc::MCMCWrapper, args...; kwargs...)

Extends the sample methods of AbstractMCMC (which extends StatsBase) to sample from the emulated posterior, using the MCMC sampling algorithm and Emulator configured in MCMCWrapper. Returns a MCMCChains.Chains object containing the samples.

Supported methods are:

  • sample([rng, ]mcmc, N; kwargs...)

    Return a MCMCChains.Chains object containing N samples from the emulated posterior.

  • sample([rng, ]mcmc, isdone; kwargs...)

    Sample from the model with the Markov chain Monte Carlo sampler until a convergence criterion isdone returns true, and return the samples. The function isdone has the signature

        isdone(rng, model, sampler, samples, state, iteration; kwargs...)

    where state and iteration are the current state and iteration of the sampler, respectively. It should return true when sampling should end, and false otherwise.

  • sample([rng, ]mcmc, parallel_type, N, nchains; kwargs...)

    Sample nchains Monte Carlo Markov chains in parallel according to parallel_type, which may be MCMCThreads() or MCMCDistributed() for thread and parallel sampling, respectively.

source
CalibrateEmulateSample.MarkovChainMonteCarlo.get_posteriorFunction
get_posterior(
     mcmc::CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCWrapper,
     chain::MCMCChains.Chains
 ) -> EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution
-

Returns a ParameterDistribution object corresponding to the empirical distribution of the samples in chain.

Note

This method does not currently support combining samples from multiple Chains.

source
CalibrateEmulateSample.MarkovChainMonteCarlo.optimize_stepsizeFunction
optimize_stepsize(
     rng::Random.AbstractRNG,
     mcmc::CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCWrapper;
     init_stepsize,
@@ -20,18 +20,18 @@
     max_iter,
     sample_kwargs...
 ) -> Float64
-

Uses a heuristic to return a stepsize for the mh_proposal_sampler element of MCMCWrapper which yields fast convergence of the Markov chain.

The criterion used is that Metropolis-Hastings proposals should be accepted between 15% and 35% of the time.

source

See AbstractMCMC sampling API for background on our use of Turing.jl's AbstractMCMC API for MCMC sampling.

Sampler algorithms

CalibrateEmulateSample.MarkovChainMonteCarlo.pCNMHSamplingType
struct pCNMHSampling <: CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCProtocol

MCMCProtocol which uses Metropolis-Hastings sampling that generates proposals for new parameters according to the preconditioned Crank-Nicholson (pCN) algorithm, which is usable for MCMC in the stepsize → 0 limit, unlike the vanilla random walk. Steps are based on the covariance of prior.

source

See AbstractMCMC sampling API for background on our use of Turing.jl's AbstractMCMC API for MCMC sampling.

Sampler algorithms

CalibrateEmulateSample.MarkovChainMonteCarlo.pCNMHSamplingType
struct pCNMHSampling <: CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCProtocol

MCMCProtocol which uses Metropolis-Hastings sampling that generates proposals for new parameters according to the preconditioned Crank-Nicholson (pCN) algorithm, which is usable for MCMC in the stepsize → 0 limit, unlike the vanilla random walk. Steps are based on the covariance of prior.

source
CalibrateEmulateSample.MarkovChainMonteCarlo.MetropolisHastingsSamplerFunction
MetropolisHastingsSampler(
     _::CalibrateEmulateSample.MarkovChainMonteCarlo.RWMHSampling,
     prior::EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution
 ) -> CalibrateEmulateSample.MarkovChainMonteCarlo.RWMetropolisHastings{D} where D<:(AdvancedMH.RandomWalkProposal{false})
-

Constructor for all Sampler objects, with one method for each supported MCMC algorithm.

Warning

Both currently implemented Samplers use a Gaussian approximation to the prior: specifically, new Metropolis-Hastings proposals are a scaled combination of the old parameter values and a random shift distributed as a Gaussian with the same covariance as the prior.

This suffices for our current use case, in which our priors are Gaussian (possibly in a transformed space) and we assume enough fidelity in the Emulator that inference isn't prior-dominated.

source

Emulated posterior (Model)

CalibrateEmulateSample.MarkovChainMonteCarlo.EmulatorPosteriorModelFunction
EmulatorPosteriorModel(
+

Constructor for all Sampler objects, with one method for each supported MCMC algorithm.

Warning

Both currently implemented Samplers use a Gaussian approximation to the prior: specifically, new Metropolis-Hastings proposals are a scaled combination of the old parameter values and a random shift distributed as a Gaussian with the same covariance as the prior.

This suffices for our current use case, in which our priors are Gaussian (possibly in a transformed space) and we assume enough fidelity in the Emulator that inference isn't prior-dominated.

source

Emulated posterior (Model)

CalibrateEmulateSample.MarkovChainMonteCarlo.EmulatorPosteriorModelFunction
EmulatorPosteriorModel(
     prior::EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution,
     em::CalibrateEmulateSample.Emulators.Emulator{FT<:AbstractFloat},
     obs_sample::AbstractArray{FT<:AbstractFloat, 1}
 ) -> AdvancedMH.DensityModel{F} where F<:(CalibrateEmulateSample.MarkovChainMonteCarlo.var"#1#2"{EnsembleKalmanProcesses.ParameterDistributions.ParameterDistribution{PDType, CType, ST}, CalibrateEmulateSample.Emulators.Emulator{FT}, <:AbstractVector{FT1}} where {PDType<:EnsembleKalmanProcesses.ParameterDistributions.ParameterDistributionType, CType<:EnsembleKalmanProcesses.ParameterDistributions.ConstraintType, ST<:AbstractString, FT<:AbstractFloat, FT1<:AbstractFloat})
-

Factory which constructs AdvancedMH.DensityModel objects given a prior on the model parameters (prior) and an Emulator of the log-likelihood of the data given parameters. Together these yield the log posterior density we're attempting to sample from with the MCMC, which is the role of the DensityModel class in the AbstractMCMC interface.

source

Internals - MCMC State

CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCStateType
struct MCMCState{T, L<:Real} <: AdvancedMH.AbstractTransition

Extends the AdvancedMH.Transition (which encodes the current state of the MC during sampling) with a boolean flag to record whether this state is new (arising from accepting a Metropolis-Hastings proposal) or old (from rejecting a proposal).

Fields

  • params::Any: Sampled value of the parameters at the current state of the MCMC chain.

  • log_density::Real: Log probability of params, as computed by the model using the prior.

  • accepted::Bool: Whether this state resulted from accepting a new MH proposal.

source

Internals - Other

CalibrateEmulateSample.MarkovChainMonteCarlo.to_decorrelatedFunction
to_decorrelated(
+

Factory which constructs AdvancedMH.DensityModel objects given a prior on the model parameters (prior) and an Emulator of the log-likelihood of the data given parameters. Together these yield the log posterior density we're attempting to sample from with the MCMC, which is the role of the DensityModel class in the AbstractMCMC interface.

source

Internals - MCMC State

CalibrateEmulateSample.MarkovChainMonteCarlo.MCMCStateType
struct MCMCState{T, L<:Real} <: AdvancedMH.AbstractTransition

Extends the AdvancedMH.Transition (which encodes the current state of the MC during sampling) with a boolean flag to record whether this state is new (arising from accepting a Metropolis-Hastings proposal) or old (from rejecting a proposal).

Fields

  • params::Any: Sampled value of the parameters at the current state of the MCMC chain.

  • log_density::Real: Log probability of params, as computed by the model using the prior.

  • accepted::Bool: Whether this state resulted from accepting a new MH proposal.

source

Internals - Other

+

Transform samples from the original (correlated) coordinate system to the SVD-decorrelated coordinate system used by Emulator. Used in the constructor for MCMCWrapper.

source
diff --git a/dev/API/RandomFeatures/index.html b/dev/API/RandomFeatures/index.html index 8fec24d9..9a098aac 100644 --- a/dev/API/RandomFeatures/index.html +++ b/dev/API/RandomFeatures/index.html @@ -1,18 +1,18 @@ -Random Features · CalibrateEmulateSample.jl

RandomFeatures

Kernel and Covariance structure

CalibrateEmulateSample.Emulators.LowRankFactorType
struct LowRankFactor{FT<:AbstractFloat} <: CalibrateEmulateSample.Emulators.CovarianceStructureType

builds a covariance structure that deviates from the identity with a low-rank perturbation. This perturbation is diagonalized in the low-rank space

source
CalibrateEmulateSample.Emulators.SeparableKernelType
struct SeparableKernel{CST1<:CalibrateEmulateSample.Emulators.CovarianceStructureType, CST2<:CalibrateEmulateSample.Emulators.CovarianceStructureType} <: CalibrateEmulateSample.Emulators.KernelStructureType

Builds a separable kernel, i.e. one that accounts for input and output covariance structure separately

source
CalibrateEmulateSample.Emulators.NonseparableKernelType
struct NonseparableKernel{CST<:CalibrateEmulateSample.Emulators.CovarianceStructureType} <: CalibrateEmulateSample.Emulators.KernelStructureType

Builds a nonseparable kernel, i.e. one that accounts for a joint input and output covariance structure

source
CalibrateEmulateSample.Emulators.calculate_n_hyperparametersFunction
calculate_n_hyperparameters(
+Random Features · CalibrateEmulateSample.jl

RandomFeatures

Kernel and Covariance structure

CalibrateEmulateSample.Emulators.LowRankFactorType
struct LowRankFactor{FT<:AbstractFloat} <: CalibrateEmulateSample.Emulators.CovarianceStructureType

builds a covariance structure that deviates from the identity with a low-rank perturbation. This perturbation is diagonalized in the low-rank space

source
CalibrateEmulateSample.Emulators.SeparableKernelType
struct SeparableKernel{CST1<:CalibrateEmulateSample.Emulators.CovarianceStructureType, CST2<:CalibrateEmulateSample.Emulators.CovarianceStructureType} <: CalibrateEmulateSample.Emulators.KernelStructureType

Builds a separable kernel, i.e. one that accounts for input and output covariance structure separately

source
CalibrateEmulateSample.Emulators.NonseparableKernelType
struct NonseparableKernel{CST<:CalibrateEmulateSample.Emulators.CovarianceStructureType} <: CalibrateEmulateSample.Emulators.KernelStructureType

Builds a nonseparable kernel, i.e. one that accounts for a joint input and output covariance structure

source

Scalar interface

CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterfaceType
struct ScalarRandomFeatureInterface{S<:AbstractString, RNG<:Random.AbstractRNG, KST<:CalibrateEmulateSample.Emulators.KernelStructureType} <: CalibrateEmulateSample.Emulators.RandomFeatureInterface

Structure holding the Scalar Random Feature models.

Fields

  • rfms::Vector{RandomFeatures.Methods.RandomFeatureMethod}: vector of RandomFeatureMethods, contains the feature structure, batch-sizes and regularization

  • fitted_features::Vector{RandomFeatures.Methods.Fit}: vector of Fits, containing the matrix decomposition and coefficients of RF when fitted to data

  • batch_sizes::Union{Nothing, Dict{S, Int64}} where S<:AbstractString: batch sizes

  • n_features::Union{Nothing, Int64}: n_features

  • input_dim::Int64: input dimension

  • rng::Random.AbstractRNG: choice of random number generator

  • kernel_structure::CalibrateEmulateSample.Emulators.KernelStructureType: Kernel structure type (e.g. Separable or Nonseparable)

  • feature_decomposition::AbstractString: Random Feature decomposition, choose from "svd" or "cholesky" (default)

  • optimizer_options::Dict{S} where S<:AbstractString: dictionary of options for hyperparameter optimizer

source

Scalar interface

CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterfaceType
struct ScalarRandomFeatureInterface{S<:AbstractString, RNG<:Random.AbstractRNG, KST<:CalibrateEmulateSample.Emulators.KernelStructureType} <: CalibrateEmulateSample.Emulators.RandomFeatureInterface

Structure holding the Scalar Random Feature models.

Fields

  • rfms::Vector{RandomFeatures.Methods.RandomFeatureMethod}: vector of RandomFeatureMethods, contains the feature structure, batch-sizes and regularization

  • fitted_features::Vector{RandomFeatures.Methods.Fit}: vector of Fits, containing the matrix decomposition and coefficients of RF when fitted to data

  • batch_sizes::Union{Nothing, Dict{S, Int64}} where S<:AbstractString: batch sizes

  • n_features::Union{Nothing, Int64}: n_features

  • input_dim::Int64: input dimension

  • rng::Random.AbstractRNG: choice of random number generator

  • kernel_structure::CalibrateEmulateSample.Emulators.KernelStructureType: Kernel structure type (e.g. Separable or Nonseparable)

  • feature_decomposition::AbstractString: Random Feature decomposition, choose from "svd" or "cholesky" (default)

  • optimizer_options::Dict{S} where S<:AbstractString: dictionary of options for hyperparameter optimizer

source
CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterfaceMethod
ScalarRandomFeatureInterface(
     n_features::Int64,
     input_dim::Int64;
     kernel_structure,
@@ -21,16 +21,16 @@
     feature_decomposition,
     optimizer_options
 ) -> CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface{String, Random._GLOBAL_RNG, CalibrateEmulateSample.Emulators.SeparableKernel{CST1, CalibrateEmulateSample.Emulators.OneDimFactor}} where CST1<:CalibrateEmulateSample.Emulators.CovarianceStructureType
-

Constructs a ScalarRandomFeatureInterface <: MachineLearningTool interface for the RandomFeatures.jl package for multi-input and single- (or decorrelated-)output emulators.

  • n_features - the number of random features
  • input_dim - the dimension of the input space
  • kernel_structure - - a prescribed form of kernel structure
  • batch_sizes = nothing - Dictionary of batch sizes passed RandomFeatures.jl object (see definition there)
  • rng = Random.GLOBAL_RNG - random number generator
  • feature_decomposition = "cholesky" - choice of how to store decompositions of random features, cholesky or svd available
  • optimizer_options = nothing - Dict of options to pass into EKI optimization of hyperparameters (defaults created in ScalarRandomFeatureInterface constructor):
    • "prior": the prior for the hyperparameter optimization
    • "priorinscale": use this to tune the input prior scale
    • "n_ensemble": number of ensemble members
    • "n_iteration": number of eki iterations
    • "covsamplemultiplier": increase for more samples to estimate covariance matrix in optimization (default 10.0, minimum 0.0)
    • "scheduler": Learning rate Scheduler (a.k.a. EKP timestepper) Default: DataMisfitController
    • "tikhonov": tikhonov regularization parameter if >0
    • "inflation": additive inflation ∈ [0,1] with 0 being no inflation
    • "train_fraction": e.g. 0.8 (default) means 80:20 train - test split
    • "nfeaturesopt": fix the number of features for optimization (default n_features, as used for prediction)
    • "multithread": how to multithread. "ensemble" (default) threads across ensemble members "tullio" threads random feature matrix algebra
    • "accelerator": use EKP accelerators (default is no acceleration)
    • "verbose" => false, verbose optimizer statements
source
CalibrateEmulateSample.Emulators.build_models!Method
build_models!(
+

Constructs a ScalarRandomFeatureInterface <: MachineLearningTool interface for the RandomFeatures.jl package for multi-input and single- (or decorrelated-)output emulators.

  • n_features - the number of random features
  • input_dim - the dimension of the input space
  • kernel_structure - - a prescribed form of kernel structure
  • batch_sizes = nothing - Dictionary of batch sizes passed RandomFeatures.jl object (see definition there)
  • rng = Random.GLOBAL_RNG - random number generator
  • feature_decomposition = "cholesky" - choice of how to store decompositions of random features, cholesky or svd available
  • optimizer_options = nothing - Dict of options to pass into EKI optimization of hyperparameters (defaults created in ScalarRandomFeatureInterface constructor):
    • "prior": the prior for the hyperparameter optimization
    • "priorinscale": use this to tune the input prior scale
    • "n_ensemble": number of ensemble members
    • "n_iteration": number of eki iterations
    • "covsamplemultiplier": increase for more samples to estimate covariance matrix in optimization (default 10.0, minimum 0.0)
    • "scheduler": Learning rate Scheduler (a.k.a. EKP timestepper) Default: DataMisfitController
    • "tikhonov": tikhonov regularization parameter if >0
    • "inflation": additive inflation ∈ [0,1] with 0 being no inflation
    • "train_fraction": e.g. 0.8 (default) means 80:20 train - test split
    • "nfeaturesopt": fix the number of features for optimization (default n_features, as used for prediction)
    • "multithread": how to multithread. "ensemble" (default) threads across ensemble members "tullio" threads random feature matrix algebra
    • "accelerator": use EKP accelerators (default is no acceleration)
    • "verbose" => false, verbose optimizer statements
source
CalibrateEmulateSample.Emulators.build_models!Method
build_models!(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface,
     input_output_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer{FT<:AbstractFloat}
 )
-

Builds the random feature method from hyperparameters. We use cosine activation functions and a Multivariate Normal distribution (from Distributions.jl) with mean M=0, and input covariance U built with the CovarianceStructureType.

source
GaussianProcesses.predictMethod
predict(
+

Builds the random feature method from hyperparameters. We use cosine activation functions and a Multivariate Normal distribution (from Distributions.jl) with mean M=0, and input covariance U built with the CovarianceStructureType.

source
GaussianProcesses.predictMethod
predict(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface,
     new_inputs::AbstractMatrix;
     multithread
 ) -> Tuple{Any, Any}
-

Prediction of data observation (not latent function) at new inputs (passed in as columns in a matrix). That is, we add the observational noise into predictions.

source

Vector Interface

CalibrateEmulateSample.Emulators.VectorRandomFeatureInterfaceType
struct VectorRandomFeatureInterface{S<:AbstractString, RNG<:Random.AbstractRNG, KST<:CalibrateEmulateSample.Emulators.KernelStructureType} <: CalibrateEmulateSample.Emulators.RandomFeatureInterface

Structure holding the Vector Random Feature models.

Fields

  • rfms::Vector{RandomFeatures.Methods.RandomFeatureMethod}: A vector of RandomFeatureMethods, contains the feature structure, batch-sizes and regularization

  • fitted_features::Vector{RandomFeatures.Methods.Fit}: vector of Fits, containing the matrix decomposition and coefficients of RF when fitted to data

  • batch_sizes::Union{Nothing, Dict{S, Int64}} where S<:AbstractString: batch sizes

  • n_features::Union{Nothing, Int64}: number of features

  • input_dim::Int64: input dimension

  • output_dim::Int64: output_dimension

  • rng::Random.AbstractRNG: rng

  • regularization::Vector{Union{LinearAlgebra.Diagonal, LinearAlgebra.UniformScaling, Matrix}}: regularization

  • kernel_structure::CalibrateEmulateSample.Emulators.KernelStructureType: Kernel structure type (e.g. Separable or Nonseparable)

  • feature_decomposition::AbstractString: Random Feature decomposition, choose from "svd" or "cholesky" (default)

  • optimizer_options::Dict: dictionary of options for hyperparameter optimizer

source

Vector Interface

CalibrateEmulateSample.Emulators.VectorRandomFeatureInterfaceType
struct VectorRandomFeatureInterface{S<:AbstractString, RNG<:Random.AbstractRNG, KST<:CalibrateEmulateSample.Emulators.KernelStructureType} <: CalibrateEmulateSample.Emulators.RandomFeatureInterface

Structure holding the Vector Random Feature models.

Fields

  • rfms::Vector{RandomFeatures.Methods.RandomFeatureMethod}: A vector of RandomFeatureMethods, contains the feature structure, batch-sizes and regularization

  • fitted_features::Vector{RandomFeatures.Methods.Fit}: vector of Fits, containing the matrix decomposition and coefficients of RF when fitted to data

  • batch_sizes::Union{Nothing, Dict{S, Int64}} where S<:AbstractString: batch sizes

  • n_features::Union{Nothing, Int64}: number of features

  • input_dim::Int64: input dimension

  • output_dim::Int64: output_dimension

  • rng::Random.AbstractRNG: rng

  • regularization::Vector{Union{LinearAlgebra.Diagonal, LinearAlgebra.UniformScaling, Matrix}}: regularization

  • kernel_structure::CalibrateEmulateSample.Emulators.KernelStructureType: Kernel structure type (e.g. Separable or Nonseparable)

  • feature_decomposition::AbstractString: Random Feature decomposition, choose from "svd" or "cholesky" (default)

  • optimizer_options::Dict: dictionary of options for hyperparameter optimizer

source
CalibrateEmulateSample.Emulators.VectorRandomFeatureInterfaceMethod
VectorRandomFeatureInterface(
     n_features::Int64,
     input_dim::Int64,
     output_dim::Int64;
@@ -40,81 +40,81 @@
     feature_decomposition,
     optimizer_options
 ) -> CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface{String, Random._GLOBAL_RNG, CalibrateEmulateSample.Emulators.SeparableKernel{CST1, CST2}} where {CST1<:CalibrateEmulateSample.Emulators.CovarianceStructureType, CST2<:CalibrateEmulateSample.Emulators.CovarianceStructureType}
-

Constructs a VectorRandomFeatureInterface <: MachineLearningTool interface for the RandomFeatures.jl package for multi-input and multi-output emulators.

  • n_features - the number of random features
  • input_dim - the dimension of the input space
  • output_dim - the dimension of the output space
  • kernel_structure - - a prescribed form of kernel structure
  • batch_sizes = nothing - Dictionary of batch sizes passed RandomFeatures.jl object (see definition there)
  • rng = Random.GLOBAL_RNG - random number generator
  • feature_decomposition = "cholesky" - choice of how to store decompositions of random features, cholesky or svd available
  • optimizer_options = nothing - Dict of options to pass into EKI optimization of hyperparameters (defaults created in VectorRandomFeatureInterface constructor):
    • "prior": the prior for the hyperparameter optimization
    • "priorinscale"/"prioroutscale": use these to tune the input/output prior scale.
    • "n_ensemble": number of ensemble members
    • "n_iteration": number of eki iterations
    • "scheduler": Learning rate Scheduler (a.k.a. EKP timestepper) Default: DataMisfitController
    • "covsamplemultiplier": increase for more samples to estimate covariance matrix in optimization (default 10.0, minimum 0.0)
    • "tikhonov": tikhonov regularization parameter if > 0
    • "inflation": additive inflation ∈ [0,1] with 0 being no inflation
    • "train_fraction": e.g. 0.8 (default) means 80:20 train - test split
    • "nfeaturesopt": fix the number of features for optimization (default n_features, as used for prediction)
    • "multithread": how to multithread. "ensemble" (default) threads across ensemble members "tullio" threads random feature matrix algebra
    • "accelerator": use EKP accelerators (default is no acceleration)
    • "verbose" => false, verbose optimizer statements to check convergence, priors and optimal parameters.
source
CalibrateEmulateSample.Emulators.build_models!Method
build_models!(
+

Constructs a VectorRandomFeatureInterface <: MachineLearningTool interface for the RandomFeatures.jl package for multi-input and multi-output emulators.

  • n_features - the number of random features
  • input_dim - the dimension of the input space
  • output_dim - the dimension of the output space
  • kernel_structure - - a prescribed form of kernel structure
  • batch_sizes = nothing - Dictionary of batch sizes passed RandomFeatures.jl object (see definition there)
  • rng = Random.GLOBAL_RNG - random number generator
  • feature_decomposition = "cholesky" - choice of how to store decompositions of random features, cholesky or svd available
  • optimizer_options = nothing - Dict of options to pass into EKI optimization of hyperparameters (defaults created in VectorRandomFeatureInterface constructor):
    • "prior": the prior for the hyperparameter optimization
    • "priorinscale"/"prioroutscale": use these to tune the input/output prior scale.
    • "n_ensemble": number of ensemble members
    • "n_iteration": number of eki iterations
    • "scheduler": Learning rate Scheduler (a.k.a. EKP timestepper) Default: DataMisfitController
    • "covsamplemultiplier": increase for more samples to estimate covariance matrix in optimization (default 10.0, minimum 0.0)
    • "tikhonov": tikhonov regularization parameter if > 0
    • "inflation": additive inflation ∈ [0,1] with 0 being no inflation
    • "train_fraction": e.g. 0.8 (default) means 80:20 train - test split
    • "nfeaturesopt": fix the number of features for optimization (default n_features, as used for prediction)
    • "multithread": how to multithread. "ensemble" (default) threads across ensemble members "tullio" threads random feature matrix algebra
    • "accelerator": use EKP accelerators (default is no acceleration)
    • "verbose" => false, verbose optimizer statements to check convergence, priors and optimal parameters.
source
CalibrateEmulateSample.Emulators.build_models!Method
build_models!(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface,
     input_output_pairs::EnsembleKalmanProcesses.DataContainers.PairedDataContainer{FT<:AbstractFloat};
     regularization_matrix
 ) -> Union{Nothing, Vector{Union{LinearAlgebra.Diagonal, LinearAlgebra.UniformScaling, Matrix}}}
-

Build Vector Random Feature model for the input-output pairs subject to regularization, and optimizes the hyperparameters with EKP.

source
GaussianProcesses.predictMethod
predict(
+

Build Vector Random Feature model for the input-output pairs subject to regularization, and optimizes the hyperparameters with EKP.

source
GaussianProcesses.predictMethod
predict(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface,
     new_inputs::AbstractMatrix
 ) -> Tuple{Any, Any}
-

Prediction of data observation (not latent function) at new inputs (passed in as columns in a matrix). That is, we add the observational noise into predictions.

source

Other utilities

Other utilities

CalibrateEmulateSample.Emulators.get_rfmsFunction
get_rfms(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
 ) -> Vector{RandomFeatures.Methods.RandomFeatureMethod}
-

gets the rfms field

source
get_rfms(
+

gets the rfms field

source
get_rfms(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
 ) -> Vector{RandomFeatures.Methods.RandomFeatureMethod}
-

Gets the rfms field

source
CalibrateEmulateSample.Emulators.get_fitted_featuresFunction
get_fitted_features(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
 ) -> Vector{RandomFeatures.Methods.Fit}
-

gets the fitted_features field

source
get_fitted_features(
+

gets the fitted_features field

source
get_fitted_features(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
 ) -> Vector{RandomFeatures.Methods.Fit}
-

Gets the fitted_features field

source
CalibrateEmulateSample.Emulators.get_batch_sizesFunction
get_batch_sizes(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
 ) -> Union{Nothing, Dict{S, Int64}} where S<:AbstractString
-

gets batch_sizes the field

source
get_batch_sizes(
+

gets batch_sizes the field

source
get_batch_sizes(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
 ) -> Union{Nothing, Dict{S, Int64}} where S<:AbstractString
-

Gets the batch_sizes field

source
CalibrateEmulateSample.Emulators.get_n_featuresFunction
get_n_features(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
 ) -> Union{Nothing, Int64}
-

gets the n_features field

source
get_n_features(
+

gets the n_features field

source
get_n_features(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
 ) -> Union{Nothing, Int64}
-

Gets the n_features field

source
CalibrateEmulateSample.Emulators.get_input_dimFunction
get_input_dim(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
 ) -> Int64
-

gets the input_dim field

source
get_input_dim(
+

gets the input_dim field

source
get_input_dim(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
 ) -> Int64
-

Gets the input_dim field

source
CalibrateEmulateSample.Emulators.get_rngFunction
get_rng(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
 ) -> Random.AbstractRNG
-

gets the rng field

source
get_rng(
+

gets the rng field

source
get_rng(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
 ) -> Random.AbstractRNG
-

Gets the rng field

source
CalibrateEmulateSample.Emulators.get_kernel_structureFunction
get_kernel_structure(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
 ) -> CalibrateEmulateSample.Emulators.KernelStructureType
-

Gets the kernel_structure field

source
get_kernel_structure(
+

Gets the kernel_structure field

source
get_kernel_structure(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
 ) -> CalibrateEmulateSample.Emulators.KernelStructureType
-

Gets the kernel_structure field

source
CalibrateEmulateSample.Emulators.get_feature_decompositionFunction
get_feature_decomposition(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
 ) -> AbstractString
-

gets the feature_decomposition field

source
get_feature_decomposition(
+

gets the feature_decomposition field

source
get_feature_decomposition(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
 ) -> AbstractString
-

Gets the feature_decomposition field

source
CalibrateEmulateSample.Emulators.get_optimizer_optionsFunction
get_optimizer_options(
     srfi::CalibrateEmulateSample.Emulators.ScalarRandomFeatureInterface
 ) -> Dict{S} where S<:AbstractString
-

gets the optimizer_options field

source
get_optimizer_options(
+

gets the optimizer_options field

source
get_optimizer_options(
     vrfi::CalibrateEmulateSample.Emulators.VectorRandomFeatureInterface
 ) -> Dict
-

Gets the optimizer_options field

source
CalibrateEmulateSample.Emulators.shrinkage_covFunction
shrinkage_cov(sample_mat::AbstractMatrix) -> Any
-

Calculate the empirical covariance, additionally applying a shrinkage operator (here the Ledoit Wolf 2004 shrinkage operation). Known to have better stability properties than Monte-Carlo for low sample sizes

source
+

Empty method, as optimization takes place within the build_models stage

source
CalibrateEmulateSample.Emulators.shrinkage_covFunction
shrinkage_cov(sample_mat::AbstractMatrix) -> Any
+

Calculate the empirical covariance, additionally applying a shrinkage operator (here the Ledoit Wolf 2004 shrinkage operation). Known to have better stability properties than Monte-Carlo for low sample sizes

source
diff --git a/dev/API/Utilities/index.html b/dev/API/Utilities/index.html index 0bf5200a..d6f285b9 100644 --- a/dev/API/Utilities/index.html +++ b/dev/API/Utilities/index.html @@ -4,8 +4,8 @@ obs::EnsembleKalmanProcesses.Observations.Observation; rng_seed ) -> Any -

Return a random sample from the observations, for use in the MCMC.

source
CalibrateEmulateSample.Utilities.get_training_pointsMethod
get_training_points(
+

Return a random sample from the observations, for use in the MCMC.

  • rng - optional RNG object used to pick random sample; defaults to Random.GLOBAL_RNG.
  • obs - Observation struct with the observations (extract will pick one of the sample observations to train).
  • rng_seed - optional kwarg; if provided, used to re-seed rng before sampling.
source
CalibrateEmulateSample.Utilities.get_training_pointsMethod
get_training_points(
     ekp::EnsembleKalmanProcesses.EnsembleKalmanProcess{FT, IT, P},
     train_iterations::Union{AbstractVector{IT}, IT} where IT
 ) -> EnsembleKalmanProcesses.DataContainers.PairedDataContainer
-

Extract the training points needed to train the Gaussian process regression.

  • ekp - EnsembleKalmanProcess holding the parameters and the data that were produced during the Ensemble Kalman (EK) process.
  • train_iterations - Number (or indices) EK layers/iterations to train on.
source
+

Extract the training points needed to train the Gaussian process regression.

source diff --git a/dev/GaussianProcessEmulator/index.html b/dev/GaussianProcessEmulator/index.html index 26bfc150..502ef9b4 100644 --- a/dev/GaussianProcessEmulator/index.html +++ b/dev/GaussianProcessEmulator/index.html @@ -25,4 +25,4 @@ noise_learn = true )

When noise_learn is true, an additional white noise kernel is added to the kernel. This white noise is present across all parameter values, including the training data. The scale parameters of the white noise kernel are learned in optimize_hyperparameters!(emulator).

You may not need to learn the noise if you already have a good estimate of the noise from your training data, and if the Gaussian process kernel is well specified. When noise_learn is false, a small additional regularization is added for stability. The default value is 1e-3 but this can be chosen through the optional argument alg_reg_noise:

gauss_proc = GaussianProcess(
     gppackage;
     noise_learn = false,
-    alg_reg_noise = 1e-3 )
+ alg_reg_noise = 1e-3 ) diff --git a/dev/calibrate/index.html b/dev/calibrate/index.html index 15f3ab4f..df758580 100644 --- a/dev/calibrate/index.html +++ b/dev/calibrate/index.html @@ -1,2 +1,2 @@ -Calibrate · CalibrateEmulateSample.jl

The Calibrate stage

Calibration of the computer model entails finding an optimal parameter $\theta^*$ that maximizes the posterior probability

\[\rho(\theta\vert y, \Gamma_y) = \dfrac{e^{-\mathcal{L}(\theta; y)}}{Z(y\vert\Gamma_y)}\rho_{\mathrm{prior}}(\theta), \qquad \mathcal{L}(\theta, y) = \langle \mathcal{G}(\theta) - y \, , \, \Gamma_y^{-1} \left ( \mathcal{G}(\theta) - y \right ) \rangle,\]

where $\mathcal{L}$ is the loss or negative log-likelihood, $Z(y\vert\Gamma)$ is a normalizing constant, $y$ represents the data, $\Gamma_y$ is the noise covariance matrix and $\rho_{\mathrm{prior}}(\theta)$ is the prior density. Calibration is performed using ensemble Kalman processes, which generate input-output pairs $\{\theta, \mathcal{G}(\theta)\}$ in high density from the prior initial guess to the found optimal parameter $\theta^*$. These input-output pairs are then used as the data to train an emulator of the forward model $\mathcal{G}$.

Ensemble Kalman Processes

Calibration can be performed using different ensemble Kalman processes: ensemble Kalman inversion (Iglesias et al, 2013), ensemble Kalman sampler (Garbuno-Inigo et al, 2020), and unscented Kalman inversion (Huang et al, 2022). All algorithms are implemented in EnsembleKalmanProcesses.jl. Documentation of each algorithm is available in the EnsembleKalmanProcesses docs.

Typical construction of the EnsembleKalmanProcess

Documentation on how to construct an EnsembleKalmanProcess from the computer model and the data can be found in the EnsembleKalmanProcesses docs.

Julia-free forward model

One draw of our approach is that it does not require the forward map to be written in Julia. To aid construction of such a workflow, EnsembleKalmanProcesses.jl provides a documented example of a BASH workflow for the sinusoid problem, with source code here. The forward map interacts with the calibration tools (EKP) only though TOML file reading an writing, and thus can be written in any language; for example, to be used with slurm HPC scripts, with source code here.

+Calibrate · CalibrateEmulateSample.jl

The Calibrate stage

Calibration of the computer model entails finding an optimal parameter $\theta^*$ that maximizes the posterior probability

\[\rho(\theta\vert y, \Gamma_y) = \dfrac{e^{-\mathcal{L}(\theta; y)}}{Z(y\vert\Gamma_y)}\rho_{\mathrm{prior}}(\theta), \qquad \mathcal{L}(\theta, y) = \langle \mathcal{G}(\theta) - y \, , \, \Gamma_y^{-1} \left ( \mathcal{G}(\theta) - y \right ) \rangle,\]

where $\mathcal{L}$ is the loss or negative log-likelihood, $Z(y\vert\Gamma)$ is a normalizing constant, $y$ represents the data, $\Gamma_y$ is the noise covariance matrix and $\rho_{\mathrm{prior}}(\theta)$ is the prior density. Calibration is performed using ensemble Kalman processes, which generate input-output pairs $\{\theta, \mathcal{G}(\theta)\}$ in high density from the prior initial guess to the found optimal parameter $\theta^*$. These input-output pairs are then used as the data to train an emulator of the forward model $\mathcal{G}$.

Ensemble Kalman Processes

Calibration can be performed using different ensemble Kalman processes: ensemble Kalman inversion (Iglesias et al, 2013), ensemble Kalman sampler (Garbuno-Inigo et al, 2020), and unscented Kalman inversion (Huang et al, 2022). All algorithms are implemented in EnsembleKalmanProcesses.jl. Documentation of each algorithm is available in the EnsembleKalmanProcesses docs.

Typical construction of the EnsembleKalmanProcess

Documentation on how to construct an EnsembleKalmanProcess from the computer model and the data can be found in the EnsembleKalmanProcesses docs.

Julia-free forward model

One draw of our approach is that it does not require the forward map to be written in Julia. To aid construction of such a workflow, EnsembleKalmanProcesses.jl provides a documented example of a BASH workflow for the sinusoid problem, with source code here. The forward map interacts with the calibration tools (EKP) only though TOML file reading an writing, and thus can be written in any language; for example, to be used with slurm HPC scripts, with source code here.

diff --git a/dev/contributing/index.html b/dev/contributing/index.html index 0c45feff..81196fc7 100644 --- a/dev/contributing/index.html +++ b/dev/contributing/index.html @@ -30,4 +30,4 @@ # s, squash = use commit, but meld into previous commit # # If you remove a line here THAT COMMIT WILL BE LOST. - # However, if you remove everything, the rebase will be aborted.

Then in the next screen that appears, we can just delete all messages that we do not want to show in the commit. After this is done and we are back to the console, we have to force push. We need to force push because we rewrote the local commit history.

$ git push -u origin <name_of_local_branch> --force

You can find more information about squashing here.

Unit testing

Currently a number of checks are run per commit for a given PR.

Unit tests are run against every new commit for a given PR, the status of the unit-tests are not checked during the merge process but act as a sanity check for developers and reviewers. Depending on the content changed in the PR, some CI checks that are not necessary will be skipped. For example doc only changes do not require the unit tests to be run.

The merge process

We ensure that all unit tests across several environments, Documentation builds, and integration tests (managed by Buildkite), pass before merging any PR into main. The integration tests currently run some of our example cases in examples/.

+ # However, if you remove everything, the rebase will be aborted.

Then in the next screen that appears, we can just delete all messages that we do not want to show in the commit. After this is done and we are back to the console, we have to force push. We need to force push because we rewrote the local commit history.

$ git push -u origin <name_of_local_branch> --force

You can find more information about squashing here.

Unit testing

Currently a number of checks are run per commit for a given PR.

Unit tests are run against every new commit for a given PR, the status of the unit-tests are not checked during the merge process but act as a sanity check for developers and reviewers. Depending on the content changed in the PR, some CI checks that are not necessary will be skipped. For example doc only changes do not require the unit tests to be run.

The merge process

We ensure that all unit tests across several environments, Documentation builds, and integration tests (managed by Buildkite), pass before merging any PR into main. The integration tests currently run some of our example cases in examples/.

diff --git a/dev/emulate/index.html b/dev/emulate/index.html index b9e8ef7a..d4432f55 100644 --- a/dev/emulate/index.html +++ b/dev/emulate/index.html @@ -10,4 +10,4 @@ retained_svd_frac = 0.95, )

The optional arguments above relate to the data processing.

Emulator Training

The emulator is trained when we combine the machine learning tool and the data into the Emulator above. For any machine learning tool, hyperparameters are optimized.

optimize_hyperparameters!(emulator)

For some machine learning packages however, this may be completed during construction automatically, and for others this will not. If automatic construction took place, the optimize_hyperparameters! line does not perform any new task, so may be safely called. In the Lorenz example, this line learns the hyperparameters of the Gaussian process, which depend on the choice of kernel, and the choice of GP package. Predictions at new inputs can then be made using

y, cov = Emulator.predict(emulator, new_inputs)

This returns both a mean value and a covariance.

Data processing

Some effects of the following are outlined in a practical setting in the results and appendices of Howland, Dunbar, Schneider, (2022).

Diagonalization and output dimension reduction

This arises from the optional arguments

We always use singular value decomposition to diagonalize the output space, requiring output covariance Γy. Why? If we need to train a $\mathbb{R}^{10} \to \mathbb{R}^{100}$ emulator, diagonalization allows us to instead train 100 $\mathbb{R}^{10} \to \mathbb{R}^{1}$ emulators (far cheaper).

Performance is increased further by throwing away less informative output dimensions, if 95% of the information (i.e., variance) is in the first 40 diagonalized output dimensions then setting retained_svd_frac=0.95 will train only 40 emulators.

Note

Diagonalization is an approximation. It is however a good approximation when the observational covariance varies slowly in the parameter space.

Warn

Severe approximation errors can occur if obs_noise_cov is not provided.

Normalization and standardization

This arises from the optional arguments

We normalize the input data in a standard way by centering, and scaling with the empirical covariance

To help with poor conditioning of the covariance matrix, users can also standardize each output dimension with by a multiplicative factor given by the elements of factor_vector.

Modular interface

Developers may contribute new tools by performing the following

  1. Create MyMLToolName.jl, and include "MyMLToolName.jl" in Emulators.jl
  2. Create a struct MyMLTool <: MachineLearningTool, containing any arguments or optimizer options
  3. Create the following three methods to build, train, and predict with your tool (use GaussianProcess.jl as a guide)
build_models!(mlt::MyMLTool, iopairs::PairedDataContainer) -> Nothing
 optimize_hyperparameters!(mlt::MyMLTool, args...; kwargs...) -> Nothing
-function predict(mlt::MyMLTool, new_inputs::Matrix; kwargs...) -> Matrix, Union{Matrix, Array{,3}
on dimensions of the predict inputs and outputs

The predict method takes as input, an input_dim-by-N_new matrix. It return both a predicted mean and a predicted (co)variance at new inputs. (i) for scalar-output methods relying on diagonalization, return output_dim-by-N_new matrices for mean and variance, (ii) For vector-output methods, return output_dim-by-N_new for mean and output_dim-by-output_dim-by-N_new for covariances.

Please get in touch with our development team when contributing new statistical emulators, to help us ensure the smoothest interface with any new tools.

+function predict(mlt::MyMLTool, new_inputs::Matrix; kwargs...) -> Matrix, Union{Matrix, Array{,3}
on dimensions of the predict inputs and outputs

The predict method takes as input, an input_dim-by-N_new matrix. It return both a predicted mean and a predicted (co)variance at new inputs. (i) for scalar-output methods relying on diagonalization, return output_dim-by-N_new matrices for mean and variance, (ii) For vector-output methods, return output_dim-by-N_new for mean and output_dim-by-output_dim-by-N_new for covariances.

Please get in touch with our development team when contributing new statistical emulators, to help us ensure the smoothest interface with any new tools.

diff --git a/dev/examples/Cloudy_example/index.html b/dev/examples/Cloudy_example/index.html index 705ffe15..a1461906 100644 --- a/dev/examples/Cloudy_example/index.html +++ b/dev/examples/Cloudy_example/index.html @@ -242,4 +242,4 @@ # Setting title and labels ax.title = param_names[idx] ax.xlabel = "Value" - ax.ylabel = "Density"

This is what the marginal distributions of the three parameters look like, for the case of the GP emulator, and in the constrained/physical space:

posterior_N0_gpjl

posterior_theta_gpjl

posterior_k_gpjl

Here, the posterior distributions are shown as orange histograms, the prior distribution are shown as grey histograms (though with the exception of the parmaeter k, they are barely visible), and the true parameter values are marked as vertical purple lines.

Appendix: What Does Cloudy Do?

For the purpose of Bayesian parameter learning, the forward model can be treated as a black box that processes input parameters to yield specific outputs. However, for those who wish to learn more about the inner workings of Cloudy, we refer to his paper and offer a brief outline below:

The mathematical starting point of Cloudy is the stochastic collection equation (SCE; sometimes also called Smoluchowski equation after Marian Smoluchowski), which describes the time rate of change of $f = f(m, t)$, the mass distribution function of liquid water droplets, due to the process of collision and coalescence. The distribution function $f$ depends on droplet mass $m$ and time $t$ and is defined such that $f(m) \text{ d}m$ denotes the number of droplets with masses in the interval $[m, m + dm]$ per unit volume.

The stochastic collection equation is an integro-differential equation that can be written as

\[ \frac{\partial f(m, t)}{\partial t} = \frac{1}{2} \int_{m'=0}^{\infty} f(m', t) f(m-m', t) \mathcal{C}(m', m-m')\text{d}m' - f(m, t) \int_{m'=0}^\infty f(m', t)\mathcal{C}(m, m') \text{d}m', \]

where $\mathcal{C}(m', m'')$ is the collision-coalescence kernel, which encapsulates the physics of droplet-droplet interactions – it describes the rate at which two drops of masses $m'$ and $m''$ come into contact and coalesce into a drop of mass $m' + m''$. The first term on the right-hand side of the SCE describes the rate of increase of the number of drops having a mass $m$ due to collision and coalescence of drops of masses $m'$ and $m-m'$ (where the factor $\frac{1}{2}$ avoids double counting), while the second term describes the rate of reduction of drops of mass $m$ due to collision and coalescence of drops having a mass $m$ with other drops.

We can rewrite the SCE in terms of the moments $M_k$ of $f$, which are the prognostic variables in Cloudy. They are defined by

\[ M_k = \int_0^\infty m^k f(m, t) \text{d}m\]

The time rate of change of the k-th moment of $f$ is obtained by multiplying the SCE by $m^k$ and integrating over the entire range of droplet masses (from $m=0$ to $\infty$), which yields

\[ \frac{\partial M_k(t)}{\partial t} = \frac{1}{2}\int_0^\infty \left((m+m')^k - m^k - {m'}^k\right) \mathcal{C}(m, m')f(m, t)f(m', t) \, \text{d}m\, \text{d}m' ~~~~~~~~ (1)\]

In this example, the kernel is set to be constant – $\mathcal{C}(m', m'') = B = \text{const}$ – and the cloud droplet mass distribution is assumed to be a $\text{Gamma}(k_t, \theta_t)$ distribution, scaled by a factor $N_{0,t}$ which denotes the droplet number concentration:

\[f(m, t) = \frac{N_{0,t}}{\Gamma(k_t)\theta_t^k} m^{k_t-1} \exp{(-m/\theta_t)}\]

The parameter vector $\phi_t= [N_{0,t}, k_t, \theta_t]$ changes over time (as indicated by the subscript $t$), as the shape of the distribution evolves. In fact, there is a priori no reason to assume that the distribution would retain its Gamma shape over time, but this is a common assumption that is made in order to solve the closure problem (without this assumption, one would have to keep track of infinitely many moments of the mass distribution in order to uniquely identify the distribution $f$ at each time step, which is obviously not practicable).

For Gamma mass distribution functions, specifying the first three moments ($M_0$, $M_1$, and $M_2$) is sufficient to uniquely determine the parameter vector $\phi_t$, hence Cloudy solves equation (1) for $k = 0, 1, 2$. This mapping of the parameters of the initial cloud droplet mass distribution to the (zeroth-, first-, and second-order) moments of the distribution at a specified end time is done by DynamicalModel.jl.

+ ax.ylabel = "Density"

This is what the marginal distributions of the three parameters look like, for the case of the GP emulator, and in the constrained/physical space:

posterior_N0_gpjl

posterior_theta_gpjl

posterior_k_gpjl

Here, the posterior distributions are shown as orange histograms, the prior distribution are shown as grey histograms (though with the exception of the parmaeter k, they are barely visible), and the true parameter values are marked as vertical purple lines.

Appendix: What Does Cloudy Do?

For the purpose of Bayesian parameter learning, the forward model can be treated as a black box that processes input parameters to yield specific outputs. However, for those who wish to learn more about the inner workings of Cloudy, we refer to his paper and offer a brief outline below:

The mathematical starting point of Cloudy is the stochastic collection equation (SCE; sometimes also called Smoluchowski equation after Marian Smoluchowski), which describes the time rate of change of $f = f(m, t)$, the mass distribution function of liquid water droplets, due to the process of collision and coalescence. The distribution function $f$ depends on droplet mass $m$ and time $t$ and is defined such that $f(m) \text{ d}m$ denotes the number of droplets with masses in the interval $[m, m + dm]$ per unit volume.

The stochastic collection equation is an integro-differential equation that can be written as

\[ \frac{\partial f(m, t)}{\partial t} = \frac{1}{2} \int_{m'=0}^{\infty} f(m', t) f(m-m', t) \mathcal{C}(m', m-m')\text{d}m' - f(m, t) \int_{m'=0}^\infty f(m', t)\mathcal{C}(m, m') \text{d}m', \]

where $\mathcal{C}(m', m'')$ is the collision-coalescence kernel, which encapsulates the physics of droplet-droplet interactions – it describes the rate at which two drops of masses $m'$ and $m''$ come into contact and coalesce into a drop of mass $m' + m''$. The first term on the right-hand side of the SCE describes the rate of increase of the number of drops having a mass $m$ due to collision and coalescence of drops of masses $m'$ and $m-m'$ (where the factor $\frac{1}{2}$ avoids double counting), while the second term describes the rate of reduction of drops of mass $m$ due to collision and coalescence of drops having a mass $m$ with other drops.

We can rewrite the SCE in terms of the moments $M_k$ of $f$, which are the prognostic variables in Cloudy. They are defined by

\[ M_k = \int_0^\infty m^k f(m, t) \text{d}m\]

The time rate of change of the k-th moment of $f$ is obtained by multiplying the SCE by $m^k$ and integrating over the entire range of droplet masses (from $m=0$ to $\infty$), which yields

\[ \frac{\partial M_k(t)}{\partial t} = \frac{1}{2}\int_0^\infty \left((m+m')^k - m^k - {m'}^k\right) \mathcal{C}(m, m')f(m, t)f(m', t) \, \text{d}m\, \text{d}m' ~~~~~~~~ (1)\]

In this example, the kernel is set to be constant – $\mathcal{C}(m', m'') = B = \text{const}$ – and the cloud droplet mass distribution is assumed to be a $\text{Gamma}(k_t, \theta_t)$ distribution, scaled by a factor $N_{0,t}$ which denotes the droplet number concentration:

\[f(m, t) = \frac{N_{0,t}}{\Gamma(k_t)\theta_t^k} m^{k_t-1} \exp{(-m/\theta_t)}\]

The parameter vector $\phi_t= [N_{0,t}, k_t, \theta_t]$ changes over time (as indicated by the subscript $t$), as the shape of the distribution evolves. In fact, there is a priori no reason to assume that the distribution would retain its Gamma shape over time, but this is a common assumption that is made in order to solve the closure problem (without this assumption, one would have to keep track of infinitely many moments of the mass distribution in order to uniquely identify the distribution $f$ at each time step, which is obviously not practicable).

For Gamma mass distribution functions, specifying the first three moments ($M_0$, $M_1$, and $M_2$) is sufficient to uniquely determine the parameter vector $\phi_t$, hence Cloudy solves equation (1) for $k = 0, 1, 2$. This mapping of the parameters of the initial cloud droplet mass distribution to the (zeroth-, first-, and second-order) moments of the distribution at a specified end time is done by DynamicalModel.jl.

diff --git a/dev/examples/edmf_example/index.html b/dev/examples/edmf_example/index.html index 4f54f861..b6dddfe5 100644 --- a/dev/examples/edmf_example/index.html +++ b/dev/examples/edmf_example/index.html @@ -7,4 +7,4 @@ using CalibrateEmulateSample.ParameterDistribution posterior = load(posterior_filepath)["posterior"] posterior_samples = vcat([get_distribution(posterior)[name] for name in get_name(posterior)]...) # samples are columns

To transform these samples into physical parameter space use the following:

transformed_posterior_samples =
-mapslices(x -> transform_unconstrained_to_constrained(posterior, x), posterior_samples, dims = 1)
Computational vs Physical space

The computational $\theta$-space are the parameters on which the algorithms act. Statistics (e.g. mean/covariance) are most meaningful when taken in this space. The physical $\phi$-space is a (nonlinear) transformation of the computational space to apply parameter constraints. To pass parameter values back into the forward model, one must transform them. Full details and examples can be found here

+mapslices(x -> transform_unconstrained_to_constrained(posterior, x), posterior_samples, dims = 1)
Computational vs Physical space

The computational $\theta$-space are the parameters on which the algorithms act. Statistics (e.g. mean/covariance) are most meaningful when taken in this space. The physical $\phi$-space is a (nonlinear) transformation of the computational space to apply parameter constraints. To pass parameter values back into the forward model, one must transform them. Full details and examples can be found here

diff --git a/dev/examples/emulators/ishigami_3d_1d/index.html b/dev/examples/emulators/ishigami_3d_1d/index.html index 60af75ac..78c3c6a9 100644 --- a/dev/examples/emulators/ishigami_3d_1d/index.html +++ b/dev/examples/emulators/ishigami_3d_1d/index.html @@ -90,4 +90,4 @@ (std) firstorder: [0.05909336956162543, 0.11484966121124164, 0.012908533302492602] (mean) totalorder: [0.5670345355855254, 0.4716028261179354, 0.24108222433317] (std) totalorder: [0.10619345801872732, 0.1041023777237331, 0.07200225781785778] - + diff --git a/dev/examples/emulators/lorenz_integrator_3d_3d/index.html b/dev/examples/emulators/lorenz_integrator_3d_3d/index.html index 64cf1a4e..3d935d1a 100644 --- a/dev/examples/emulators/lorenz_integrator_3d_3d/index.html +++ b/dev/examples/emulators/lorenz_integrator_3d_3d/index.html @@ -79,4 +79,4 @@ u_test_tmp[:, i + 1] = rf_mean end

The other trajectories are similar. We then produce the following plots. In all figures, the results from evolving the state with the true integrator is orange, and with the emulated integrators are blue.

Gaussian Process Emulator (Sci-kit learn: GP)

For one example fit

Random Feature Emulator (RF-nosvd-nonsep)

For one example fit

-

and here are CDFs over 20 randomized trials of the random feature hyperparameter optimization

+

and here are CDFs over 20 randomized trials of the random feature hyperparameter optimization

diff --git a/dev/examples/emulators/regression_2d_2d/index.html b/dev/examples/emulators/regression_2d_2d/index.html index d79e9966..fe00b753 100644 --- a/dev/examples/emulators/regression_2d_2d/index.html +++ b/dev/examples/emulators/regression_2d_2d/index.html @@ -66,4 +66,4 @@ inputs = permutedims(hcat(X1[:], X2[:]), (2, 1))

We predict using the emulators at the new inputs, and transform_to_real inverts the data processing back to physical values

em_mean, em_cov = predict(emulator, inputs, transform_to_real = true)

We then plot the predicted mean and pointwise variances, and calculate the errors from the three highlighted cases:

Gaussian Process Emulator (Sci-kit learn: gp-skljl)

L^2 error of mean and latent truth:0.0008042391077774167

Random Feature Emulator (rf-scalar)

L^2 error of mean and latent truth:0.0012253119679379056

Random Feature Emulator (vector: rf-nosvd-nonsep)

L^2 error of mean and latent truth:0.0011094292509180393
- + diff --git a/dev/examples/lorenz_example/index.html b/dev/examples/lorenz_example/index.html index eb8d9c13..24d738fd 100644 --- a/dev/examples/lorenz_example/index.html +++ b/dev/examples/lorenz_example/index.html @@ -114,4 +114,4 @@ println("True parameters: ") println(params_true) println("\nEKI results:") -println(get_ϕ_mean_final(priors, ekiobj))

The parameters and forward model outputs will be saved in parameter_storage.jld2 and data_storage.jld2, respectively. The data will be saved in the directory output. A scatter plot animation of the ensemble convergence to the true parameters is saved in the directory output. These points represent the training points that are used for the emulator.

Emulate-sample

The L96 parameter estimation can be run using julia --project emulate_sample.jl

The output will provide the estimated posterior distribution over the parameters. The emulate-sample code will run for several choices in the machine learning model that is used for the emulation stage, inclding Gaussian Process regression and RF, and using singular value data decorrelation or not.

The sampling results from two emulators are shown below. We can see that the posterior is relatively insensitive to the choice of the machine learning emulation tool in this L96 example.

L96 CES example case: GP regression emulator (case="GP")

L96 CES example case: RF scalar emulator (case="RF-scalar")

+println(get_ϕ_mean_final(priors, ekiobj))

The parameters and forward model outputs will be saved in parameter_storage.jld2 and data_storage.jld2, respectively. The data will be saved in the directory output. A scatter plot animation of the ensemble convergence to the true parameters is saved in the directory output. These points represent the training points that are used for the emulator.

Emulate-sample

The L96 parameter estimation can be run using julia --project emulate_sample.jl

The output will provide the estimated posterior distribution over the parameters. The emulate-sample code will run for several choices in the machine learning model that is used for the emulation stage, inclding Gaussian Process regression and RF, and using singular value data decorrelation or not.

The sampling results from two emulators are shown below. We can see that the posterior is relatively insensitive to the choice of the machine learning emulation tool in this L96 example.

L96 CES example case: GP regression emulator (case="GP")

L96 CES example case: RF scalar emulator (case="RF-scalar")

diff --git a/dev/examples/sinusoid_example/index.html b/dev/examples/sinusoid_example/index.html index 968b088f..bbafed22 100644 --- a/dev/examples/sinusoid_example/index.html +++ b/dev/examples/sinusoid_example/index.html @@ -101,4 +101,4 @@ println("Vertical shift mean: ", mean(constrained_posterior["vert_shift"]), ", std: ", std(constrained_posterior["vert_shift"]))

This gives:

parametersmeanstd
amplitude3.03820.2880
vert_shift6.37740.4586

This is in agreement with the true $\theta=(3.0, 7.0)$ and with the observational covariance matrix we provided $\Gamma=0.2 * I$ (i.e., a standard deviation of approx. $0.45$). CalibrateEmulateSample.jl has built-in plotting recipes to help us visualize the prior and posterior distributions. Note that these are the marginal distributions.

# We can quickly plot priors and posterior using built-in capabilities
 p = plot(prior, fill = :lightgray)
 plot!(posterior, fill = :darkblue, alpha = 0.5)
-

GP_posterior

The MCMC has learned the posterior distribution which is much narrower than the prior. For multidimensional problems, the posterior is typically multidimensional, and marginal distribution plots do not show how parameters co-vary. We plot a 2D histogram of $\theta_1$ vs. $\theta_2$ below, with the marginal distributions on each axis.

GP_2d_posterior

Sample with Random Features

We can repeat the sampling method using the random features emulator instead of the Gaussian process and we find similar results:

parametersmeanstd
amplitude3.32100.7216
vert_shift6.39860.5098

RF_2d_posterior

It is reassuring to see that our uncertainty quantification methods are robust to the different emulator choices here. This is because our particular GP and RF emulators showed similar accuracy during validation. However, this result is highly sensitive to the choices of GP kernel and RF kernel structure. If you find very different posterior distributions for different emulators, it is likely that the kernel choices need be refined. The kernel choices must be flexible enough to accurately capture the relationships between the inputs and outputs. We recommend trying a variety of different emulator configurations and carefully considering emulator validation on samples that the emulator has not been trained on.

+

GP_posterior

The MCMC has learned the posterior distribution which is much narrower than the prior. For multidimensional problems, the posterior is typically multidimensional, and marginal distribution plots do not show how parameters co-vary. We plot a 2D histogram of $\theta_1$ vs. $\theta_2$ below, with the marginal distributions on each axis.

GP_2d_posterior

Sample with Random Features

We can repeat the sampling method using the random features emulator instead of the Gaussian process and we find similar results:

parametersmeanstd
amplitude3.32100.7216
vert_shift6.39860.5098

RF_2d_posterior

It is reassuring to see that our uncertainty quantification methods are robust to the different emulator choices here. This is because our particular GP and RF emulators showed similar accuracy during validation. However, this result is highly sensitive to the choices of GP kernel and RF kernel structure. If you find very different posterior distributions for different emulators, it is likely that the kernel choices need be refined. The kernel choices must be flexible enough to accurately capture the relationships between the inputs and outputs. We recommend trying a variety of different emulator configurations and carefully considering emulator validation on samples that the emulator has not been trained on.

diff --git a/dev/glossary/index.html b/dev/glossary/index.html index 477e5391..50c6ca05 100644 --- a/dev/glossary/index.html +++ b/dev/glossary/index.html @@ -1,2 +1,2 @@ -Glossary · CalibrateEmulateSample.jl

Glossary

The following list includes the names and symbols of recurring concepts in CalibrateEmulateSample.jl. Some of these variables do not appear in the codebase, which relies on array programming for performance. Contributions to the codebase require following this notational convention. Similarly, if you find inconsistencies in the documentation or codebase, please report an issue on GitHub.

NameSymbol (Theory/Docs)Symbol (Code)
Parameter vector, Parameters (unconstrained space)$\theta$θ
Parameter vector size, Number of parameters$p$N_par
Ensemble size$J$N_ens
Ensemble particles, members$\theta^{(j)}$
Number of iterations$N_{\rm it}$N_iter
Observation vector, Observations, Data vector$y$y
Observation vector size, Data vector size$d$N_obs
Observational noise$\eta$obs_noise
Observational noise covariance$\Gamma_y$obs_noise_cov
Hilbert space inner product$\langle \phi , \Gamma^{-1} \psi \rangle$
Forward map$\mathcal{G}$G
Dynamical model$\Psi$Ψ
Transform map (constrained to unconstrained)$\mathcal{T}$T
Observation map$\mathcal{H}$H
Prior covariance (unconstrained space)$\Gamma_{\theta}$prior_cov
Prior mean (unconstrained space)$m_\theta$prior_mean
+Glossary · CalibrateEmulateSample.jl

Glossary

The following list includes the names and symbols of recurring concepts in CalibrateEmulateSample.jl. Some of these variables do not appear in the codebase, which relies on array programming for performance. Contributions to the codebase require following this notational convention. Similarly, if you find inconsistencies in the documentation or codebase, please report an issue on GitHub.

NameSymbol (Theory/Docs)Symbol (Code)
Parameter vector, Parameters (unconstrained space)$\theta$θ
Parameter vector size, Number of parameters$p$N_par
Ensemble size$J$N_ens
Ensemble particles, members$\theta^{(j)}$
Number of iterations$N_{\rm it}$N_iter
Observation vector, Observations, Data vector$y$y
Observation vector size, Data vector size$d$N_obs
Observational noise$\eta$obs_noise
Observational noise covariance$\Gamma_y$obs_noise_cov
Hilbert space inner product$\langle \phi , \Gamma^{-1} \psi \rangle$
Forward map$\mathcal{G}$G
Dynamical model$\Psi$Ψ
Transform map (constrained to unconstrained)$\mathcal{T}$T
Observation map$\mathcal{H}$H
Prior covariance (unconstrained space)$\Gamma_{\theta}$prior_cov
Prior mean (unconstrained space)$m_\theta$prior_mean
diff --git a/dev/index.html b/dev/index.html index 3daec4f5..75f215ab 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · CalibrateEmulateSample.jl

CalibrateEmulateSample.jl

CalibrateEmulateSample.jl solves parameter estimation problems using accelerated (and approximate) Bayesian inversion.

The framework can be applied currently to learn:

  • the joint distribution for a moderate numbers of parameters (<40),
  • it is not inherently restricted to unimodal distributions.

It can be used with computer models that:

  • can be noisy or chaotic,
  • are non-differentiable,
  • can only be treated as black-box (interfaced only with parameter files).

The computer model is supplied by the user, as a parameter-to-data map $\mathcal{G}(\theta): \mathbb{R}^p \rightarrow \mathbb{R}^d$. For example, $\mathcal{G}$ could be a map from any given parameter configuration $\theta$ to a collection of statistics of a dynamical system trajectory. $\mathcal{G}$ is referred to as the forward model in the Bayesian inverse problem setting.

The data produced by the forward model are compared to observations $y$, which are assumed to be corrupted by additive noise $\eta$, such that

\[y = \mathcal{G}(\theta) + \eta,\]

where the noise $\eta$ is drawn from a $d$-dimensional Gaussian with distribution $\mathcal{N}(0, \Gamma_y)$.

The inverse problem

Given an observation $y$, the computer model $\mathcal{G}$, the observational noise $\Gamma_y$, and some broad prior information on $\theta$, we return the joint distribution of a data-informed distribution for "$\theta$ given $y$".

As the name suggests, CalibrateEmulateSample.jl breaks this problem into a sequence of three steps: calibration, emulation, and sampling. A comprehensive treatment of the calibrate-emulate-sample approach to Bayesian inverse problems can be found in Cleary et al. (2020).

The three steps of the algorithm: see our walkthrough of the Sinusoid Example

Learn the vertical shift and amplitude of the signal given the noisy observation

The calibrate step of the algorithm consists of an application of Ensemble Kalman Processes, which generates input-output pairs $\{\theta, \mathcal{G}(\theta)\}$ in high density around an optimal parameter $\theta^*$. Here, $\theta$ are amplitude and vertical shift pairs, and $\mathcal{G}(\theta)$ are the resulting signal mean and range. This $\theta^*$ will be near a mode of the posterior distribution (Note: This is the only time we interface with the forward model $\mathcal{G}$).

calibrate with EKP to generate data pairs...

The emulate step takes these pairs $\{\theta, \mathcal{G}(\theta)\}$ and trains a statistical surrogate model (e.g., a Gaussian process), emulating the forward map $\mathcal{G}$.

emulate the map statistically from EKP pairs...

The sample step uses this surrogate in place of $\mathcal{G}$ in a sampling method (Markov chain Monte Carlo) to sample the posterior distribution of $\theta$.

sample the emulated map with MCMC...

Code Components

CalibrateEmulateSample.jl contains the following modules:

ModulePurpose
CalibrateEmulateSample.jlA wrapper for the pipeline
Emulator.jlModular template for the emulators
GaussianProcess.jlA Gaussian process emulator
Scalar/VectorRandomFeatureInterface.jlA Scalar/Vector-output Random Feature emulator
MarkovChainMonteCarlo.jlModular template for Markov Chain Monte Carlo samplers
Utilities.jlHelper functions

Authors

CalibrateEmulateSample.jl is being developed by the Climate Modeling Alliance.

+Home · CalibrateEmulateSample.jl

CalibrateEmulateSample.jl

CalibrateEmulateSample.jl solves parameter estimation problems using accelerated (and approximate) Bayesian inversion.

The framework can be applied currently to learn:

  • the joint distribution for a moderate numbers of parameters (<40),
  • it is not inherently restricted to unimodal distributions.

It can be used with computer models that:

  • can be noisy or chaotic,
  • are non-differentiable,
  • can only be treated as black-box (interfaced only with parameter files).

The computer model is supplied by the user, as a parameter-to-data map $\mathcal{G}(\theta): \mathbb{R}^p \rightarrow \mathbb{R}^d$. For example, $\mathcal{G}$ could be a map from any given parameter configuration $\theta$ to a collection of statistics of a dynamical system trajectory. $\mathcal{G}$ is referred to as the forward model in the Bayesian inverse problem setting.

The data produced by the forward model are compared to observations $y$, which are assumed to be corrupted by additive noise $\eta$, such that

\[y = \mathcal{G}(\theta) + \eta,\]

where the noise $\eta$ is drawn from a $d$-dimensional Gaussian with distribution $\mathcal{N}(0, \Gamma_y)$.

The inverse problem

Given an observation $y$, the computer model $\mathcal{G}$, the observational noise $\Gamma_y$, and some broad prior information on $\theta$, we return the joint distribution of a data-informed distribution for "$\theta$ given $y$".

As the name suggests, CalibrateEmulateSample.jl breaks this problem into a sequence of three steps: calibration, emulation, and sampling. A comprehensive treatment of the calibrate-emulate-sample approach to Bayesian inverse problems can be found in Cleary et al. (2020).

The three steps of the algorithm: see our walkthrough of the Sinusoid Example

Learn the vertical shift and amplitude of the signal given the noisy observation

The calibrate step of the algorithm consists of an application of Ensemble Kalman Processes, which generates input-output pairs $\{\theta, \mathcal{G}(\theta)\}$ in high density around an optimal parameter $\theta^*$. Here, $\theta$ are amplitude and vertical shift pairs, and $\mathcal{G}(\theta)$ are the resulting signal mean and range. This $\theta^*$ will be near a mode of the posterior distribution (Note: This is the only time we interface with the forward model $\mathcal{G}$).

calibrate with EKP to generate data pairs...

The emulate step takes these pairs $\{\theta, \mathcal{G}(\theta)\}$ and trains a statistical surrogate model (e.g., a Gaussian process), emulating the forward map $\mathcal{G}$.

emulate the map statistically from EKP pairs...

The sample step uses this surrogate in place of $\mathcal{G}$ in a sampling method (Markov chain Monte Carlo) to sample the posterior distribution of $\theta$.

sample the emulated map with MCMC...

Code Components

CalibrateEmulateSample.jl contains the following modules:

ModulePurpose
CalibrateEmulateSample.jlA wrapper for the pipeline
Emulator.jlModular template for the emulators
GaussianProcess.jlA Gaussian process emulator
Scalar/VectorRandomFeatureInterface.jlA Scalar/Vector-output Random Feature emulator
MarkovChainMonteCarlo.jlModular template for Markov Chain Monte Carlo samplers
Utilities.jlHelper functions

Authors

CalibrateEmulateSample.jl is being developed by the Climate Modeling Alliance.

diff --git a/dev/installation_instructions/index.html b/dev/installation_instructions/index.html index 668926d2..57d3e408 100644 --- a/dev/installation_instructions/index.html +++ b/dev/installation_instructions/index.html @@ -8,4 +8,4 @@ > PYTHON="" julia --project -e 'using Conda; Conda.add("scikit-learn=1.1.1")'

See the PyCall.jl documentation for more information about how to configure the local Julia / Conda / Python environment.

To test that the package is working:

> julia --project -e 'using Pkg; Pkg.test()'

Building the documentation locally

You need to first build the top-level project before building the documentation:

cd CalibrateEmulateSample.jl
 julia --project -e 'using Pkg; Pkg.instantiate()'

Then you can build the project documentation under the docs/ sub-project:

julia --project=docs/ -e 'using Pkg; Pkg.instantiate()'
-julia --project=docs/ docs/make.jl

The locally rendered HTML documentation can be viewed at docs/build/index.html. Occasional figures may only be viewable in the online documentation due to the fancy-url package.

+julia --project=docs/ docs/make.jl

The locally rendered HTML documentation can be viewed at docs/build/index.html. Occasional figures may only be viewable in the online documentation due to the fancy-url package.

diff --git a/dev/random_feature_emulator/index.html b/dev/random_feature_emulator/index.html index 3b1bedf2..9f926be7 100644 --- a/dev/random_feature_emulator/index.html +++ b/dev/random_feature_emulator/index.html @@ -99,4 +99,4 @@ build_default_prior(input_dim, output_dim, vector_general_kernel) # builds a 2-entry distribution # 781875-dim unbounded distribution 'full_cholesky' -# 1-dim positive distribution `sigma`

See the API for more details.

+# 1-dim positive distribution `sigma`

See the API for more details.

diff --git a/dev/sample/index.html b/dev/sample/index.html index cb41fd68..91946da7 100644 --- a/dev/sample/index.html +++ b/dev/sample/index.html @@ -24,4 +24,4 @@ # chain = sample(rng, mcmc, 100_000; stepsize = new_step) display(chain) # diagnostics -plot(chain) # plots samples over iteration and PDFs for each parameter

Internals: Transitions

Implementing MCMC involves defining states and transitions of a Markov process (whose stationary distribution is what we seek to sample from). AbstractMCMC's terminology is a bit confusing for the MH case; states of the chain are described by Transition objects, which contain the current sample (and other information like its log-probability).

AdvancedMH defines an AbstractTransition base class for use with its methods; we implement our own child class, MCMCState, in order to record statistics on the MH acceptance ratio.

Internals: Markov steps

Markov transitions of the chain are defined by overloading AbstractMCMC's step method, which takes the Sampler and current Transition and implements the Sampler's logic to returns an updated Transition representing the chain's new state (actually, a pair of Transitions, for cases where the Sampler doesn't obey detailed balance; this isn't relevant for us).

For example, in Metropolis-Hastings sampling this is where we draw a proposal sample and accept or reject it according to the MH criterion. AdvancedMH implements this here; we re-implement this method because 1) we need to record whether a proposal was accepted or rejected, and 2) our calls to propose() are stepsize-dependent.

+plot(chain) # plots samples over iteration and PDFs for each parameter

Internals: Transitions

Implementing MCMC involves defining states and transitions of a Markov process (whose stationary distribution is what we seek to sample from). AbstractMCMC's terminology is a bit confusing for the MH case; states of the chain are described by Transition objects, which contain the current sample (and other information like its log-probability).

AdvancedMH defines an AbstractTransition base class for use with its methods; we implement our own child class, MCMCState, in order to record statistics on the MH acceptance ratio.

Internals: Markov steps

Markov transitions of the chain are defined by overloading AbstractMCMC's step method, which takes the Sampler and current Transition and implements the Sampler's logic to returns an updated Transition representing the chain's new state (actually, a pair of Transitions, for cases where the Sampler doesn't obey detailed balance; this isn't relevant for us).

For example, in Metropolis-Hastings sampling this is where we draw a proposal sample and accept or reject it according to the MH criterion. AdvancedMH implements this here; we re-implement this method because 1) we need to record whether a proposal was accepted or rejected, and 2) our calls to propose() are stepsize-dependent.