kliff.uq#

class kliff.uq.MCMC(loss: Loss, nwalkers: Optional[int] = None, logprior_fn: Optional[Callable] = None, logprior_args: Optional[tuple] = None, sampler: Optional[str] = 'ptemcee', **kwargs)[source]#

MCMC sampler class for Bayesian uncertainty quantification.

This is a wrapper over PtemceeSampler and EmceeSampler. Currently, only these 2 samplers implemented.

Parameters
  • loss (Loss) – Loss function class from Loss.

  • nwalkers (Optional[int]) – Number of walkers to simulate. The minimum number of walkers is twice the number of parameters. It defaults to this minimum value.

  • logprior_fn (Optional[Callable]) – A function that evaluate logarithm of the prior distribution. The prior doesn’t need to be normalized. It defaults to a uniform prior over a finite range.

  • logprior_args (Optional[tuple]) – Additional positional arguments of the logprior_fn. If the default logprior_fn is used, then the boundaries of the uniform prior can be specified here.

  • sampler (Optional[str] or sampler instance) – An argument that specifies the MCMC sampler to use. The value can be one of the strings "ptemcee" (the default value) or "emcee", or a sampler class instance. If "ptemcee" or "emcee" is given, a respective internal sampler class will be uses.

  • **kwargs (Optional[dict]) – Additional keyword arguments for ptemcee.Sampler or emcee.EnsembleSampler.

builtin_samplers = ['ptemcee', 'emcee']#
kliff.uq.get_T0(loss)[source]#

Compute the natural temperature. The minimum loss is the loss value at the optimal parameters.

Parameters

loss (Loss) – Loss function class from Loss.

Returns

Value of the natural temperature.

Return type

float

kliff.uq.mser(chain, dmin=1, dstep=10, dmax=-1, full_output=False)[source]#

Estimate the equilibration time using marginal standard error rule (MSER). This is done by calculating the standard error (square) of chain_d, where chain_d contains the last n-d element of the chain (n is the total number of iterations for each chain), for progresively larger d values, starting from dmin upto dmax, incremented by dstep. The SE values are stored in a list. Then we search the minimum element in the list and return the index of that element.

Parameters
  • chain (1D np.ndarray) – Array containing the time series.

  • dmin (int) – Index where to start the search in the time series.

  • dstep (int) – How much to increment the search is done.

  • dmax (int) – Index where to stop the search in the time series.

  • full_output (bool) – A flag to return the list of squared standard error.

Returns

dstar – Estimate of the equilibration time using MSER. If full_output=True, then a dictionary containing the estimated equilibration time and the list of squared standard errors will be returned.

Return type

int or dict

kliff.uq.autocorr(chain, *args, **kwargs)[source]#

Use emcee package to estimate the autocorrelation length.

Parameters
  • chain (np.ndarray (nwalkers, nsteps, ndim,)) – Chains from the MCMC simulation. The shape of the chains needs to be (nsteps, nwalkers, ndim). Note that the burn-in time needs to be discarded prior to this calculation

  • args – Additional positional and keyword arguments of emcee.autocorr.integrated_time.

  • kwargs – Additional positional and keyword arguments of emcee.autocorr.integrated_time.

Returns

Estimate of the autocorrelation length for each parameter.

Return type

float or array

kliff.uq.rhat(chain, time_axis=1, return_WB=False)[source]#

Compute the value of \hat{r} proposed by Brooks and Gelman [BrooksGelman1998]. If the samples come from PTMCMC simulation, then the chain needs to be from one of the temperature only.

Parameters
  • chain (ndarray) – The MCMC chain as a ndarray, preferrably with the shape (nwalkers, nsteps, ndims). However, the shape can also be (nsteps, nwalkers, ndims), but the argument time_axis needs to be set to 0.

  • time_axis (int (optional)) – Axis in which the time series is stored (0 or 1). For emcee results, the time series is stored in axis 0, but for ptemcee for a given temperature, the time axis is 1.

  • return_WB (bool (optional)) – A flag to return covariance matrices within and between chains.

Returns

  • r (float) – The value of rhat.

  • W, B (2d ndarray) – Matrices of covariance within and between the chains.

References

BrooksGelman1998

Brooks, S.P., Gelman, A., 1998. General Methods for Monitoring Convergence of Iterative Simulations. Journal of Computational and Graphical Statistics 7, 434455. https://doi.org/10.1080/10618600.1998.10474787