kliff.uq#

class kliff.uq.MCMC(loss: Loss, nwalkers: Optional[int] = None, logprior_fn: Optional[Callable] = None, logprior_args: Optional[tuple] = None, sampler: Optional[str] = 'ptemcee', **kwargs)[source]#

MCMC sampler class for Bayesian uncertainty quantification.

This is a wrapper over PtemceeSampler and EmceeSampler. Currently, only these 2 samplers implemented.

Parameters

loss (Loss) – Loss function class from Loss.
nwalkers (Optional[int]) – Number of walkers to simulate. The minimum number of walkers is twice the number of parameters. It defaults to this minimum value.
logprior_fn (Optional[Callable]) – A function that evaluate logarithm of the prior distribution. The prior doesn’t need to be normalized. It defaults to a uniform prior over a finite range.
logprior_args (Optional[tuple]) – Additional positional arguments of the logprior_fn. If the default logprior_fn is used, then the boundaries of the uniform prior can be specified here.
sampler (Optional[str] or sampler instance) – An argument that specifies the MCMC sampler to use. The value can be one of the strings "ptemcee" (the default value) or "emcee", or a sampler class instance. If "ptemcee" or "emcee" is given, a respective internal sampler class will be uses.
**kwargs (Optional[dict]) – Additional keyword arguments for ptemcee.Sampler or emcee.EnsembleSampler.

builtin_samplers = ['ptemcee', 'emcee']#

kliff.uq.get_T0(loss)[source]#

Compute the natural temperature. The minimum loss is the loss value at the optimal parameters.

Parameters: loss (Loss) – Loss function class from Loss.
Returns: Value of the natural temperature.
Return type: float

kliff.uq.mser(chain, dmin=1, dstep=10, dmax=-1, full_output=False)[source]#

Estimate the equilibration time using marginal standard error rule (MSER). This is done by calculating the standard error (square) of chain_d, where chain_d contains the last $n-d$ element of the chain (n is the total number of iterations for each chain), for progresively larger d values, starting from dmin upto dmax, incremented by dstep. The SE values are stored in a list. Then we search the minimum element in the list and return the index of that element.

Parameters

chain (1D np.ndarray) – Array containing the time series.
dmin (int) – Index where to start the search in the time series.
dstep (int) – How much to increment the search is done.
dmax (int) – Index where to stop the search in the time series.
full_output (bool) – A flag to return the list of squared standard error.

Returns

dstar – Estimate of the equilibration time using MSER. If full_output=True, then a dictionary containing the estimated equilibration time and the list of squared standard errors will be returned.

Return type

int or dict

kliff.uq.autocorr(chain, *args, **kwargs)[source]#

Use emcee package to estimate the autocorrelation length.

Parameters

chain (np.ndarray (nwalkers, nsteps, ndim,)) – Chains from the MCMC simulation. The shape of the chains needs to be (nsteps, nwalkers, ndim). Note that the burn-in time needs to be discarded prior to this calculation
args – Additional positional and keyword arguments of emcee.autocorr.integrated_time.
kwargs – Additional positional and keyword arguments of emcee.autocorr.integrated_time.

Returns

Estimate of the autocorrelation length for each parameter.

Return type

float or array

kliff.uq.rhat(chain, time_axis=1, return_WB=False)[source]#

Compute the value of $\hat{r}$ proposed by Brooks and Gelman [BrooksGelman1998]. If the samples come from PTMCMC simulation, then the chain needs to be from one of the temperature only.

Parameters

chain (ndarray) – The MCMC chain as a ndarray, preferrably with the shape (nwalkers, nsteps, ndims). However, the shape can also be (nsteps, nwalkers, ndims), but the argument time_axis needs to be set to 0.
time_axis (int (optional)) – Axis in which the time series is stored (0 or 1). For emcee results, the time series is stored in axis 0, but for ptemcee for a given temperature, the time axis is 1.
return_WB (bool (optional)) – A flag to return covariance matrices within and between chains.

Returns

r (float) – The value of rhat.
W, B (2d ndarray) – Matrices of covariance within and between the chains.

References

BrooksGelman1998: Brooks, S.P., Gelman, A., 1998. General Methods for Monitoring Convergence of Iterative Simulations. Journal of Computational and Graphical Statistics 7, 434455. https://doi.org/10.1080/10618600.1998.10474787