kliff.uq

class kliff.uq.MCMC(loss: Loss, nwalkers: int | None = None, logprior_fn: Callable | None = None, logprior_args: tuple | None = None, sampler: str | None = 'ptemcee', **kwargs)[source]

MCMC sampler class for Bayesian uncertainty quantification.

This is a wrapper over PtemceeSampler and EmceeSampler. Currently, only these 2 samplers implemented.

Parameters:
  • loss – Loss function class from Loss.

  • nwalkers – Number of walkers to simulate. The minimum number of walkers is twice

  • value. (the number of parameters. It defaults to this minimum)

  • logprior_fn – A function that evaluate logarithm of the prior distribution. The prior doesn’t need to be normalized. It defaults to a uniform prior over a finite range.

  • logprior_args – Additional positional arguments of the logprior_fn. If the default logprior_fn is used, then the boundaries of the uniform prior can be specified here.

  • sampler – An argument that specifies the MCMC sampler to use. The value can be one of the strings "ptemcee" (the default value) or "emcee", or a sampler class instance. If "ptemcee" or "emcee" is given, a respective internal sampler class will be uses.

  • **kwargs – Additional keyword arguments for ptemcee.Sampler or emcee.EnsembleSampler.

builtin_samplers = ['ptemcee', 'emcee']
kliff.uq.get_T0(loss)[source]

Compute the natural temperature.

The minimum loss is the loss value at the optimal parameters.

Parameters:

loss (Loss) – Loss function class from Loss.

Return type:

float

Returns:

Value of the natural temperature.

kliff.uq.mser(chain, dmin=1, dstep=10, dmax=-1, full_output=False)[source]

Estimate the equilibration time using marginal standard error rule (MSER).

This is done by calculating the standard error (square) of chain_d, where chain_d contains the last n-d element of the chain (n is the total number of iterations for each chain), for progresively larger d values, starting from dmin upto dmax, incremented by dstep. The SE values are stored in a list. Then we search the minimum element in the list and return the index of that element.

Parameters:
  • chain (ndarray) – (nsteps,) Array containing the time series.

  • dmin (Optional[int]) – Index where to start the search in the time series.

  • dstep (Optional[int]) – How much to increment the search is done.

  • dmax (Optional[int]) – Index where to stop the search in the time series.

  • full_output (Optional[bool]) – A flag to return the list of squared standard error.

Return type:

Union[int, dict]

Returns:

Estimate of the equilibration time using MSER. If full_output=True, then a dictionary containing the estimated equilibration time and the list of squared standard errors will be returned.

kliff.uq.autocorr(chain, *args, **kwargs)[source]

Use emcee package to estimate the autocorrelation length.

Parameters:
  • chain (ndarray) – (nwalkers, nsteps, ndim,) Chains from the MCMC simulation. Note that the burn-in time needs to be discarded prior to this calculation

  • args – Additional positional and keyword arguments of emcee.autocorr.integrated_time.

  • kwargs – Additional positional and keyword arguments of emcee.autocorr.integrated_time.

Return type:

ndarray

Returns:

Estimate of the autocorrelation length for each parameter.

kliff.uq.rhat(chain, time_axis=1, return_WB=False)[source]

Compute the value of \hat{r} proposed by Brooks and Gelman [BrooksGelman1998].

If the samples come from PTMCMC simulation, then the chain needs to be from one of the temperature only.

Parameters:
  • chain (ndarray) – The MCMC chain as a ndarray, preferrably with the shape (nwalkers, nsteps, ndim,). However, the shape can also be (nsteps, nwalkers, ndim,), but the argument time_axis needs to be set to 0.

  • time_axis (Optional[int]) – Axis in which the time series is stored (0 or 1). For emcee results, the time series is stored in axis 0, but for ptemcee for a given temperature, the time axis is 1.

  • return_WB (Optional[bool]) – A flag to return covariance matrices within and between chains.

Return type:

Union[float, Tuple[float, ndarray, ndarray]]

Returns:

The value of rhat. if return_WB=True, also returns matrices of covariance within and between the chains.

References

[BrooksGelman1998]

Brooks, S.P., Gelman, A., 1998. General Methods for Monitoring Convergence of Iterative Simulations. Journal of Computational and Graphical Statistics 7, 434455. https://doi.org/10.1080/10618600.1998.10474787

class kliff.uq.Bootstrap(loss: Loss, seed: int | None = 1717, *args, **kwargs)[source]

Bootstrap sampler class for uncertainty quantification.

This is a wrapper over BootstrapEmpiricalModel and BootstrapNeuralNetworkModel to provide a united interface. You can use the two classes directly.

Parameters:
class kliff.uq.BootstrapEmpiricalModel(loss, seed=1717)[source]

Bootstrap sampler class for empirical, physics-based potentials.

Parameters:
  • loss (Loss) – Loss function class instance from Loss.

  • seed (Optional[int]) – Random number generator seed.

generate_bootstrap_compute_arguments(nsamples, bootstrap_cas_generator_fn=None, **kwargs)[source]

Generate bootstrap compute arguments samples.

If this function is called multiple, say, K times, then it will in total generate: math: K imes nsamples bootstrap compute arguments samples. That is, consecutive call of this function will append the generated compute arguments samples.

Parameters:
  • nsamples (int) – Number of bootstrap samples to generate.

  • bootstrap_cas_generator_fn (Optional[Callable]) – A function to generate bootstrap compute argument samples. The default function combine the compute arguments across all calculators and do sampling with replacement from the combined list. Another possible convention is to do sampling with replacement on the compute arguments list of each calculator separately, in which case a custom function needs to be defined and used. The required argument for the custom generator functions is the requested number of samples.

  • kwargs – Additional keyword arguments to bootstrap_cas_generator_fn.

save_bootstrap_compute_arguments(filename)[source]

Export the generated bootstrap compute arguments as a json file.

The json file will contain the identifier of the compute arguments for each sample.

Parameters:

filename (Union[Path, str]) – Where to export the bootstrap compute arguments samples

load_bootstrap_compute_arguments(filename)[source]

Load the bootstrap compute arguments from a json file.

If a list of bootstrap compute arguments samples exists prior to this function call, then the samples read from this file will be appended to the old list.

Parameters:

filename (Union[Path, str]) – Name or path of json file to read.

Return type:

dict

Returns:

Dictionary read from the json file.

run(min_kwargs=None, initial_guess=None, residual_fn_list=None, callback=None)[source]

Iterate over the generated bootstrap compute arguments samples and train the potential using each compute arguments sample.

Parameters:
  • min_kwargs (Optional[dict]) – Keyword arguments for minimize().

  • initial_guess (Optional[ndarray]) – (ndim,) Initial guess of parameters to use for the minimization. It is recommended to use the same values as used in the training process if such step is done prior to running bootstrap.

  • residual_fn_list (Optional[List]) – List of residual function to use in each calculator. Currently, this only affect the case when multiple calculators are used. If there is only a single calculator, don’t worry about this argument.

  • callback (Optional[Callable]) – Called after each iteration. The arguments for this function are the bootstrap instance and and output of minimize(). This function can also be used to break the run, by returning boolean True.

Return type:

ndarray

Returns:

(nsamples, ndim,) Parameter samples from bootstrapping.

Raises:
  • BootstrapError – If there is no bootstrap compute areguments generated prior to calling this method.

  • ValueError – If the calculators use neither the energy nor forces.

restore_loss()[source]

Restore the loss function: revert back the compute arguments and the parameters to the original state.

class kliff.uq.BootstrapNeuralNetworkModel(loss, seed=1717, orig_state_filename='orig_model.pkl')[source]

Bootstrap sampler class for neural network potentials.

Parameters:
  • loss (Loss) – Loss function class instance from Loss.

  • seed (Optional[int]) – Random number generator seed.

  • orig_state_filename (Union[str, Path, None]) – Name of the file in which the initial state of the model prior to bootstrapping will be stored. This is to use at the end of the bootstrap run to reset the model to the initial state.

generate_bootstrap_compute_arguments(nsamples, bootstrap_cas_generator_fn=None, **kwargs)[source]

Generate bootstrap compute arguments samples.

If this function is called multiple, say, K times, then it will in total generate: math: K imes nsamples bootstrap compute arguments samples. That is, consecutive call of this function will append the generated compute arguments samples.

Parameters:
  • nsamples (int) – Number of bootstrap samples to generate.

  • bootstrap_cas_generator_fn (Optional[Callable]) – A function to generate bootstrap compute argument samples. The default function combine the compute arguments across all calculators and do sampling with replacement from the combined list. Another possible convention is to do sampling with replacement on the compute arguments list of each calculator separately, in which case a custom function needs to be defined and used.

  • kwargs – Additional keyword arguments to bootstrap_cas_generator_fn.

save_bootstrap_compute_arguments(filename)[source]

Export the generated bootstrap compute arguments as a json file.

The json file will contain the identifier of the compute arguments for each sample.

Parameters:

filename (Union[Path, str]) – Where to export the bootstrap compute arguments samples

load_bootstrap_compute_arguments(filename)[source]

Load the bootstrap compute arguments from a json file.

If a list of bootstrap compute arguments samples exists prior to this function call, then the samples read from this file will be appended to the old list.

Parameters:

filename (Union[Path, str]) – Name or path of json file to read.

Return type:

dict

Returns:

Dictionary read from the json file.

run(min_kwargs=None, callback=None)[source]

Iterate over the generated bootstrap compute arguments samples and train the potential using each compute arguments sample.

Parameters:
  • min_kwargs (Optional[dict]) – Keyword arguments for minimize().

  • callback (Optional[Callable]) – Called after each iteration. The arguments for this function are the bootstrap instance and and output of minimize(). This function can also be used to break the run, by returning boolean True.

Return type:

ndarray

Returns:

(nsamples, ndim,) Parameter samples from bootstrapping.

Raises:

BootstrapError – If there is no bootstrap compute areguments generated prior to calling this method.

restore_loss()[source]

Restore the loss function: revert back the compute arguments and the parameters to the original state.