kliff.uq¶
- class kliff.uq.MCMC(loss: Loss, nwalkers: int | None = None, logprior_fn: Callable | None = None, logprior_args: tuple | None = None, sampler: str | None = 'ptemcee', **kwargs)[source]¶
MCMC sampler class for Bayesian uncertainty quantification.
This is a wrapper over
PtemceeSamplerandEmceeSampler. Currently, only these 2 samplers implemented.- Parameters:
loss – Loss function class from
Loss.nwalkers – Number of walkers to simulate. The minimum number of walkers is twice
value. (the number of parameters. It defaults to this minimum)
logprior_fn – A function that evaluate logarithm of the prior distribution. The prior doesn’t need to be normalized. It defaults to a uniform prior over a finite range.
logprior_args – Additional positional arguments of the
logprior_fn. If the defaultlogprior_fnis used, then the boundaries of the uniform prior can be specified here.sampler – An argument that specifies the MCMC sampler to use. The value can be one of the strings
"ptemcee"(the default value) or"emcee", or a sampler class instance. If"ptemcee"or"emcee"is given, a respective internal sampler class will be uses.**kwargs – Additional keyword arguments for
ptemcee.Sampleroremcee.EnsembleSampler.
- builtin_samplers = ['ptemcee', 'emcee']¶
- kliff.uq.get_T0(loss)[source]¶
Compute the natural temperature.
The minimum loss is the loss value at the optimal parameters.
- kliff.uq.mser(chain, dmin=1, dstep=10, dmax=-1, full_output=False)[source]¶
Estimate the equilibration time using marginal standard error rule (MSER).
This is done by calculating the standard error (square) of
chain_d, wherechain_dcontains the lastelement of the chain (n is the total number of iterations for each chain), for progresively larger d values, starting from
dminuptodmax, incremented bydstep. The SE values are stored in a list. Then we search the minimum element in the list and return the index of that element.- Parameters:
chain (
ndarray) – (nsteps,) Array containing the time series.dmin (
Optional[int]) – Index where to start the search in the time series.dstep (
Optional[int]) – How much to increment the search is done.dmax (
Optional[int]) – Index where to stop the search in the time series.full_output (
Optional[bool]) – A flag to return the list of squared standard error.
- Return type:
Union[int,dict]- Returns:
Estimate of the equilibration time using MSER. If
full_output=True, then a dictionary containing the estimated equilibration time and the list of squared standard errors will be returned.
- kliff.uq.autocorr(chain, *args, **kwargs)[source]¶
Use
emceepackage to estimate the autocorrelation length.- Parameters:
chain (
ndarray) – (nwalkers, nsteps, ndim,) Chains from the MCMC simulation. Note that the burn-in time needs to be discarded prior to this calculationargs – Additional positional and keyword arguments of
emcee.autocorr.integrated_time.kwargs – Additional positional and keyword arguments of
emcee.autocorr.integrated_time.
- Return type:
ndarray- Returns:
Estimate of the autocorrelation length for each parameter.
- kliff.uq.rhat(chain, time_axis=1, return_WB=False)[source]¶
Compute the value of
proposed by Brooks and Gelman [BrooksGelman1998].
If the samples come from PTMCMC simulation, then the chain needs to be from one of the temperature only.
- Parameters:
chain (
ndarray) – The MCMC chain as a ndarray, preferrably with the shape (nwalkers, nsteps, ndim,). However, the shape can also be (nsteps, nwalkers, ndim,), but the argument time_axis needs to be set to 0.time_axis (
Optional[int]) – Axis in which the time series is stored (0 or 1). For emcee results, the time series is stored in axis 0, but for ptemcee for a given temperature, the time axis is 1.return_WB (
Optional[bool]) – A flag to return covariance matrices within and between chains.
- Return type:
Union[float,Tuple[float,ndarray,ndarray]]- Returns:
The value of rhat. if
return_WB=True, also returns matrices of covariance within and between the chains.
References
[BrooksGelman1998]Brooks, S.P., Gelman, A., 1998. General Methods for Monitoring Convergence of Iterative Simulations. Journal of Computational and Graphical Statistics 7, 434455. https://doi.org/10.1080/10618600.1998.10474787
- class kliff.uq.Bootstrap(loss: Loss, seed: int | None = 1717, *args, **kwargs)[source]¶
Bootstrap sampler class for uncertainty quantification.
This is a wrapper over
BootstrapEmpiricalModelandBootstrapNeuralNetworkModelto provide a united interface. You can use the two classes directly.- Parameters:
loss – Loss function class instance from
Loss.seed – Random number generator seed.
args – Additional positional and keyword arguments for instantiating
BootstrapEmpiricalModelorBootstrapNeuralNetworkModel.kwargs – Additional positional and keyword arguments for instantiating
BootstrapEmpiricalModelorBootstrapNeuralNetworkModel.
- class kliff.uq.BootstrapEmpiricalModel(loss, seed=1717)[source]¶
Bootstrap sampler class for empirical, physics-based potentials.
- Parameters:
- generate_bootstrap_compute_arguments(nsamples, bootstrap_cas_generator_fn=None, **kwargs)[source]¶
Generate bootstrap compute arguments samples.
If this function is called multiple, say, K times, then it will in total generate: math: K imes nsamples bootstrap compute arguments samples. That is, consecutive call of this function will append the generated compute arguments samples.
- Parameters:
nsamples (
int) – Number of bootstrap samples to generate.bootstrap_cas_generator_fn (
Optional[Callable]) – A function to generate bootstrap compute argument samples. The default function combine the compute arguments across all calculators and do sampling with replacement from the combined list. Another possible convention is to do sampling with replacement on the compute arguments list of each calculator separately, in which case a custom function needs to be defined and used. The required argument for the custom generator functions is the requested number of samples.kwargs – Additional keyword arguments to
bootstrap_cas_generator_fn.
- save_bootstrap_compute_arguments(filename)[source]¶
Export the generated bootstrap compute arguments as a json file.
The json file will contain the identifier of the compute arguments for each sample.
- Parameters:
filename (
Union[Path,str]) – Where to export the bootstrap compute arguments samples
- load_bootstrap_compute_arguments(filename)[source]¶
Load the bootstrap compute arguments from a json file.
If a list of bootstrap compute arguments samples exists prior to this function call, then the samples read from this file will be appended to the old list.
- Parameters:
filename (
Union[Path,str]) – Name or path of json file to read.- Return type:
dict- Returns:
Dictionary read from the json file.
- run(min_kwargs=None, initial_guess=None, residual_fn_list=None, callback=None)[source]¶
Iterate over the generated bootstrap compute arguments samples and train the potential using each compute arguments sample.
- Parameters:
min_kwargs (
Optional[dict]) – Keyword arguments forminimize().initial_guess (
Optional[ndarray]) – (ndim,) Initial guess of parameters to use for the minimization. It is recommended to use the same values as used in the training process if such step is done prior to running bootstrap.residual_fn_list (
Optional[List]) – List of residual function to use in each calculator. Currently, this only affect the case when multiple calculators are used. If there is only a single calculator, don’t worry about this argument.callback (
Optional[Callable]) – Called after each iteration. The arguments for this function are the bootstrap instance and and output ofminimize(). This function can also be used to break the run, by returning boolean True.
- Return type:
ndarray- Returns:
(nsamples, ndim,) Parameter samples from bootstrapping.
- Raises:
BootstrapError – If there is no bootstrap compute areguments generated prior to calling this method.
ValueError – If the calculators use neither the energy nor forces.
- class kliff.uq.BootstrapNeuralNetworkModel(loss, seed=1717, orig_state_filename='orig_model.pkl')[source]¶
Bootstrap sampler class for neural network potentials.
- Parameters:
seed (
Optional[int]) – Random number generator seed.orig_state_filename (
Union[str,Path,None]) – Name of the file in which the initial state of the model prior to bootstrapping will be stored. This is to use at the end of the bootstrap run to reset the model to the initial state.
- generate_bootstrap_compute_arguments(nsamples, bootstrap_cas_generator_fn=None, **kwargs)[source]¶
Generate bootstrap compute arguments samples.
If this function is called multiple, say, K times, then it will in total generate: math: K imes nsamples bootstrap compute arguments samples. That is, consecutive call of this function will append the generated compute arguments samples.
- Parameters:
nsamples (
int) – Number of bootstrap samples to generate.bootstrap_cas_generator_fn (
Optional[Callable]) – A function to generate bootstrap compute argument samples. The default function combine the compute arguments across all calculators and do sampling with replacement from the combined list. Another possible convention is to do sampling with replacement on the compute arguments list of each calculator separately, in which case a custom function needs to be defined and used.kwargs – Additional keyword arguments to
bootstrap_cas_generator_fn.
- save_bootstrap_compute_arguments(filename)[source]¶
Export the generated bootstrap compute arguments as a json file.
The json file will contain the identifier of the compute arguments for each sample.
- Parameters:
filename (
Union[Path,str]) – Where to export the bootstrap compute arguments samples
- load_bootstrap_compute_arguments(filename)[source]¶
Load the bootstrap compute arguments from a json file.
If a list of bootstrap compute arguments samples exists prior to this function call, then the samples read from this file will be appended to the old list.
- Parameters:
filename (
Union[Path,str]) – Name or path of json file to read.- Return type:
dict- Returns:
Dictionary read from the json file.
- run(min_kwargs=None, callback=None)[source]¶
Iterate over the generated bootstrap compute arguments samples and train the potential using each compute arguments sample.
- Parameters:
min_kwargs (
Optional[dict]) – Keyword arguments forminimize().callback (
Optional[Callable]) – Called after each iteration. The arguments for this function are the bootstrap instance and and output ofminimize(). This function can also be used to break the run, by returning boolean True.
- Return type:
ndarray- Returns:
(nsamples, ndim,) Parameter samples from bootstrapping.
- Raises:
BootstrapError – If there is no bootstrap compute areguments generated prior to calling this method.