kliff.descriptors#
- class kliff.descriptors.Descriptor(cut_dists, cut_name, hyperparams, normalize=True, dtype=<class 'numpy.float32'>)[source]#
Base class of atomic environment descriptors.
Process dataset to generate fingerprints. This is the base class for all descriptors, so it should not be used directly. Instead, descriptors built on top of this such as
SymmetryFunction
andBispectrum
can be used to transform the atomic environment information into fingerprints.- Parameters
cut_dists (
Dict
[str
,float
]) – Cutoff distances, with key of the form A-B where A and B are species string, and value should be a float. Example: cut_dists = {‘C-C’: 5.0}cut_name (
str
) – Name of the cutoff function, such as cos, P3, and P7.hyperparams (
Union
[Dict
,str
]) – A dictionary of the hyperparams of the descriptor or a string to select the predefined hyperparams.normalize (
bool
) – If True, the fingerprints is centered and normalized: zeta = (zeta - mean(zeta)) / stdev(zeta)dtype – np.dtype Data type of the generated fingerprints, such as np.float32 and np.float64.
- size#
int Length of the fingerprint vector.
- mean#
list Mean of the fingerprints.
- stdev#
list Standard deviation of the fingerprints.
- generate_fingerprints(configs, fit_forces=False, fit_stress=False, fingerprints_filename='fingerprints.pkl', fingerprints_mean_stdev_filename=None, use_welford_method=False, nprocs=1)[source]#
Convert all configurations to their fingerprints.
- Parameters
configs (
List
[Configuration
]) – Dataset configurationsfit_forces (
bool
) – Whether to compute the gradient of fingerprints w.r.t. atomic coordinates so as to compute forces.fit_stress (
bool
) – Whether to compute the gradient of fingerprints w.r.t. atomic coordinates so as to compute stress.use_welford_method (
bool
) – Whether to compute mean and standard deviation using the Welford method, which is memory efficient. See https://en.wikipedia.org/wiki/Algorithms_for_calculating_variancefingerprints_filename (
Union
[Path
,str
]) – Path to dump fingerprints to a pickle file.fingerprints_mean_stdev_filename (
Union
[str
,Path
,None
]) – Path to dump the mean and standard deviation of the fingerprints as a pickle file. If normalize=False for the descriptor, this is ignored.nprocs (
int
) – Number of processes used to generate the fingerprints. If 1, run in serial mode, otherwise nprocs processes will be forked via multiprocessing to do the work.
- transform(conf, fit_forces=False, fit_stress=False)[source]#
Transform atomic coords to atomic environment descriptor values.
- Parameters
conf (
Configuration
) – atomic configurationfit_forces (
bool
) – Whether to fit forces, so as to compute gradients of fingerprints w.r.t. coordsfit_stress (
bool
) – Whether to fit stress, so as to compute gradients of fingerprints w.r.t. coords
- Returns
- Descriptor values. 2D array with shape (num_atoms, num_descriptors),
where num_atoms is the number of atoms in the configuration, and num_descriptors is the size of the descriptor vector (depending on the choice of the hyperparameters).
- dzeta_dr: Gradient of the descriptor w.r.t. atomic coordinates. 4D array if
grad is True, otherwise None. Shape: (num_atoms, num_descriptors, num_atoms, 3), where num_atoms and num_descriptors has the same meanings as described in zeta, and 3 denotes the 3D space for the Cartesian coordinates.
- dzeta_ds: Gradient of the descriptor w.r.t. virial stress component. 2D
array of shape (num_atoms, num_descriptors, 6), where num_atoms and num_descriptors has the same meanings as described in zeta, and 6 denote the virial stress component in Voigt notation, see https://en.wikipedia.org/wiki/Voigt_notation
- Return type
zeta
- class kliff.descriptors.SymmetryFunction(cut_dists, cut_name, hyperparams, normalize=True, dtype=<class 'numpy.float32'>)[source]#
Atom-centered symmetry functions descriptor as discussed in [Behler2011].
- Parameters
cut_dists (dict) – Cutoff distances, with key of the form
A-B
whereA
andB
are atomic species string, and value should be a float.cut_name (str) – Name of the cutoff function.
hyperparams (dict or str) –
A dictionary of the hyper parameters of that define the descriptor. We provide two sets of hyperparams that can be used by setting
hyperparams='set51'
orhyperparams='set30'
, which are taken from [Artrith2012] and [Artrith2013], respectively. To see what they are, one can do:>>> cut_name = 'cos' # just for init purpose >>> cut_dists = {'C-C': 5.} # just for init purpose >>> hyperparams = 'set51' >>> desc = SymmetryFunction(cut_dists, cut_name, hyperparams) >>> desc.get_hyperparams()
normalize (bool (optional)) – If
True
, the fingerprints is centered and normalized according to:zeta = (zeta - mean(zeta)) / stdev(zeta)
dtype (np.dtype (optional)) – Data type for the generated fingerprints, such as
np.float32
andnp.float64
.
Example
If
set51
orset30
hyperparams are used, the cutoff distances should be given inAngstrom
.>>> cut_name = 'cos' >>> cut_dists = {'C-C': 5., 'C-H': 4.5, 'H-H': 4.0} >>> hyperparams = 'set51' >>> desc = SymmetryFunction(cut_dists, cut_name, hyperparams)
You can provide your own hyperparams as a dictionary:
>>> cut_name = 'cos' >>> cut_dists = {'C-C': 5., 'C-H': 4.5, 'H-H': 4.0} >>> hyperparams = {'g1': None, >>> 'g2': [{'eta':0.1, 'Rs':0.2}, {'eta':0.3, 'Rs':0.4}], >>> 'g3': [{'kappa':0.1}, {'kappa':0.2}, {'kappa':0.3}]} >>> desc = SymmetryFunction(cut_dists, cut_name, hyperparams)
References
- Behler2011
J. Behler, “Atom-centered symmetry functions for constructing high-dimensional neural network potentials,” J. Chem. Phys. 134, 074106 (2011).
- Artrith2012
N. Artrith and J. Behler. “High-dimensional neural network potentials for metal surfaces: A prototype study for copper.” Physical Review B 85, no. 4 (2012): 045439.
- Artrith2013
N. Artrith, B. Hiller, and J. Behler. “Neural network potentials for metals and oxides–First applications to copper clusters at zinc oxide.” physica status solidi (b) 250, no. 6 (2013): 1191-1203.
- transform(conf, fit_forces=False, fit_stress=False)[source]#
Transform atomic coords to atomic environment descriptor values.
- Parameters
conf (
Configuration
object) – A configuration of atoms.
- fit_forces: bool (optional)
Whether to compute the gradient of descriptor values w.r.t. atomic coordinates so as to compute forces.
- fit_stress: bool (optional)
Whether to compute the gradient of descriptor values w.r.t. atomic coordinates so as to compute stress.
- Returns
zeta (2D array) – Descriptor values, each row for one atom. zeta has shape (num_atoms, num_descriptors), where num_atoms is the number of atoms in the configuration, and num_descriptors is the size of the descriptor vector (depending on the the choice of hyper-parameters).
dzetadr_forces (3D array if fit_forces is
True
, otherwiseNone
) – Gradient of descriptor values w.r.t. atomic coordinates for forces computation. dzetadr_forces has shape (num_atoms, num_descriptors, num_atoms*DIM), where num_atoms and num_descriptors has the same meanings as described in zeta. DIM = 3 denotes three Cartesian coordinates.dzetadr_stress (3D array if fit_stress is
True
, otherwiseNone
) – Gradient of descriptor values w.r.t. atomic coordinates for stress computation. dzetadr_stress has shape (num_atoms, num_descriptors, 6), where num_atoms and num_descriptors has the same meanings as described in zeta. The last dimension is the 6 component associated with virial stress in the order of 11, 22, 33, 23, 31, 12.
- write_kim_params(path, fname='descriptor.params')[source]#
Write descriptor info for KIM model.
- Parameters
path – Directory Path to write the file.
fname – Name of the file.
- generate_fingerprints(configs, fit_forces=False, fit_stress=False, fingerprints_filename='fingerprints.pkl', fingerprints_mean_stdev_filename=None, use_welford_method=False, nprocs=1)#
Convert all configurations to their fingerprints.
- Parameters
configs (
List
[Configuration
]) – Dataset configurationsfit_forces (
bool
) – Whether to compute the gradient of fingerprints w.r.t. atomic coordinates so as to compute forces.fit_stress (
bool
) – Whether to compute the gradient of fingerprints w.r.t. atomic coordinates so as to compute stress.use_welford_method (
bool
) – Whether to compute mean and standard deviation using the Welford method, which is memory efficient. See https://en.wikipedia.org/wiki/Algorithms_for_calculating_variancefingerprints_filename (
Union
[Path
,str
]) – Path to dump fingerprints to a pickle file.fingerprints_mean_stdev_filename (
Union
[str
,Path
,None
]) – Path to dump the mean and standard deviation of the fingerprints as a pickle file. If normalize=False for the descriptor, this is ignored.nprocs (
int
) – Number of processes used to generate the fingerprints. If 1, run in serial mode, otherwise nprocs processes will be forked via multiprocessing to do the work.
- get_cutoff()#
Return the name and values of cutoff.
- get_dtype()#
Return the data type of the fingerprints.
- get_mean()#
Return a list of the mean of the fingerprints.
- get_stdev()#
Return a list of the standard deviation of the fingerprints.
- load_state_dict(data)#
Load state dict of a descriptor.
- Parameters
data (
Dict
[str
,Any
]) – state dict to load.
- state_dict()#
Return the state dict of the descriptor.
- Return type
Dict
[str
,Any
]
- class kliff.descriptors.Bispectrum(cut_dists, cut_name=None, hyperparams=None, normalize=True, dtype=<class 'numpy.float32'>)[source]#
Bispectrum descriptor.
Process dataset to generate fingerprints using the Bispectrum descriptor as discussed in [Bartok2010] and [Thompson2015].
- Parameters
cut_dists (dict) – Cutoff distances, with key of the form
A-B
whereA
andB
are atomic species string, and value should be a float.cut_name (str) – Name of the cutoff function.
hyperparams (dict) – A dictionary of the hyperparams of the descriptor.
normalize (bool (optional)) – If
True
, the fingerprints is centered and normalized according to:zeta = (zeta - mean(zeta)) / stdev(zeta)
dtype (np.dtype) – Data type for the generated fingerprints, such as
np.float32
andnp.float64
.
Example
>>> cut_name = 'cos' >>> cut_dists = {'C-C': 5.0, 'C-H': 4.5, 'H-H': 4.0} >>> hyperparams = {'jmax': 4, 'weight': {'C':1.0, 'H':1.0}} >>> desc = Bispectrum(cut_dists, cut_name, hyperparams)
References
- Bartok2010
Bartók, Albert P., Mike C. Payne, Risi Kondor, and Gábor Csányi. “Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons.” Physical review letters 104, no. 13 (2010): 136403.
- Thompson2015
Thompson, Aidan P., Laura P. Swiler, Christian R. Trott, Stephen M. Foiles, and Garritt J. Tucker. “Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials.” Journal of Computational Physics 285 (2015): 316-330.
- transform(conf, grad=False)[source]#
Transform atomic coords to atomic environment descriptor values.
- Parameters
conf – atomic configuration
fit_forces – Whether to fit forces, so as to compute gradients of fingerprints w.r.t. coords
fit_stress – Whether to fit stress, so as to compute gradients of fingerprints w.r.t. coords
- Returns
- Descriptor values. 2D array with shape (num_atoms, num_descriptors),
where num_atoms is the number of atoms in the configuration, and num_descriptors is the size of the descriptor vector (depending on the choice of the hyperparameters).
- dzeta_dr: Gradient of the descriptor w.r.t. atomic coordinates. 4D array if
grad is True, otherwise None. Shape: (num_atoms, num_descriptors, num_atoms, 3), where num_atoms and num_descriptors has the same meanings as described in zeta, and 3 denotes the 3D space for the Cartesian coordinates.
- dzeta_ds: Gradient of the descriptor w.r.t. virial stress component. 2D
array of shape (num_atoms, num_descriptors, 6), where num_atoms and num_descriptors has the same meanings as described in zeta, and 6 denote the virial stress component in Voigt notation, see https://en.wikipedia.org/wiki/Voigt_notation
- Return type
zeta
- update_hyperparams(params)[source]#
Update the hyperparameters based on the input at initialization.
- generate_fingerprints(configs, fit_forces=False, fit_stress=False, fingerprints_filename='fingerprints.pkl', fingerprints_mean_stdev_filename=None, use_welford_method=False, nprocs=1)#
Convert all configurations to their fingerprints.
- Parameters
configs (
List
[Configuration
]) – Dataset configurationsfit_forces (
bool
) – Whether to compute the gradient of fingerprints w.r.t. atomic coordinates so as to compute forces.fit_stress (
bool
) – Whether to compute the gradient of fingerprints w.r.t. atomic coordinates so as to compute stress.use_welford_method (
bool
) – Whether to compute mean and standard deviation using the Welford method, which is memory efficient. See https://en.wikipedia.org/wiki/Algorithms_for_calculating_variancefingerprints_filename (
Union
[Path
,str
]) – Path to dump fingerprints to a pickle file.fingerprints_mean_stdev_filename (
Union
[str
,Path
,None
]) – Path to dump the mean and standard deviation of the fingerprints as a pickle file. If normalize=False for the descriptor, this is ignored.nprocs (
int
) – Number of processes used to generate the fingerprints. If 1, run in serial mode, otherwise nprocs processes will be forked via multiprocessing to do the work.
- get_cutoff()#
Return the name and values of cutoff.
- get_dtype()#
Return the data type of the fingerprints.
- get_hyperparams()#
Return the hyperparameters of descriptors.
- get_mean()#
Return a list of the mean of the fingerprints.
- get_stdev()#
Return a list of the standard deviation of the fingerprints.
- load_state_dict(data)#
Load state dict of a descriptor.
- Parameters
data (
Dict
[str
,Any
]) – state dict to load.
- state_dict()#
Return the state dict of the descriptor.
- Return type
Dict
[str
,Any
]
- write_kim_params(path, fname='descriptor.params')#
Write descriptor info for KIM model.
- Parameters
path (
Union
[Path
,str
]) – Directory Path to write the file.fname (
str
) – Name of the file.