kliff.legacy.descriptors.descriptor

class kliff.legacy.descriptors.descriptor.Descriptor(cut_dists, cut_name, hyperparams, normalize=True, dtype=<class 'numpy.float32'>)[source]

Base class of atomic environment descriptors.

Process dataset to generate fingerprints. This is the base class for all descriptors, so it should not be used directly. Instead, descriptors built on top of this such as SymmetryFunction and Bispectrum can be used to transform the atomic environment information into fingerprints.

Parameters:
  • cut_dists (Dict[str, float]) – Cutoff distances, with key of the form A-B where A and B are species string, and value should be a float. Example: cut_dists = {‘C-C’: 5.0}

  • cut_name (str) – Name of the cutoff function, such as cos, P3, and P7.

  • hyperparams (Union[Dict, str]) – A dictionary of the hyperparams of the descriptor or a string to select the predefined hyperparams.

  • normalize (bool) – If True, the fingerprints is centered and normalized: zeta = (zeta - mean(zeta)) / stdev(zeta)

  • dtype – np.dtype Data type of the generated fingerprints, such as np.float32 and np.float64.

size

int Length of the fingerprint vector.

mean

list Mean of the fingerprints.

stdev

list Standard deviation of the fingerprints.

generate_fingerprints(configs, fit_forces=False, fit_stress=False, fingerprints_filename='fingerprints.pkl', fingerprints_mean_stdev_filename=None, use_welford_method=False, nprocs=1)[source]

Convert all configurations to their fingerprints.

Parameters:
  • configs (List[Configuration]) – Dataset configurations

  • fit_forces (bool) – Whether to compute the gradient of fingerprints w.r.t. atomic coordinates so as to compute forces.

  • fit_stress (bool) – Whether to compute the gradient of fingerprints w.r.t. atomic coordinates so as to compute stress.

  • use_welford_method (bool) – Whether to compute mean and standard deviation using the Welford method, which is memory efficient. See https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance

  • fingerprints_filename (Union[Path, str]) – Path to dump fingerprints to a pickle file.

  • fingerprints_mean_stdev_filename (Union[str, Path, None]) – Path to dump the mean and standard deviation of the fingerprints as a pickle file. If normalize=False for the descriptor, this is ignored.

  • nprocs (int) – Number of processes used to generate the fingerprints. If 1, run in serial mode, otherwise nprocs processes will be forked via multiprocessing to do the work.

transform(conf, fit_forces=False, fit_stress=False)[source]

Transform atomic coords to atomic environment descriptor values.

Parameters:
  • conf (Configuration) – atomic configuration

  • fit_forces (bool) – Whether to fit forces, so as to compute gradients of fingerprints w.r.t. coords

  • fit_stress (bool) – Whether to fit stress, so as to compute gradients of fingerprints w.r.t. coords

Returns:

Descriptor values. 2D array with shape (num_atoms, num_descriptors),

where num_atoms is the number of atoms in the configuration, and num_descriptors is the size of the descriptor vector (depending on the choice of the hyperparameters).

dzeta_dr: Gradient of the descriptor w.r.t. atomic coordinates. 4D array if

grad is True, otherwise None. Shape: (num_atoms, num_descriptors, num_atoms, 3), where num_atoms and num_descriptors has the same meanings as described in zeta, and 3 denotes the 3D space for the Cartesian coordinates.

dzeta_ds: Gradient of the descriptor w.r.t. virial stress component. 2D

array of shape (num_atoms, num_descriptors, 6), where num_atoms and num_descriptors has the same meanings as described in zeta, and 6 denote the virial stress component in Voigt notation, see https://en.wikipedia.org/wiki/Voigt_notation

Return type:

zeta

write_kim_params(path, fname='descriptor.params')[source]

Write descriptor info for KIM model.

Parameters:
  • path (Union[Path, str]) – Directory Path to write the file.

  • fname (str) – Name of the file.

get_size()[source]

Return the size of the descriptor vector.

get_mean()[source]

Return a list of the mean of the fingerprints.

get_stdev()[source]

Return a list of the standard deviation of the fingerprints.

get_dtype()[source]

Return the data type of the fingerprints.

get_cutoff()[source]

Return the name and values of cutoff.

get_hyperparams()[source]

Return the hyperparameters of descriptors.

state_dict()[source]

Return the state dict of the descriptor.

Return type:

Dict[str, Any]

load_state_dict(data)[source]

Load state dict of a descriptor.

Parameters:

data (Dict[str, Any]) – state dict to load.

kliff.legacy.descriptors.descriptor.load_fingerprints(path)[source]

Read preprocessed fingerprints from file.

This is the reverse operation of Descriptor._dump_fingerprints.

Parameters:

path (Union[Path, str]) – Path to the pickled data file.

Returns:

Fingerprints

kliff.legacy.descriptors.descriptor.generate_full_cutoff(cutoff)[source]

Generate a full binary cutoff dictionary.

For species pair S1-S2 in the cutoff dictionary, add key S2-S1 to it, with the same value as S1-S2.

Parameters:

cutoff – Cutoff dictionary with key of the form A-B where A and B are atomic species, and value should be a float.

Returns:

A dictionary with all combination of species as keys.

Example: >>> cutoff = {‘C-C’: 4.0, ‘C-H’:3.5} >>> generate_full_cutoff(cutoff)

{‘C-C’: 4.0, ‘C-H’:3.5, ‘H-C’:3.5}

kliff.legacy.descriptors.descriptor.generate_unique_cutoff_pairs(cutoff)[source]

Generate a full binary cutoff dictionary.

For species pair S1-S2 in the cutoff dictionary, remove key S2-S1 from it if S1 is different from S2.

Parameters:

cutoff – Cutoff dictionary with key of the form A-B where A and B are atomic species, and value should be a float.

Returns:

A dictionary with unique species pair as keys.

Example: >>> cutoff = {‘C-C’: 4.0, ‘C-H’:3.5, ‘H-C’:3.5} >>> generate_unique_cutoff_pairs(cutoff)

{‘C-C’: 4.0, ‘C-H’:3.5}

kliff.legacy.descriptors.descriptor.generate_species_code(cutoff)[source]

Generate species code info from cutoff dictionary.

Parameters:

cutoff – Cutoff dictionary with key of the form A-B where A and B are atomic species, and value should be a float.

Returns:

A dictionary of species and the integer code (starting from 0),

with keys the species in cutoff keys, and values integer code for species.

Return type:

species_code

Example: >>> cutoff = {‘C-C’: 4.0, ‘C-H’:3.5} >>> generate_species_code(cutoff)

{‘C’:0, ‘H’:1}

exception kliff.legacy.descriptors.descriptor.DescriptorError(msg)[source]