Theory#

A parametric potential typically takes the form

\mathcal{V} = \mathcal{V}(\bm r_1,\dots,\bm r_{N_a}, Z_1,\dots,Z_{N_a}; \bm\theta)

where \bm r_1,\dots,\bm r_{N_a} and Z_1,\dots,Z_{N_a} are the coordinates and species of a system of N_a atoms, respectively, and \bm\theta denotes a set of fitting parameters. For notational simplicity, in the following discussion, we assume that the atomic species information is implicitly carried by the coordinates and thus we can exclude Z from the functional form, and use \bm R to denote the coordinates of all atoms in the configuration. Then we have

\mathcal{V} = \mathcal{V}(\bm R; \bm\theta).

A potential parameterization process is typically formulated as a weighted least-squares minimization problem, where we adjust the potential parameters \bm\theta so as to reproduce a training set of reference data obtained from experiments and/or first-principles computations. Mathematically, we hope to minimize a loss function

\mathcal{L(\bm\theta)} = \frac{1}{2} \sum_{i=1}^{N_p}
\|w_i (\bm p_i(\mathcal{V}(\bm R_i; \bm\theta)) - \bm q_i) \| ^2

with respect to \bm\theta, where \{\bm q_1,\dots, \bm q_{N_p}\} is a training set of N_p reference data, \bm p_i is the corresponding prediction for \bm q_i computed from the potential (as indicated by its argument), \|\cdot\| denote the L_2 norm, and w_i is the weight for the i-th data point. We call

\bm u = \bm p(\mathcal{V}(\bm R; \bm\theta)) - \bm q

the residual function that characterizes the difference between the potential predictions and the reference data for a set of properties.

Generally speaking, \bm q can be a collection of any material properties considered important for a given application, such as the cohesive energy, equilibrium lattice constant, and elastic constants of a given crystal phase. These materials properties can be obtained from experiments and/or first-principles calculations. However, nowadays, most of the potentials are trained using the force-matching scheme, where the potential is trained to a large set of forces on atoms (and/or energies, stresses) obtained by first-principles calculations for a set of atomic configurations. This is extremely true for machine learning potentials, where a large set of training data is necessary, and it seems impossible to collect sufficient number of material properties for the training set.

The reference \bm q and the prediction \bm p are typically represented as vectors such that q[m] is the m-th reference property and p[m] is the corresponding m-th prediction obtained from the potential. Assuming we want to fit a potential to energy and forces, then \bm q is a vector of size 1+3N_a, in which N_a is the number of atoms in a configuration, with

q[0] &= E_\text{ref}\\
q[1] &= f_\text{ref}^{0, x}, \quad
q[2] = f_\text{ref}^{0, y}, \quad
q[3] = f_\text{ref}^{0, z}, \\
q[4] &= f_\text{ref}^{1, x}, \quad
q[5] = f_\text{ref}^{1, y}, \quad
q[6] = f_\text{ref}^{1, z}, \\
\cdots \\
q[3N_a-2] &= f_\text{ref}^{N_a-1, x}, \quad
q[3N_a-1] = f_\text{ref}^{N_a-1, y}, \quad
q[3N_a] = f_\text{ref}^{N_a-1, z}, \\

where E_\text{ref} is the reference energy, and f_\text{ref}^{i, x}, f_\text{ref}^{i, y}, and f_\text{ref}^{i, z} denote the x-, y-, and z-component of reference force on atom i, respectively. In other words, we put the energy as the 0th component of \bm q, and then put the force on the first atom as the 1st to 3rd components of \bm q, the force on the second atom the next three components till the forces on all atoms are placed in \bm q. In the same fashion, we can construct the prediction vector \bm p, and then to compute the residual vector.

Note

We use boldface with subscript to denote a data point (e.g. \bm q_i means the i-th data point in the training set), and use normal text with square bracket to denote the component of a data point (e.g. : q[m] indicates the m-th component of a general data point \bm q.

If stress is used in the fitting, q[3N_a] to q[3N_a+5] will store the reference Voigt stress \sigma_{xx}, \sigma_{yy}, \sigma_{zz}, \sigma_{yz}, \sigma_{xy}, \sigma_{xz}, and, of course, p[3N_a] to p[3N_a+5] are the corresponding predictions computed from the potential.

The objective of the parameterization process is to find a set of parameters \bm\theta of potential that reproduce the reference data as well as possible.