Train a Stillinger-Weber potential ================================== In this tutorial, we train a Stillinger-Weber (SW) potential for silicon that is archived on `OpenKIM `_. Before getting started to train the SW model, let’s first make sure it is installed. If you haven’t already, follow ``installation`` to install ``kim-api`` and ``kimpy``, and ``openkim-models``. Then do ``$ kim-api-collections-management list``, and make sure ``SW_StillingerWeber_1985_Si__MO_405512056662_006`` is listed in one of the collections. .. tip:: If you see ``SW_StillingerWeber_1985_Si__MO_405512056662_005`` (note the last three digits), you need to change ``model = KIMModel(model_name="SW_StillingerWeber_1985_Si__MO_405512056662_006")`` to the corresponding model name in your installation. We are going to create potentials for diamond silicon, and fit the potentials to a training set of energies and forces consisting of compressed and stretched diamond silicon structures, as well as configurations drawn from molecular dynamics trajectories at different temperatures. Download the `training set `_ ``Si_training_set.tar.gz``. (It will be automatically downloaded if not present.) The data is stored in # **extended xyz** format, and see ``doc.dataset`` for more information of this format. .. warning:: The ``Si_training_set`` is just a toy data set for the purpose to demonstrate how to use KLIFF to train potentials. It should not be used to train any potential for real simulations. .. warning:: Regression `calculator`, and `loss` module is now part of `legacy` module.` Let’s first import the modules that will be used in this example. .. code-block:: python from kliff.legacy.calculators import Calculator from kliff.dataset import Dataset from kliff.dataset.weight import Weight from kliff.legacy.loss import Loss from kliff.models import KIMModel from kliff.utils import download_dataset Model ----- We first create a KIM model for the SW potential, and print out all the available parameters that can be optimized (we call these the ``model parameters``). Continuing in our python script we write .. code-block:: python model = KIMModel(model_name="SW_StillingerWeber_1985_Si__MO_405512056662_006") model.echo_model_params() .. parsed-literal:: #================================================================================ # Available parameters to optimize (In MODEL SPACE). # Model: SW_StillingerWeber_1985_Si__MO_405512056662_006 #================================================================================ name: A value: [15.28484792] size: 1 name: B value: [0.60222456] size: 1 name: p value: [4.] size: 1 name: q value: [0.] size: 1 name: sigma value: [2.0951] size: 1 name: gamma value: [2.51412] size: 1 name: cutoff value: [3.77118] size: 1 name: lambda value: [45.5322] size: 1 name: costheta0 value: [-0.33333333] size: 1 #================================================================================ # Following parameters have transformation objects attached, # Parameter value in PARAM SPACE: #================================================================================ .. parsed-literal:: '#================================================================================\n# Available parameters to optimize (In MODEL SPACE).\n# Model: SW_StillingerWeber_1985_Si__MO_405512056662_006\n#================================================================================\n\nname: A\nvalue: [15.28484792]\nsize: 1\n\nname: B\nvalue: [0.60222456]\nsize: 1\n\nname: p\nvalue: [4.]\nsize: 1\n\nname: q\nvalue: [0.]\nsize: 1\n\nname: sigma\nvalue: [2.0951]\nsize: 1\n\nname: gamma\nvalue: [2.51412]\nsize: 1\n\nname: cutoff\nvalue: [3.77118]\nsize: 1\n\nname: lambda\nvalue: [45.5322]\nsize: 1\n\nname: costheta0\nvalue: [-0.33333333]\nsize: 1\n\n#================================================================================\n# Following parameters have transformation objects attached, \n# Parameter value in PARAM SPACE: \n#================================================================================\n' The output is generated by the last line, and it tells us the ``name``, ``value``, ``size``, ``data type`` and a ``description`` of each parameter. .. tip:: You can provide a ``path`` argument to the method `echo_model_params(path)` to write the available parameters information to a file indicated by `path` .. warning:: The available parameters information can also by obtained using the **kliff** `cmdlntool`: ``$ kliff model --echo-params SW_StillingerWeber_1985_Si__MO_405512056662_006`` Now that we know what parameters are available for fitting, we can optimize all or a subset of them to reproduce the training set. .. code-block:: python model.set_opt_params( A=[[5.0, 1.0, 20]], B=[["default"]], sigma=[[2.0951, "fix"]], gamma=[[1.5]] ) model.echo_opt_params() .. parsed-literal:: Parameter:A : [5.] Parameter:B : [0.60222456] Parameter:gamma : [1.5] Here, we tell KLIFF to fit four parameters ``B``, ``gamma``, ``sigma``, and ``A`` of the SW model. The information for each fitting parameter should be provided as a list of lists, where the size of the outer list should be equal to the ``size`` of the parameter given by ``model.echo_model_params()``. For each inner list, you can provide either one, two, or three items. - One item. You can use a numerical value (e.g. ``gamma``) to provide an initial guess of the parameter. Alternatively, the string ``'default'`` can be provided to use the default value in the model (e.g. ``B``). - Two items. The first item should be a numerical value and the second item should be the string ``'fix'`` (e.g. ``sigma``), which tells KLIFF to use the value for the parameter, but do not optimize it. - Three items. The first item can be a numerical value or the string ``'default'``, having the same meanings as the one item case. In the second and third items, you can list the lower and upper bounds for the parameters, respectively. A bound could be provided as a numerical values or ``None``. The latter indicates no bound is applied. The call of ``model.echo_opt_params()`` prints out the fitting parameters that we require KLIFF to optimize. The number ``1`` after the name of each parameter indicates the size of the parameter. .. tip:: The parameters that are not included as a fitting parameter are fixed to the default values in the model during the optimization. Training set ------------ KLIFF has a :class:`~kliff.dataset.Dataset` to deal with the training data (and possibly test data). Additionally, we define the ``energy_weight`` and ``forces_weight`` corresponding to each configuration using :class:`~kliff.dataset.weight.Weight`. In this example, we set ``energy_weight`` to ``1.0`` and ``forces_weight`` to ``0.1``. For the silicon training set, we can read and process the files by: .. code-block:: python dataset_path = download_dataset(dataset_name="Si_training_set") weight = Weight(energy_weight=1.0, forces_weight=0.1) tset = Dataset.from_path(dataset_path, weight) configs = tset.get_configs() The ``configs`` in the last line is a list of :class:`~kliff.dataset.Configuration`. Each configuration is an internal representation of a processed **extended xyz** file, hosting the species, coordinates, energy, forces, and other related information of a system of atoms. Calculator ---------- :class:`~kliff.legacy.calculator.Calculator` is the central agent that exchanges information and orchestrate the operation of the fitting process. It calls the model to compute the energy and forces and provide this information to the Loss function (`discussed below <#loss-function>`__) to compute the loss. It also grabs the parameters from the optimizer and update the parameters stored in the model so that the up-to-date parameters are used the next time the model is evaluated to compute the energy and forces. The calculator can be created by: .. code-block:: python calc = Calculator(model) _ = calc.create(configs) .. parsed-literal:: 2025-05-16 21:17:01.932 | INFO | kliff.legacy.calculators.calculator:create:107 - Create calculator for 1000 configurations. where ``calc.create(configs)`` does some initializations for each configuration in the training set, such as creating the neighbor list. Loss function ------------- KLIFF uses a loss function to quantify the difference between the training set data and potential predictions and uses minimization algorithms to reduce the loss as much as possible. KLIFF provides a large number of minimization algorithms by interacting with `SciPy `_. For physics-motivated potentials, any algorithm listed on ```scipy.optimize.minimize`` and ```scipy.optimize.least_squares`` can be used. In the following code snippet, we create a loss of energy and forces and use ``2`` processors to calculate the loss. The ``L-BFGS-B`` minimization algorithm is applied to minimize the loss, and the minimization is allowed to run for a max number of 100 iterations. .. code-block:: python steps = 100 loss = Loss(calc, nprocs=2) # loss.minimize(method="L-BFGS-B", options={"disp": True, "maxiter": steps}) The minimization stops after running for 27 steps. After the minimization, we’d better save the model, which can be loaded later for the purpose of retraining or for function evaluations. If satisfied with the fitted model, you can also write it as a KIM model that can be used with LAMMPS, GULP, ASE, etc. via the kim-api. .. code-block:: python model.echo_opt_params() model.save("kliff_model.yaml") model.write_kim_model() # model.load("kliff_model.yaml") .. parsed-literal:: 2025-05-16 21:17:01.991 | INFO | kliff.models.kim:write_kim_model:657 - KLIFF trained model write to `/home/amit/Projects/COLABFIT/kliff/kliff/docs/source/tutorials/SW_StillingerWeber_1985_Si__MO_405512056662_006_kliff_trained` Parameter:A : [5.] Parameter:B : [0.60222456] Parameter:gamma : [1.5] The first line of the above code generates the output. A comparison with the original parameters before carrying out the minimization shows that we recover the original parameters quite reasonably. The second line saves the fitted model to a file named ``kliff_model.pkl`` on the disk, and the third line writes out a KIM potential named ``SW_StillingerWeber_1985_Si__MO_405512056662_006_kliff_trained``. For information about how to load a saved model, see `Save and load a model <./../howto/install_kim_model.rst#_install_model>`__.