Example: Training a Descriptor based Potential¶
Let us define a vey value dict directly and try to train a simple descriptor based Si potential
Step 0: Get the dataset¶
!wget https://raw.githubusercontent.com/openkim/kliff/main/examples/Si_training_set_4_configs.tar.gz
!tar -xvf Si_training_set_4_configs.tar.gz
--2025-04-11 15:15:46-- https://raw.githubusercontent.com/openkim/kliff/main/examples/Si_training_set_4_configs.tar.gz
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 7691 (7.5K) [application/octet-stream]
Saving to: ‘Si_training_set_4_configs.tar.gz.1’
Si_traini 0%[ ] 0 --.-KB/s
Si_training_set_4_c 100%[===================>] 7.51K --.-KB/s in 0s
2025-04-11 15:15:46 (108 MB/s) - ‘Si_training_set_4_configs.tar.gz.1’ saved [7691/7691]
Si_training_set_4_configs/
Si_training_set_4_configs/Si_alat5.431_scale0.005_perturb1.xyz
Si_training_set_4_configs/Si_alat5.409_scale0.005_perturb1.xyz
Si_training_set_4_configs/Si_alat5.442_scale0.005_perturb1.xyz
Si_training_set_4_configs/Si_alat5.420_scale0.005_perturb1.xyz
Step 1: workspace config¶
Create a folder named DNN_train_example, and use it for everything
workspace = {"name": "DNN_train_example", "random_seed": 12345}
Step 2: define the dataset¶
dataset = {"type": "path", "path": "Si_training_set_4_configs", "shuffle": True}
Step 3: model¶
We will use a simple fully connected neural network with tanh non-linearities and width of 51 (dims of our descriptor later). Model will contain 1 hidden layer with dimension 50, i.e.
import torch
import torch.nn as nn
torch.set_default_dtype(torch.double) # default float = double
torch_model = nn.Sequential(nn.Linear(51, 50), nn.Tanh(), nn.Linear(50, 50), nn.Tanh(), nn.Linear(50, 1))
torch_model
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[4], line 1
----> 1 import torch
2 import torch.nn as nn
3 torch.set_default_dtype(torch.double) # default float = double
ModuleNotFoundError: No module named 'torch'
model = {"name": "MY_ML_MODEL"}
Step 4: select appropriate configuration transforms¶
Let us use default set51 in Behler symmetry functions as the consfiguration transform descriptor
transforms = {
"configuration": {
"name": "Descriptor",
"kwargs": {
"cutoff": 4.0,
"species": ['Si'],
"descriptor": "SymmetryFunctions",
"hyperparameters": "set51"
}
}
}
Step 5: training¶
Lets train it using Adam optimizer. With test train split of 1:3.
training = {
"loss": {
"function": "MSE",
"weights": {
"config": 1.0,
"energy": 1.0,
"forces": 10.0
},
},
"optimizer": {
"name": "Adam",
"learning_rate": 1e-3
},
"training_dataset": {
"train_size": 3
},
"validation_dataset": {
"val_size": 1
},
"batch_size": 1,
"epochs": 10,
}
Step 6: (Optional) export the model?¶
export = {"model_path":"./", "model_name": "MyDNN__MO_111111111111_000"} # name can be anything, but better to have KIM-API qualified name for convenience
Step 7: Put it all together, and pass to the trainer¶
training_manifest = {
"workspace": workspace,
"model": model,
"dataset": dataset,
"transforms": transforms,
"training": training,
"export": export
}
from kliff.trainer.torch_trainer import DNNTrainer
trainer = DNNTrainer(training_manifest, model=torch_model)
trainer.train()
trainer.save_kim_model()
2025-03-05 11:55:01.129 | INFO | kliff.trainer.base_trainer:initialize:343 - Seed set to 12345.
2025-03-05 11:55:01.131 | INFO | kliff.trainer.base_trainer:setup_workspace:390 - Either a fresh run or resume is not requested. Starting a new run.
2025-03-05 11:55:01.131 | INFO | kliff.trainer.base_trainer:initialize:346 - Workspace set to DNN_train_example/MY_ML_MODEL_2025-03-05-11-55-01.
2025-03-05 11:55:01.133 | INFO | kliff.dataset.dataset:add_weights:1126 - No explicit weights provided.
2025-03-05 11:55:01.134 | INFO | kliff.dataset.dataset:add_weights:1131 - Weights set to the same value for all configurations.
2025-03-05 11:55:01.134 | INFO | kliff.trainer.base_trainer:initialize:349 - Dataset loaded.
2025-03-05 11:55:01.135 | INFO | kliff.trainer.base_trainer:setup_dataset_split:601 - Training dataset size: 3
2025-03-05 11:55:01.135 | INFO | kliff.trainer.base_trainer:setup_dataset_split:609 - Validation dataset size: 1
2025-03-05 11:55:01.136 | INFO | kliff.trainer.base_trainer:initialize:354 - Train and validation datasets set up.
2025-03-05 11:55:01.137 | INFO | kliff.trainer.base_trainer:initialize:358 - Model loaded.
2025-03-05 11:55:01.138 | INFO | kliff.trainer.base_trainer:initialize:363 - Optimizer loaded.
2025-03-05 11:55:01.143 | INFO | kliff.trainer.base_trainer:save_config:475 - Configuration saved in DNN_train_example/MY_ML_MODEL_2025-03-05-11-55-01/f7607ea9bb9b8339abcb90454f6ecb43.yaml.
2025-03-05 11:55:01.170 | INFO | kliff.dataset.dataset:check_properties_consistency:1261 - Consistent properties: ['energy', 'forces'], stored in metadata key: `consistent_properties`
2025-03-05 11:55:01.179 | INFO | kliff.dataset.dataset:check_properties_consistency:1261 - Consistent properties: ['energy', 'forces'], stored in metadata key: `consistent_properties`
2025-03-05 11:55:01.550 | INFO | kliff.trainer.torch_trainer:train:507 - Epoch 0 completed. val loss: 76995.86237589743
2025-03-05 11:55:01.553 | INFO | kliff.trainer.torch_trainer:train:513 - Epoch 0 completed. Train loss: 242421.30496552895
2025-03-05 11:55:01.836 | INFO | kliff.trainer.torch_trainer:train:513 - Epoch 1 completed. Train loss: 225440.8494130551
2025-03-05 11:55:02.099 | INFO | kliff.trainer.torch_trainer:train:513 - Epoch 2 completed. Train loss: 209060.9601532494
2025-03-05 11:55:02.365 | INFO | kliff.trainer.torch_trainer:train:513 - Epoch 3 completed. Train loss: 192890.04531135847
2025-03-05 11:55:02.630 | INFO | kliff.trainer.torch_trainer:train:513 - Epoch 4 completed. Train loss: 176637.89002333782
2025-03-05 11:55:02.915 | INFO | kliff.trainer.torch_trainer:train:513 - Epoch 5 completed. Train loss: 160081.0169738328
2025-03-05 11:55:03.182 | INFO | kliff.trainer.torch_trainer:train:513 - Epoch 6 completed. Train loss: 142972.0737350749
2025-03-05 11:55:03.444 | INFO | kliff.trainer.torch_trainer:train:513 - Epoch 7 completed. Train loss: 125384.63352492588
2025-03-05 11:55:03.705 | INFO | kliff.trainer.torch_trainer:train:513 - Epoch 8 completed. Train loss: 107469.39302393713
2025-03-05 11:55:03.967 | INFO | kliff.trainer.torch_trainer:train:513 - Epoch 9 completed. Train loss: 89547.26232292764
/opt/mambaforge/mambaforge/envs/colabfit/lib/python3.9/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
2025-03-05 11:55:05.823 | INFO | kliff.trainer.torch_trainer:save_kim_model:599 - KIM model saved at ./MyDNN__MO_000000000000_000
To execute this model you need to install the libtorch, which is the C++ API for Pytorch. Details on how to install it and execute these ML models is provided in the :ref:following sections <_lammps>.