Transforms¶
transforms is a collection of commonly used functions, used to change or transform, the datasets/parameters. Transforms module is divided as,
Coordinate transforms: Mapping the coordinates of a configuration to invariant representations, which can be used in ML models.
Descriptors
Radial Graphs
Properties: Transform properties associated with the configurations. Often it takes input as a complete dataset, and aggregate statistics of property of entire dataset before transformations like normalization
Parameters:Only available for the physic based models for now Transform the parameter space for enabling better sampling/training.[ref]
Configuration Transforms¶
Descriptor¶
The Descriptors module bridges the libdescriptor library with KLIFF’s data structures (i.e., Configuration, NeighborList). It provides:
show_available_descriptors(): A helper function that prints all descriptor names.Descriptor:Takes a
cutoff,species,descriptor name, andhyperparameters.Computes descriptors (
forward) and their derivatives w.r.t. atomic coordinates (backward).Can store results directly in the
Configurationobject’s fingerprint.
default_hyperparams: Module containing collection of sane defaults for different descriptors
Tip
This module relies on the optional dependency libdescriptor. Which can be installed as conda install ipcamit::libdescriptor for now.
from kliff.transforms.configuration_transforms.descriptors import show_available_descriptors
show_available_descriptors()
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 1
----> 1 from kliff.transforms.configuration_transforms.descriptors import show_available_descriptors
2 show_available_descriptors()
ModuleNotFoundError: No module named 'kliff'
from kliff.transforms.configuration_transforms.descriptors import Descriptor
from kliff.transforms.configuration_transforms.default_hyperparams import symmetry_functions_set30
desc = Descriptor(cutoff=3.77,
species=["Si"],
descriptor="SymmetryFunctions",
hyperparameters=symmetry_functions_set30())
This Descriptor module is designed to work as a thin wrapper over libdescriptor library, and provides forward and backward function for computing the descriptors, and their vector-Jacobian products for gradient. Given below is a brief overview of how typical ML potential evaluates forces, and how it is achieved in KLIFF.
Theory of ML with descriptors¶
Descriptors () are used in machine learning to transform raw input features (
) into a higher-dimensional representation that captures more complex patterns and relationships. This transformation is particularly useful in various applications, including molecular dynamics, material science, and geometric deep learning.
Forward Pass¶
Descriptor Calculation
The input features
(e.g., atomic coordinates, molecular structures) are mapped to a higher-dimensional space using a function
.
The output of this mapping is the descriptor
:
Model Prediction:
The descriptor
is then used as input to a machine learning model (e.g., neural network) to make predictions:
Backward Pass¶
Loss Calculation:
A loss function measures the difference between the model’s predictions and the ground truth:
Derivative of Loss with Respect to Descriptors:
During backpropagation, the first step is to compute the derivative of the loss with respect to the descriptors:
Vector-Jacobian Product:
The next step is to compute the derivative of the descriptors with respect to the input coordinates
. This is represented by the Jacobian matrix:
To efficiently compute the gradient of the loss with respect to the input
, we use the vector-Jacobian product:
Gradient Flow:
The gradients are then used to update the model parameters during optimization (e.g., gradient descent):
where is the learning rate.
Forces¶
Forces for an ML model can be evaluated similary
See example below.
KLIFF Descriptor backward and forward¶
# generate Si configuration
from ase.build import bulk
from kliff.dataset import Configuration
import numpy as np
Si_diamond = bulk("Si", a=5.44)
Si_config = Configuration.from_ase_atoms(Si_diamond)
# FORWARD: generating the descriptor $\zeta$
zeta = desc.forward(Si_config)
# BACKWARD: vector-jacobian product against arbitrary vector (\partial L/\partial \zeta)
dE_dZeta = np.random.random(zeta.shape)
forces = - desc.backward(Si_config, dE_dZeta=dE_dZeta)
print(forces)
[[-0. -0. -0.]
[-0. -0. -0.]]
Radial Graphs¶
Similarly users can also generate radial graphs for graph neural networks.
from kliff.transforms.configuration_transforms.graphs import RadialGraph
graph_generator = RadialGraph(species=["Si"], cutoff=3.77, n_layers=1)
# dummy energy, needed for eval
Si_config._energy = 0.0
Si_config._forces = np.zeros_like(Si_config.coords)
print(graph_generator.forward(Si_config))
PyGGraph(energy=0.0, forces=[2, 3], n_layers=1, coords=[54, 3], images=[54], species=[54], z=[54], cell=[9], contributions=[54], num_nodes=54, idx=-1, edge_index0=[2, 14])