Transforms¶

transforms is a collection of commonly used functions, used to change or transform, the datasets/parameters. Transforms module is divided as,

Coordinate transforms: Mapping the coordinates of a configuration to invariant representations, which can be used in ML models.
- Descriptors
- Radial Graphs
Properties: Transform properties associated with the configurations. Often it takes input as a complete dataset, and aggregate statistics of property of entire dataset before transformations like normalization
Parameters:Only available for the physic based models for now Transform the parameter space for enabling better sampling/training.[ref]

Configuration Transforms¶

Descriptor¶

The Descriptors module bridges the libdescriptor library with KLIFF’s data structures (i.e., Configuration, NeighborList). It provides:

show_available_descriptors(): A helper function that prints all descriptor names.
Descriptor:
- Takes a cutoff, species, descriptor name, and hyperparameters.
- Computes descriptors (forward) and their derivatives w.r.t. atomic coordinates (backward).
- Can store results directly in the Configuration object’s fingerprint.
default_hyperparams: Module containing collection of sane defaults for different descriptors

Tip

This module relies on the optional dependency libdescriptor. Which can be installed as conda install ipcamit::libdescriptor for now.

from kliff.transforms.configuration_transforms.descriptors import show_available_descriptors
show_available_descriptors()

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from kliff.transforms.configuration_transforms.descriptors import show_available_descriptors
      2 show_available_descriptors()

ModuleNotFoundError: No module named 'kliff'

from kliff.transforms.configuration_transforms.descriptors import Descriptor
from kliff.transforms.configuration_transforms.default_hyperparams import symmetry_functions_set30

desc = Descriptor(cutoff=3.77, 
                  species=["Si"], 
                  descriptor="SymmetryFunctions", 
                  hyperparameters=symmetry_functions_set30())

This Descriptor module is designed to work as a thin wrapper over libdescriptor library, and provides forward and backward function for computing the descriptors, and their vector-Jacobian products for gradient. Given below is a brief overview of how typical ML potential evaluates forces, and how it is achieved in KLIFF.

Theory of ML with descriptors¶

Descriptors ( $\zeta$ ) are used in machine learning to transform raw input features ( $\mathbf{\zeta}$ ) into a higher-dimensional representation that captures more complex patterns and relationships. This transformation is particularly useful in various applications, including molecular dynamics, material science, and geometric deep learning.

Forward Pass¶

Descriptor Calculation
- The input features $x$ (e.g., atomic coordinates, molecular structures) are mapped to a higher-dimensional space using a function $F$ .
- The output of this mapping is the descriptor $\mathbf{\zeta}$ :

$\mathbf{\zeta} = F(\mathbf{x})$

Model Prediction:
- The descriptor $\zeta$ is then used as input to a machine learning model (e.g., neural network) to make predictions:

$y = \text{ML Model}(\mathbf{\zeta})$

Backward Pass¶

Loss Calculation:
- A loss function measures the difference between the model’s predictions and the ground truth:

$\mathcal{L} = \text{Loss}(y, \text{ground truth})$

Derivative of Loss with Respect to Descriptors:
- During backpropagation, the first step is to compute the derivative of the loss with respect to the descriptors:

$\frac{\partial \mathcal{L}}{\partial \mathbf{\zeta}} = \nabla_\mathbf{\zeta} \mathcal{L}$

Vector-Jacobian Product:
- The next step is to compute the derivative of the descriptors with respect to the input coordinates $\mathbf{x}$ . This is represented by the Jacobian matrix:

$J = \frac{\partial \mathbf{\zeta}}{\partial \mathbf{x}} = \nabla_x F(x)$

To efficiently compute the gradient of the loss with respect to the input $\mathbf{x}$ , we use the vector-Jacobian product:

$\frac{\partial \mathcal{L}}{\partial \mathbf{x}} = J \cdot \frac{\partial \mathcal{L}}{\partial \mathbf{\zeta}}$

Gradient Flow:
- The gradients are then used to update the model parameters during optimization (e.g., gradient descent):

$\text{Parameters} \leftarrow \text{Parameters} - \eta \frac{\partial \mathcal{L}}{\partial x}$

where $\eta$ is the learning rate.

Forces¶

Forces for an ML model can be evaluated similary

$\mathbf{\mathcal{F}} = - \frac{\partial E}{\partial \mathbf{\zeta}} \cdot \frac{\partial \mathbf{\zeta}}{\partial \mathbf{x}}$

See example below.

KLIFF Descriptor `backward` and `forward`¶

# generate Si configuration
from ase.build import bulk
from kliff.dataset import Configuration
import numpy as np

Si_diamond = bulk("Si", a=5.44)
Si_config = Configuration.from_ase_atoms(Si_diamond)

# FORWARD: generating the descriptor $\zeta$
zeta = desc.forward(Si_config)

# BACKWARD: vector-jacobian product against arbitrary vector (\partial L/\partial \zeta)
dE_dZeta = np.random.random(zeta.shape)

forces = - desc.backward(Si_config, dE_dZeta=dE_dZeta)
print(forces)

[[-0. -0. -0.]
 [-0. -0. -0.]]

Radial Graphs¶

Similarly users can also generate radial graphs for graph neural networks.

from kliff.transforms.configuration_transforms.graphs import RadialGraph

graph_generator = RadialGraph(species=["Si"], cutoff=3.77, n_layers=1)

# dummy energy, needed for eval
Si_config._energy = 0.0
Si_config._forces = np.zeros_like(Si_config.coords)

print(graph_generator.forward(Si_config))

PyGGraph(energy=0.0, forces=[2, 3], n_layers=1, coords=[54, 3], images=[54], species=[54], z=[54], cell=[9], contributions=[54], num_nodes=54, idx=-1, edge_index0=[2, 14])

Transforms¶

Configuration Transforms¶

Descriptor¶

Theory of ML with descriptors¶

Forward Pass¶

Backward Pass¶

Forces¶

KLIFF Descriptor backward and forward¶

Radial Graphs¶

KLIFF Descriptor `backward` and `forward`¶