Train a linear regression potentialΒΆ
In this tutorial, we train a linear regression model on the descriptors obtained using the symmetry functions.
from kliff.legacy.calculators import CalculatorTorch
from kliff.dataset import Dataset
from kliff.legacy.descriptors import SymmetryFunction
from kliff.models import LinearRegression
from kliff.utils import download_dataset
descriptor = SymmetryFunction(
cut_name="cos", cut_dists={"Si-Si": 5.0}, hyperparams="set30", normalize=True
)
model = LinearRegression(descriptor)
# training set
dataset_path = download_dataset(dataset_name="Si_training_set")
dataset_path = dataset_path.joinpath("varying_alat")
tset = Dataset.from_path(dataset_path)
configs = tset.get_configs()
# calculator
calc = CalculatorTorch(model)
calc.create(configs, reuse=False)
2025-05-16 21:19:50.351 | INFO | kliff.dataset.dataset:add_weights:1128 - No explicit weights provided. 2025-05-16 21:19:50.352 | INFO | kliff.legacy.calculators.calculator_torch:_get_device:592 - Training on cpu 2025-05-16 21:19:50.353 | INFO | kliff.legacy.descriptors.descriptor:generate_fingerprints:103 - Start computing mean and stdev of fingerprints. 2025-05-16 21:20:04.086 | INFO | kliff.legacy.descriptors.descriptor:generate_fingerprints:120 - Finish computing mean and stdev of fingerprints. 2025-05-16 21:20:04.087 | INFO | kliff.legacy.descriptors.descriptor:generate_fingerprints:128 - Fingerprints mean and stdev saved to fingerprints_mean_and_stdev.pkl. 2025-05-16 21:20:04.088 | INFO | kliff.legacy.descriptors.descriptor:_dump_fingerprints:163 - Pickling fingerprints to fingerprints.pkl 2025-05-16 21:20:04.092 | INFO | kliff.legacy.descriptors.descriptor:_dump_fingerprints:175 - Processing configuration: 0. 2025-05-16 21:20:04.224 | INFO | kliff.legacy.descriptors.descriptor:_dump_fingerprints:175 - Processing configuration: 100. 2025-05-16 21:20:04.355 | INFO | kliff.legacy.descriptors.descriptor:_dump_fingerprints:175 - Processing configuration: 200. 2025-05-16 21:20:04.491 | INFO | kliff.legacy.descriptors.descriptor:_dump_fingerprints:175 - Processing configuration: 300. 2025-05-16 21:20:04.647 | INFO | kliff.legacy.descriptors.descriptor:_dump_fingerprints:218 - Pickle 400 configurations finished.
We can train a linear regression model by minimizing a loss function as
discussed in neural network tutorial. But linear regression
model has analytic solutions, and thus we can train the model directly
by using this feature. This can be achieved by calling the fit()
function of its calculator.
# fit the model
calc.fit()
# save model
model.save("linear_model.pkl")
2025-05-16 21:20:05.178 | INFO | kliff.models.linear_regression:fit:42 - Finished fitting model "LinearRegression"
Finished fitting model "LinearRegression"