Skip to content

Commit

Permalink
Release commit for v0.1.3
Browse files Browse the repository at this point in the history
  • Loading branch information
Mingjian Wen committed Aug 19, 2019
1 parent 70ddab9 commit 6dd4eb5
Show file tree
Hide file tree
Showing 16 changed files with 134 additions and 55 deletions.
Binary file not shown.
Binary file not shown.
6 changes: 3 additions & 3 deletions docs/source/auto_examples/example_kim_SW_Si.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Before getting started to train the SW model, let's first install the SW # model::\n\n $ kim-api-collections-management install user SW_StillingerWeber_1985_Si__MO_405512056662_005\n\n.. seealso::\n This installs the model and its driver into the ``User Collection``. See\n `install_model` for more information about installing KIM models.\n\nWe are going to create potentials for diamond silicon, and fit the potentials to a\ntraining set of energies and forces consisting of compressed and stretched diamond\nsilicon structures, as well as configurations drawn from molecular dynamics trajectories\nat different temperatures.\nDownload the training set :download:`Si_training_set.tar.gz\n<https://raw.githubusercontent.com/mjwen/kliff/master/examples/Si_training_set.tar.gz>`\nand extract the tarball: ``$ tar xzf Si_training_set.tar.gz``. The data is stored in\n**extended xyz** format, and see `doc.dataset` for more information of this format.\n\n<div class=\"alert alert-danger\"><h4>Warning</h4><p>The ``Si_training_set`` is just a toy data set for the purpose to demonstrate how to\n use KLIFF to train potentials. It should not be used to train any potential for real\n simulations.</p></div>\n\nLet's first import the modules that will be used in this example.\n\n"
"Before getting started to train the SW model, let's first install the SW model::\n\n $ kim-api-collections-management install user SW_StillingerWeber_1985_Si__MO_405512056662_005\n\n.. seealso::\n This installs the model and its driver into the ``User Collection``. See\n `install_model` for more information about installing KIM models.\n\nWe are going to create potentials for diamond silicon, and fit the potentials to a\ntraining set of energies and forces consisting of compressed and stretched diamond\nsilicon structures, as well as configurations drawn from molecular dynamics trajectories\nat different temperatures.\nDownload the training set :download:`Si_training_set.tar.gz\n<https://raw.githubusercontent.com/mjwen/kliff/master/examples/Si_training_set.tar.gz>`\nand extract the tarball: ``$ tar xzf Si_training_set.tar.gz``. The data is stored in\n**extended xyz** format, and see `doc.dataset` for more information of this format.\n\n<div class=\"alert alert-danger\"><h4>Warning</h4><p>The ``Si_training_set`` is just a toy data set for the purpose to demonstrate how to\n use KLIFF to train potentials. It should not be used to train any potential for real\n simulations.</p></div>\n\nLet's first import the modules that will be used in this example.\n\n"
]
},
{
Expand Down Expand Up @@ -112,7 +112,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"where ``calc.create(configs)`` does some initializations for each each\nconfiguration in the training set, such as creating the neighbor list.\n\n\nLoss function\n-------------\n\nKLIFF uses a loss function to quantify the difference between the training set data and\npotential predictions and uses minimization algorithms to reduce the loss as much as\npossible. KLIFF provides a large number of minimization algorithms by interacting with\nSciPy_. For physics-motivated potentials, any algorithm listed on\n`scipy.optimize.minimize`_ and `scipy.optimize.least_squares`_ can be used. In the\nfollowing code snippet, we create a loss of energy and forces, where the residual\nfunction uses an ``energy_weight`` of ``1.0`` and a ``forces_weight`` of ``0.1``, and\n``2`` processors will be used to calculate the loss. The ``L-BFGS-B`` minimization\nalgorithm is applied to minimize the loss, and the minimization is allowed to run for a\na max number of 100 iterations.\n\n"
"where ``calc.create(configs)`` does some initializations for each\nconfiguration in the training set, such as creating the neighbor list.\n\n\nLoss function\n-------------\n\nKLIFF uses a loss function to quantify the difference between the training set data and\npotential predictions and uses minimization algorithms to reduce the loss as much as\npossible. KLIFF provides a large number of minimization algorithms by interacting with\nSciPy_. For physics-motivated potentials, any algorithm listed on\n`scipy.optimize.minimize`_ and `scipy.optimize.least_squares`_ can be used. In the\nfollowing code snippet, we create a loss of energy and forces, where the residual\nfunction uses an ``energy_weight`` of ``1.0`` and a ``forces_weight`` of ``0.1``, and\n``2`` processors will be used to calculate the loss. The ``L-BFGS-B`` minimization\nalgorithm is applied to minimize the loss, and the minimization is allowed to run for\na max number of 100 iterations.\n\n"
]
},
{
Expand Down Expand Up @@ -141,7 +141,7 @@
},
"outputs": [],
"source": [
"model.echo_fitting_params()\nmodel.save('kliff_model.pkl')\nmodel.write_kim_model()"
"model.echo_fitting_params()\nmodel.save('kliff_model.pkl')\nmodel.write_kim_model()\nmodel.load('kliff_model.pkl')"
]
},
{
Expand Down
7 changes: 4 additions & 3 deletions docs/source/auto_examples/example_kim_SW_Si.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@


##########################################################################################
# Before getting started to train the SW model, let's first install the SW # model::
# Before getting started to train the SW model, let's first install the SW model::
#
# $ kim-api-collections-management install user SW_StillingerWeber_1985_Si__MO_405512056662_005
#
Expand Down Expand Up @@ -137,7 +137,7 @@


##########################################################################################
# where ``calc.create(configs)`` does some initializations for each each
# where ``calc.create(configs)`` does some initializations for each
# configuration in the training set, such as creating the neighbor list.
#
#
Expand All @@ -152,7 +152,7 @@
# following code snippet, we create a loss of energy and forces, where the residual
# function uses an ``energy_weight`` of ``1.0`` and a ``forces_weight`` of ``0.1``, and
# ``2`` processors will be used to calculate the loss. The ``L-BFGS-B`` minimization
# algorithm is applied to minimize the loss, and the minimization is allowed to run for a
# algorithm is applied to minimize the loss, and the minimization is allowed to run for
# a max number of 100 iterations.

steps = 100
Expand All @@ -170,6 +170,7 @@
model.echo_fitting_params()
model.save('kliff_model.pkl')
model.write_kim_model()
model.load('kliff_model.pkl')


##########################################################################################
Expand Down
2 changes: 1 addition & 1 deletion docs/source/auto_examples/example_kim_SW_Si.py.md5
Original file line number Diff line number Diff line change
@@ -1 +1 @@
80a3877e619c7934efd28c5c0aa08d96
f1d3c76dd1fd0038dfa916a3643e927f
22 changes: 18 additions & 4 deletions docs/source/auto_examples/example_kim_SW_Si.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Train a Stillinger-Weber potential
In this tutorial, we train a Stillinger-Weber (SW) potential for silicon that is archived
on OpenKIM_.

Before getting started to train the SW model, let's first install the SW # model::
Before getting started to train the SW model, let's first install the SW model::

$ kim-api-collections-management install user SW_StillingerWeber_1985_Si__MO_405512056662_005

Expand Down Expand Up @@ -245,6 +245,14 @@ test data). For the silicon training set, we can read and process the files by:
.. rst-class:: sphx-glr-script-out

Out:

.. code-block:: none
1000 configurations read from "Si_training_set"
The ``configs`` in the last line is a list of :class:`~kliff.dataset.Configuration`.
Expand Down Expand Up @@ -278,7 +286,7 @@ by:
where ``calc.create(configs)`` does some initializations for each each
where ``calc.create(configs)`` does some initializations for each
configuration in the training set, such as creating the neighbor list.


Expand All @@ -293,7 +301,7 @@ SciPy_. For physics-motivated potentials, any algorithm listed on
following code snippet, we create a loss of energy and forces, where the residual
function uses an ``energy_weight`` of ``1.0`` and a ``forces_weight`` of ``0.1``, and
``2`` processors will be used to calculate the loss. The ``L-BFGS-B`` minimization
algorithm is applied to minimize the loss, and the minimization is allowed to run for a
algorithm is applied to minimize the loss, and the minimization is allowed to run for
a max number of 100 iterations.


Expand All @@ -316,8 +324,12 @@ a max number of 100 iterations.

.. code-block:: none
Start minimization using method: L-BFGS-B.
Running in multiprocessing mode with 2 processes.
/Users/Wenz/Applications/kliff/kliff/log.py:45: Warning: "mpi4y" detected. If you try to run in MPI mode, you should execute your code via "mpiexec" (or "mpirun"). If not, ignore this message.
warnings.warn(message, category=warning_category)
Finish minimization using method: L-BFGS-B.
Expand All @@ -333,6 +345,7 @@ that can be used with LAMMPS_, GULP_, ASE_, etc. via the kim-api_.
model.echo_fitting_params()
model.save('kliff_model.pkl')
model.write_kim_model()
model.load('kliff_model.pkl')
Expand Down Expand Up @@ -361,6 +374,7 @@ that can be used with LAMMPS_, GULP_, ASE_, etc. via the kim-api_.
gamma 1
2.2014621875873330e+00
KLIFF trained model write to "/Users/Wenz/Applications/kliff/examples/SW_StillingerWeber_1985_Si__MO_405512056662_005_kliff_trained"
Expand All @@ -386,7 +400,7 @@ parameters quite reasonably. The second line saves the fitted model to a file na

.. rst-class:: sphx-glr-timing

**Total running time of the script:** ( 2 minutes 43.602 seconds)
**Total running time of the script:** ( 2 minutes 19.769 seconds)


.. _sphx_glr_download_auto_examples_example_kim_SW_Si.py:
Expand Down
14 changes: 7 additions & 7 deletions docs/source/auto_examples/example_nn_Si.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Model\n-----\n\nFor a NN model, we need to specify the descriptor that transforms atomic environment\ninformation to the fingerprints, which the NN model uses as the input. Here, we use the\nsymmetry functions proposed by by Behler and coworkers.\n\n"
"Model\n-----\n\nFor a NN model, we need to specify the descriptor that transforms atomic environment\ninformation to the fingerprints, which the NN model uses as the input. Here, we use the\nsymmetry functions proposed by Behler and coworkers.\n\n"
]
},
{
Expand All @@ -58,7 +58,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The ``cut_name`` and ``cut_dists`` tells the descriptor what type of cutoff function to\nuse and what the cutoff distances are. ``hyperparams`` specifies the set of\nhyperparameters used in the symmetry function descriptor. If you prefer, you can provide\na dictionary of your own hyperparameters. And finally, ``normalize`` informs that the\ngenerated fingerprints should be normalized by first subtracting the mean and then\ndividing the standard deviation. This normalization typically makes it easier to\noptimize NN model.\n\nWe can then build the NN model on top of the descriptor.\n\n"
"The ``cut_name`` and ``cut_dists`` tell the descriptor what type of cutoff function to\nuse and what the cutoff distances are. ``hyperparams`` specifies the set of\nhyperparameters used in the symmetry function descriptor. If you prefer, you can provide\na dictionary of your own hyperparameters. And finally, ``normalize`` informs that the\ngenerated fingerprints should be normalized by first subtracting the mean and then\ndividing the standard deviation. This normalization typically makes it easier to\noptimize NN model.\n\nWe can then build the NN model on top of the descriptor.\n\n"
]
},
{
Expand All @@ -69,14 +69,14 @@
},
"outputs": [],
"source": [
"N1 = 10\nN2 = 10\nmodel = NeuralNetwork(descriptor)\nmodel.add_layers(\n # first hidden layer\n nn.Linear(descriptor.get_size(), N1),\n nn.Tanh(),\n # second hidden layer\n nn.Linear(N1, N2),\n nn.Tanh(),\n # output layer\n nn.Linear(N2, 1),\n)\nmodel.set_save_metadata(prefix='./my_kliff_model', start=5, frequency=2)"
"N1 = 10\nN2 = 10\nmodel = NeuralNetwork(descriptor)\nmodel.add_layers(\n # first hidden layer\n nn.Linear(descriptor.get_size(), N1),\n nn.Tanh(),\n # second hidden layer\n nn.Linear(N1, N2),\n nn.Tanh(),\n # output layer\n nn.Linear(N2, 1),\n)\nmodel.set_save_metadata(prefix='./kliff_saved_model', start=5, frequency=2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the above code, we build a NN model with an input layer, two hidden layer, and an\noutput layer. The ``descriptor`` carries the information of the input layer, so it is\nnot needed to be specified explicitly. For each hidden layer, we first do a linear\ntransformation using ``nn.Linear(size_in, size_out)`` (essentially carrying out $y\n= xW+b$, where $W$ is the weight matrix of size ``size_in`` by ``size_out``, and\n$b$ is a vector of size ``size_out``. Then we apply the hyperbolic tangent\nactivation function ``nn.Tanh()`` to the output of the Linear layer (i.e. $y$) so\nas to add the nonlinearity. We use a Linear layer for the output layer as well, but\nunlike the hidden layer, no activation function is applied here. The input size\n``size_in`` of the first hidden layer must be the size of the descriptor, which is\nobtained using ``descriptor.get_size()``. For all other layers (hidden or output), the\ninput size must be equal to the output size of the previous layer. The ``out_size`` of\nthe output layer much be 1 such that the output of the NN model is gives the energy of\natom.\n\nThe ``set_save_metadata`` function call informs where to save intermediate models during\nthe optimization (discussed below), and what the starting epoch and how often to save\nthe model.\n\n\nTraining set and calculator\n---------------------------\n\nThe training set and the calculator are the same as explained in `tut_kim_sw`. The\nonly difference is that we need use the\n:mod:`~kliff.calculators.CalculatorTorch()`, which is targeted for the NN model.\nAlso, its ``create()`` method takes an argument ``reuse`` to inform whether to reuse the\nfingerprints generated from the descriptor if it is present.\n\n"
"In the above code, we build a NN model with an input layer, two hidden layer, and an\noutput layer. The ``descriptor`` carries the information of the input layer, so it is\nnot needed to be specified explicitly. For each hidden layer, we first do a linear\ntransformation using ``nn.Linear(size_in, size_out)`` (essentially carrying out $y\n= xW+b$, where $W$ is the weight matrix of size ``size_in`` by ``size_out``, and\n$b$ is a vector of size ``size_out``. Then we apply the hyperbolic tangent\nactivation function ``nn.Tanh()`` to the output of the Linear layer (i.e. $y$) so\nas to add the nonlinearity. We use a Linear layer for the output layer as well, but\nunlike the hidden layer, no activation function is applied here. The input size\n``size_in`` of the first hidden layer must be the size of the descriptor, which is\nobtained using ``descriptor.get_size()``. For all other layers (hidden or output), the\ninput size must be equal to the output size of the previous layer. The ``out_size`` of\nthe output layer must be 1 such that the output of the NN model gives the energy of the\natom.\n\nThe ``set_save_metadata`` function call informs where to save intermediate models during\nthe optimization (discussed below), and what the starting epoch and how often to save\nthe model.\n\n\nTraining set and calculator\n---------------------------\n\nThe training set and the calculator are the same as explained in `tut_kim_sw`. The\nonly difference is that we need to use the\n:mod:`~kliff.calculators.CalculatorTorch()`, which is targeted for the NN model.\nAlso, its ``create()`` method takes an argument ``reuse`` to inform whether to reuse the\nfingerprints generated from the descriptor if it is present.\n\n"
]
},
{
Expand All @@ -87,14 +87,14 @@
},
"outputs": [],
"source": [
"# training set\ndataset_name = 'Si_training_set/varying_alat'\ntset = Dataset()\ntset.read(dataset_name)\nconfigs = tset.get_configs()\nprint('Number of configurations:', len(configs))\n\n# calculator\ncalc = CalculatorTorch(model)\ncalc.create(configs, reuse=True)"
"# training set\ndataset_name = 'Si_training_set/varying_alat'\ntset = Dataset()\ntset.read(dataset_name)\nconfigs = tset.get_configs()\n\n# calculator\ncalc = CalculatorTorch(model)\ncalc.create(configs, reuse=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Loss function\n-------------\n\nKLIFF uses a loss function to quantify the difference between the training data and\npotential predictions and uses minimization algorithms to reduce the loss as much as\npossible. In the following code snippet, we create a loss function that uses the\n``Adam`` optimizer to minimize it. The Adam optimizer supports minimization using\n`mini-batches` of data, and here we use ``100`` configurations in each minimization step\n(the training set has a total of 400 configurations as can be seen above), and run\nthrough the training set for ``10`` epochs. The learning rate ``lr`` used here is\n``0.01``, and typically, one may need to play with this to find an acceptable one that\ndrives the loss down in a reasonable time.\n\n"
"Loss function\n-------------\n\nKLIFF uses a loss function to quantify the difference between the training data and\npotential predictions and uses minimization algorithms to reduce the loss as much as\npossible. In the following code snippet, we create a loss function that uses the\n``Adam`` optimizer to minimize it. The Adam optimizer supports minimization using\n`mini-batches` of data, and here we use ``100`` configurations in each minimization step\n(the training set has a total of 400 configurations as can be seen above), and run\nthrough the training set for ``10`` epochs. The learning rate ``lr`` used here is\n``0.001``, and typically, one may need to play with this to find an acceptable one that\ndrives the loss down in a reasonable time.\n\n"
]
},
{
Expand Down Expand Up @@ -123,7 +123,7 @@
},
"outputs": [],
"source": [
"model.save('./saved_model.pkl')\nmodel.write_kim_model()"
"model.save('./final_model.pkl')\nloss.save_optimizer_stat('./optimizer_stat.pkl')\n\nmodel.write_kim_model()"
]
}
],
Expand Down
Loading

0 comments on commit 6dd4eb5

Please sign in to comment.