Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add framework for property model parameter generation #251

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
ea75218
Refactor Redlich-Kister model candidate generation to operate on a si…
bocklund Sep 11, 2023
b0d62ac
More model building refactoring to not automatically make feature set…
bocklund Sep 12, 2023
b49bb14
WIP: unary V0+VA modeling and binary+ternary V0 modeling in notebooks
bocklund Nov 15, 2023
02ed1f1
Refactoring
bocklund Nov 25, 2023
f5daf47
Add some commented todos for shift_reference_state
bocklund Nov 25, 2023
119a989
Cleanup/rewrite get_data_quantities for VA parameters
bocklund Nov 25, 2023
1003d0d
Ensure support for higher order interaction parameters
bocklund Nov 25, 2023
7556299
Support VM_MIX!
bocklund Nov 25, 2023
3a25c59
Test of VM(T) data, seems to work
bocklund Nov 25, 2023
2a6ef16
Cleanups for get_data_quantities for VA params
bocklund Nov 25, 2023
f3c98d2
Working elastic constant fitting
bocklund Nov 25, 2023
5264543
Add TODO comment
bocklund Nov 25, 2023
3912c2f
Cleanup elastic notebook
bocklund Nov 26, 2023
074abf9
ESPEI tests and notebook passing
bocklund Nov 26, 2023
5ae9e49
Add modified version of _get_sample_condition_dicts
bocklund Nov 27, 2023
2b62e3e
Move fitting steps and description to new notebook that works
bocklund Nov 27, 2023
2166440
WIP: refactor: delete get_data_quantities shift_reference_state funct…
bocklund Nov 27, 2023
c6b87d1
WIP: more refactoring fit_formation_energy
bocklund Nov 27, 2023
e09ea49
Every day i'm factoring (tests passing)
bocklund Nov 27, 2023
566af22
Refactor: shared binary and ternary interaction code
bocklund Nov 28, 2023
a3ec172
Unify endmember and interaction parameter insertion
bocklund Nov 28, 2023
a619092
move insertion to fit_parameters
bocklund Nov 28, 2023
21a3bd9
WIP: be able to insert parameters each step
bocklund Nov 28, 2023
debc5bc
Tweaks to accept non G parameter types
bocklund Nov 28, 2023
91c4f29
Fix to replace database symbols in case they slip in
bocklund Nov 28, 2023
e47d1a6
Update notebook with the new fit_parameters!
bocklund Nov 28, 2023
53dc18d
Support fully qualified ModelFittingDescription in schema
bocklund Nov 28, 2023
052b098
Sketch out some failing tests to get working
bocklund Nov 28, 2023
dfb116a
Delete old notebooks
bocklund Nov 28, 2023
b06d720
Add local directory to path to import qualified objects
bocklund Dec 1, 2023
8580d15
Remove elastic model and fitting description
bocklund Dec 1, 2023
a2e52fa
Add failing test_G_lattice_stabilities_do_not_prevent_fitting_other_p…
bocklund Dec 1, 2023
b030e1f
Fix for preventing generating duplicate parameters
bocklund Dec 1, 2023
94cd1c0
Code cleanup and move full enumeration of candidate models down to de…
bocklund Dec 1, 2023
d6d6699
implement another test
bocklund Dec 1, 2023
4ed24d6
more debug logging
bocklund Dec 1, 2023
9ba1611
Implement another test
bocklund Dec 1, 2023
27c6b6b
test tweaks
bocklund Dec 1, 2023
f876308
Delete redundant test
bocklund Dec 1, 2023
b4ac1b0
Implement test for normalizing per mole of formula units
bocklund Dec 2, 2023
3555f31
Implement binary/ternary V0/VA absolute/mix tests
bocklund Dec 2, 2023
62ad29e
Remove testing notebook
bocklund Dec 2, 2023
100d13b
Rename molar volume fitting description
bocklund Dec 2, 2023
f3f28d1
CALPHAD -> Calphad in docs
bocklund Dec 3, 2023
e8376a2
Tutorial and input docs writeup
bocklund Dec 3, 2023
cb6a491
TODO cleanup
bocklund Dec 3, 2023
433be56
get_data_quantities -> get_response_vector
bocklund Dec 3, 2023
9692b7d
Remove transform_data a a public api for FittingStep
bocklund Dec 3, 2023
4225686
rename AbstractRKMPropertyStep to AbstractLinearPropertyStep
bocklund Dec 3, 2023
827e5a2
refactor FittingStep.transform_feature for Gibbs params
bocklund Dec 3, 2023
11ea7c2
Move/rename _get_sample_condition_dicts, delete espei.parameter_selec…
bocklund Dec 3, 2023
2741986
fitting_descriptions cleanup
bocklund Dec 3, 2023
bcfdb8a
Add typing in fitting_steps
bocklund Dec 3, 2023
898854f
Cleanup some TODOs
bocklund Dec 3, 2023
ea83636
TODO and commenting cleanup
bocklund Dec 3, 2023
3d5a7f2
Remove Lu database
bocklund Dec 3, 2023
a9aab48
add normalization support to parameters
bocklund Dec 5, 2023
6f7d895
add mixing and absolute value test for V0
bocklund Dec 7, 2023
34e3cf4
Follow links in dataset recursive glob
bocklund Dec 5, 2023
aaad2c0
Fix link for Marker 2018
bocklund Jan 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 16 additions & 8 deletions docs/api/espei.parameter_selection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,22 @@ espei.parameter\_selection package
Submodules
----------

espei.parameter\_selection.fitting\_descriptions module
-------------------------------------------------------

.. automodule:: espei.parameter_selection.fitting_descriptions
:members:
:undoc-members:
:show-inheritance:

espei.parameter\_selection.fitting\_steps module
------------------------------------------------

.. automodule:: espei.parameter_selection.fitting_steps
:members:
:undoc-members:
:show-inheritance:

espei.parameter\_selection.model\_building module
-------------------------------------------------

Expand All @@ -28,14 +44,6 @@ espei.parameter\_selection.selection module
:undoc-members:
:show-inheritance:

espei.parameter\_selection.utils module
---------------------------------------

.. automodule:: espei.parameter_selection.utils
:members:
:undoc-members:
:show-inheritance:

Module contents
---------------

Expand Down
6 changes: 3 additions & 3 deletions docs/cu-mg-example.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ ESPEI-datasets repository so that others may benefit from this data as you have.
You may then add your name to the CONTRIBUTORS file as described in the README.


Phases and CALPHAD models
Phases and Calphad models
=========================

The Cu-Mg system contains five stable phases: Liquid, disordered fcc and hcp,
Expand Down Expand Up @@ -200,7 +200,7 @@ MCMC optimization
With the data in the CU-MG input data, ESPEI generated 18 parameters to fit. For
systems with more components, solution phases, and input data, may more
parameters could be required to describe the thermodynamics of the specific
system well. Because they describe Gibbs free energies, parameters in CALPHAD
system well. Because they describe Gibbs free energies, parameters in Calphad
models are highly correlated in both single-phase descriptions and for
describing equilibria between phases. For large systems, global numerical
optimization of many parameters simultaneously is computationally intractable.
Expand Down Expand Up @@ -391,7 +391,7 @@ the diagonal and covariances between them under the diagonal. A more
circular covariance means that parameters are not correlated to each
other, while elongated shapes indicate that the two parameters are
correlated. Strongly correlated parameters are expected for some
parameters in CALPHAD models within phases or for phases in equilibrium,
parameters in Calphad models within phases or for phases in equilibrium,
because increasing one parameter while decreasing another would give a
similar error.

Expand Down
6 changes: 3 additions & 3 deletions docs/design.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ The goal is to make it clear how different modules in ESPEI fit together and whe

ESPEI provides tools to

1. Parameterize CALPHAD models by optimizing the compromise between model accuracy and complexity. We typically call this parameter generation or model selection.
2. Fit parameterized CALPHAD models to thermochemical and phase boundary data or other custom data with uncertainty quantification via Markov chain Monte Carlo
1. Parameterize Calphad models by optimizing the compromise between model accuracy and complexity. We typically call this parameter generation or model selection.
2. Fit parameterized Calphad models to thermochemical and phase boundary data or other custom data with uncertainty quantification via Markov chain Monte Carlo

API
---
Expand Down Expand Up @@ -47,7 +47,7 @@ Parameter selection
-------------------

Parameter selection goes through the ``generate_parameters`` function in the ``espei.paramselect`` module.
The goal of parameter selection is go through each phase (one at a time) and fit a CALPHAD model to the data.
The goal of parameter selection is go through each phase (one at a time) and fit a Calphad model to the data.

For each phase, the endmembers are fit first, followed by binary and ternary interactions.
For each individual endmember or interaction to fit, a series of candidate models are generated that have increasing
Expand Down
16 changes: 9 additions & 7 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@

\part{Introduction}

ESPEI, or Extensible Self-optimizing Phase Equilibria Infrastructure, is a tool for creating CALPHAD databases and evaluating the uncertainty of CALPHAD models.
The purpose of ESPEI is to be both a user tool for fitting state-of-the-art CALPHAD-type models and to be a research platform for developing methods for fitting and uncertainty quantification.
ESPEI, or Extensible Self-optimizing Phase Equilibria Infrastructure, is a tool for creating Calphad databases and evaluating the uncertainty of Calphad models.
The purpose of ESPEI is to be both a user tool for fitting state-of-the-art Calphad-type models and to be a research platform for developing methods for fitting and uncertainty quantification.
ESPEI uses `pycalphad`_ for the thermodynamic backend and supports fitting adjustable parameters for any pycalphad model.

ESPEI is developed in the open on `GitHub <https://github.com/PhasesResearchLab/ESPEI>`_.
Expand All @@ -26,16 +26,16 @@ What does ESPEI do?
Parameter generation
~~~~~~~~~~~~~~~~~~~~

ESPEI can be used to generate model parameters for CALPHAD models of the Gibbs energy that follow the temperature-dependent power series expansion of the Gibbs energy within the compound energy formalism (CEF) for endmembers and for binary and ternary Redlich-Kister interaction parameters with Muggianu extrapolation.
This parameter generation step augments the CALPHAD modeler by providing tools for data-driven model selection, rather than relying on a modeler's intuition alone.
ESPEI can be used to generate model parameters for Calphad models of the Gibbs energy that follow the temperature-dependent power series expansion of the Gibbs energy within the compound energy formalism (CEF) for endmembers and for binary and ternary Redlich-Kister interaction parameters with Muggianu extrapolation.
This parameter generation step augments the Calphad modeler by providing tools for data-driven model selection, rather than relying on a modeler's intuition alone.
Model generation is based on a linear regression of enthalpy, entropy, and heat capacity data (see :ref:`non-equilibrium thermochemical data <non_equilibrium_thermochemical_data>`), using the corrected Akiake Information Criterion (AICc) to prevent overfitting.

Optimization and uncertainty quantification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

ESPEI can optimize and quantify the uncertainty of CALPHAD model parameters to thermochemical and :ref:`phase boundary data <phase_boundary_data>`.
ESPEI can optimize and quantify the uncertainty of Calphad model parameters to thermochemical and :ref:`phase boundary data <phase_boundary_data>`.
Optimization and uncertainty quantification is performed using a Bayesian ensemble Markov Chain Monte Carlo (MCMC) method.
Any CALPHAD database can be used, including databases generated by ESPEI or starting from an existing CALPHAD database.
Any Calphad database can be used, including databases generated by ESPEI or starting from an existing Calphad database.

ESPEI supports all models supported by pycalphad.
User-developed models that are compatible with pycalphad can be used without making any modifications to ESPEI's code.
Expand All @@ -57,7 +57,7 @@ The name ESPEI and early concept were developed by [Shang2010]_ under the superv
After developing `pycalphad`_, Richard Otis and Zi-Kui Liu reimagined the concept and wrote
`pycalphad-fitting`_ (used in [Otis2016]_ and [Otis2017]_), which formed the nucleus for the present version of ESPEI ([Bocklund2019]_).

Details on the implementation of ESPEI can be found in the following publications:
Details on the implementation of ESPEI can be found in the following publications:

- B\. Bocklund *et al.*, MRS Communications 9(2) (2019) 1–10. doi:`10.1557/mrc.2019.59 <https://doi.org/10.1557/mrc.2019.59>`_.
- B\. Bocklund, Ph.D. Dissertation (Chapter 3), The Pennsylvania State University (2021), https://etda.libraries.psu.edu/catalog/21192bjb54
Expand Down Expand Up @@ -122,6 +122,7 @@ Documentation


cu-mg-example
tutorial_gen_custom_mod_params

.. raw:: latex

Expand Down Expand Up @@ -222,6 +223,7 @@ References
.. [Coughanowr1991] Coughanowr *et al.*, Assessment of the Cu-Mg system. Zeitschrift f{ü}r Met. 82, 574–581 (1991).
.. [Dinsdale1991] Dinsdale, Calphad 15(4) (1991) 317-425, doi:`10.1016/0364-5916(91)90030-N <https://doi.org/10.1016/0364-5916(91)90030-N>`_
.. [Lukas2007] Lukas, Fries, and Sundman, Computational Thermodynamics: The Calphad Method. (Cambridge University Press, 2007). doi:`10.1017/CBO9780511804137 <https://doi.org/10.1017/CBO9780511804137>`_
.. [Marker2018] Marker *et al.*, Computational Materials Science 142 (2018) 215-226. doi:`10.1016/j.commatsci.2017.10.016 <https://doi.org/10.1016/j.commatsci.2017.10.016>`_
.. [Otis2016] Otis, Ph.D. Dissertation, The Pennsylvania State University (2016). https://etda.libraries.psu.edu/catalog/s1784k73d
.. [Otis2017] Otis *et al.*, JOM 69 (2017) doi:`10.1007/s11837-017-2318-6 <http://doi.org/10.1007/s11837-017-2318-6>`_
.. [Roslyakova2016] Roslyakova *et al.*, Calphad 55 (2016) 165–180. doi:`10.1016/j.calphad.2016.09.001 <https://doi.org/10.1016/j.calphad.2016.09.001>`_
Expand Down
6 changes: 3 additions & 3 deletions docs/input_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ To check the datasets at path ``my-input-data/`` you can run ``espei --check-dat
Phase Descriptions
==================

The JSON file for describing CALPHAD phases is conceptually similar to a setup file in Thermo-Calc's PARROT module.
The JSON file for describing Calphad phases is conceptually similar to a setup file in Thermo-Calc's PARROT module.
At the top of the file there is the ``refdata`` key that describes which reference state you would like to choose.
Currently the reference states are strings referring to dictionaries in ``pycalphad.refdata`` only ``"SGTE91"`` is implemented.

Expand Down Expand Up @@ -131,7 +131,7 @@ Two examples follow. The first dataset has some data for the formation heat capa
* The ``conditions`` describe temperatures (``T``) and pressures (``P``) as either scalars or one-dimensional lists.
* The type of quantity is expressed using the ``output`` key. This can in principle be any thermodynamic quantity, but currently only ``CPM*``, ``SM*``, and ``HM*`` (where ``*`` is either nothing, ``_MIX`` or ``_FORM``) are supported. Support for changing reference states is planned but not yet implemented, so all thermodynamic quantities must be formation quantities (e.g. ``HM_FORM`` or ``HM_MIX``, etc.). This is tracked by :issue:`85` on GitHub.
* ``values`` is a 3-dimensional array where each value is the ``output`` for a specific condition of pressure, temperature, and sublattice configurations from outside to inside. Alternatively, the size of the array must be ``(len(P), len(T), len(subl_config))``. In the example below, the shape of the ``values`` array is (1, 12, 1) as there is one pressure scalar, one sublattice configuration, and 12 temperatures.
* There is also a key, ``excluded_model_contributions``, which will make those contributions of pycalphad's ``Model`` not be fit to when doing parameter selection or MCMC. This is useful for cases where the type of data used does not include some specific ``Model`` contributions that parameters may already exist for. For example, DFT formation energies do not include ideal mixing or (CALPHAD-type) magnetic model contributions, but formation energies from experiments would include these contributions so experimental formation energies should not be excluded.
* There is also a key, ``excluded_model_contributions``, which will make those contributions of pycalphad's ``Model`` not be fit to when doing parameter selection or MCMC. This is useful for cases where the type of data used does not include some specific ``Model`` contributions that parameters may already exist for. For example, DFT formation energies do not include ideal mixing or (Calphad-type) magnetic model contributions, but formation energies from experiments would include these contributions so experimental formation energies should not be excluded.

.. code-block:: JSON

Expand Down Expand Up @@ -359,7 +359,7 @@ Tags are a flexible method to adjust many ESPEI datasets simultaneously and driv
Each dataset can have a ``"tags"`` key, with a corresponding value of a list of tags, e.g. ``["dft"]``.
Any tag modifications present in the input YAML file are applied to the datasets before ESPEI is run.

They can be used in many creative ways, but some suggested ways include to add weights or to exclude model contributions, e.g. for DFT data that should not have contributions for a CALPHAD magnetic model or ideal mixing energy.
They can be used in many creative ways, but some suggested ways include to add weights or to exclude model contributions, e.g. for DFT data that should not have contributions for a Calphad magnetic model or ideal mixing energy.
An example of using the tags in an input file looks like:

.. code-block:: JSON
Expand Down
4 changes: 2 additions & 2 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Quickstart
ESPEI has two different fitting modes: parameter generation and Bayesian parameter estimation, which uses Markov Chain Monte Carlo (MCMC).
You can run either of these modes or both of them sequentially.

To run either of the modes, you need to have a phase models file that describes the phases in the system using the standard CALPHAD approach within the compound energy formalism.
To run either of the modes, you need to have a phase models file that describes the phases in the system using the standard Calphad approach within the compound energy formalism.
You also need to describe the data that ESPEI should fit to.
You will need single-phase and multi-phase data for a full run.
Fit settings and all datasets are stored as JSON files and described in detail at the :ref:`Input data` page.
Expand Down Expand Up @@ -151,7 +151,7 @@ You can install git using ``conda install git`` on Windows.
Q: I have a large database, can I use ESPEI to optimize parameters in only a subsystem?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A: Yes, if you have a multicomponent CALPHAD database, but want to optimize or
A: Yes, if you have a multicomponent Calphad database, but want to optimize or
determine the uncertainty for a constituent unary, binary or ternary subsystem
that you have data for, you can do that without any extra effort.

Expand Down
2 changes: 1 addition & 1 deletion docs/recipes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,7 @@ the diagonal and covariances between them under the diagonal. A more
circular covariance means that parameters are not correlated to each
other, while elongated shapes indicate that the two parameters are
correlated. Strongly correlated parameters are expected for some
parameters in CALPHAD models within phases or for phases in equilibrium,
parameters in Calphad models within phases or for phases in equilibrium,
because increasing one parameter while decreasing another would give a
similar likelihood.

Expand Down
6 changes: 3 additions & 3 deletions docs/specifying_priors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,18 +28,18 @@ There is also a special (improper) ``zero`` prior that always gives :math:`\ln p
Each ``scipy.stats`` prior is typically specified using several keyword argument
parameters, e.g. ``loc`` and ``scale``, which have special meaning for the
different distribution functions.
In order to be flexible to specifying these arguments when the CALPHAD
In order to be flexible to specifying these arguments when the Calphad
parameters they will be used for are not known beforehand, ESPEI uses a small
language to specify how the distribution hyperparameters can be set relative to
the CALPHAD parameters.
the Calphad parameters.

Basically, the ``PriorSpec`` objects are created with the name of the distribution
and the hyperparameters that are modified with
one of the modifier types: ``absolute``, ``relative``, ``shift_absolute``, or ``shift_relative``.
For example, the ``loc`` parameter might become ``loc_relative`` and ``scale`` might
become ``scale_shift_relative``.

Here are some examples of how the modifier parameters of value ``v`` modify the hyperparameters when given a CALPHAD parameter of value ``p``:
Here are some examples of how the modifier parameters of value ``v`` modify the hyperparameters when given a Calphad parameter of value ``p``:

* ``_absolute=v`` always take the exact value passed in, ``v``; ``loc_absolute=-20`` gives a value of ``loc=-20``.
* ``_relative=v`` gives , ``v*p``; ``scale_absolute=0.1`` with ``p=10000`` gives a value of ``scale=10000*0.1=1000``.
Expand Down
Loading
Loading