Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QC model method contributions #160

Draft
wants to merge 41 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
d436e95
add main and auxiliary basis set quantities to AtomCenteredBasisSet
EBB2675 Sep 24, 2024
022c1fa
formatting
EBB2675 Sep 24, 2024
bf1f6f4
Merge branch 'develop' into coupled_cluster
EBB2675 Sep 25, 2024
af88624
add a draft for AtomCenteredFunction
EBB2675 Sep 25, 2024
0a9d98b
add a quantity for number of primitives
EBB2675 Sep 25, 2024
a204ec4
add atoms_state reference to AtomCenteredBasisSet
EBB2675 Sep 26, 2024
a325753
Merge remote-tracking branch 'origin/develop' into 130-atom-centered-…
EBB2675 Oct 1, 2024
8866ab7
assign JSON format to basis set
EBB2675 Oct 2, 2024
c7b078a
add type and auxiliary_type quantities to AtomCenteredBasisSet
EBB2675 Oct 8, 2024
1208e88
reformatted basis_set.py
EBB2675 Oct 8, 2024
fbc13f9
add NAO and point charges to basis set types
EBB2675 Oct 8, 2024
ddab789
add cECPs and pointcharges to AtomCenteredBasisSet
EBB2675 Oct 9, 2024
8c6ae1a
fix point charge Quantity type
EBB2675 Oct 11, 2024
c94e995
a bit of a cleanup
EBB2675 Oct 14, 2024
a007e1e
merge develop
EBB2675 Oct 18, 2024
62de243
move GTOIntegralDecomposition to NumericalSettings
EBB2675 Oct 24, 2024
c12adac
Merge branch 'develop' into 130-atom-centered-basis-set
EBB2675 Nov 19, 2024
52d6b9a
merge develop
EBB2675 Nov 19, 2024
646430c
add Mesh, NumericalIntegration and MolecularHamiltonianSubTerms
EBB2675 Nov 19, 2024
cca109f
minor adjustments to Mesh and NumericalIntegration
EBB2675 Nov 19, 2024
42f4f38
add integration_thresh and weight_cutoff to NumericalIntegration
EBB2675 Nov 20, 2024
0f92eb9
check whether n_primitive matches the lengths of exponents and contra…
EBB2675 Nov 20, 2024
e8fb5ef
add tests for AtomCenteredBasisSet and AtomCenteredFunction
EBB2675 Nov 20, 2024
4826f0c
add tests for Mesh and NumericalIntegration
EBB2675 Nov 20, 2024
c5141ab
modify Mesh
EBB2675 Nov 20, 2024
23d7615
MEnum for MolecularHamiltonianContributions
EBB2675 Nov 21, 2024
9ef13ea
remove contributions
EBB2675 Nov 21, 2024
80c8b64
add a normalizer function for the AtomCenteredFunction to handle comb…
EBB2675 Nov 28, 2024
4ac1076
fix test_basis_set.py
EBB2675 Dec 4, 2024
10e782c
add OrbitalLocalization to numerical_settings.py
EBB2675 Dec 4, 2024
5c15e97
add method to LocalCorrelation
EBB2675 Dec 4, 2024
2fd6398
add total_charge and total_spin to ModelSystem
EBB2675 Dec 5, 2024
c9d6118
add a simple HF class
EBB2675 Dec 10, 2024
816c7ba
a placeholder for MO and LCAO
EBB2675 Dec 11, 2024
09b0dfc
add MEnum for GTOIntegralDecomposition
EBB2675 Jan 2, 2025
271f69c
modify reference determinants in HF base class
EBB2675 Jan 2, 2025
d90eac1
fix ROKS issue, minor modifications in model_method
EBB2675 Jan 13, 2025
95e319c
enhance LocalCorrelation and CoupledCluster
EBB2675 Jan 14, 2025
6f3e30b
add a placeholder MolecularOrbitalsState
EBB2675 Jan 15, 2025
92fd95e
improved class descriptions, new MolecularOrbitals class, improved Ha…
EBB2675 Jan 28, 2025
b0eb159
minor modifications
EBB2675 Jan 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 102 additions & 0 deletions src/nomad_simulations/schema_packages/atoms_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -641,3 +641,105 @@ def normalize(self, archive: 'EntryArchive', logger: 'BoundLogger') -> None:
self.chemical_symbol = self.resolve_chemical_symbol(logger=logger)
if self.atomic_number is None:
self.atomic_number = self.resolve_atomic_number(logger=logger)


class MolecularOrbitals(Entity):
"""
This class stores all molecular orbitals (MO) in a single container, with each Quantity using
arrays indexed by mo_num and ao_num.

Comparison to TREXIO:
- mo/type -> mo_type
- mo/num -> mo_num
- mo/coefficient -> coefficient
- mo/coefficient_im -> coefficient_im
- mo/symmetry -> symmetry
- mo/occupation -> occupation
- mo/energy -> energy
- mo/spin -> spin

"""

mo_num = Quantity(
type=np.int32,
description="""
Number of molecular orbitals.
""",
)

ao_num = Quantity(
type=np.int32,
description="""
Number of atomic orbitals or basis functions (often needed for coefficient shape).
Corresponds to the 'ao.num' dimension in TREXIO.
""",
)

mo_type = Quantity(
type=str,
shape=['mo_num'],
description="""
Type of the molecular orbitals
e.g. 'canonical', 'localized'.
In case of CASSCF calculations, there will be orbital subspaces of different nature.
E.g. :
Internal orbitals : canonical
Active orbitals : natural
Virtual orbitals : canonical
""",
)

coefficient = Quantity(
type=np.float64,
shape=['mo_num', 'ao_num'],
description="""
Real part of the MO coefficients. The shape is
[mo.num, ao.num], meaning each row corresponds to one MO, and each column
to one atomic orbital (or basis function).
""",
)

coefficient_im = Quantity(
type=np.float64,
shape=['mo_num', 'ao_num'],
description="""
Imaginary part of the MO coefficients. The shape is
[mo.num, ao.num]. This array may be omitted or set to zero if the orbitals
are purely real.
""",
)

symmetry = Quantity(
type=str,
shape=['mo_num'],
description="""
Symmetry label for each MO, e.g. group-theory labels or
simpler 'sigma', 'pi', 'delta'.
""",
)

occupation = Quantity(
type=np.float64,
shape=['mo_num'],
description="""
Occupation numbers for each MO. Typically in [0, 2]
for closed-shell systems, but might be fractional in open-shell systems or multi-reference calculations.
""",
)

energy = Quantity(
type=np.float64,
shape=['mo_num'],
description="""
Orbital energies for each MO.
""",
)

spin = Quantity(
type=np.int32,
shape=['mo_num'],
description="""
Spin channel for each MO if this is an unrestricted open-shell set.
Typically 0 for alpha, 1 for beta.
""",
)
249 changes: 237 additions & 12 deletions src/nomad_simulations/schema_packages/basis_set.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from nomad import utils
from nomad.datamodel.data import ArchiveSection
from nomad.datamodel.metainfo.annotations import ELNAnnotation
from nomad.metainfo import MEnum, Quantity, SubSection
from nomad.metainfo import JSON, MEnum, Quantity, SubSection
from nomad.units import ureg

from nomad_simulations.schema_packages.atoms_state import AtomsState
Expand Down Expand Up @@ -187,28 +187,253 @@ def normalize(self, archive: 'EntryArchive', logger: 'BoundLogger') -> None:

class AtomCenteredFunction(ArchiveSection):
"""
Specifies a single function (term) in an atom-centered basis set.
Specifies a single contracted basis function in an atom-centered basis set.

In many quantum-chemistry codes, an atom-centered basis set is composed of
several "shells," each shell containing one or more basis functions of a certain
angular momentum. For instance, a shell of p-type orbitals (L=1) typically
consists of 3 degenerate functions (p_x, p_y, p_z) if `harmonic_type='cartesian'`
or 3 spherical harmonics if `harmonic_type='spherical'`.

A single "atom-centered function" can be a linear combination of multiple
primitive Gaussians (or Slater-type orbitals, STOs).
In practice, these contract together to form the final basis function used by
the SCF or post-SCF method. Often, each contraction is labeled by its
angular momentum (e.g., s, p, d, f) and a set of exponents and coefficients.

**References**:
- T. Helgaker, P. Jørgensen, J. Olsen, *Molecular Electronic-Structure Theory*, Wiley (2000).
- F. Jensen, *Introduction to Computational Chemistry*, 2nd ed., Wiley (2007).
- J. B. Foresman, Æ. Frisch, *Exploring Chemistry with Electronic Structure Methods*, Gaussian Inc.
"""

pass
harmonic_type = Quantity(
type=MEnum(
'spherical',
'cartesian',
),
default='spherical',
description="""
Specifies whether the basis functions are expanded in **spherical** (pure)
harmonics or **cartesian** harmonics. Many modern quantum-chemistry codes
default to *spherical harmonics* for d, f, g..., which eliminates the
redundant functions found in the cartesian sets.

- `'spherical'` : (2l+1) functions for a shell of angular momentum l
- `'cartesian'` : (l+1)(l+2)/2 functions for that shell (extra functions appear)
""",
)

function_type = Quantity(
type=MEnum(
's',
'p',
'd',
'f',
'g',
'h',
'i',
'j',
'k',
'l',
'sp',
'spd',
'spdf',
),
description="""
Symbolic label for the **angular momentum** of this contracted function.
Typical single-letter labels:
- 's' => L=0
- 'p' => L=1
- 'd' => L=2
- 'f' => L=3
- 'g' => L=4
- 'h', 'i', etc. => still higher angular momenta
Combined labels like 'sp' or 'spdf' indicate a **combined shell** in which
multiple angular momenta share exponents. For example, in some older Pople
basis sets, an 'sp' shell has an s- and p-type function sharing the same
exponents but different contraction coefficients.
""",
)

n_primitive = Quantity(
type=np.int32,
description="""
Number of **primitive** functions in this contracted basis function.
For example, in a contracted Gaussian-type orbital (GTO) approach, each basis
function might be built from a sum of `n_primitive` Gaussians with different
exponents, each scaled by a contraction coefficient.
""",
)

exponents = Quantity(
type=np.float32,
shape=['n_primitive'],
description="""
The **exponents** of each primitive basis function.
In a Gaussian basis set (GTO), these are the alpha_i in
exp(-alpha_i * r^2). In a Slater-type basis (STO), they'd be
exp(-zeta_i * r). Typically sorted from largest to smallest.
""",
)

contraction_coefficients = Quantity(
type=np.float32,
shape=['*'], # Flexible shape to handle combined types (e.g. SP, SPD..)
description="""
The **contraction coefficients** associated with each primitive exponent.
In the simplest case (pure s- or p-function), this array has length
equal to `n_primitive`. For combined shells (like 'sp'), the length
might be `2 * n_primitive`, because you have separate coefficients
for the s-part and the p-part.
""",
)

point_charge = Quantity(
type=np.float32,
description="""
If using a basis function that explicitly includes a point-charge or an
ECP-like pseudo-component, this field can store that net charge.
Typically 0 for standard GTO or STO expansions.
Some extended basis concepts (or embedded charges) might set it differently.
""",
)

def normalize(self, archive: 'EntryArchive', logger: 'BoundLogger') -> None:
"""
Validates the input data
and resolves combined types like SP, SPD, SPDF, etc.

Raises ValueError: If the data is inconsistent (e.g., mismatch in exponents and coefficients).
"""
super().normalize(archive, logger)

# TODO: design system for writing basis functions like gaussian or slater orbitals
# Validate number of primitives
if self.n_primitive is not None:
if self.exponents is not None and len(self.exponents) != self.n_primitive:
raise ValueError(
f'Mismatch in number of exponents: expected {self.n_primitive}, '
f'found {len(self.exponents)}.'
)

# For combined shells (like 'sp', 'spd', etc.), ensure the coefficient array is large enough
if self.function_type and len(self.function_type) > 1:
num_types = len(self.function_type) # For SP: 2, SPD: 3, etc.
if self.contraction_coefficients is not None:
expected_coeffs = num_types * self.n_primitive
if len(self.contraction_coefficients) != expected_coeffs:
raise ValueError(
f'Mismatch in contraction coefficients for {self.function_type} type: '
f'expected {expected_coeffs}, found {len(self.contraction_coefficients)}.'
)

# Split coefficients into separate lists for each type
self.coefficient_sets = {
t: self.contraction_coefficients[i::num_types]
for i, t in enumerate(self.function_type)
}

# Debug: Log split coefficients
for t, coeffs in self.coefficient_sets.items():
logger.info(f'{t}-type coefficients: {coeffs}')
else:
logger.warning(
f'No contraction coefficients provided for {self.function_type} type.'
)

# For single types, ensure coefficients match primitives
elif self.contraction_coefficients is not None:
if len(self.contraction_coefficients) != self.n_primitive:
raise ValueError(
f'Mismatch in contraction coefficients: expected {self.n_primitive}, '
f'found {len(self.contraction_coefficients)}.'
)


class AtomCenteredBasisSet(BasisSetComponent):
"""
Defines an atom-centered basis set.
Defines an **atom-centered basis set** for quantum chemistry calculations.
Unlike plane-wave methods, these expansions are typically built around each atom's
position, using either:
- Slater-type orbitals (STO)
- Gaussian-type orbitals (GTO)
- Numerical atomic orbitals (NAO)
- Effective-core potentials or point-charges (PC, cECP, etc.)

This section references multiple `AtomCenteredFunction` objects, each describing
a single contracted function or shell. Additionally, one can label the overall
basis set name (e.g., "cc-pVTZ", "def2-SVP", "6-31G**") and specify the high-level
role of the basis set in the calculation.

**Common examples**:
- **Pople basis** (3-21G, 6-31G(d), 6-311++G(2df,2pd), etc.)
- **Dunning correlation-consistent** (cc-pVDZ, cc-pVTZ, aug-cc-pVTZ, etc.)
- **Slater basis** used in ADF, for instance
- **ECP** (Effective Core Potential) expansions like LANL2DZ or SDD for transition metals

**References**:
- F. Jensen, *Introduction to Computational Chemistry*, 2nd ed., Wiley (2007).
- A. Szabo, N. S. Ostlund, *Modern Quantum Chemistry*, McGraw-Hill (1989).
- T. H. Dunning Jr., J. Chem. Phys. 90, 1007 (1989) for correlation-consistent basis sets.
"""

basis_set = Quantity(
type=str,
description="""
**Name** or label of the basis set as recognized by the code or standard
library. Examples: "6-31G*", "cc-pVTZ", "def2-SVP", "STO-3G", "LANL2DZ" (ECP).
""",
)

type = Quantity(
type=MEnum(
'STO', # Slater-type orbitals
'GTO', # Gaussian-type orbitals
'NAO', # Numerical atomic orbitals
'cECP', # Capped effective core potentials
'PC', # Point charges
),
description="""
The **functional form** of the basis set:
- 'STO': Slater-type orbitals
- 'GTO': Gaussian-type orbitals
- 'NAO': Numerical atomic orbitals
- 'cECP': Some variant of a "capped" or shape-consistent ECP
- 'PC': Point charges (or ghost basis centers)

If a code uses a mixture (e.g., GTO + ECP), it might either store them
as separate `AtomCenteredBasisSet` sections or unify them if the code does so internally.
""",
)

role = Quantity(
type=MEnum(
'orbital',
'auxiliary_scf',
'auxiliary_post_hf',
'cabs',
),
description="""
The role of this basis set in the calculation:
- 'orbital': main orbital basis for the SCF
- 'auxiliary_scf': used for RI-J or density fitting in SCF
- 'auxiliary_post_hf': used in MP2, CC, etc.
- 'cabs': complementary auxiliary basis for explicitly correlated (F12) methods.
""",
)

total_number_of_basis_functions = Quantity(
type=np.int32,
description="""
The **total** number of contracted basis functions in this entire set.
This is typically the sum of all `(2l+1)` or cartesian expansions across
all shells on all relevant atoms (within the scope of this section).
""",
)

functional_composition = SubSection(
sub_section=AtomCenteredFunction.m_def, repeats=True
) # TODO change name

def normalize(self, archive: 'EntryArchive', logger: 'BoundLogger') -> None:
super().normalize(archive, logger)
# self.name = self.m_def.name
# TODO: set name based on basis functions
# ? use basis set names from Basis Set Exchange
)


class APWBaseOrbital(ArchiveSection):
Expand Down
Loading
Loading