-
Notifications
You must be signed in to change notification settings - Fork 875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve unphysical (greater than 1) occupancy handling in CifParser
and add missing site label if not check_occu
#3819
Conversation
@@ -735,10 +735,10 @@ def test_bad_cif(self): | |||
filepath = f"{TEST_FILES_DIR}/cif/bad_occu.cif" | |||
parser = CifParser(filepath) | |||
with pytest.raises( | |||
ValueError, match="No structure parsed for section 1 in CIF.\nSpecies occupancies sum to more than 1!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous error message might be misleading, as this fails because the occupancy is greater than tolerance
, not 1
.
@@ -1196,8 +1203,7 @@ def parse_structures( | |||
"in the CIF file as is. If you want the primitive cell, please set primitive=True explicitly.", | |||
UserWarning, | |||
) | |||
if not check_occu: # added in https://github.com/materialsproject/pymatgen/pull/2836 | |||
warnings.warn("Structures with unphysical site occupancies are not compatible with many pymatgen features.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This warning should not be raised just because if not check_occu
, it should really check the occupancy and only warn if the occupancy is "unphysical".
The following code has already done this, so I would suggest removing it:
Lines 1011 to 1018 in 2e1c301
if any(occu > 1 for occu in sum_occu): | |
msg = ( | |
f"Some occupancies ({sum_occu}) sum to > 1! If they are within " | |
"the occupancy_tolerance, they will be rescaled. " | |
f"The current occupancy_tolerance is set to: {self._occupancy_tolerance}" | |
) | |
warnings.warn(msg) | |
self.warnings.append(msg) |
@janosh. Can you please review (and comment) on this? Thanks. |
io.cif.CifParser
CifParser
CifParser
CifParser
CifParser
CifParser
and add missing site label if not check_occu
8d7c087
to
c76726a
Compare
@@ -1149,7 +1149,10 @@ def get_matching_coord( | |||
all_species_noedit = all_species.copy() # save copy before scaling in case of check_occu=False, used below | |||
for idx, species in enumerate(all_species): | |||
total_occu = sum(species.values()) | |||
if 1 < total_occu <= self._occupancy_tolerance: | |||
if check_occu and total_occu > self._occupancy_tolerance: | |||
raise ValueError(f"Occupancy {total_occu} exceeded tolerance.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would help capture the occupancy > tolerance
error.
Without this, if an occupancy is greater than tolerance
, it would not be scaled, and passed directly into Structure
. The Exception
raised by Structure
(because of unphysical occupancy) would be compressed and replaced with a general message which doesn't show the reason for failure:
Lines 1291 to 1301 in bb68c78
for idx, data in enumerate(self._cif.data.values()): | |
try: | |
if struct := self._get_structure(data, primitive, symmetrized, check_occu=check_occu): | |
structures.append(struct) | |
except (KeyError, ValueError) as exc: | |
msg = f"No structure parsed for section {idx + 1} in CIF.\n{exc}" | |
if on_error == "raise": | |
raise ValueError(msg) from exc | |
if on_error == "warn": | |
warnings.warn(msg) |
For example the following error message provided in #3816:
Some occupancies ([2.0, 2.0, 2.0, 1.0, 1.0]) sum to > 1! If they are within the occupancy_tolerance, they will be rescaled. The current occupancy_tolerance is set to: 1.0
No structure parsed for section 1 in CIF.
...
in CifParser.parse_structures(self, primitive, symmetrized, check_occu, on_error)
1219 warnings.warn("Issues encountered while parsing CIF: " + "\n".join(self.warnings))
1221 if len(structures) == 0:
-> 1222 raise ValueError("Invalid CIF file with no structures!")
1223 return structures
ValueError: Invalid CIF file with no structures!
Summary
To fix #3816.
ValueError
if occupancy exceedsoccupancy_tolerance
and ifcheck_occu
, with a more descriptive error message.occupancy_tolerance
) in cif whennot check_occu
in 7cf5ba9, a warning would still be raised.if not check_occu
by 05c11d4.A more descriptive error message for unphysical sites
With currently implementation (with
check_occu = False
),ValueError
for occupancy beyond tolerance would be raised by the checker insideStructure
(repored in #3816), with a misleading error message (Some occupancies ([2.0, 2.0, 2.0, 1.0, 1.0]) sum to > 1
), where it should not be> 1
, but> occupancy_tolerance
instead.Clarify
check_occu
pymatgen/pymatgen/io/cif.py
Lines 1263 to 1266 in bb68c78
The name of
check_occu
does not reflect its functionality. With current implementation, occupancy would be checked regardless of the value ofcheck_occu
.pymatgen/pymatgen/io/cif.py
Lines 1067 to 1081 in bb68c78
We would need to clarify its behavior in docstring.