Validation #52

deltamarnix · 2024-12-02T10:59:55Z

deltamarnix
Dec 2, 2024
Collaborator

MODFLOW validates models before running them, but we foresee that we can aid the user in creating models by validating the input that the user provides. We can define multiple moments where validation can be useful:

When reading MODFLOW input. The moment that the flopy internal model is constructed during a read, we can do checks to see if the data is in the correct format, or even more rigorous checks to validate the data itself. This can be useful when users have manually edited MODFLOW input files.
When creating flopy models programmatically. Any time that a package / model / simulation variable is assigned to, we can check the input data.
When writing MODFLOW input files. We can consider to add more extensive validation checks when writing the MODFLOW input files to disk. Perhaps check if all data exists, or some reference files that need to be in the right place. Some checks could be on a very specific part, but we could also think of post-write checks where dependencies between files need to be validated.
When reading MODFLOW output. Just like input, output could be altered by hand. So we need to check if the data is still correct while reading it.

Since we were thinking of using attrs as our data class package, we can leverage the validation functionality that is built into the package. But it is likely that not all checks can be built within this structure. attrs states that initializers should remain simple. And therefore I'm not sure if we should validate a whole data grid with attrs the moment that we read it. Especially since we want to do lazy loading as much as possible with packages like Dask or XArray.
I do think that validation of more simple parameters can already help a lot, especially on input types and perhaps some extra range checks.

It would also be possible to disable the checks system while creating and setting attributes, and only check input validation via attr.validate() manually. We could give the user a simple handle to let them validate their model model.validate().

There are some built-in validators that help set up simple checks, also for lists and dicts.

wpbonelli · 2025-01-29T17:34:22Z

wpbonelli
Jan 29, 2025
Maintainer

Since we were thinking of using attrs as our data class package, we can leverage the validation functionality that is built into the package. But it is likely that not all checks can be built within this structure

I guess we should think through the invariants we might want to check. Agree that a full grid check probably doesn't fit in the attrs paradigm.. attrs validation is probably best for predicates on the variable alone, like string format checking, bounds checking for scalars.. shape checking for arrays is awkward, usually the array needs to fit some dimension(s) of the grid, but it may just need to match another variable somewhere in the simulation. I think we can use attrs converters instead of validators for this.

It would also be possible to disable the checks system while creating and setting attributes, and only check input validation via attr.validate() manually. We could give the user a simple handle to let them validate their model model.validate().

Yeah, maybe by default validation could be disabled for init/access and deferred to write-time.

Maybe eager/lazy/no validation could be controlled by a global flopy setting or a per-simulation setting.

0 replies

wpbonelli · 2025-02-08T23:30:38Z

wpbonelli
Feb 8, 2025
Maintainer

Trying to think thru first principles.

With a typed object model we impose some structure on the domain. Then we have constraints on a component class that its definition can't capture, i.e. assert X for all packages Y. At an even finer level, we have parameter validation, predicates on individual variables. Then there is the "when"— validations can take place on init and/or on modification and/or at the user's request.

A constraint can be placed on an object's type or on its value. Python being dynamically typed we can do type checking

statically (e.g. with mypy)
at runtime
both

At runtime, we can do it

manually
with attrs validation hooks as discussed above
with a runtime type checker e.g. beartype

the bear people make the point that dynamic typing is (at least sometimes) a feature, not a bug, in python, so favor runtime over strict static checking. That said, most code should still pass static checks?

beartype has a validation framework. If we are using it to check types, why not use it to check values too, for consistency's sake.

However, some field-scoped constraints need the component (or even the simulation) context, e.g. to assert an array's shape with reference to a dimension defined in the same or another component. Attrs converters seem natural to use here, since we may also want to expand a scalar value provided as convenient shorthand to the requested shape.

If we can define a "field validation" as a pure function of the value to check, and shift any context-awareness to a conversion step, we could use beartype for all field-scoped validations. Thus extending the concept "type" to include properties derivable from the runtime value.

Component-level validations would probably still need separate handling, seems like that won't fit either the attrs or beartype paradigm.

1 reply

wpbonelli Feb 14, 2025
Maintainer

some field-scoped constraints need the component (or even the simulation) context, e.g. to assert an array's shape with reference to a dimension defined in the same or another component. Attrs converters seem natural to use here, since we may also want to expand a scalar value provided as convenient shorthand to the requested shape.

Thinking has evolved here. The prototype currently uses on_setattr to do late shape validation (this is a slightly higher-level hook than conveters, and runs after them). But ultimately we can get alignment for free from xarray.DataTree dimension inheritance (cf #86)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation #52

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Validation #52

deltamarnix Dec 2, 2024 Collaborator

Replies: 2 comments · 1 reply

wpbonelli Jan 29, 2025 Maintainer

wpbonelli Feb 8, 2025 Maintainer

wpbonelli Feb 14, 2025 Maintainer

deltamarnix
Dec 2, 2024
Collaborator

Replies: 2 comments 1 reply

wpbonelli
Jan 29, 2025
Maintainer

wpbonelli
Feb 8, 2025
Maintainer

wpbonelli Feb 14, 2025
Maintainer