-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
136-feature: Allow import/export from JSON #333
136-feature: Allow import/export from JSON #333
Conversation
This commit adds support for pydantic in CPTData by checking if pydantic is installed and using it if available. If pydantic is not installed, the dataclass module is used instead.
src/pygef/cpt.py
Outdated
from datetime import date | ||
from enum import Enum | ||
from typing import Any, List | ||
|
||
import polars as pl | ||
|
||
# check if pydantic is installed | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tversteeg I tried to keep pydantic as an optional dependency because I wasn't sure if you were willing to accept pydantic as an additional dependency.
Pydantic offers a near drop-in replacement for the builtins dataclass
decorator, so thought this might be an elegant solution. Upon running the tests there are so validation errors where some fields were None
while this was not allowed according to the type annotations,
Still need a some work of course to convert the polars
DataFrame to json, but wanted to get a quick check on how you feel about using pydantic in this manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this solution, it looks elegent. Although I wonder if it needs to be optional, is there a convincing argument against using pydantic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for me, I use Pydantic all over the place so would recommend it :)
But I think it will introduce a small overhead due to data validation step introduced by Pydantic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that overhead also prevents some nasty bugs and user errors, so for me that's probably worth it. @RDWimmers @tlukkezen, thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, lets add pydantic as an additional dependency.
- added config class and custum json_encoder for polars DataFrames - added `computed_field` decorator to `property` attributes
Waited for the release of pydantic v2, which is now implemented in Rust and is significantly faster than v1.x eg, below throws a validation error since date is not optional. class CPTData(BaseModel):
....
research_report_date: date
...
# shim.py:
def gef_cpt_to_cpt_data(gef_cpt: _GefCpt) -> CPTData:
....
kwargs["research_report_date"] = None
....
return CPTData(**kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to implement v2 of pydantic
@@ -91,25 +90,35 @@ def __post_init__(self): | |||
# bypass FrozenInstanceError | |||
object.__setattr__(self, "data", df) | |||
|
|||
class Config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BaseConfig is deprecated. Use the pydantic.ConfigDict
instead.
@@ -167,25 +166,35 @@ def __post_init__(self): | |||
# bypass FrozenInstanceError | |||
object.__setattr__(self, "data", df) | |||
|
|||
class Config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BaseConfig is deprecated. Use the pydantic.ConfigDict
instead.
We can set the validation errors to None, looking at the minimal available attributes of the GEF and XML definitions there are not mandatory. |
Adding support for exporting and import data from json.
As discussed in #136, try to achieve this using Pydantic