JSON is a popular message interchange format employed in API design for its simplicity, readability, flexibility and wide support. However, json.dump
and json.load
offer no direct support when working with Python data classes employing type annotations. This package offers services for working with strongly-typed Python classes: serializing objects to JSON, deserializing JSON to objects, and producing a JSON schema that matches the data class, e.g. to be used in an OpenAPI specification.
This package offers the following services:
- Generate a JSON object from a Python object (
object_to_json
) - Parse a JSON object into a Python object (
json_to_object
) - Generate a JSON schema from a Python type (
classdef_to_schema
andtype_to_schema
) - Validate a JSON object against a Python type (
validate_object
)
In the context of this package, a JSON object is the (intermediate) Python object representation produced by json.loads
from a JSON string. In contrast, a JSON string is the string representation generated by json.dumps
from the (intermediate) Python object representation.
- Writing a cloud function (lambda) that communicates with JSON messages received as HTTP payload or websocket text messages
- Verifying if an API endpoint receives well-formed input
- Generating a type schema for an OpenAPI specification to impose constraints on what messages an API can receive
- Parsing JSON configuration files into a Python object
Consider the following class definition:
@dataclass
class SimpleObjectExample:
bool_value: bool = True
int_value: int = 23
float_value: float = 4.5
str_value: str = "string"
First, we serialize the object to JSON with
source = SimpleObjectExample()
json_repr = object_to_json(source)
Here, the variable json_repr
has the value:
{'bool_value': True, 'int_value': 23, 'float_value': 4.5, 'str_value': 'string'}
Next, we restore the object from JSON with
target = json_to_object(SimpleObjectExample, json_repr)
Here, target
holds the restored data class object:
SimpleObjectExample(bool_value=True, int_value=23, float_value=4.5, str_value='string')
We can also produce the JSON schema corresponding to the Python class:
json_schema = json.dumps(classdef_to_schema(SimpleObjectExample), indent=4)
which yields
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"bool_value": {
"type": "boolean",
"default": true
},
"int_value": {
"type": "integer",
"default": 23
},
"float_value": {
"type": "number",
"default": 4.5
},
"str_value": {
"type": "string",
"default": "string"
}
},
"additionalProperties": false,
"required": [
"bool_value",
"int_value",
"float_value",
"str_value"
]
}
The following table shows the conversion types the package employs:
Python type | JSON schema type | Behavior |
---|---|---|
None | null | |
bool | boolean | |
int | integer | |
float | number | |
str | string | |
bytes | string | represented with Base64 content encoding |
datetime | string | constrained to match ISO 8601 format 2018-11-13T20:20:39+00:00 |
date | string | constrained to match ISO 8601 format 2018-11-13 |
time | string | constrained to match ISO 8601 format 20:20:39+00:00 |
Enum | value type | stores the enumeration value type (typically integer or string) |
List[T] | array | recursive in T |
Dict[K, V] | object | recursive in V, keys are coerced into string |
Dict[Enum, V] | object | recursive in V, keys are of enumeration value type |
Set[T] | array | recursive in T, container has uniqueness constraint |
Tuple[T1, T2, ...] | array | array has fixed length, each element has specific type |
data class | object | iterates over fields of data class |
named tuple | object | iterates over fields of named tuple |
Any | object | iterates over dir(obj) |
Simple types:
Python type | JSON schema |
---|---|
bool | {"type": "boolean"} |
int | {"type": "integer"} |
float | {"type": "number"} |
str | {"type": "string"} |
bytes | {"type": "string", "contentEncoding": "base64"} |
Enumeration types:
class Side(enum.Enum):
LEFT = "L"
RIGHT = "R"
{"enum": ["L", "R"], "type": "string"}
Container types:
Python type | JSON schema |
---|---|
List[int] | {"type": "array", "items": {"type": "integer"}} |
Dict[str, int] | {"type": "object", "additionalProperties": {"type": "integer"}} |
Set[int] | {"type": "array", "items": {"type": "integer"}, "uniqueItems": True}} |
Tuple[int, str] | {"type": "array", "minItems": 2, "maxItems": 2, "prefixItems": [{"type": "integer"}, {"type": "string"}]} |
If a composite object (e.g. a dataclass or a plain Python class) has a to_json
member function, then this function is invoked to produce a JSON object representation from an instance.
If a composite object has a from_json
class function (a.k.a. @classmethod
), then this function is invoked, passing the JSON object as an argument, to produce an instance of the corresponding type.
It is possible to declare custom types when generating a JSON schema. For example, the following class definition has the annotation @json_schema_type
, which will register a JSON schema subtype definition under the path #/definitions/AzureBlob
, which will be referenced later with $ref
:
_regexp_azure_url = re.compile(
r"^https?://([^.]+)\.blob\.core\.windows\.net/([^/]+)/(.*)$")
@dataclass
@json_schema_type(
schema={
"type": "object",
"properties": {
"mimeType": {"type": "string"},
"blob": {
"type": "string",
"pattern": _regexp_azure_url.pattern,
},
},
"required": ["mimeType", "blob"],
"additionalProperties": False,
}
)
class AzureBlob(Blob):
...
You can use @json_schema_type
without the schema
parameter to register the type name but have the schema definition automatically derived from the Python type. This is useful if the type is reused across the type hierarchy:
@json_schema_type
class Image:
...
class Study:
left: Image
right: Image
Here, the two properties of Study
(left
and right
) will refer to the same subtype #/definitions/Image
.
If a Python class has a property augmented with an underscore (_
) as per PEP 8 to avoid conflict with a Python keyword (e.g. for
or in
), the underscore is removed when reading from or writing to JSON.