Skip to content

xarray-contrib/xarray-schema

Repository files navigation

xarray-schema

Schema validation for Xarray

CI codecov MIT License

installation

Install xarray-schema from PyPI:

pip install xarray-schema

Conda:

conda install -c conda-forge xarray-schema

Or install it from source:

pip install git+https://github.com/xarray-contrib/xarray-schema

usage

Xarray-schema's API is modeled after Pandera. The DataArraySchema and DatasetSchema objects both have .validate() methods.

The basic usage is as follows:

import numpy as np
import xarray as xr
from xarray_schema import DataArraySchema, DatasetSchema, CoordsSchema

da = xr.DataArray(np.ones(4, dtype='i4'), dims=['x'], name='foo')

schema = DataArraySchema(dtype=np.integer, name='foo', shape=(4, ), dims=['x'])

schema.validate(da)

You can also use it to validate a Dataset like so:

schema_ds = DatasetSchema({'foo': schema})

schema_ds.validate(da.to_dataset())

Each component of the Xarray data model is implemented as a stand alone class:

from xarray_schema.components import (
    DTypeSchema,
    DimsSchema,
    ShapeSchema,
    NameSchema,
    ChunksSchema,
    ArrayTypeSchema,
    AttrSchema,
    AttrsSchema
)

# example constructions
dtype_schema = DTypeSchema('i4')
dims_schema = DimsSchema(('x', 'y', None))  # None is used as a wildcard
shape_schema = ShapeSchema((5, 10, None))  # None is used as a wildcard
name_schema = NameSchema('foo')
chunk_schema = ChunksSchema({'x': None, 'y': -1})  # None is used as a wildcard, -1 is used as
ArrayTypeSchema = ArrayTypeSchema(np.ndarray)

# Example usage
dtype_schema.validate(da.dtype)

# Each object schema can be exported to JSON format
dtype_json = dtype_schema.to_json()

roadmap

This is a very early prototype of a library. Some key things are missing:

  1. Exceptions: Pandera accumulates schema exceptions and reports them all at once. Currently, we are a eagerly raising SchemaErrors when the are found.

license

All the code in this repository is MIT licensed.

history

This project was originally developed at CarbonPlan. It was transferred to the xarray-contrib organization in August 2022.