Tools for building Python scripts and applications leveraging the OCSF.
If you just want to use this library as a CLI tool, install it with pip
or
poetry
and try the following commands:
python -m ocsf.compile path/to/ocsf-schema
python -m ocsf.compare my-schema-export.json path/to/ocsf-schema
python -m ocsf.schema 1.2.0
python -m ocsf.validate.compatibility path/to/ocsf-schema 1.2.0
This project began with two goals:
- Provide the OCSF community with a validator that tests for breaking changes
in
ocsf-schema
PRs. - Begin to provide the OCSF community with more composable tools and libraries, as well as approachable reference implementations of OCSF related functions, in order to make OCSF more "hackable."
The scope of this project may grow to include things like a reference implementation OCSF schema compiler.
The project targets Python 3.11 for a balance of capability and availability.
The root level package, ocsf
, is a namespace package so that other
repositories and artifacts can also use the ocsf
namespace.
This library is divided into several discrete packages.
The ocsf.util
package provides the get_schema
function. This function
leverages the functionality in the ocsf.schema
and ocsf.api
packages (below)
to easily build an OCSF schema from a file on disk, a working copy of an OCSF
repository, or from the API.
schema = get_schema("1.1.0")
schema = get_schema("./1.3.0-dev.json")
schema = get_schema("path/to/ocsf-schema")
The ocsf.schema
package contains Python data classes that represent an
OCSF schema as represented from the OCSF server's API endpoints. See the
ocsf.schema.model
module for the data model definitions.
It also includes utilities to parse the schema from a JSON string or file.
The ocsf.repository
package contains a typed Python representation of a
working copy of an OCSF schema repository. Said another way, it represents the
OCSF metaschema and repository contents in Python.
It also includes the read_repo
function to read a repository from disk.
The ocsf.compile
package "compiles" the OCSF schema from a repository just as
the OCSF server does (with very few exceptions). It is meant to provide:
- An easy to use CLI tool to compile a repository into a single JSON schema file.
- A reference implementation for others looking to better understand OCSF compilation or to create their own compiler.
The ocsf.api
package exports an OcsfApiClient
, which is a lightweight HTTP
client that can retrieve a version of the schema over HTTP and cache it on the
local filesystem. It uses thes export/schema
, api/versions
, api/profiles
,
and api/extensions
endpoints of the OCSF server.
The ocsf_tools.compare
package compares two versions of the OCSF schema and
generates a type safe difference. Its aim is to make schema comparisons easy to
work with.
This package grew out of a library used internally at Query. The original is used extensively to manage upgrading Query's data model to newer versions of OCSF, as well as to build adapters between different OCSF flavors (like AWS Security Lake on rc2 and Query on 1.1).
There is a very simple __main__
implementation to demonstrate the comparison.
You can use it as follows:
$ poetry run python -m ocsf_tools.compare 1.0.0 1.2.0
The comparison API is straightforward. Want to look for removed events?
diff = compare(get_schema("1.0.0", "1.1.0"))
for name, event in diff.classes.items():
if isinstance(event, Removal):
print(f"Oh no, we've lost {name}!")
Or changed data types?
diff = compare(get_schema("1.0.0", "1.1.0"))
for name, event in diff.classes.items():
if isinstance(event, ChangedEvent):
for attr_name, attr in event.attributes.items():
if isinstance(attr, ChangedAttr):
if isinstance(attr.type, Change):
print(f"Who changed this data type? {name}.{attr_name}")
Or new objects?
diff = compare(get_schema("1.0.0", "1.1.0"))
for name, obj in diff.objects.items():
if isinstance(obj, Addition):
print(f"A new object {name} has been discovered!")
The ocsf.validate.framework
package provides a lightweight framework for
validators. It was inspired by the needs of ocsf-validator
, which may be
ported to this framework in the future.
The ocsf.validate.compatibility
provides a backwards compatibility validator
for OCSF schema. This compares the changes between two OCSF schemata and reports
any breaking changes between the old and new version.
The easiest way to install ocsf-lib
is from PyPI using pip
or poetry
:
$ pip install ocsf-lib
If you want to work with the source, the recommended installation is with asdf
and poetry
.
$ asdf install
$ poetry install
This project uses ruff
for formatting and linting, pyright
for type
checking, and pytest
as its test runner.
Before submitting a PR, make sure you've run following:
$ poetry run ruff format
$ poetry run ruff check
$ poetry run pyright
$ poetry run pytest
With great effort, this library passes pyright's strict mode type checking. Keep it that way! The OCSF schema is big, and even the metaschema is a lot to hold in your head. Having the type checker identify mistakes for you can be very helpful.
There is one cast used from the concrete ChangedModel
types (ChangedSchema
,
ChangedAttr
, etc.) in the compare package to the generic type. For the life of
me, I can't figure it out. I blame pyright but it's probably my own fault.
Running unit tests:
$ poetry run pytest -m "not integration"
Running integration tests:
$ poetry run pytest -m integration
NOTE: Some of the integration tests require an OCSF server instance, and are using the public instance at https://schema.ocsf.io. This should probably use a local instance of the OCSF server instead.