Classifiers

The ability to subset and manipulate the content of the knowledge base to meet the diverse needs of its users is crucial. Some users may need to query the entire content, while others may only require a specific subset of resources that meet certain criteria. This section outlines the essential mechanisms and attributes for classifying or categorizing resources when querying the knowledge base. Ideally, these should be reflected in some form through the API to provide flexibility and accessibility for all users.

OpenAPI

Available classifiers

The following attributes are currently available to filter or classify OpenAPI resources:

class: available in the Postgres resource view and the metadata and can be used to distinguish between OpenAPI2 and OpenAPI3 resources
collection: the harvester used to collect this resource. Current options are 'kin' and 'postman_apis'
validity: a true/false boolean attribute that can be found in the resource metadata (isValid)
size: expressed in bytes and available in the Postgres resource view and the metadata
version: available the JSON metadata under the

Custom search criteria can naturally be expressed directly in Postgres queries, an many of our research questions can technically be used as a classifier. But not all options can realistically be exposed as API parameters.

Spectral based classifiers

We are currently exploring the use of spectral rules to expand the classification of resources. Essentially, any set of spectral rules can be applied to evaluate OpenAPI resources, resulting in a pass/fail status and a score for the entire set or at the individual rule level. The resulting report is stored in the database as a JSON-formatted resource attachment, which can then be utilized to create a query filter or as an analytical dimension. This not only enables us to compute statistics on API compliance with policies and best practices, but also to use the ruleset as a classifier to match a specific definition.

A ruleset can be generic (e.g. a common definition or public policy) or reflect the perpective or definition of specific organizations or individuals. For example, we could have multiple ruleset qualifying an API as 'popular', 'real', 'valid'.

Note that for this to work generically, each ruleset (and rule within that set) must have a clear and unique identifier. These should be stored in our GitHub repo.

Inferred classifiers

There are many other classifiers we would like to associate with the OpenAPI resources and need to investigate how they can be implemented. These include:

provenance: who is behind this API, who are the custodians
industry classification (e.g using NAICS)
sector: public, academic, private
consumers: who are the user communities
concepts: what is the API about, keywords, subjects, etc.
lifecycle stage: where is this API if the producer and consumer lifecycle
operational: it is up and running somewhere, public / private
language: which language does the API speaks (ISO 639-1)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Classifiers

OpenAPI

Available classifiers

Spectral based classifiers

Inferred classifiers

Clone this wiki locally