-
-
Notifications
You must be signed in to change notification settings - Fork 304
Description
This issue discusses the current LD support in pygeoapi, and whether and how it could be improved.
Current State
Currently pygeoapi supports both a JSON and JSON-LD encoding. The JSON-LD encoding injects @content, @type and @id into the following resources:
- Landing Page and Collections Page:
pygeoapi/pygeoapi/linked_data.py
Line 46 in 46eaaf5
def jsonldify(func: Callable) -> Callable: - Collection Page:
pygeoapi/pygeoapi/linked_data.py
Line 118 in 46eaaf5
def jsonldify_collection(cls, collection: dict, locale_: str) -> dict: - Items Page:
pygeoapi/pygeoapi/linked_data.py
Line 176 in 46eaaf5
def geojson2jsonld(cls, data: dict, dataset: str, - Item Page:
pygeoapi/pygeoapi/linked_data.py
Line 269 in 46eaaf5
def jsonldify_geometry(feature: dict) -> None:
This is an example of an Item Page: https://demo.pygeoapi.io/master/collections/obs/items/371?f=jsonld
The item page also adds a GeoSparql (gsp) endpoint.
Limitations/ Issues
The default schema is https://schema.org/ and the default gsp is 'http://www.opengis.net/ont/geosparql. It is possible to link properties of the features to terms from the vocabulary.
It is possible to add other vocabularies, but not to override the default vocabulary; the way the multiple vocabularies are handled may originate conflicts with unexpected results. As an example, in this pygeoapi config we define a datetime property based on DCAT:
linked-data:
context:
- schema: https://www.w3.org/ns/dcat#
stn_id: schema:identifier
datetime:
"@id": schema:startDate
"@type": schema:DateTime
This is a snipped of the resulting JSON-LD definition of a feature:
{
"@context": [
{
"schema": "https://schema.org/",
"gsp": "http://www.opengis.net/ont/geosparql#",
"type": "@type"
},
{
"schema": "https://www.w3.org/ns/dcat#",
"stn_id": "schema:identifier",
"datetime": {
"@id": "schema:startDate",
"@type": "schema:DateTime"
}
}
],
"type": "schema:Place",
"id": 371,
"linked_data": {
"context": [
{
"schema": "https://www.w3.org/ns/dcat#",
"stn_id": "schema:identifier",
"datetime": {
"@id": "schema:startDate",
"@type": "schema:DateTime"
}
}
]
},
"stn_id": 35,
"datetime": "2001-10-30T14:24:55Z",
"value": 89.9,
Within the @context object we have two schemas defined, which means the last one takes precedence. As a result schema:place will be defined according to DCAT, rather than schema.org. Since dcat#Place does not exist, that link is broken. As the schema.org prefix is not used, the webpage will also not be parsed correctly by the Google search engine crawler.
This could be fixed by having different prefixes for each object within the @context. These behaviours should also be more clearly explained in the documentation.
Another objective which has been widely discussed in terms of visibility of SDIs, is to provide structured data to Google which could originate rich search results. This objective is articulated in the pygeoapi documentation. However, after testing the landing page and other resources with Google tools, it states that no items are detected. This usually means that the JSON-LD syntactically correct, but it does not match any of the specific rich result types that Google Search currently supports.
Verbatim JSON-LD injection feature #2171
This PR enables injecting JSON-LD context into both JSON and JSON-LD Features and STAC items:
- It only acts upon those specific resources.
- It does not update the current LD functionality; when it is switched on the current LD functionality is switched off, which means we can only have one approach or the other.
- It affects both the JSON and JSON-LD encodings, which become equal.
This approach works differently than the current approach:
- It supports only one context, which replaces the default one. The vocabularies, as well as the types are all defined in this context.
- It allows defining a sparql fallback endpoint, within the
linked_data structure. - It supports replacing the id field with the url of the item.
As an example, in this snippet that defines a feature a context is injected:
{
"@context": [
"https://ogcincubator.github.io/bblocks-examples/build/annotated/bbr/examples/observation/vectorObservationFeature/context.jsonld"
],
"id": "https://defs-dev.opengis.net/bblocks-pygeoapi/collections/ogc.bbr.examples.observation.vectorObservationFeature/items/vector-obs-1",
"type": "Feature",
"geometry": {
"type": "LineString",
"coordinates": [
[
-111.67183507997295,
40.056709946862874
],
[
-111.71,
40.156709946862875
]
]
},
The linked_data structure is defined like this, reflecting all the options:
"linked_data": {
"inject_verbatim_context": true,
"replace_id_field": "id",
"context": [
"https://ogcincubator.github.io/bblocks-examples/build/annotated/bbr/examples/observation/vectorObservationFeature/context.jsonld"
],
"fallback_sparql_endpoint": "https://defs-hosted.opengis.net/fuseki-hosted/query"
}
And this is the pygeoapi config that originated this result:
linked-data:
inject_verbatim_context: true
replace_id_field: id
context:
- https://ogcincubator.github.io/bblocks-examples/build/annotated/bbr/examples/observation/vectorObservationFeature/context.jsonld
fallback_sparql_endpoint: https://defs-hosted.opengis.net/fuseki-hosted/query
This approach seems more powerful than the current one, as it offsets the context to a separate file where everything can be defined, including setting the use of multiple vocabularies. In this way it restricts the injection of JSON-LD to a single tag.
On the downside, it does require the user to create a context file, even in the simple example that they just want to provide info to schema.org. However, this task could be made easier by reusing the existing OGC building blocks, if this is extensively explained in the documentation.