diff --git a/doc/release-notes/10542-signposting.md b/doc/release-notes/10542-signposting.md new file mode 100644 index 00000000000..f847e6ba94a --- /dev/null +++ b/doc/release-notes/10542-signposting.md @@ -0,0 +1,11 @@ +# Signposting Output Now Contains Links to All Dataset Metadata Export Formats + +When Signposting was added in Dataverse 5.14 (#8981), it only provided links for the `schema.org` metadata export format. + +The output of HEAD, GET, and the Signposting "linkset" API have all been updated to include links to all available dataset metadata export formats (including any external exporters, such as Croissant, that have been enabled). + +This provides a lightweight machine-readable way to first retrieve a list of links (via a HTTP HEAD request, for example) to each available metadata export format and then follow up with a request for the export format of interest. + +In addition, the content type for the `schema.org` dataset metadata export format has been corrected. It was `application/json` and now it is `application/ld+json`. + +See also [the docs](https://preview.guides.gdcc.io/en/develop/api/native-api.html#retrieve-signposting-information) and #10542. diff --git a/doc/sphinx-guides/source/admin/discoverability.rst b/doc/sphinx-guides/source/admin/discoverability.rst index 19ef7250a29..22ff66246f0 100644 --- a/doc/sphinx-guides/source/admin/discoverability.rst +++ b/doc/sphinx-guides/source/admin/discoverability.rst @@ -51,7 +51,7 @@ The Dataverse team has been working with Google on both formats. Google has `ind Signposting +++++++++++ -The Dataverse software supports `Signposting `_. This allows machines to request more information about a dataset through the `Link `_ HTTP header. +The Dataverse software supports `Signposting `_. This allows machines to request more information about a dataset through the `Link `_ HTTP header. Links to all enabled metadata export formats are given. See :ref:`metadata-export-formats` for a list. There are 2 Signposting profile levels, level 1 and level 2. In this implementation, * Level 1 links are shown `as recommended `_ in the "Link" diff --git a/doc/sphinx-guides/source/api/changelog.rst b/doc/sphinx-guides/source/api/changelog.rst index 162574e7799..888b3319889 100644 --- a/doc/sphinx-guides/source/api/changelog.rst +++ b/doc/sphinx-guides/source/api/changelog.rst @@ -9,12 +9,11 @@ This API changelog is experimental and we would love feedback on its usefulness. v6.6 ---- - - **/api/metadatablocks** is no longer returning duplicated metadata properties and does not omit metadata properties when called. +- The content type for the ``schema.org`` dataset metadata export format has been corrected. It was ``application/json`` and now it is ``application/ld+json``. See also :ref:`export-dataset-metadata-api`. v6.5 ---- - - **/api/datasets/{identifier}/links**: The response from :ref:`list-collections-linked-from-dataset` has been improved to provide a more structured (but backward-incompatible) JSON response. v6.4 diff --git a/doc/sphinx-guides/source/api/native-api.rst b/doc/sphinx-guides/source/api/native-api.rst index cf088963b7d..88f929f0969 100644 --- a/doc/sphinx-guides/source/api/native-api.rst +++ b/doc/sphinx-guides/source/api/native-api.rst @@ -1345,6 +1345,8 @@ Export Metadata of a Dataset in Various Formats |CORS| Export the metadata of the current published version of a dataset in various formats. +To get a list of available formats, see :ref:`available-exporters` and :ref:`get-export-formats`. + See also :ref:`batch-exports-through-the-api` and the note below: .. code-block:: bash @@ -1361,9 +1363,30 @@ The fully expanded example above (without environment variables) looks like this curl "https://demo.dataverse.org/api/datasets/export?exporter=ddi&persistentId=doi:10.5072/FK2/J8SJZB" -.. note:: Supported exporters (export formats) are ``ddi``, ``oai_ddi``, ``dcterms``, ``oai_dc``, ``schema.org`` , ``OAI_ORE`` , ``Datacite``, ``oai_datacite`` and ``dataverse_json``. Descriptive names can be found under :ref:`metadata-export-formats` in the User Guide. +.. _available-exporters: + +Available Dataset Metadata Exporters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following dataset metadata exporters ship with Dataverse: + +- ``Datacite`` +- ``dataverse_json`` +- ``dcterms`` +- ``ddi`` +- ``oai_datacite`` +- ``oai_dc`` +- ``oai_ddi`` +- ``OAI_ORE`` +- ``schema.org`` + +These are the strings to pass as ``$METADATA_FORMAT`` in the examples above. Descriptive names for each format can be found under :ref:`metadata-export-formats` in the User Guide. + +Additional exporters can be enabled, as described under :ref:`external-exporters` in the Installation Guide. The machine-readable name/identifier for each external exporter can be found under :ref:`inventory-of-external-exporters`. If you are interested in creating your own exporter, see :doc:`/developers/metadataexport`. + +To discover the machine-readable name of exporters (e.g. ``ddi``) that have been enabled on the installation of Dataverse you are using see :ref:`get-export-formats`. Alternatively, you can use the Signposting "linkset" API documented under :ref:`signposting-api`. -.. note:: Additional exporters can be enabled, as described under :ref:`external-exporters` in the Installation Guide. To discover the machine-readable name of each exporter (e.g. ``ddi``), check :ref:`inventory-of-external-exporters` or ``getFormatName`` in the exporter's source code. +To discover the machine-readable name of exporters generally, check :ref:`inventory-of-external-exporters` or ``getFormatName`` in the exporter's source code. Schema.org JSON-LD ^^^^^^^^^^^^^^^^^^ @@ -1377,6 +1400,8 @@ Both forms are valid according to Google's Structured Data Testing Tool at https The standard has further evolved into a format called Croissant. For details, see :ref:`schema.org-head` in the Admin Guide. +The ``schema.org`` format changed after Dataverse 6.4 as well. Previously its content type was "application/json" but now it is "application/ld+json". + List Files in a Dataset ~~~~~~~~~~~~~~~~~~~~~~~ @@ -2943,15 +2968,23 @@ Retrieve Signposting Information Dataverse supports :ref:`discovery-sign-posting` as a discovery mechanism. Signposting involves the addition of a `Link `__ HTTP header providing summary information on GET and HEAD requests to retrieve the dataset page and a separate /linkset API call to retrieve additional information. -Here is an example of a "Link" header: +Signposting Link HTTP Header +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Here is an example of a HTTP "Link" header from a GET or HEAD request for a dataset landing page: -``Link: ;rel="cite-as", ;rel="describedby";type="application/vnd.citationstyles.csl+json",;rel="describedby";type="application/ld+json", ;rel="type",;rel="type", ;rel="license", ; rel="linkset";type="application/linkset+json"`` +``Link: ;rel="cite-as", ;rel="describedby";type="application/vnd.citationstyles.csl+json",;rel="describedby";type="application/json",;rel="describedby";type="application/xml",;rel="describedby";type="application/xml",;rel="describedby";type="application/xml",;rel="describedby";type="application/ld+json",;rel="describedby";type="application/xml",;rel="describedby";type="application/xml",;rel="describedby";type="text/html",;rel="describedby";type="application/json",;rel="describedby";type="application/xml", ;rel="type",;rel="type", ;rel="license", ; rel="linkset";type="application/linkset+json"`` -The URL for linkset information is discoverable under the ``rel="linkset";type="application/linkset+json`` entry in the "Link" header, such as in the example above. +The URL for linkset information (described below) is discoverable under the ``rel="linkset";type="application/linkset+json`` entry in the "Link" header, such as in the example above. + +Signposting Linkset API Endpoint +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The reponse includes a JSON object conforming to the `Signposting `__ specification. As part of this conformance, unlike most Dataverse API responses, the output is not wrapped in a ``{"status":"OK","data":{`` object. Signposting is not supported for draft dataset versions. +Like :ref:`get-export-formats`, this API can be used to get URLs to dataset metadata export formats, but with URLs for the dataset in question. + .. code-block:: bash export SERVER_URL=https://demo.dataverse.org @@ -4890,12 +4923,14 @@ The fully expanded example above (without environment variables) looks like this curl "https://demo.dataverse.org/api/info/settings/:MaxEmbargoDurationInMonths" -Get Export Formats -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.. _get-export-formats: + +Get Dataset Metadata Export Formats +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Get the available export formats, including custom formats. +Get the available dataset metadata export formats, including formats from external exporters (see :ref:`available-exporters`). -The response contains an object with available format names as keys, and as values an object with the following properties: +The response contains a JSON object with the available format names as keys (these can be passed to :ref:`export-dataset-metadata-api`), and values as objects with the following properties: * ``displayName`` * ``mediaType`` diff --git a/doc/sphinx-guides/source/user/dataset-management.rst b/doc/sphinx-guides/source/user/dataset-management.rst index b3a14554b40..d1acb3294fc 100755 --- a/doc/sphinx-guides/source/user/dataset-management.rst +++ b/doc/sphinx-guides/source/user/dataset-management.rst @@ -43,6 +43,8 @@ Additional formats can be enabled. See :ref:`inventory-of-external-exporters` in Each of these metadata exports contains the metadata of the most recently published version of the dataset. +For each dataset, links to each enabled metadata format are available programmatically via Signposting. For details, see :ref:`discovery-sign-posting` in the Admin Guide and :ref:`signposting-api` in the API Guide. + .. _adding-new-dataset: Adding a New Dataset diff --git a/src/main/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporter.java b/src/main/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporter.java index 0c4b39fd641..d4f2f95389f 100644 --- a/src/main/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporter.java +++ b/src/main/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporter.java @@ -111,7 +111,11 @@ public Boolean isAvailableToUsers() { @Override public String getMediaType() { - return MediaType.APPLICATION_JSON; + /** + * Changed from "application/json" to "application/ld+json" because + * that's what Signposting expects. + */ + return "application/ld+json"; } } diff --git a/src/main/java/edu/harvard/iq/dataverse/util/SignpostingResources.java b/src/main/java/edu/harvard/iq/dataverse/util/SignpostingResources.java index b6f8870aa2d..8bebcf4d438 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/SignpostingResources.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/SignpostingResources.java @@ -16,6 +16,7 @@ Two configurable options allow changing the limit for the number of authors or d import edu.harvard.iq.dataverse.*; import edu.harvard.iq.dataverse.dataset.DatasetUtil; +import edu.harvard.iq.dataverse.export.ExportService; import jakarta.json.Json; import jakarta.json.JsonArrayBuilder; import jakarta.json.JsonObjectBuilder; @@ -28,6 +29,8 @@ Two configurable options allow changing the limit for the number of authors or d import java.util.logging.Logger; import static edu.harvard.iq.dataverse.util.json.NullSafeJsonBuilder.jsonObjectBuilder; +import io.gdcc.spi.export.ExportException; +import io.gdcc.spi.export.Exporter; public class SignpostingResources { private static final Logger logger = Logger.getLogger(SignpostingResources.class.getCanonicalName()); @@ -72,8 +75,17 @@ public String getLinks() { } String describedby = "<" + ds.getGlobalId().asURL().toString() + ">;rel=\"describedby\"" + ";type=\"" + "application/vnd.citationstyles.csl+json\""; - describedby += ",<" + systemConfig.getDataverseSiteUrl() + "/api/datasets/export?exporter=schema.org&persistentId=" - + ds.getProtocol() + ":" + ds.getAuthority() + "/" + ds.getIdentifier() + ">;rel=\"describedby\"" + ";type=\"application/ld+json\""; + ExportService instance = ExportService.getInstance(); + for (String[] labels : instance.getExportersLabels()) { + String formatName = labels[1]; + Exporter exporter; + try { + exporter = ExportService.getInstance().getExporter(formatName); + describedby += ",<" + getExporterUrl(formatName, ds) + ">;rel=\"describedby\"" + ";type=\"" + exporter.getMediaType() + "\""; + } catch (ExportException ex) { + logger.warning("Could not look up exporter based on " + formatName + ". Exception: " + ex); + } + } valueList.add(describedby); String type = ";rel=\"type\""; @@ -85,7 +97,7 @@ public String getLinks() { String linkset = "<" + systemConfig.getDataverseSiteUrl() + "/api/datasets/:persistentId/versions/" + workingDatasetVersion.getVersionNumber() + "." + workingDatasetVersion.getMinorVersionNumber() - + "/linkset?persistentId=" + ds.getProtocol() + ":" + ds.getAuthority() + "/" + ds.getIdentifier() + "> ; rel=\"linkset\";type=\"application/linkset+json\""; + + "/linkset?persistentId=" + ds.getGlobalId().asString() + "> ; rel=\"linkset\";type=\"application/linkset+json\""; valueList.add(linkset); logger.fine(String.format("valueList is: %s", valueList)); @@ -95,7 +107,7 @@ public String getLinks() { public JsonArrayBuilder getJsonLinkset() { Dataset ds = workingDatasetVersion.getDataset(); GlobalId gid = ds.getGlobalId(); - String landingPage = systemConfig.getDataverseSiteUrl() + "/dataset.xhtml?persistentId=" + ds.getProtocol() + ":" + ds.getAuthority() + "/" + ds.getIdentifier(); + String landingPage = systemConfig.getDataverseSiteUrl() + "/dataset.xhtml?persistentId=" + ds.getGlobalId().asString(); JsonArrayBuilder authors = getJsonAuthors(getAuthorURLs(false)); JsonArrayBuilder items = getJsonItems(); @@ -112,15 +124,24 @@ public JsonArrayBuilder getJsonLinkset() { ) ); - mediaTypes.add( - jsonObjectBuilder().add( - "href", - systemConfig.getDataverseSiteUrl() + "/api/datasets/export?exporter=schema.org&persistentId=" + ds.getProtocol() + ":" + ds.getAuthority() + "/" + ds.getIdentifier() - ).add( - "type", - "application/ld+json" - ) - ); + ExportService instance = ExportService.getInstance(); + for (String[] labels : instance.getExportersLabels()) { + String formatName = labels[1]; + Exporter exporter; + try { + exporter = ExportService.getInstance().getExporter(formatName); + mediaTypes.add( + jsonObjectBuilder().add( + "href", getExporterUrl(formatName, ds) + ).add( + "type", + exporter.getMediaType() + ) + ); + } catch (ExportException ex) { + logger.warning("Could not look up exporter based on " + formatName + ". Exception: " + ex); + } + } JsonArrayBuilder linksetJsonObj = Json.createArrayBuilder(); JsonObjectBuilder mandatory; @@ -274,4 +295,9 @@ private String getPublicDownloadUrl(DataFile dataFile) { return FileUtil.getPublicDownloadUrl(systemConfig.getDataverseSiteUrl(), ((gid != null) ? gid.asString() : null), dataFile.getId()); } + + private String getExporterUrl(String formatName, Dataset ds) { + return systemConfig.getDataverseSiteUrl() + + "/api/datasets/export?exporter=" + formatName + "&persistentId=" + ds.getGlobalId().asString(); + } } diff --git a/src/test/java/edu/harvard/iq/dataverse/api/SignpostingIT.java b/src/test/java/edu/harvard/iq/dataverse/api/SignpostingIT.java index 75f514f3398..679d1e79aa1 100644 --- a/src/test/java/edu/harvard/iq/dataverse/api/SignpostingIT.java +++ b/src/test/java/edu/harvard/iq/dataverse/api/SignpostingIT.java @@ -16,6 +16,8 @@ import java.util.regex.Pattern; import jakarta.json.JsonObject; +import static org.hamcrest.CoreMatchers.endsWith; +import static org.hamcrest.CoreMatchers.is; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; @@ -56,8 +58,21 @@ public void testSignposting() { String datasetLandingPage = RestAssured.baseURI + "/dataset.xhtml?persistentId=" + datasetPid; System.out.println("Checking dataset landing page for Signposting: " + datasetLandingPage); Response getHtml = given().get(datasetLandingPage); + getHtml.then().assertThat() + .statusCode(OK.getStatusCode()) + .header("Link", endsWith("linkset?persistentId=" + datasetPid + "> ; rel=\"linkset\";type=\"application/linkset+json\"")); System.out.println("Link header: " + getHtml.getHeader("Link")); + if (false) { + // Split on commas to make the output more readable. + System.out.println("---"); + String header = getHtml.getHeader("Link"); + for (String string : header.split(",")) { + System.out.println(string + ","); + } + System.out.println("returning early..."); + return; + } getHtml.then().assertThat().statusCode(OK.getStatusCode()); @@ -67,6 +82,8 @@ public void testSignposting() { assertTrue(linkHeader.contains(datasetPid)); assertTrue(linkHeader.contains("cite-as")); assertTrue(linkHeader.contains("describedby")); + // Make sure we get more exporters besides just "schema.org". + assertTrue(linkHeader.contains("oai_datacite")); Response headHtml = given().head(datasetLandingPage); @@ -76,6 +93,7 @@ public void testSignposting() { // Make sure there's Signposting stuff in the "Link" header such as // the dataset PID, cite-as, etc. + // TODO: The comment above is a repeat and so are some of the assertions below. Consolidate? linkHeader = getHtml.getHeader("Link"); assertTrue(linkHeader.contains(datasetPid)); assertTrue(linkHeader.contains("cite-as")); @@ -90,8 +108,15 @@ public void testSignposting() { System.out.println("Linkset URL: " + linksetUrl); Response linksetResponse = given().accept(ContentType.JSON).get(linksetUrl); + linksetResponse.prettyPrint(); + linksetResponse.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("linkset[0].anchor", endsWith("/dataset.xhtml?persistentId=" + datasetPid)) + .body("linkset[0].license.href", is("http://creativecommons.org/publicdomain/zero/1.0")) + .body("linkset[0].describedby[1].href", endsWith("persistentId=" + datasetPid)); String responseString = linksetResponse.getBody().asString(); + System.out.println("response string: " + responseString); JsonObject data = JsonUtil.getJsonObject(responseString); JsonObject lso = data.getJsonArray("linkset").getJsonObject(0); @@ -107,6 +132,13 @@ public void testSignposting() { Pattern exporterPattern = Pattern.compile("[<\\[][^()\\[\\]]*?exporter=schema.org[^()\\[\\]]*[>\\]]"); Matcher exporterMatcher = exporterPattern.matcher(linkHeader); exporterMatcher.find(); + // TODO: make an assertion + //assertTrue(exporterMatcher.find()); + + // Test another + Pattern exporterPattern2 = Pattern.compile("exporter=oai_datacite"); + Matcher exporterMatcher2 = exporterPattern2.matcher(linkHeader); + assertTrue(exporterMatcher2.find()); Response exportDataset = UtilIT.exportDataset(datasetPid, "schema.org"); exportDataset.prettyPrint(); diff --git a/src/test/resources/json/export-formats.json b/src/test/resources/json/export-formats.json index b4dc0168629..65fc746ee23 100644 --- a/src/test/resources/json/export-formats.json +++ b/src/test/resources/json/export-formats.json @@ -36,7 +36,7 @@ }, "schema.org": { "displayName": "Schema.org JSON-LD", - "mediaType": "application/json", + "mediaType": "application/ld+json", "isHarvestable": false, "isVisibleInUserInterface": true },