Skip to content

Commit

Permalink
Merge pull request #11001 from IQSS/10519-dataset-types
Browse files Browse the repository at this point in the history
allow links between dataset types and metadata blocks
  • Loading branch information
ofahimIQSS authored Feb 5, 2025
2 parents aebec05 + ce03b30 commit 9ee8754
Show file tree
Hide file tree
Showing 14 changed files with 569 additions and 53 deletions.
12 changes: 12 additions & 0 deletions doc/release-notes/10519-dataset-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
## Dataset Types can be linked to Metadata Blocks

Metadata blocks (e.g. "CodeMeta") can now be linked to dataset types (e.g. "software") using new superuser APIs.

This will have the following effects for the APIs used by the new Dataverse UI ( https://github.com/IQSS/dataverse-frontend ):

- The list of fields shown when creating a dataset will include fields marked as "displayoncreate" (in the tsv/database) for metadata blocks (e.g. "CodeMeta") that are linked to the dataset type (e.g. "software") that is passed to the API.
- The metadata blocks shown when editing a dataset will include metadata blocks (e.g. "CodeMeta") that are linked to the dataset type (e.g. "software") that is passed to the API.

Mostly in order to write automated tests for the above, a [displayOnCreate](https://dataverse-guide--11001.org.readthedocs.build/en/11001/api/native-api.html#set-displayoncreate-for-a-dataset-field) API endpoint has been added.

For more information, see the guides ([overview](https://dataverse-guide--11001.org.readthedocs.build/en/11001/user/dataset-management.html#dataset-types), [new APIs](https://dataverse-guide--11001.org.readthedocs.build/en/11001/api/native-api.html#link-dataset-type-with-metadata-blocks)), #10519 and #11001.
63 changes: 61 additions & 2 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -540,6 +540,8 @@ The fully expanded example above (without environment variables) looks like this
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X DELETE "https://demo.dataverse.org/api/dataverses/root/assignments/6"
.. _list-metadata-blocks-for-a-collection:

List Metadata Blocks Defined on a Dataverse Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -567,6 +569,7 @@ This endpoint supports the following optional query parameters:

- ``returnDatasetFieldTypes``: Whether or not to return the dataset field types present in each metadata block. If not set, the default value is false.
- ``onlyDisplayedOnCreate``: Whether or not to return only the metadata blocks that are displayed on dataset creation. If ``returnDatasetFieldTypes`` is true, only the dataset field types shown on dataset creation will be returned within each metadata block. If not set, the default value is false.
- ``datasetType``: Optionally return additional fields from metadata blocks that are linked with a particular dataset type (see :ref:`dataset-types` in the User Guide). Pass a single dataset type as a string. For a list of dataset types you can pass, see :ref:`api-list-dataset-types`.

An example using the optional query parameters is presented below:

Expand All @@ -575,14 +578,17 @@ An example using the optional query parameters is presented below:
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export ID=root
export DATASET_TYPE=software
curl -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/dataverses/$ID/metadatablocks?returnDatasetFieldTypes=true&onlyDisplayedOnCreate=true"
curl -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/dataverses/$ID/metadatablocks?returnDatasetFieldTypes=true&onlyDisplayedOnCreate=true&datasetType=$DATASET_TYPE"
The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" "https://demo.dataverse.org/api/dataverses/root/metadatablocks?returnDatasetFieldTypes=true&onlyDisplayedOnCreate=true"
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" "https://demo.dataverse.org/api/dataverses/root/metadatablocks?returnDatasetFieldTypes=true&onlyDisplayedOnCreate=true&datasetType=software"
.. _define-metadata-blocks-for-a-dataverse-collection:

Define Metadata Blocks for a Dataverse Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -609,6 +615,8 @@ The fully expanded example above (without environment variables) looks like this
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST -H "Content-type:application/json" --upload-file define-metadatablocks.json "https://demo.dataverse.org/api/dataverses/root/metadatablocks"
An alternative to defining metadata blocks at a collection level is to create and use a dataset type. See :ref:`api-link-dataset-type`.

Determine if a Dataverse Collection Inherits Its Metadata Blocks from Its Parent
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -3473,6 +3481,36 @@ The fully expanded example above (without environment variables) looks like this
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X DELETE "https://demo.dataverse.org/api/datasets/datasetTypes/3"
.. _api-link-dataset-type:
Link Dataset Type with Metadata Blocks
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Linking a dataset type with one or more metadata blocks results in additional fields from those blocks appearing in the output from the :ref:`list-metadata-blocks-for-a-collection` API endpoint. The new frontend for Dataverse (https://github.com/IQSS/dataverse-frontend) uses the JSON output from this API endpoint to construct the page that users see when creating or editing a dataset. Once the frontend has been updated to pass in the dataset type (https://github.com/IQSS/dataverse-client-javascript/issues/210), specifying a dataset type in this way can be an alternative way to display additional metadata fields than the traditional method, which is to enable a metadata block at the collection level (see :ref:`define-metadata-blocks-for-a-dataverse-collection`).
For example, a superuser could create a type called "software" and link it to the "CodeMeta" metadata block (this example is below). Then, once the new frontend allows it, the user can specify that they want to create a dataset of type software and see the additional metadata fields from the CodeMeta block when creating or editing their dataset.
This API endpoint is for superusers only.
.. code-block:: bash
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export TYPE=software
export JSON='["codeMeta20"]'
curl -H "X-Dataverse-key:$API_TOKEN" -H "Content-Type: application/json" "$SERVER_URL/api/datasets/datasetTypes/$TYPE" -X PUT -d $JSON
The fully expanded example above (without environment variables) looks like this:
.. code-block:: bash
curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -H "Content-Type: application/json" "https://demo.dataverse.org/api/datasets/datasetTypes/software" -X PUT -d '["codeMeta20"]'
To update the blocks that are linked, send an array with those blocks.
To remove all links to blocks, send an empty array.
Files
-----
Expand Down Expand Up @@ -5256,6 +5294,27 @@ The fully expanded example above (without environment variables) looks like this
curl "https://demo.dataverse.org/api/datasetfields/facetables"
.. _setDisplayOnCreate:
Set displayOnCreate for a Dataset Field
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Set displayOnCreate for a dataset field. See also :doc:`/admin/metadatacustomization` in the Admin Guide.
.. code-block:: bash
export SERVER_URL=http://localhost:8080
export FIELD=subtitle
export BOOLEAN=true
curl -X POST "$SERVER_URL/api/admin/datasetfield/setDisplayOnCreate?datasetFieldType=$FIELD&setDisplayOnCreate=$BOOLEAN"
The fully expanded example above (without environment variables) looks like this:
.. code-block:: bash
curl -X POST "http://localhost:8080/api/admin/datasetfield/setDisplayOnCreate?datasetFieldType=studyAssayCellType&setDisplayOnCreate=true"
.. _Notifications:
Notifications
Expand Down
6 changes: 4 additions & 2 deletions doc/sphinx-guides/source/user/dataset-management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -801,21 +801,23 @@ If you deaccession the most recently published version of the dataset but not al
Dataset Types
=============

.. note:: Development of the dataset types feature is ongoing. Please see https://github.com/IQSS/dataverse-pm/issues/307 for details.

Out of the box, all datasets have a dataset type of "dataset". Superusers can add additional types such as "software" or "workflow" using the :ref:`api-add-dataset-type` API endpoint.

Once more than one type appears in search results, a facet called "Dataset Type" will appear allowing you to filter down to a certain type.

If your installation is configured to use DataCite as a persistent ID (PID) provider, the appropriate type ("Dataset", "Software", "Workflow") will be sent to DataCite when the dataset is published for those three types.

Currently, the dataset type can only be specified via API and only when the dataset is created. For details, see the following sections of the API guide:
Currently, specifying a type for a dataset can only be done via API and only when the dataset is created. The type can't currently be changed afterward. For details, see the following sections of the API guide:

- :ref:`api-create-dataset-with-type` (Native API)
- :ref:`api-semantic-create-dataset-with-type` (Semantic API)
- :ref:`import-dataset-with-type`

Dataset types can be listed, added, or deleted via API. See :ref:`api-dataset-types` in the API Guide for more.

Development of the dataset types feature is ongoing. Please see https://github.com/IQSS/dataverse/issues/10489 for details.
Dataset types can be linked with metadata blocks to make fields from those blocks available when datasets of that type are created or edited. See :ref:`api-link-dataset-type` and :ref:`list-metadata-blocks-for-a-collection` for details.

.. |image1| image:: ./img/DatasetDiagram.png
:class: img-responsive
Expand Down
2 changes: 2 additions & 0 deletions docker-compose-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ services:
- dev
networks:
- dataverse
volumes:
- ./docker-dev-volumes/solr/data:/var/solr

dev_dv_initializer:
container_name: "dev_dv_initializer"
Expand Down
81 changes: 67 additions & 14 deletions src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
package edu.harvard.iq.dataverse;

import edu.harvard.iq.dataverse.dataset.DatasetType;
import java.io.IOException;
import java.io.StringReader;
import java.net.URI;
Expand Down Expand Up @@ -871,15 +872,15 @@ public List<DatasetFieldType> findAllDisplayedOnCreateInMetadataBlock(MetadataBl
Root<MetadataBlock> metadataBlockRoot = criteriaQuery.from(MetadataBlock.class);
Root<DatasetFieldType> datasetFieldTypeRoot = criteriaQuery.from(DatasetFieldType.class);

Predicate requiredInDataversePredicate = buildRequiredInDataversePredicate(criteriaBuilder, datasetFieldTypeRoot);
Predicate fieldRequiredInTheInstallation = buildFieldRequiredInTheInstallationPredicate(criteriaBuilder, datasetFieldTypeRoot);

criteriaQuery.where(
criteriaBuilder.and(
criteriaBuilder.equal(metadataBlockRoot.get("id"), metadataBlock.getId()),
datasetFieldTypeRoot.in(metadataBlockRoot.get("datasetFieldTypes")),
criteriaBuilder.or(
criteriaBuilder.isTrue(datasetFieldTypeRoot.get("displayOnCreate")),
requiredInDataversePredicate
fieldRequiredInTheInstallation
)
)
);
Expand All @@ -890,16 +891,39 @@ public List<DatasetFieldType> findAllDisplayedOnCreateInMetadataBlock(MetadataBl
return typedQuery.getResultList();
}

public List<DatasetFieldType> findAllInMetadataBlockAndDataverse(MetadataBlock metadataBlock, Dataverse dataverse, boolean onlyDisplayedOnCreate) {
public List<DatasetFieldType> findAllInMetadataBlockAndDataverse(MetadataBlock metadataBlock, Dataverse dataverse, boolean onlyDisplayedOnCreate, DatasetType datasetType) {
if (!dataverse.isMetadataBlockRoot() && dataverse.getOwner() != null) {
return findAllInMetadataBlockAndDataverse(metadataBlock, dataverse.getOwner(), onlyDisplayedOnCreate);
return findAllInMetadataBlockAndDataverse(metadataBlock, dataverse.getOwner(), onlyDisplayedOnCreate, datasetType);
}

CriteriaBuilder criteriaBuilder = em.getCriteriaBuilder();
CriteriaQuery<DatasetFieldType> criteriaQuery = criteriaBuilder.createQuery(DatasetFieldType.class);

Root<MetadataBlock> metadataBlockRoot = criteriaQuery.from(MetadataBlock.class);
Root<DatasetFieldType> datasetFieldTypeRoot = criteriaQuery.from(DatasetFieldType.class);

// Build the main predicate to include fields that belong to the specified dataverse and metadataBlock and match the onlyDisplayedOnCreate value.
Predicate fieldPresentInDataverse = buildFieldPresentInDataversePredicate(dataverse, onlyDisplayedOnCreate, criteriaQuery, criteriaBuilder, datasetFieldTypeRoot, metadataBlockRoot);

// Build an additional predicate to include fields from the datasetType, if the datasetType is specified and contains the given metadataBlock.
Predicate fieldPresentInDatasetType = buildFieldPresentInDatasetTypePredicate(datasetType, criteriaQuery, criteriaBuilder, datasetFieldTypeRoot, metadataBlockRoot, onlyDisplayedOnCreate);

// Build the final WHERE clause by combining all the predicates.
criteriaQuery.where(
criteriaBuilder.equal(metadataBlockRoot.get("id"), metadataBlock.getId()), // Match the MetadataBlock ID.
datasetFieldTypeRoot.in(metadataBlockRoot.get("datasetFieldTypes")), // Ensure the DatasetFieldType is part of the MetadataBlock.
criteriaBuilder.or(
fieldPresentInDataverse,
fieldPresentInDatasetType
)
);

criteriaQuery.select(datasetFieldTypeRoot);

return em.createQuery(criteriaQuery).getResultList();
}

private Predicate buildFieldPresentInDataversePredicate(Dataverse dataverse, boolean onlyDisplayedOnCreate, CriteriaQuery<DatasetFieldType> criteriaQuery, CriteriaBuilder criteriaBuilder, Root<DatasetFieldType> datasetFieldTypeRoot, Root<MetadataBlock> metadataBlockRoot) {
Root<Dataverse> dataverseRoot = criteriaQuery.from(Dataverse.class);

// Join Dataverse with DataverseFieldTypeInputLevel on the "dataverseFieldTypeInputLevels" attribute, using a LEFT JOIN.
Expand Down Expand Up @@ -930,7 +954,7 @@ public List<DatasetFieldType> findAllInMetadataBlockAndDataverse(MetadataBlock m
Predicate hasNoInputLevelPredicate = criteriaBuilder.not(criteriaBuilder.exists(subquery));

// Define a predicate to include the required fields in Dataverse.
Predicate requiredInDataversePredicate = buildRequiredInDataversePredicate(criteriaBuilder, datasetFieldTypeRoot);
Predicate fieldRequiredInTheInstallation = buildFieldRequiredInTheInstallationPredicate(criteriaBuilder, datasetFieldTypeRoot);

// Define a predicate for displaying DatasetFieldTypes on create.
// If onlyDisplayedOnCreate is true, include fields that:
Expand All @@ -941,28 +965,57 @@ public List<DatasetFieldType> findAllInMetadataBlockAndDataverse(MetadataBlock m
? criteriaBuilder.or(
criteriaBuilder.or(
criteriaBuilder.isTrue(datasetFieldTypeRoot.get("displayOnCreate")),
requiredInDataversePredicate
fieldRequiredInTheInstallation
),
requiredAsInputLevelPredicate
)
: criteriaBuilder.conjunction();

// Build the final WHERE clause by combining all the predicates.
criteriaQuery.where(
// Combine all the predicates.
return criteriaBuilder.and(
criteriaBuilder.equal(dataverseRoot.get("id"), dataverse.getId()), // Match the Dataverse ID.
criteriaBuilder.equal(metadataBlockRoot.get("id"), metadataBlock.getId()), // Match the MetadataBlock ID.
metadataBlockRoot.in(dataverseRoot.get("metadataBlocks")), // Ensure the MetadataBlock is part of the Dataverse.
datasetFieldTypeRoot.in(metadataBlockRoot.get("datasetFieldTypes")), // Ensure the DatasetFieldType is part of the MetadataBlock.
criteriaBuilder.or(includedAsInputLevelPredicate, hasNoInputLevelPredicate), // Include DatasetFieldTypes based on the input level predicates.
displayedOnCreatePredicate // Apply the display-on-create filter if necessary.
);
}

criteriaQuery.select(datasetFieldTypeRoot).distinct(true);

return em.createQuery(criteriaQuery).getResultList();
private Predicate buildFieldPresentInDatasetTypePredicate(DatasetType datasetType,
CriteriaQuery<DatasetFieldType> criteriaQuery,
CriteriaBuilder criteriaBuilder,
Root<DatasetFieldType> datasetFieldTypeRoot,
Root<MetadataBlock> metadataBlockRoot,
boolean onlyDisplayedOnCreate) {
Predicate datasetTypePredicate = criteriaBuilder.isFalse(criteriaBuilder.literal(true)); // Initialize datasetTypePredicate to always false by default
if (datasetType != null) {
// Create a subquery to check for the presence of the specified metadataBlock within the datasetType
Subquery<Long> datasetTypeSubquery = criteriaQuery.subquery(Long.class);
Root<DatasetType> datasetTypeRoot = criteriaQuery.from(DatasetType.class);

// Define a predicate for displaying DatasetFieldTypes on create.
// If onlyDisplayedOnCreate is true, include fields that are either marked as displayed on create OR marked as required.
// Otherwise, use an always-true predicate (conjunction).
Predicate displayedOnCreatePredicate = onlyDisplayedOnCreate ?
criteriaBuilder.or(
criteriaBuilder.isTrue(datasetFieldTypeRoot.get("displayOnCreate")),
buildFieldRequiredInTheInstallationPredicate(criteriaBuilder, datasetFieldTypeRoot)
)
: criteriaBuilder.conjunction();

datasetTypeSubquery.select(criteriaBuilder.literal(1L))
.where(
criteriaBuilder.equal(datasetTypeRoot.get("id"), datasetType.getId()), // Match the DatasetType ID.
metadataBlockRoot.in(datasetTypeRoot.get("metadataBlocks")), // Ensure the metadataBlock is included in the datasetType's list of metadata blocks.
displayedOnCreatePredicate
);

// Now set the datasetTypePredicate to true if the subquery finds a matching metadataBlock
datasetTypePredicate = criteriaBuilder.exists(datasetTypeSubquery);
}
return datasetTypePredicate;
}

private Predicate buildRequiredInDataversePredicate(CriteriaBuilder criteriaBuilder, Root<DatasetFieldType> datasetFieldTypeRoot) {
private Predicate buildFieldRequiredInTheInstallationPredicate(CriteriaBuilder criteriaBuilder, Root<DatasetFieldType> datasetFieldTypeRoot) {
// Predicate to check if the current DatasetFieldType is required.
Predicate isRequired = criteriaBuilder.isTrue(datasetFieldTypeRoot.get("required"));

Expand Down
Loading

0 comments on commit 9ee8754

Please sign in to comment.