You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The DataCite metadata standard is able to record the experimental technique used to establish the dataset. However, SciCat doesn't do this: so the DataCite metadata is lacking this information.
Note that, although SciCat can store the experimental technique information as dataset metadata, this information is not propagated to publishedDataset.
Steps to Reproduce
create a dataset, including the experimental technique
trigger publishing the dataset.
observe DOI metadata; e.g., via DataCite API.
Current Behaviour
The DataCite metadata contains no subject elements.
Expected Behaviour
The DataCite metadata should contain subject element(s) that describe the techniques.
Arguably, there should be a single place (within SciCat code) that provides DataCite metadata (as described in #1192). While removing this duplicate code (i.e., closing #1192) would benefit this issue. I don't consider #1192 to block this issue.
The text was updated successfully, but these errors were encountered:
@paulmillar thanks for opening the issue.
Given that PublishedData can contains one or more datasets, what would you do if multiple datasets with different techniques are present?
Would you add a list of techniques to publishedData and than propagate all of them to DataCite?
Yes, this is certainly a valid question. I've spent a little time thinking about this, but haven't come to a strong opinion.
One could argue that each technique (of those techniques describing the publishedData) indicates that there's at least some data (within the publishedData data) taken with that technique. Under that interpretation the publishedData techniques would be the union of all techniques in its member datasets.
Alternatively, one could argue the publishedData techniques should describe all the datasets being published, since the publishedData is describing all those datasets. With this interpretation, the publishedData techniques is the intersection of all techniques in the member datasets.
Yet a third option is the selection is context-driven. Why is a DOI being generated? This might suggest some techniques (from the union) be included and other should be ignored. This would be a more nuanced approach, something that would likely require human input.
In practical terms, I would suggest taking the first option (use the union of techniques from member datasets) as an initial version.
A subsequent update could be to present the list of techniques in the web UI, to allow the user to choose/veto techniques, as appropriate.
Summary
The DataCite metadata standard is able to record the experimental technique used to establish the dataset. However, SciCat doesn't do this: so the DataCite metadata is lacking this information.
Note that, although SciCat can store the experimental technique information as dataset metadata, this information is not propagated to publishedDataset.
Steps to Reproduce
Current Behaviour
The DataCite metadata contains no
subject
elements.Expected Behaviour
The DataCite metadata should contain
subject
element(s) that describe the techniques.Details
The document ETN-1: Embedding PaNET in DataCite metadata describes how to include PaNET terms within the metadata associated with a DOI.
The document ETN-2: Working with PaNET terms in SciCat describes how to format PaNET terms within SciCat.
Note that (as described in #1192) the DataCite metadata is calculated in two places:
scicat-backend-next
's published-data.controller.ts andoai-provider-service
's openaire-mapper.ts.Arguably, there should be a single place (within SciCat code) that provides DataCite metadata (as described in #1192). While removing this duplicate code (i.e., closing #1192) would benefit this issue. I don't consider #1192 to block this issue.
The text was updated successfully, but these errors were encountered: