Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Required and recomeded fields in DataCite XML #1

Open
borsna opened this issue Jan 11, 2024 · 11 comments
Open

Required and recomeded fields in DataCite XML #1

borsna opened this issue Jan 11, 2024 · 11 comments
Labels
datacite DataCite

Comments

@borsna
Copy link
Contributor

borsna commented Jan 11, 2024

Agree on list of required & recommended fields in DataCite XML.
Some examples could be useful.

(M): Mandatory
(R): Recommended
(O): Optional

  • Identifier (M)
  • Creator (M)
  • Title (M)
  • Publisher (M)
  • PublicationYear (M)
  • Subject (M)
  • Contributor (O)
  • Date (M)
  • Language (R)
  • ResourceType (M)
  • AlternateIdentifier (MO)
  • RelatedIdentifier (MO)
  • Format (O)
  • Version (O)
  • Rights (R)
  • Description (M)
  • GeoLocation (O)
  • FundingReference (O)
@borsna borsna added the datacite DataCite label Jan 12, 2024
@DeboraArlt
Copy link

AlternateIdentifier and RelatedIdentifier are M above. What's the reason for that? Those are O and R at DataCite. Will often be missing.

@DeboraArlt
Copy link

One thing I am missing: since the idea is to gather metadata that can be for datasets publisehd elsewhere (not SND) there should be an element that allows to point to where the metadata come from.

@borsna
Copy link
Contributor Author

borsna commented Feb 9, 2024

@DeboraArlt good catch! these those should be O or R in our case. Changed both to O in the list and will add it to the agenda if we should have them as R or O in the next online meeting.

The Identifier field should be the way to point to the source and the harvester will add harvesting metadata about the source repository and promote the DOI from Identifier as the source for the metadata.

@DeboraArlt
Copy link

Creator: "The main researchers involved in producing the resource, in priority order (occurrences: 1-n)." - may not always be researchers that produce the resource. I see that DataCite uses this definition. EML says "The people or organizations who created this resource." To me it seems strange limiting the definition to researchers, I would suggest change to "The person(-s) involved in producing [creating?] the resource, in priority order (occurrences: 1-n)."

And is it correct that we only allow person here? or also organisation? Seems that DataCite allows both: May be a corporate/institutional or personal name.

@DeboraArlt
Copy link

Publication year: is now "The year when the data was or will be made publicly available. (occurrences: 1)" - go with other definitions and chage to "The year when the resource was ..."?

@borsna
Copy link
Contributor Author

borsna commented Mar 5, 2024

Creator: "The main researchers involved in producing the resource, in priority order (occurrences: 1-n)." - may not always be researchers that produce the resource. I see that DataCite uses this definition. EML says "The people or organizations who created this resource." To me it seems strange limiting the definition to researchers, I would suggest change to "The person(-s) involved in producing [creating?] the resource, in priority order (occurrences: 1-n)."

And is it correct that we only allow person here? or also organisation? Seems that DataCite allows both: May be a corporate/institutional or personal name.

True, we have a example of a creator organization so i think we should be clearer in the text. Would this be a good change?

The person(-s) and/or orginzation(-s) involved in creating the resource, in priority order (occurrences: 1-n).

Publication year: is now "The year when the data was or will be made publicly available. (occurrences: 1)" - go with other definitions and chage to "The year when the resource was ..."?

Agreed, committing the change.

@DeboraArlt
Copy link

good change, if "orginzation(-s)" is "organization(-s)" ;-)

@DeboraArlt
Copy link

RelatedIdentifier: remove the "Definition:" at the start of the definition text?

@DeboraArlt
Copy link

Description: I am confused by the definition "All additional information that does not fit in any of the other categories. May be used for technical information (occurrences: 0-n)." although it's used like this by DataCite. Description i sM, which I think is a good idea, but then just calling it "additional info tht does not fit in any other category" is pretty vague. I would prefer being clearer with what we ask for here. I would always want a short abstract for a resource. other descriptions (e.g. Methods, or TechnicalInfo) can be provided too but don't really describe the resource. and a title may be not very descriptive either.

GBIF uses abstract (for methods and other there are other field/properties): A brief overview of the resource that is being documented.
Dublin Core also uses abstract: A summary of the resource.

What about changing the definition to something like "A brief summary of the resource. May also be used for technical information. (occurrences: 0-n)"

@DeboraArlt
Copy link

GeoLocation: semantics but perhaps better as (inserting resource) "Spatial region or named place where the data of the resource was gathered or about which the data is focused", or "Spatial region or named place where the data contained in the resource was gathered or about which the data is focused"?

borsna added a commit that referenced this issue Mar 8, 2024
* fix spelling
* remove redundant label
* redefine info of description field
* update definition of geolocation

#1
@borsna
Copy link
Contributor Author

borsna commented Apr 5, 2024

Added required property contributorType for Contributor in the example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datacite DataCite
Projects
None yet
Development

No branches or pull requests

2 participants