Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Schema 4.5 #166

Merged
merged 17 commits into from
Dec 4, 2023
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
513cdfd
Adds Schema 4.5 resources
codycooperross Sep 26, 2023
cb58532
Preliminary support for publisher as a hash with additional subproper…
codycooperross Sep 28, 2023
05a6921
Adds support for reading/writing DC XML publisher xml:lang attribute
codycooperross Sep 29, 2023
51731cb
Modify insert_publisher for publisher_obj.
svogt0511 Oct 8, 2023
4e55843
Modify insert_publisher - remove reference to publisher_obj.
svogt0511 Oct 13, 2023
1039b0c
Read codemeta publisher as hash
codycooperross Oct 19, 2023
5ab5042
Adds ROR normalization when reading publisherIdentifier in line with …
codycooperross Oct 19, 2023
7b602b5
Merge remote-tracking branch 'origin/master' into schema-4.5
svogt0511 Oct 23, 2023
ca90d2c
Merge remote-tracking branch 'origin/master' into schema-4.5
svogt0511 Oct 24, 2023
62a2bde
Consistently read publisher as a hash from DataCite JSON and and Cros…
codycooperross Nov 6, 2023
714d71a
Use normalize_publisher method in Metadata class
codycooperross Nov 6, 2023
98870f1
Corrections and refactoring
codycooperross Nov 27, 2023
3749914
Makes turtle publisher tests agnostic to external metadata changes
codycooperross Nov 28, 2023
a44b282
Issue #172 - Temporary fix to DC JSON publisher - return only string,…
svogt0511 Nov 30, 2023
23a3e94
Merge remote-tracking branch 'origin/master' into schema-4.5
svogt0511 Nov 30, 2023
aa464c2
Change DC JSON writing behavior for publisher value before official S…
codycooperross Dec 4, 2023
da9cecd
Comment clarity
codycooperross Dec 4, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 17 additions & 9 deletions lib/bolognese/datacite_utils.rb
Original file line number Diff line number Diff line change
Expand Up @@ -106,9 +106,19 @@ def insert_titles(xml)
end
end
end

def insert_publisher(xml)
xml.publisher(publisher || container && container["title"])
if publisher.is_a?(Hash)
attributes = {
'publisherIdentifier' => publisher["publisherIdentifier"] || nil,
'publisherIdentifierScheme' => publisher["publisherIdentifierScheme"] || nil,
'schemeURI' => publisher["schemeUri"] || nil,
"xml:lang" => publisher["lang"] || nil
}.compact
ashwinisukale marked this conversation as resolved.
Show resolved Hide resolved
xml.publisher(publisher["name"] || container && container["title"], attributes)
else
xml.publisher(publisher || container && container["title"])
end
end

def insert_publication_year(xml)
Expand Down Expand Up @@ -375,13 +385,11 @@ def insert_geo_locations(xml)
end
end
if geo_location["geoLocationPolygon"]
geo_location["geoLocationPolygon"].each do |geo_location_polygon|
xml.geoLocationPolygon do
geo_location_polygon.each do |polygon_point|
xml.polygonPoint do
xml.pointLatitude(polygon_point.dig("polygonPoint", "pointLatitude"))
xml.pointLongitude(polygon_point.dig("polygonPoint", "pointLongitude"))
end
xml.geoLocationPolygon do
Array.wrap(geo_location["geoLocationPolygon"]).each do |polygon_point|
xml.polygonPoint do
xml.pointLatitude(polygon_point.dig("polygonPoint", "pointLatitude"))
xml.pointLongitude(polygon_point.dig("polygonPoint", "pointLongitude"))
end
end
end
Expand Down
7 changes: 6 additions & 1 deletion lib/bolognese/metadata.rb
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,12 @@ def dates
end

def publisher
@publisher ||= meta.fetch("publisher", nil)
@publisher ||=
if meta.fetch("publisher", nil).is_a?(Hash)
meta.fetch("publisher", nil)
elsif meta.fetch("publisher", nil).is_a?(String)
{ "name" => meta.fetch("publisher", nil) }.compact
end
end

def identifiers
Expand Down
2 changes: 1 addition & 1 deletion lib/bolognese/metadata_utils.rb
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ def citeproc_hsh
"volume" => container.to_h["volume"],
"issue" => container.to_h["issue"],
"page" => page,
"publisher" => publisher,
"publisher" => publisher["name"],
"title" => parse_attributes(titles, content: "title", first: true),
"URL" => url,
"copyright" => Array.wrap(rights_list).map { |l| l["rights"] }.first,
Expand Down
2 changes: 1 addition & 1 deletion lib/bolognese/readers/bibtex_reader.rb
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ def read_bibtex(string: nil, **options)
"titles" => meta.try(:title).present? ? [{ "title" => meta.try(:title).to_s }] : [],
"creators" => creators,
"container" => container,
"publisher" => meta.try(:publisher).to_s.presence,
"publisher" => meta.try(:publisher).present? ? { "name" => meta.try(:publisher).to_s }.compact : nil,
codycooperross marked this conversation as resolved.
Show resolved Hide resolved
"related_identifiers" => related_identifiers,
"dates" => dates,
"publication_year" => publication_year,
Expand Down
2 changes: 1 addition & 1 deletion lib/bolognese/readers/codemeta_reader.rb
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def read_codemeta(string: nil, **options)
dates << { "date" => meta.fetch("dateCreated"), "dateType" => "Created" } if meta.fetch("dateCreated", nil).present?
dates << { "date" => meta.fetch("dateModified"), "dateType" => "Updated" } if meta.fetch("dateModified", nil).present?
publication_year = meta.fetch("datePublished")[0..3] if meta.fetch("datePublished", nil).present?
publisher = meta.fetch("publisher", nil)
publisher = { "name" => meta.fetch("publisher", nil) } if meta.fetch("publisher", nil).present?
state = meta.present? || read_options.present? ? "findable" : "not_found"
schema_org = meta.fetch("@type", nil)
types = {
Expand Down
3 changes: 1 addition & 2 deletions lib/bolognese/readers/crossref_reader.rb
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,7 @@ def read_crossref(string: nil, **options)
journal_metadata = nil
journal_issue = {}
journal_metadata = nil
publisher = query.dig("crm_item", 0)
publisher = nil unless publisher.is_a?(String)
publisher = query.dig("crm_item", 0).is_a?(String) ? { "name" => query.dig("crm_item", 0) } : nil

case model
when "book"
Expand Down
18 changes: 17 additions & 1 deletion lib/bolognese/readers/datacite_reader.rb
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,22 @@ def read_datacite(string: nil, **options)

titles = get_titles(meta)

publisher = Array.wrap(meta.dig("publisher")).map do |r|
if r.blank?
nil
elsif r.is_a?(String)
{ "name" => r.strip }
elsif r.is_a?(Hash)
{
"name" => r["__content__"].strip,
"publisherIdentifier" => r["publisherIdentifierScheme"] == "ROR" ? normalize_ror(r["publisherIdentifier"]) : r["publisherIdentifier"],
"publisherIdentifierScheme" => r["publisherIdentifierScheme"],
"schemeUri" => r["schemeURI"],
"lang" => r["lang"],
}.compact
end
end.compact.first

descriptions = Array.wrap(meta.dig("descriptions", "description")).map do |r|
if r.blank?
nil
Expand Down Expand Up @@ -287,7 +303,7 @@ def read_datacite(string: nil, **options)
"creators" => get_authors(Array.wrap(meta.dig("creators", "creator"))),
"contributors" => get_authors(Array.wrap(meta.dig("contributors", "contributor"))),
"container" => set_container(meta),
"publisher" => parse_attributes(meta.fetch("publisher", nil), first: true).to_s.strip.presence,
"publisher" => publisher,
"agency" => "datacite",
"funding_references" => funding_references,
"dates" => dates,
Expand Down
2 changes: 1 addition & 1 deletion lib/bolognese/readers/ris_reader.rb
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ def read_ris(string: nil, **options)
"url" => meta.fetch("UR", nil),
"titles" => meta.fetch("T1", nil).present? ? [{ "title" => meta.fetch("T1", nil) }] : nil,
"creators" => get_authors(author),
"publisher" => meta.fetch("PB", "(:unav)"),
"publisher" => { "name" => meta.fetch("PB", "(:unav)") },
"container" => container,
"related_identifiers" => related_identifiers,
"dates" => dates,
Expand Down
7 changes: 6 additions & 1 deletion lib/bolognese/readers/schema_org_reader.rb
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,12 @@ def read_schema_org(string: nil, **options)
creators = get_authors(from_schema_org_creators(Array.wrap(authors)))
end
contributors = get_authors(from_schema_org_contributors(Array.wrap(meta.fetch("editor", nil))))
publisher = parse_attributes(meta.fetch("publisher", nil), content: "name", first: true)
publisher =
if parse_attributes(meta.fetch("publisher", nil), content: "name", first: true)
{
"name" => parse_attributes(meta.fetch("publisher", nil), content: "name", first: true),
}.compact
end
codycooperross marked this conversation as resolved.
Show resolved Hide resolved

ct = (schema_org == "Dataset") ? "includedInDataCatalog" : "Periodical"
container = if meta.fetch(ct, nil).present?
Expand Down
2 changes: 1 addition & 1 deletion lib/bolognese/writers/bibtex_writer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def bibtex
volume: container.to_h["volume"],
issue: container.to_h["issue"],
pages: pages,
publisher: publisher,
publisher: publisher["name"],
year: publication_year,
copyright: Array.wrap(rights_list).map { |l| l["rights"] }.first,
}.compact
Expand Down
2 changes: 1 addition & 1 deletion lib/bolognese/writers/codemeta_writer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ def codemeta
"tags" => subjects.present? ? Array.wrap(subjects).map { |k| parse_attributes(k, content: "subject", first: true) } : nil,
"datePublished" => get_date(dates, "Issued") || publication_year,
"dateModified" => get_date(dates, "Updated"),
"publisher" => publisher,
"publisher" => publisher["name"],
"license" => Array.wrap(rights_list).map { |l| l["rightsUri"] }.compact.unwrap,
}.compact
JSON.pretty_generate hsh.presence
Expand Down
2 changes: 1 addition & 1 deletion lib/bolognese/writers/csv_writer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ def csv
resource_type: types["resourceType"],
title: parse_attributes(titles, content: "title", first: true),
author: authors_as_string(creators),
publisher: publisher,
publisher: publisher["name"],
publication_year: publication_year
}.values

Expand Down
6 changes: 3 additions & 3 deletions lib/bolognese/writers/jats_writer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -77,16 +77,16 @@ def insert_citation_title(xml)

def insert_source(xml)
if is_chapter?
xml.source(publisher)
xml.source(publisher["name"])
elsif is_article? || is_data?
xml.source(container && container["title"] || publisher)
xml.source(container && container["title"] || publisher["name"])
else
xml.source(parse_attributes(titles, content: "title", first: true))
end
end

def insert_publisher_name(xml)
xml.send("publisher-name", publisher)
xml.send("publisher-name", publisher["name"])
end

def insert_publication_date(xml)
Expand Down
2 changes: 1 addition & 1 deletion lib/bolognese/writers/ris_writer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ def ris
"AB" => parse_attributes(abstract_description, content: "description", first: true),
"KW" => Array.wrap(subjects).map { |k| parse_attributes(k, content: "subject", first: true) }.presence,
"PY" => publication_year,
"PB" => publisher,
"PB" => publisher["name"],
"LA" => language,
"VL" => container.to_h["volume"],
"IS" => container.to_h["issue"],
Expand Down
2 changes: 1 addition & 1 deletion lib/bolognese/writers/schema_org_writer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ def schema_hsh
"schemaVersion" => schema_version,
"periodical" => types.present? ? ((types["schemaOrg"] != "Dataset") && container.present? ? to_schema_org(container) : nil) : nil,
"includedInDataCatalog" => types.present? ? ((types["schemaOrg"] == "Dataset") && container.present? ? to_schema_org_container(container, type: "Dataset") : nil) : nil,
"publisher" => publisher.present? ? { "@type" => "Organization", "name" => publisher } : nil,
"publisher" => publisher.present? ? { "@type" => "Organization", "name" => publisher["name"] } : nil,
"funder" => to_schema_org_funder(funding_references),
"provider" => agency.present? ? { "@type" => "Organization", "name" => agency } : nil
}.compact.presence
Expand Down
35 changes: 35 additions & 0 deletions resources/kernel-4.5/include/datacite-contributorType-v4.xsd
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- Version 1.0 - Created 2011-01-13 - FZ, TIB, Germany
2013-05 v3.0: Addition of ID to simpleType element, added values "ResearchGroup" & "Other"
2014-08-20 v3.1: Addition of value "DataCurator"
2015-05-14 v4.0 dropped value "Funder", use new "funderReference" -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://datacite.org/schema/kernel-4" targetNamespace="http://datacite.org/schema/kernel-4" elementFormDefault="qualified">
<xs:simpleType name="contributorType" id="contributorType">
<xs:annotation>
<xs:documentation>The type of contributor of the resource.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:enumeration value="ContactPerson" />
<xs:enumeration value="DataCollector" />
<xs:enumeration value="DataCurator" />
<xs:enumeration value="DataManager" />
<xs:enumeration value="Distributor" />
<xs:enumeration value="Editor" />
<xs:enumeration value="HostingInstitution" />
<xs:enumeration value="Other" />
<xs:enumeration value="Producer" />
<xs:enumeration value="ProjectLeader" />
<xs:enumeration value="ProjectManager" />
<xs:enumeration value="ProjectMember" />
<xs:enumeration value="RegistrationAgency" />
<xs:enumeration value="RegistrationAuthority" />
<xs:enumeration value="RelatedPerson" />
<xs:enumeration value="ResearchGroup" />
<xs:enumeration value="RightsHolder" />
<xs:enumeration value="Researcher" />
<xs:enumeration value="Sponsor" />
<xs:enumeration value="Supervisor" />
<xs:enumeration value="WorkPackageLeader" />
</xs:restriction>
</xs:simpleType>
</xs:schema>
25 changes: 25 additions & 0 deletions resources/kernel-4.5/include/datacite-dateType-v4.xsd
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- Version 1.0 - Created 2011-01-13 - FZ, TIB, Germany
2013-05 v3.0: Addition of ID to simpleType element; addition of value "Collected"; deleted "StartDate" & "EndDate"
2017-10-23 v4.1: Addition of value "Other"
2019-02-14 v4.2: Addition of value "Withdrawn"-->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://datacite.org/schema/kernel-4" targetNamespace="http://datacite.org/schema/kernel-4" elementFormDefault="qualified">
<xs:simpleType name="dateType" id="dateType">
<xs:annotation>
<xs:documentation>The type of date. Use RKMS‐ISO8601 standard for depicting date ranges.To indicate the end of an embargo period, use Available. To indicate the start of an embargo period, use Submitted or Accepted, as appropriate.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:enumeration value="Accepted" />
<xs:enumeration value="Available" />
<xs:enumeration value="Collected" />
<xs:enumeration value="Copyrighted" />
<xs:enumeration value="Created" />
<xs:enumeration value="Issued" />
<xs:enumeration value="Other" />
<xs:enumeration value="Submitted" />
<xs:enumeration value="Updated" />
<xs:enumeration value="Valid" />
<xs:enumeration value="Withdrawn" />
</xs:restriction>
</xs:simpleType>
</xs:schema>
19 changes: 19 additions & 0 deletions resources/kernel-4.5/include/datacite-descriptionType-v4.xsd
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- Version 1.0 - Created 2011-01-13 - FZ, TIB, Germany
2013-05 v3.0: Addition of ID to simpleType element, addition of value "Methods"
2015-02-12 v4.0: Addition of value "TechnicalInfo"-->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://datacite.org/schema/kernel-4" targetNamespace="http://datacite.org/schema/kernel-4" elementFormDefault="qualified">
<xs:simpleType name="descriptionType" id="descriptionType">
<xs:annotation>
<xs:documentation>The type of the description.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:enumeration value="Abstract" />
<xs:enumeration value="Methods" />
<xs:enumeration value="SeriesInformation" />
<xs:enumeration value="TableOfContents" />
<xs:enumeration value="TechnicalInfo" />
<xs:enumeration value="Other" />
</xs:restriction>
</xs:simpleType>
</xs:schema>
16 changes: 16 additions & 0 deletions resources/kernel-4.5/include/datacite-funderIdentifierType-v4.xsd
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- Version 1.0 - Created 2016-05-14 -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://datacite.org/schema/kernel-4" targetNamespace="http://datacite.org/schema/kernel-4" elementFormDefault="qualified">
<xs:simpleType name="funderIdentifierType" id="funderIdentifierType">
<xs:annotation>
<xs:documentation>The type of the funderIdentifier.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:enumeration value="ISNI" />
<xs:enumeration value="GRID" />
<xs:enumeration value="ROR" />
<xs:enumeration value="Crossref Funder ID" />
<xs:enumeration value="Other" />
</xs:restriction>
</xs:simpleType>
</xs:schema>
10 changes: 10 additions & 0 deletions resources/kernel-4.5/include/datacite-nameType-v4.xsd
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- Version 4.1 - Created 2017-10-23 -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://datacite.org/schema/kernel-4" targetNamespace="http://datacite.org/schema/kernel-4" elementFormDefault="qualified">
<xs:simpleType name="nameType" id="nameType">
<xs:restriction base="xs:string">
<xs:enumeration value="Organizational" />
<xs:enumeration value="Personal" />
</xs:restriction>
</xs:simpleType>
</xs:schema>
12 changes: 12 additions & 0 deletions resources/kernel-4.5/include/datacite-numberType-v4.xsd
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- Version 4.4 - Created 2021-03-05 -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://datacite.org/schema/kernel-4" targetNamespace="http://datacite.org/schema/kernel-4" elementFormDefault="qualified">
<xs:simpleType name="numberType" id="numberType">
<xs:restriction base="xs:string">
<xs:enumeration value="Article" />
<xs:enumeration value="Chapter" />
<xs:enumeration value="Report" />
<xs:enumeration value="Other" />
</xs:restriction>
</xs:simpleType>
</xs:schema>
34 changes: 34 additions & 0 deletions resources/kernel-4.5/include/datacite-relatedIdentifierType-v4.xsd
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- Version 1.0 - Created 2011-01-13 - FZ, TIB, Germany
2013-05 v3.0: Addition of ID to simpleType element; addition of value "PMID"
2014-08-20 v3.1: Addition of values "arxiv" and "bibcode"
2015-02-12 v4.0 Addition of value "IGSN"
2019-02-14 v4.2 Addition of value "w3id" -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://datacite.org/schema/kernel-4" targetNamespace="http://datacite.org/schema/kernel-4" elementFormDefault="qualified">
<xs:simpleType name="relatedIdentifierType" id="relatedIdentifierType">
<xs:annotation>
<xs:documentation>The type of the RelatedIdentifier.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:enumeration value="ARK" />
<xs:enumeration value="arXiv" />
<xs:enumeration value="bibcode" />
<xs:enumeration value="DOI" />
<xs:enumeration value="EAN13" />
<xs:enumeration value="EISSN" />
<xs:enumeration value="Handle" />
<xs:enumeration value="IGSN" />
<xs:enumeration value="ISBN" />
<xs:enumeration value="ISSN" />
<xs:enumeration value="ISTC" />
<xs:enumeration value="LISSN" />
<xs:enumeration value="LSID" />
<xs:enumeration value="PMID" />
<xs:enumeration value="PURL" />
<xs:enumeration value="UPC" />
<xs:enumeration value="URL" />
<xs:enumeration value="URN" />
<xs:enumeration value="w3id" />
</xs:restriction>
</xs:simpleType>
</xs:schema>
Loading