What is the API for the metadata assessment reports? #18
-
How can I programmatically access the assessment report that displays in MetacatUI for each dataset? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The MetaDIG API is the mechanism to use to get access to the assessment reports published on DataONE. The MetaDIG engine supports running a configurable set of "assessment suites" on each dataset in DataONE. All Datasets that use compatible metadata standards are checked with the DataONE FAIR Suite of assessment check. The results of one of these runs can be retrieved from the API knowing the identifier of the suite and the identifier of the metadata record of interest. For example, for the assessment suite $ curl https://api.dataone.org/quality/runs/FAIR-suite-0.3.1/doi:10.18739/A24T6F461 | jq . Note that the pipe to MetaDIG report in JSON formatThis produces the following report output: {
"id": "3b484839-1fc8-45a5-833c-ddd04dbe9e97",
"timestamp": "Apr 9, 2022 1:43:13 PM",
"objectIdentifier": "doi:10.18739/A24T6F461",
"result": [
{
"check": {
"id": "resource.abstractLength.sufficient.1",
"name": "Resource Abstract Length Sufficient",
"description": "Check that an abstract exists and is of sufficient length.",
"type": "Findable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n # check: datasetAbstractLength\n import re\n import metadig.variable as mvar\n \n global abstract\n\n global output\n global status\n minLength = 100\n \n if 'abstract' not in globals() or abstract is None:\n output = \"An abstract was not found.\"\n status = \"FAILURE\"\n return False\n\n if(mvar.isBlank(abstract)):\n output = \"The abstract is blank\"\n status = \"FAILURE\"\n return False\n \n # Convert to unicode so that non-ascii characters don't cause decoding errors\n abstract = mvar.toUnicode(abstract)\n \n # The abstract can be a textType element, so it may contain multiple subelements, i.e. <para>, etc \n # Since the metadig-engine is stuck at XPath 1.0, we cannot use the xpath to gather these into \n # a single string.\n if(isinstance(abstract, list)):\n abstract = ' '.join(abstract)\n \n numWords = len(re.split('\\s+', abstract.strip()))\n if (numWords < minLength):\n output = \"The abstract word count of '{}' is less that the recommended minimum of '{}'\".format(numWords, minLength)\n status = \"FAILURE\"\n return False\n else:\n output = \"The abstract is valid, with a word count of {}\".format(numWords)\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "abstract",
"xpath": "/resource/descriptions/description[@descriptionType='Abstract'] |\n /*/description//text()[normalize-space()] |\n /eml/*/abstract//text()[normalize-space()] |\n /*/identificationInfo/*/abstract//text()[normalize-space()] |\n /*/identificationInfo/*/abstrac//text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:08 PM",
"output": [
{
"value": "The abstract is valid, with a word count of 250",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.keywords.controlled.1",
"name": "Resource Keywords Controlled",
"description": "Check if a keyword thesaurus is present, indicating that the keywords are controlled.",
"type": "Findable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n\n keywordGroupCount = 0\n naturalKeywordGroupCount = 0\n controlledKeywordGroupCount = 0\n \n # For each resource, check that the attributes defined for it have unique names\n if(len(keywordGroups) == 0):\n output = \"No keywords found, unable to check for controlled keywords.\"\n status = \"FAILURE\"\n return False\n \n for i in range(0, len(keywordGroups)):\n keywordGroupCount += 1\n # Should be only one thesaurus per keyword group. If there is ANY thesaurus\n # then it applies to the entire group, and so this group of keywords is\n # controlled and not 'natural'\n thesaurusPresent = keywordGroups.get(i)\n \n if(thesaurusPresent):\n controlledKeywordGroupCount += 1\n else:\n naturalKeywordGroupCount += 1\n \n if(controlledKeywordGroupCount > 0):\n if(controlledKeywordGroupCount < keywordGroupCount):\n output = \"{} groups of keywords not controlled (from a vocabulary) (out of {} keyword groups.)\".format(keywordGroupCount - controlledKeywordGroupCount, keywordGroupCount)\n status = \"FAILURE\"\n return False\n else:\n output = \"All {} keyword groups are controlled (from a vocabulary).)\".format(keywordGroupCount)\n status = \"SUCCESS\"\n return True \n else:\n output = \"No controlled keyword (from a vocabulary) groups found (out of {} keyword groups.\".format(keywordGroupCount)\n status = \"FAILURE\"\n return False\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "keywordGroups",
"xpath": "\n /*/identificationInfo/*/descriptiveKeywords |\n /eml/dataset/keywordSet\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "thesaurusPresent",
"xpath": "boolean( \n (./MD_Keywords/thesaurusName and not (./MD_keywords/thesaurusName[@nilReason=\"unknown\"])) or\n (./keywordThesaurus))\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:08 PM",
"output": [
{
"value": "No controlled keyword (from a vocabulary) groups found (out of 1 keyword groups.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.keywords.present.1",
"name": "Resource Keywords Present",
"description": "Check if keywords are present.",
"type": "Findable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n import re\n import metadig.variable as mvar\n\n global keywords\n\n # Fail if no keywords present\n if (not 'keywords' in globals() or keywords is None):\n output = \"No keywords were found.\"\n status = \"FAILURE\"\n return False\n\n # Convert all values to unicode\n keywords = mvar.toUnicode(keywords)\n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(keywords, unicode)):\n keywords = [keywords]\n\n # Passed all tests!\n output = \"{} keywords are present\".format(len(keywords))\n # Passed all tests!\n status = \"SUCCESS\"\n return True \n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "keywords",
"xpath": "\n /*/identificationInfo/*/descriptiveKeywords/*/keyword |\n /eml/dataset/keywordSet/keyword\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:08 PM",
"output": [
{
"value": "25 keywords are present",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.keywordType.present.1",
"name": "Resource keyword Type Present",
"description": "Check if each keyword has a type specified..",
"type": "Findable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global keywords\n global keywordNames\n import metadig.variable as mvar\n\n maxPrint = 10\n keywordTypesMissing = []\n NoneType = type(None)\n \n # For each keyword, check if a type is specified\n if(isinstance(keywords, NoneType) or len(keywords) == 0):\n output = \"No keywords found, unable to check for keyword types.\"\n status = \"FAILURE\"\n return False\n \n # Make sure values are converted to type unicode\n keywords = mvar.toUnicode(keywords)\n keywordNames = mvar.toUnicode(keywordNames)\n \n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(keywords, unicode)):\n keywords = [keywords]\n \n if(isinstance(keywordNames, unicode)):\n keywordNames = [keywordNames]\n \n # For each keyword, check if a keyword type is specified.\n for i in range(0, len(keywords)):\n keywordType = keywords[i]\n keywordName = keywordNames[i]\n \n # Check if keywordName is blank\n if(isinstance(keywordName, NoneType) or mvar.isBlank(keywordName)):\n keywordName = \"name N/A\"\n \n # No keyword type for this keyword was specified\n if(isinstance(keywordType, NoneType) or mvar.isBlank(keywordType)):\n keywordTypesMissing.append(keywordName)\n\n # Add the list of resources that don't have keywords to the output\n if(len(keywordTypesMissing) == 1):\n output = u\"This keyword (of {} total) does not have a type specified: '{}'\".format(len(keywords), ', '.join(keywordTypesMissing[0:maxPrint]))\n elif (len(keywordTypesMissing) > 1):\n output = u\"These {} keywords (of {} total) do not have a type specified: '{}'\".format(len(keywordTypesMissing), len(keywords), ', '.join(keywordTypesMissing[0:maxPrint]))\n if(len(keywordTypesMissing) > maxPrint):\n output += \", ...\"\n \n status = \"FAILURE\"\n return False\n else:\n output = \"All {} keywords have a type specified\".format(len(keywords))\n status = \"SUCCESS\"\n return True\n\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "keywords",
"xpath": "\n /eml/dataset/keywordSet/keyword |\n /*/identificationInfo/MD_DataIdentification/descriptiveKeywords/MD_Keywords/keyword\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "type",
"xpath": "\n ./@keywordType |\n ../type/MD_KeywordTypeCode\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "keywordNames",
"xpath": "\n /eml/dataset/keywordSet/keyword |\n /*/identificationInfo/MD_DataIdentification/descriptiveKeywords/MD_Keywords/keyword\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "text()[normalize-space()] | */text()[normalize-space()]",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:08 PM",
"output": [
{
"value": "These 3 keywords (of 25 total) do not have a type specified: 'inlandWaters, environment, climatologyMeteorologyAtmosphere'",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.publicationDate.timeframe.1",
"name": "Publication date is not in the future",
"description": "Publication date should not be in the future",
"type": "Findable",
"level": "OPTIONAL",
"environment": "rscript",
"code": "\n library(metadig)\n library(stringr)\n \n if (pubDatePresent) {\n if(is.na(datasetPubDate) || length(datasetPubDate) == 0) {\n failure(\"The publication date is blank.\")\n } else {\n # A publication date is present, so now check if it is in\n # the future\n # The R as.Date function assumes that you are specifying a date as number of\n # days since epoch, if a string is entered that looks like a number, i.e. \"2017\"\n # so we have to check to see if R got confused.\n datasetPubDate <- str_trim(datasetPubDate, side=c(\"both\"))\n pubDate <- str_trim(toString(datasetPubDate), side = c(\"both\"))\n # If ISO 8806 formatted date, remove the time portion in case this was formatted in a way that\n # R can't handle.\n if (grepl(pubDate, \"T\")) {\n pubDate <- str_replace(pubDate, \"T.*\", \"\")\n }\n \n # First try to parse date string with default date format (should be ISO 8601)\n # Note: the 'as.Date' function is supposed to not throw an ERROR if 'optional=TRUE', but \n # this is not happending on the version of R used by the quality engine. Because of this\n # we have to do this extra error checking.\n metadataDate <- tryCatch({\n md <- NA\n md <- as.Date(pubDate, optional = TRUE)\n return (md)\n }, warning = function(w) {\n return(md)\n }, error = function(e) {\n return(md)\n })\n \n # If the date wasn't parsable then try parsing again with specific formats\n if(is.na(metadataDate)) {\n metadataDate <- tryCatch({\n md <- NA\n md <- as.Date(pubDate, tryFormats = c(\"%d/%m/%Y\",\"%Y-%m-%d\",\"%d-%m-%Y\", \"%Y%m%d\",\"%B %d %Y\",\"%Y\"), optional = TRUE)\n return (md)\n }, warning = function(w) {\n return(md)\n }, error = function(e) {\n return(md)\n })\n }\n \n if(! is.na(metadataDate)) {\n currentDate <- Sys.Date()\n if(metadataDate > currentDate) {\n failure(sprintf(\"The publication date '%s' is in the future.\", datasetPubDate))\n } else {\n success(sprintf(\"The publication date '%s' was found and is not in the future.\", datasetPubDate))\n }\n } else {\n failure(sprintf(\"Unable to parse publication date '%s'.\", datasetPubDate))\n }\n }\n } else {\n failure(\"A publication date is not present.\") \n }\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "pubDatePresent",
"xpath": "boolean(\n /resource/publicationYear or\n /*/available or\n /eml/dataset/pubDate or\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/Date or\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/DateTime or\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[dateType/CI_DateTypeCode='publication']/date/DateTime)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "datasetPubDate",
"xpath": "/resource/publicationYear |\n /*/available |\n /eml/dataset/pubDate |\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/Date//text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/DateTime//text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[dateType/CI_DateTypeCode='publication']/date/DateTime//text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:09 PM",
"output": [
{
"value": "The publication date '2013-11-09' was found and is not in the future.",
"type": "text"
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "metadata.identifier.present.1",
"name": "Metadata Identifier Present",
"description": "Check that a metadata identifier exists.",
"type": "Findable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n import metadig.variable as mvar\n\n # These global variables are read by the quality engine when\n # the check returns\n global output\n global status\n \n global metadataIdentifier\n\n # check if metadata identifier is present\n if 'metadataIdentifier' not in globals() or metadataIdentifier is None:\n output = \"A metadata identifier was not found.\"\n status = \"FAILURE\"\n return False\n \n metadataIdentifier = mvar.toUnicode(metadataIdentifier)\n \n # This should only be a single value, but if not (a list is returned) just get the first \n # one\n if(isinstance(metadataIdentifier, list)):\n metadataIdentifier = metadataIdentifier[0]\n \n if (mvar.isBlank(metadataIdentifier)):\n output = \"The metadata identifier is blank.\"\n status = \"FAILURE\"\n return False\n else:\n output = u\"The metadata identifier '{}' was found.\".format(metadataIdentifier.strip())\n status = \"SUCCESS\"\n return True \n \n # Check if the identifier is an authority id (DOI, ARK, CURIE, RRID)\n # doi:, https://doi.org/, urn, http:, https:, ark:, RRID:\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "metadataIdentifier",
"xpath": "/*/metadataIdentifier/MD_Identifier |\n /*/fileIdentifier |\n /eml/@packageId\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:09 PM",
"output": [
{
"value": "The metadata identifier 'doi:10.18739/A24X54G2H' was found.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.creator.present.1",
"name": "Resource creator Present",
"description": "Check that a resource creator exists.",
"type": "Findable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n\n if ('creator' in globals() and creator is not None and creator):\n output = \"A resource author/originator is present\"\n status = \"SUCCESS\"\n return True\n else:\n output = \"A dataset author/originator is not present\"\n status = \"FAILURE\"\n return False\n\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "creator",
"xpath": "boolean(\n /resource/creators/creator/* or\n /*/creator or\n /eml/dataset/creator or\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_ResponsibleParty[normalize-space(role/CI_RoleCode)='author']/individualName or\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_Responsibility[normalize-space(role/CI_RoleCode)='author']/party/*/name or\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_Responsibility[normalize-space(role/CI_RoleCode)='originator']/party/*/name)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:09 PM",
"output": [
{
"value": "A resource author/originator is present",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.creatorIdentifier.present.1",
"name": "Resource creator Identifier Present",
"description": "Check that a resource creator identifier exists.",
"type": "Findable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global creator\n \n import metadig.variable as mvar\n\n # A resource creator is not present\n if 'creator' not in globals() or creator is None:\n output = \"A resource creator identifier was not found.\"\n status = \"FAILURE\"\n return False\n \n # Convert all values to unicode\n creator = mvar.toUnicode(creator)\n \n # Convert all values to unicode\n if(isinstance(creator, unicode)):\n creator = [creator]\n \n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if (mvar.isBlank(creator)):\n output = \"The resource creator identifier is blank.\" \n status = \"FAILURE\"\n return False\n else:\n # Check if resource creator identifier is a single string or arrayList\n if(isinstance(creator, unicode)):\n output = u\"The resource creator identifier '{}' was found.\".format(creator)\n elif (isinstance(creator, list)):\n output = u\"The resource creator identifier '{}' was found (first of {} creator identifiers).\".format(creator[0].strip(), len(creator))\n \n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "creator",
"xpath": "\n /resource/contributors/contributor/nameIdentifier/text()[normalize-space()] |\n /eml/*/creator/userId/text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_Responsibility[normalize-space(role/CI_RoleCode)='author']/party/*/partyIdentifier/MD_Identifier/code//text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_Responsibility[normalize-space(role/CI_RoleCode)='creator']/party/*/partyIdentifier/MD_Identifier/code//text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_Responsibility[normalize-space(role/CI_RoleCode)='originator']/party/*/partyIdentifier/MD_Identifier/code//text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:09 PM",
"output": [
{
"value": "A resource creator identifier was not found.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.revisionDate.present.1",
"name": "Resource Revision Date Present",
"description": "Check that a revision or creation date exists.",
"type": "Findable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n \n if ('datasetRevisionDatePresent' in globals() and datasetRevisionDatePresent is not None and datasetRevisionDatePresent):\n output = \"A resource creation or revision date is present.\"\n status = \"SUCCESS\"\n return True\n else:\n output = \"A resource creation or revision date is not present.\"\n status = \"FAILURE\"\n return False\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "datasetRevisionDatePresent",
"xpath": "boolean((/resource/dates/date)\n or (/*/dateSubmitted)\n or (/eml/*/maintenance/changeHistory/changeDate)\n or (/eml/*/pubDate)\n or (/*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='creation']/date/Date)\n or (/*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='creation']/date/DateTime)\n or (/*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='revision']/date/Date)\n or (/*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='revision']/date/DateTime)\n or (/*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/Date)\n or (/*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/DateTime)\n or (/*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='creation']/date/DateTime)\n or (/*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='revision']/date/DateTime)\n or (/*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/DateTime))\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:09 PM",
"output": [
{
"value": "A resource creation or revision date is present.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "entity.identifier.present.1",
"name": "Entity Identifier Present",
"description": "Check that each entity has an identifier.",
"type": "Findable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global entityIdentifier\n \n import metadig.variable as mvar\n \n maxPrint = 3\n\n # An entity identifier is not present\n if 'entityIdentifier' not in globals() or entityIdentifier is None:\n output = \"An entity identifier was not found.\"\n status = \"FAILURE\"\n return False\n \n entityIdentifier = mvar.toUnicode(entityIdentifier)\n # If a single value, convert to a list for easier processing (i.e. don't have\n # to keep checking if it's a single value or list)\n if(isinstance(entityIdentifier, unicode)):\n entityIdentifier = [entityIdentifier]\n \n # check if the identifier is blank\n # TODO: check the identifier namespace\n if (mvar.isBlank(entityIdentifier)):\n output = \"The entity identifier is blank.\"\n status = \"FAILURE\"\n return False\n else:\n if(len(entityIdentifier) > 1):\n output = u\"These {} entity identifiers were found: {}\".format(len(entityIdentifier), entityIdentifier[0:maxPrint])\n if(len(entityIdentifier) > maxPrint):\n output += u\", ...\"\n else:\n output = u\"The entity identifier '{}' was found\".format(entityIdentifier[0])\n \n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "entityIdentifier",
"xpath": "\n /*/identificationInfo/*/citation/CI_Citation/identifier/MD_Identifier/code//text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/identifier/RS_Identifier/code//text()[normalize-space()] |\n /*/identifier |\n /eml/dataset/*/alternateIdentifier |\n /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity]/@id |\n /eml/dataset/*/physical/distribution/online/url |\n /resource/identifier[identifierType='DOI'] |\n /resource/alternateIdentifiers/alternateIdentifier\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:09 PM",
"output": [
{
"value": "These 2 entity identifiers were found: [u'urn:uuid:9b6b919a-0c64-4bdd-ac5a-ddefa3b67ca9', u'https://cn.dataone.org/cn/v2/resolve/urn:uuid:9b6b919a-0c64-4bdd-ac5a-ddefa3b67ca9']",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "entity.identifierType.present.1",
"name": "Entity Identifier Type Present",
"description": "Check that a entity identifier type exists.",
"type": "Findable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global entityIdentifierType\n \n import metadig.variable as mvar\n\n # An entity identifier type is not present\n if 'entityIdentifierType' not in globals() or entityIdentifierType is None:\n output = \"A entity identifier type was not found.\"\n status = \"FAILURE\"\n return False\n \n entityIdentifierType = mvar.toUnicode(entityIdentifierType)\n\n if (mvar.isBlank(entityIdentifierType)):\n output = \"The entity identifier type is blank.\" \n status = \"FAILURE\"\n return False\n else:\n # Check if entity identifier type is a single string or list\n if(isinstance(entityIdentifierType, unicode)):\n output = u\"The entity identifier type '{}' was found.\".format(entityIdentifierType)\n else:\n output = u\"The entity identifier type '{}' was found.\".format(entityIdentifierType[0])\n \n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "entityIdentifierType",
"xpath": "/resource/identifier/@identifierType |\n /*/identificationInfo/*/citation/CI_Citation/identifier/MD_Identifier/codeSpace//text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/identifier/MD_Identifier/authority//text()[normalize-space()] |\n /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity]/@system |\n /eml/dataset/*/alternateIdentifier/@system \n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:09 PM",
"output": [
{
"value": "A entity identifier type was not found.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.publicationDate.present.1",
"name": "Resource Publication Date Present",
"description": "Check that a publication date exists.",
"type": "Findable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global datasetPubDate\n \n import metadig.variable as mvar\n\n if ('datasetPubDate' in globals() and datasetPubDate is not None and datasetPubDate):\n status = \"SUCCESS\"\n else:\n output = \"A resource publication date is not present\"\n status = \"FAILURE\"\n return False\n \n datasetPubDate = mvar.toUnicode(datasetPubDate)\n \n if (mvar.isBlank(datasetPubDate)):\n output = \"The resource publication date is blank\"\n status = \"FAILURE\"\n return False\n else:\n output = u\"The resource publication date '{}' was found\".format(datasetPubDate)\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "datasetPubDate",
"xpath": "boolean(\n /resource/publicationYear or\n /*/available or\n /eml/dataset/pubDate or\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/Date or\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/DateTime or\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[dateType/CI_DateTypeCode='publication']/date/DateTime)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "datasetPubDate",
"xpath": "/resource/publicationYear |\n /*/available |\n /eml/dataset/pubDate |\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/Date |\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[normalize-space(dateType/CI_DateTypeCode)='publication']/date/DateTime |\n /*/identificationInfo/*/citation/CI_Citation/date/CI_Date[dateType/CI_DateTypeCode='publication']/date/DateTime\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:09 PM",
"output": [
{
"value": "The resource publication date '2013-11-09' was found",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.titleLength.sufficient.1",
"name": "Resource title length is sufficient",
"description": "Check that the resource title is greater than 7 words and less than 20.",
"type": "Findable",
"level": "REQUIRED",
"environment": "rscript",
"code": "\n check <- function() {\n library(base)\n\n if(!titlePresent) {\n return(list(status = \"FAILURE\", output = sprintf(\"The resource title is not present, so the check is unable to determine title word length.\")))\n }\n\n # Required minimum word count for title\n strictMinCount <- 5\n # Recommended minimum word count\n minWordCount <- 7\n # Recommended max word count\n maxWordCount <- 20\n\n wordCount <- length(unlist(strsplit(as.character(datasetTitle), \"\\\\s+\", perl=T)))\n if (wordCount < strictMinCount) {\n return(list(status = \"FAILURE\", output = sprintf(\"The number of words in the resource's title is %d. The minimum required word count is %s.\", wordCount, minWordCount))) \n } else if (wordCount < minWordCount) {\n return(list(status = \"FAILURE\", output = sprintf(\"The number of words in the resource's title is %d. The minimum recommended word count is %s.\", wordCount, minWordCount))) \n } else if (wordCount > maxWordCount) {\n return(list(status = \"FAILURE\", output = sprintf(\"The number of words in the resource's title is %d. The maximum recommended word count is %s.\", wordCount, maxWordCount))) \n } else {\n return(list(status = \"SUCCESS\", output = sprintf(\"The number of words in the resource's title is sufficient because it is between %d and %d words long.\", minWordCount, maxWordCount))) \n }\n }\n result <- check()\n mdq_result <- list(status=result$status, output=list(list(value=result$output)))\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "titlePresent",
"xpath": "boolean(\n\t/eml/dataset/title or \n\t/*/description or\n \t/*/identificationInfo/*/citation/CI_Citation/title)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "datasetTitle",
"xpath": "/eml/dataset/title |\n /*/description |\n /*/identificationInfo/*/citation/CI_Citation/title\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "eml",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:09 PM",
"output": [
{
"value": "The number of words in the resource's title is sufficient because it is between 7 and 20 words long.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.spatialExtent.present.1",
"name": "Resource Spatial Extent Present",
"description": "Check that at least one spatial extent exists.",
"type": "Findable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n\n if ('spatialExtent' in globals() and spatialExtent is not None and spatialExtent):\n output = \"A spatial extent is present.\"\n status = \"SUCCESS\"\n return True\n else:\n output = \"A spatial extent is not present.\"\n status = \"FAILURE\"\n return False\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "spatialExtent",
"xpath": "boolean(\n /*/identificationInfo/MD_DataIdentification/extent/EX_Extent/geographicElement or\n \t /*/identificationInfo/*/extent/EX_Extent/geographicElement/EX_GeographicBoundingBox//* or\n /*/identificationInfo/SV_ServiceIdentification/extent/EX_Extent/geographicElement/EX_GeographicBoundingBox//* or\n /*/spatial or\n /*/identificationInfo/*/extent/EX_Extent/geographicElement or\n /eml/*/coverage/geographicCoverage/boundingCoordinates or\n /resource/geoLocations/geoLocation)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:09 PM",
"output": [
{
"value": "A spatial extent is present.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "geographic.description.present.2",
"name": "Geographic coverage description",
"description": "Geographic coverage description should be present at the dataset level.",
"type": "Findable",
"level": "OPTIONAL",
"environment": "rscript",
"code": "\n library(metadig)\n \n if (descriptionPresent) {\n success(\"A textual description of the geographic coverage of this dataset is present.\")\n } else {\n failure(\"A textual description of the geographic coverage of this dataset is not present.\")\n }\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "descriptionPresent",
"xpath": "boolean(\n /eml/dataset/coverage/geographicCoverage/geographicDescription or\n /*/identificationInfo/*/extent/EX_Extent/description or\n /*/spatial)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "eml",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:10 PM",
"output": [
{
"value": "A textual description of the geographic coverage of this dataset is present.",
"type": "text"
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.taxonomicExtent.present.1",
"name": "Resource Taxonomic Extent Present",
"description": "Check that a taxonomic extent exists.",
"type": "Findable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n \n if ('taxonomicExtent' in globals() and taxonomicExtent is not None and taxonomicExtent):\n output = \"A taxonomic extent is present.\"\n status = \"SUCCESS\"\n return True\n else:\n output = \"A taxonomic extent is not present.\"\n status = \"FAILURE\"\n return False\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "taxonomicExtent",
"xpath": "boolean(\n /*/scientificName or\n /eml/*/coverage/taxonomicCoverage)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:10 PM",
"output": [
{
"value": "A taxonomic extent is not present.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.temporalExtent.present.1",
"name": "Resource Temporal Extent Present",
"description": "Check that a temporal extent exists.",
"type": "Findable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n \n if ('temporalExtent' in globals() and temporalExtent is not None and temporalExtent):\n output = \"A temporal extent is present.\"\n status = \"SUCCESS\"\n return True\n else:\n output = \"A temporal extent is not present.\"\n status = \"FAILURE\"\n return False\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "temporalExtent",
"xpath": "boolean(\n /*/identificationInfo/*/extent/EX_Extent/temporalElement/EX_TemporalExtent or\n /*/temporal or\n /eml/*/coverage/temporalCoverage)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:10 PM",
"output": [
{
"value": "A temporal extent is not present.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.accessControlRules.present.1",
"name": "Access Control Rules",
"description": "Check that access control rules exists.",
"type": "Accessible",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n \n import re\n import metadig.variable as mvar\n import xml.etree.ElementTree as ET\n global systemMetadata\n\n global output\n global status\n \n # For the metadata object, check if access control rules are specified for a group of users, if public \n # access is not granted. However, if public access is granted, then this check always passes.\n # Note that this check does not inspect the metadata, but instead evaluates the DataONE system \n # metadata\n \n if ('systemMetadata' not in globals() or systemMetadata is None):\n status = \"FAILURE\"\n output = \"DataONE system metadata not available, unable to confirm access control rules\"\n return False\n \n # Turns out that the string that contains the sysmeta xml can have non-ascii\n # characters in the <fileName> element (maybe others), so convert it to unicode.\n # Python 2.x xml.etree can't handle non-ascii characters, so remove them from the system metadata\n # as they will only be in the uncontrolled elements like fileName, which we are not using for this check\n systemMetadata = systemMetadata.encode('ascii',errors='ignore')\n root = ET.fromstring(systemMetadata)\n \n # Check an access rule that allows 'read' permission to the 'public' subject.\n # for example:\n # <accessPolicy>\n # <allow>\n # <subject>public</subject>\n # <permission>read</permission>\n # </allow>\n # <allow>\n # <subject>http://orcid.org/0000-0002-9185-0144</subject>\n # <permission>read</permission>\n # <permission>write</permission>\n # <permission>changePermission</permission>\n # </allow>\n # ...\n # </accessPolicy>\n \n rules = root.findall(\"./accessPolicy/allow\")\n # No access rules exist\n if(len(rules) == 0):\n status = \"FAILURE\"\n output = \"No access rules in system metadata, only rights holder can access this object\"\n return False\n\n # Loop through the access rules, checking for public access rule and any other rules\n publicRead = False\n otherUserRead = False\n \n for rule in rules:\n subject = rule.find(\"subject\").text\n perms = rule.findall(\"permission\")\n read = False\n changePermission = False\n for perm in perms:\n if perm.text == \"read\": \n read = True\n if perm.text == \"changePermission\": \n changePermission = True\n \n if (subject == \"public\"):\n if read or changePermission:\n publicRead = True\n else:\n if read or changePermission:\n otherUserRead = True\n \n if publicRead:\n status = \"SUCCESS\"\n output = \"Metadata are publicly available as public read is set in DataONE system metadata.\"\n return True\n else:\n if otherUserRead:\n status = \"SUCCESS\"\n output = \"Metadata are not publicly available but have rules to allow access for specific users.\"\n return True\n else:\n status = \"FAILURE\"\n output = \"Metadata are not publicly available but do not have rules to allow access for specific users.\"\n return True\n \n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "accessControlRules",
"xpath": "\n /eml/dataset/intellectualRights/text()[normalize-space()] |\n /*/identificationInfo/*/resourceConstraints/MD_LegalConstraints/accessConstraints/*/text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:10 PM",
"output": [
{
"value": "Metadata are publicly available as public read is set in DataONE system metadata.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.landingPage.present.1",
"name": "Resource Landing Page Present",
"description": "Check that a resource landing page exists and is resolvable.",
"type": "Accessible",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global datasetLandingPage\n \n import metadig.variable as mvar\n import metadig.checks as checks\n\n displayLength = 20\n # A resource landing page is not present\n if 'datasetLandingPage' not in globals() or datasetLandingPage is None:\n output = \"A resource landing page url was not found.\"\n status = \"FAILURE\"\n return False\n \n datasetLandingPage = mvar.toUnicode(datasetLandingPage)\n\n if (mvar.isBlank(datasetLandingPage)):\n output = \"The resource langing page url is blank.\" \n status = \"FAILURE\"\n return False\n else:\n # Check if variable type is a single string or arrayList\n if(isinstance(datasetLandingPage, unicode)):\n output = u\"The resource landing page url '{}' was found\".format(datasetLandingPage)\n url = datasetLandingPage.strip()\n else: \n output = u\"The resource landing page url '{}' was found (first of {} urls)\".format(datasetLandingPage[0].strip(), len(datasetLandingPage))\n url = datasetLandingPage[0].strip()\n\n \n resolvable, msg = checks.isResolvable(url)\n if (resolvable):\n output = u'{} and is resolvable.'.format(output)\n status = \"SUCCESS\"\n return True\n else:\n output = u\"{}, but is not resolvable: {}\".format(output, msg)\n status = \"FAILURE\"\n return False\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "datasetLandingPage",
"xpath": "\n /eml/dataset/distribution/online/url[@function=\"information\"] |\n /*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorTransferOptions/MD_DigitalTransferOptions/onLine/CI_OnlineResource[function/CI_OnLineFunctionCode/@codeListValue=\"information\"]/linkage/URL/text()[normalize-space()] |\n /*/distributionInfo/MD_Distribution/transferOptions/MD_DigitalTransferOptions/onLine/CI_OnlineResource[function/CI_OnLineFunctionCode='information']/linkage/URL/text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/onlineResource/CI_OnlineResource[function/CI_OnLineFunctionCode='information']/linkage/URL/text()[normalize-space() ]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:10 PM",
"output": [
{
"value": "A resource landing page url was not found.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.distributionContact.present.1",
"name": "Resource Distribution Contact Present",
"description": "Check that a distribution contact exists.",
"type": "Accessible",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n \n import metadig.variable as mvar\n\n # A distribution contact is not present\n if 'distributionContact' not in globals() or distributionContact is None:\n output = \"A distribution contact was not found.\"\n status = \"FAILURE\"\n return False\n \n if (mvar.isBlank(distributionContact)):\n output = \"The distribution contact is blank.\" \n status = \"FAILURE\"\n return False\n else:\n contact = mvar.toUnicode(distributionContact)\n # Check if resource identifier type is a single string or arrayList\n if(isinstance(contact, list)):\n output = u\"The distribution contact '{}' was found (first of {} contacts).\".format(contact[0], len(contact))\n else:\n output = u\"The distribution contact '{}' was found.\".format(contact)\n \n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "distributionContact",
"xpath": "\n /eml/*/associatedParty/role[RoleType='distributor']//text()[normalize-space()] |\n /eml/*/contact/individualName/surName/text()[normalize-space()] |\n /eml/*/contact/organizationName/text()[normalize-space()] |\n /*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorContact/CI_ResponsibleParty/organisationName/*/text()[normalize-space()] | \n /*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorContact/CI_ResponsibleParty/individualName/*/text()[normalize-space()] |\n /*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorContact/CI_ResponsibleParty/positionName/*/text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:10 PM",
"output": [
{
"value": "The distribution contact 'Roof' was found (first of 2 contacts).",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.distributionContactIdentifier.present.1",
"name": "Resource Distribution Contact Identifier",
"description": "Check that a distribution contact identifier exists.",
"type": "Accessible",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global distributionContactIdentifier\n \n import metadig.variable as mvar\n\n # A distribution contact identifier is not present\n if 'distributionContactIdentifier' not in globals() or distributionContactIdentifier is None:\n output = \"A distribution contact identifier was not found.\"\n status = \"FAILURE\"\n return False\n \n if (mvar.isBlank(distributionContactIdentifier)):\n output = \"The distribution contact identifier is blank.\" \n status = \"FAILURE\"\n return False\n else:\n distributionContactIdentifier = mvar.toUnicode(distributionContactIdentifier)\n # Check if distribution contact identifier type is a single string or list\n if(isinstance(distributionContactIdentifier, unicode)):\n output = u\"The distribution contact identifier '{}' was found\".format(distributionContactIdentifier)\n else:\n output = u\"The distribution contact identifier '{}' was found (first of {} identifiers)\".format(distributionContactIdentifier[0].strip(), len(distributionContactIdentifier))\n \n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "distributionContactIdentifier",
"xpath": "\n /eml/*/associatedParty/role[RoleType='distributor']/text()[normalize-space()] |\n /eml/*/contact/userId/text()[normalize-space()] |\n //*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorContact/CI_ResponsibleParty[normalize-space(role/CI_RoleCode)='distributor']/party/*/partyIdentifier/MD_Identifier/code |\n //*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorContact/CI_ResponsibleParty[normalize-space(role/CI_RoleCode)='pointOfContact']/party/*/partyIdentifier/MD_Identifier/code\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:10 PM",
"output": [
{
"value": "The distribution contact identifier 'https://orcid.org/0000-0001-8729-0036' was found",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "metadata.identifier.resolvable.1",
"name": "Metadata Identifier Resolvable",
"description": "Check that the metadata identifier exists and is resolvable.",
"type": "Accessible",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n\n import metadig.variable as mvar\n import metadig.checks as checks\n import urllib\n import re\n global metadataIdentifier\n\n d1_resolve_service=\"https://cn.dataone.org/cn/v2/resolve/\"\n\n # check if a metadata identifier is present\n if 'metadataIdentifier' not in globals() or metadataIdentifier is None:\n output = \"A metadata identifier was not found.\"\n status = \"FAILURE\"\n return False\n \n metadataIdentifier = mvar.toUnicode(metadataIdentifier)\n \n # This should only be a single value, but if not (a list is returned) just get the first \n # one\n if(isinstance(metadataIdentifier, list)):\n metadataIdentifier = metadataIdentifier[0]\n\n if (mvar.isBlank(metadataIdentifier)):\n output = \"The metadata identifier is blank.\"\n status = \"FAILURE\"\n return False\n else:\n output = u\"The metadata identifier '{}' was found)\".format(metadataIdentifier)\n id = metadataIdentifier\n \n # Now check if the metadata identifier is a resolvable url. If it doesn't look like a URL, then \n # see if DataONE knows about it.\n usedD1 = False\n isDOI = False\n if(re.match(\"^\\s*http.*:\\/\", id)):\n resolvable, msg = checks.isResolvable(id)\n elif(re.match('doi:', id)):\n isDOI = True\n # If the identifier is a 'bare' DOI (e.g. \"doi:10.18739/A2027H\"), then prepend with a DOI resolver link\n # i.e. https://dx.doi.org\n resolvable, msg = checks.isResolvable(\"https://dx.doi.org/{}\".format(id.strip()))\n else:\n usedD1 = True\n url = \"{}{}\".format(d1_resolve_service,urllib.quote(id))\n resolvable, msg = checks.isResolvable(url)\n \n if (resolvable):\n if(usedD1):\n output = u'{} and is resolvable using the DataONE resolve service.'.format(output)\n elif(isDOI):\n output = u'{} and is resolvable using a DOI resolver.'.format(output) \n else:\n output = u'{} and is resolvable.'.format(output) \n \n status = \"SUCCESS\"\n return True\n else:\n output = u\"{}, but is not resolvable.\".format(output) \n status = \"FAILURE\"\n return False\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "metadataIdentifier",
"xpath": "\n /resource/identifier |\n /*/fileIdentifier/*/text()[normalize-space()] |\n /eml/@packageId\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 4",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:10 PM",
"output": [
{
"value": "The metadata identifier 'doi:10.18739/A24X54G2H' was found) and is resolvable using a DOI resolver.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.publisher.present.1",
"name": "Resource Publisher Present",
"description": "Check that a publisher exists.",
"type": "Accessible",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n \n import metadig.variable as mvar\n \n if ('datasetPublisher' in globals() and datasetPublisher is not None):\n if (mvar.isBlank(datasetPublisher)):\n output = \"The resource publisher is blank.\"\n status = \"FAILURE\"\n return False\n else:\n publisher = mvar.toUnicode(datasetPublisher)\n # A single value or list could be returned by the engine, so convert a single\n # value to a list for easier processing\n if(isinstance(publisher, unicode)):\n publisher = [publisher]\n output = u\"The resource publisher '{}' was found.\".format(publisher[0].strip())\n status = \"SUCCESS\"\n return True \n else:\n output = \"A resource publisher is not present.\"\n status = \"FAILURE\"\n return False\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "datasetPublisherPresent",
"xpath": "boolean(\n /resource/publisher or\n /eml/dataset/publisher or\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_ResponsibleParty[normalize-space(role/CI_RoleCode)='publisher']/organisationName or\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_Responsibility[normalize-space(role/CI_RoleCode)='publisher']/party/CI_Organisation/name)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "datasetPublisher",
"xpath": "/resource/publisher/text()[normalize-space()] |\n /eml/dataset/publisher/organizationName/text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_ResponsibleParty[normalize-space(role/CI_RoleCode)='publisher']/organisationName//text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_Responsibility[normalize-space(role/CI_RoleCode)='publisher']/party/CI_Organisation/name//text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:10 PM",
"output": [
{
"value": "A resource publisher is not present.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.publisherIdentifier.present.1",
"name": "Resource Publisher Identifier Present",
"description": "Check that a resource publisher identifier exists.",
"type": "Accessible",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global datasetPublisherIdentifier\n \n import metadig.variable as mvar\n\n # A resource publisher identifier is not present\n if 'datasetPublisherIdentifier' not in globals() or datasetPublisherIdentifier is None:\n output = \"A resource publisher identifier was not found.\"\n status = \"FAILURE\"\n return False\n \n # Convert all values to unicode\n datasetPublisherIdentifier = mvar.toUnicode(datasetPublisherIdentifier)\n \n if (mvar.isBlank(datasetPublisherIdentifier)):\n output = \"The resource publisher identifier is blank.\"\n status = \"FAILURE\"\n return False\n else:\n # Check if resource publisher identifier is a single string or arrayList\n if(isinstance(datasetPublisherIdentifier, unicode)):\n output = u\"The resource publisher identifier '{}' was found.\".format(datasetPublisherIdentifier.strip())\n elif (isinstance(datasetPublisherIdentifier, list)):\n output = u\"The resource publisher identifier '{}' was found (first of {} publisher identifiers).\".format(datasetPublisherIdentifier[0].strip(), len(datasetPublisherIdentifier))\n\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "datasetPublisherIdentifier",
"xpath": "\n /resource/publisher |\n /eml/dataset/publisher/userId/text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_ResponsibleParty[normalize-space(role/CI_RoleCode)='publisher']/party/*/partyIdentifier/MD_Identifier/code//text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/citedResponsibleParty/CI_Responsibility[normalize-space(role/CI_RoleCode)='publisher']/party/*/partyIdentifier/MD_Identifier/code//text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:10 PM",
"output": [
{
"value": "A resource publisher identifier was not found.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.serviceLocation.present.1",
"name": "Resource Service Location Present",
"description": "Check that a service location exists.",
"type": "Accessible",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n\n if ('serviceLocation' in globals() and serviceLocation is not None and serviceLocation):\n output = \"A service location is present.\"\n status = \"SUCCESS\"\n return True\n else:\n output = \"A service location is not present.\"\n status = \"FAILURE\"\n return False\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "serviceLocation",
"xpath": "boolean(/*/identificationInfo/SV_ServiceIdentification/containsOperations/SV_OperationMetadata/connectPoint/CI_OnlineResource/linkage/URL)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:11 PM",
"output": [
{
"value": "A service location is not present.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.serviceProvider.present.1",
"name": "Resource Service Provider Present",
"description": "Check that a service provider is present.",
"type": "Accessible",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global serviceProvider\n \n import metadig.variable as mvar\n\n # A data service provider is not present\n if 'serviceProvider' not in globals() or serviceProvider is None:\n output = \"A service provider was not found.\"\n status = \"FAILURE\"\n return False\n \n serviceProvider = mvar.toUnicode(serviceProvider)\n \n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(serviceProvider, unicode)):\n serviceProvider = [serviceProvider]\n\n if (mvar.isBlank(serviceProvider)):\n output = \"The service provider is blank.\" \n status = \"FAILURE\"\n return False\n else:\n # Check if resource identifier type is a single string or arrayList\n if(len(serviceProvider) == 1):\n output = u\"The service provider '{}' was found\".format(serviceProvider)\n else: \n output = u\"The service provider '{}' was found (first of {} providers)\".format(serviceProvider[0].strip(), len(serviceProvider))\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "serviceProvider",
"xpath": "\n /*/identificationInfo/SV_ServiceIdentification/pointOfContact/CI_ResponsibleParty/individualName/*/text()[normalize-space()] |\n /*/identificationInfo/SV_ServiceIdentification/pointOfContact/CI_ResponsibleParty/organisationName/*/text()[normalize-space()] |\n /*/identificationInfo/SV_ServiceIdentification/pointOfContact/CI_ResponsibleParty/positionNam/*/text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:11 PM",
"output": [
{
"value": "A service provider was not found.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.distributionURL.resolvable.1",
"name": "Entity Distribution URL Resolvable",
"description": "Check that the entity distribution URL is resolvable.",
"type": "Accessible",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global distributionUrl\n \n import metadig.variable as mvar\n import metadig.checks as checks\n import urllib\n import re\n import numbers\n \n global distributionUrl\n \n # TODO: This check is only inspecting the first entity distribution URL.\n # Update it to check the first N URLs (where N is some\n # reasonable number to check. Note that there may be hundreds or\n # thousands of entities, and we probably don't want to check them all.\n \n # An entity distribution URL is not present\n if 'distributionUrl' not in globals() or distributionUrl is None:\n output = \"An entity distribution URL was not found.\"\n status = \"FAILURE\"\n return False\n \n # Convert to unicode to prevent decoding errors\n distributionUrl = mvar.toUnicode(distributionUrl)\n # If a single value, convert to a list for easier processing (i.e. don't \n # have to keep checking if it's a single value or list)\n if(isinstance(distributionUrl, unicode)):\n distributionUrl = [distributionUrl]\n\n if (mvar.isBlank(distributionUrl)):\n output = \"The entity distribution URL is blank.\"\n status = \"FAILURE\"\n return False\n else: \n url = distributionUrl[0].strip()\n if(len(distributionUrl) > 1):\n output = u\"The entity distribution URL '{}' was found (first of {} URLs)\".format(url, len(distributionUrl))\n else:\n output = u\"The entity distribution URL '{}' was found\".format(url)\n \n # Now check if the entity identifier is a resolvable url as is. Do not attempt to use any resolver service,\n # just use the bare URL.\n resolvable, msg = checks.isResolvable(url)\n \n if (resolvable):\n output = u'{} and is resolvable.'.format(output) \n status = \"SUCCESS\"\n return True\n else:\n output = u\"{}, but is not resolvable.\".format(output) \n status = \"FAILURE\"\n return False\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "distributionUrl",
"xpath": "/eml/dataset/*/physical/distribution/online/url[@function=\"download\"]/text()[normalize-space()] |\n /*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorTransferOptions/MD_DigitalTransferOptions/onLine/CI_OnlineResource[function/CI_OnLineFunctionCode='download']/linkage/URL/text()[normalize-space()] |\n /*/distributionInfo/MD_Distribution/transferOptions/MD_DigitalTransferOptions/onLine/CI_OnlineResource[function/CI_OnLineFunctionCode='download']/linkage/URL/text()[normalize-space()] |\n /*/identificationInfo/*/citation/CI_Citation/onlineResource/CI_OnlineResource[function/CI_OnLineFunctionCode='download']/linkage/URL/text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:11 PM",
"output": [
{
"value": "The entity distribution URL 'https://cn.dataone.org/cn/v2/resolve/urn:uuid:9b6b919a-0c64-4bdd-ac5a-ddefa3b67ca9' was found and is resolvable.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "entity.attributeName.differs.1",
"name": "Entity Attribute Names Differ from Definitions",
"description": "Check that attribute definitions are not simply the attribute name.",
"type": "Interoperable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global attributeDefinition\n global attributeName\n \n import metadig.variable as mvar\n \n definitionIsSame = []\n maxPrint = 5\n \n # For each entity, check that the attributes defined for it have unique names\n if(not attributeDefinitionPresent):\n output = \"No attribute definitions found, unable to check if attribute names differ from definitions.\"\n status = \"FAILURE\"\n return False\n \n # Check each definition - see if it is blank or missing, then check if it is the same as the \n # name\n attributeDefinition = mvar.toUnicode(attributeDefinition)\n attributeName = mvar.toUnicode(attributeName)\n \n # Convert to a list if only one element, if needed\n if(isinstance(attributeDefinition, unicode)):\n attributeDefinition = [attributeDefinition]\n \n if(isinstance(attributeName, unicode)):\n attributeName = [attributeName]\n \n for i in range(0, len(attributeDefinition)):\n definition = attributeDefinition[i]\n \n # No definition for this attribute, don't fail for this as there is \n # another test for this.\n if(definition == None):\n continue\n else:\n name = attributeName[i]\n # If the name is blank, report it using the attribute number in the file\n if(name is None or mvar.isBlank(name)):\n definitionIsSame.append(\"attr #{}\".format(i))\n # Is the definition name the same as the definition\n elif(name == definition):\n definitionIsSame.append(name)\n \n if(len(definitionIsSame) == 1):\n output = u\"This {} attribute (out of {} total) has a name that is different than the definition (or is blank): {}\".format(len(definitionIsSame), len(attributeDefinition), definitionIsSame[0])\n status = \"FAILURE\"\n return False\n elif(len(definitionIsSame) > 1):\n # Only print the max entries allowed - otherwise the report could get unweildy \n output = u\"These {} attributes (out of {} total) have names that are different than the definition (or are blank): {}\".format(len(definitionIsSame), len(attributeDefinition), ', '.join(definitionIsSame[0:maxPrint]))\n if(len(definitionIsSame) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = u\"All {} attributes have definitions that differ from the name\".format(len(attributeDefinition))\n status = \"SUCCESS\"\n return True \n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "attributeDefinitionPresent",
"xpath": "boolean(\n /eml/dataset/*/attributeList/attribute/attributeDefinition or\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute/*/description or\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute/*/description or\n /*/contentInfo/*/dimension/MI_Band/descriptor or\n /*/contentInfo/*/dimension/MD_Band/descriptor)",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "attributeDefinition",
"xpath": "/eml/dataset/*/attributeList/attribute |\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/*/dimension/MI_Band |\n /*/contentInfo/*/dimension/MD_Band \n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "definition",
"xpath": " ./attributeDefinition/text()[normalize-space()] |\n ./*/description/text()[normalize-space()] |\n ./descriptor/*/text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "attributeName",
"xpath": "\n /eml/dataset/*/attributeList/attribute |\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/*/dimension/MI_Band |\n /*/contentInfo/*/dimension/MD_Band\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "\n (./attributeName/text()[normalize-space()] | \n ./*/name/code/*/text()[normalize-space()] |\n ./descriptor/*/text()[normalize-space()] |\n ./sequenceIdentifier/MemberName/aName/*/text()[normalize-space()])[1]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:11 PM",
"output": [
{
"value": "No attribute definitions found, unable to check if attribute names differ from definitions.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.attributeNames.unique.1",
"name": "Entity Attribute Names Are Unique",
"description": "Check that attribute names are unique.",
"type": "Interoperable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global entities\n \n import metadig.variable as mvar\n \n maxPrint = 5\n \n # Determine if elements are repeated in a list\n def duplicates(x):\n \n repeated = []\n nElements = len(x)\n \n # Ensure that the list is unicode\n x = mvar.toUnicode(x)\n for i in range(nElements):\n k = i + 1\n for j in range(k, nElements):\n if x[i] == x[j] and x[i] not in repeated:\n if(x[i].strip() == \"\"): \n continue\n repeated.append(x[i])\n return repeated\n \n allDuplicates = []\n entityCount = 0\n attributeCount = 0\n \n # The check fails if no attributes are found.\n if(len(entities) == 0):\n output = \"No attributes found, unable to check for duplicate attribute names.\"\n status = \"FAILURE\"\n return False\n \n # Convert all values to unicode\n entities = mvar.toUnicode(entities)\n # If a single string, convert to list for easier processing\n if(isinstance(entities, unicode)):\n entities = [entities]\n \n # For each entity, check that the attributes defined for it have unique names\n for i in range(0, len(entities)):\n entityCount += 1\n attrList = mvar.toUnicode(entities[i])\n # No attributes for this entity\n if(attrList == None):\n continue\n else:\n # If only one attribute, the quality engine will return it as a string. Don't\n # check for dupes (characters) in a single string, only check a list of strings\n if(isinstance(attrList, str) or isinstance(attrList, unicode)):\n attributeCount += 1\n continue\n else:\n attributeCount += len(attrList)\n allDuplicates.extend(duplicates(attrList))\n \n if(len(allDuplicates) > 0):\n output = u\"These {} attributes (of {} total) had duplicate names: {}\".format(len(allDuplicates), attributeCount, ', '.join(allDuplicates[0:maxPrint]))\n if(len(allDuplicates) > maxPrint):\n output += u\", ...\"\n else:\n output = \"No duplicate attribute names were found.\"\n status = \"SUCCESS\"\n return True \n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "entities",
"xpath": "\n /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity] |\n /*/contentInfo/MD_CoverageDescription |\n /*/contentInfo/MI_CoverageDescription \n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "names",
"xpath": "\n ./attributeList/attribute/attributeName/text()[normalize-space()] |\n ./attributeDescription/RecordType/text()[normalize-space()] |\n ./dimension/MI_Band/sequenceIdentifier/MemberName/aName//text()[normalize-space()] |\n ./dimension/MD_Band/sequenceIdentifier/MemberName/aName//text()[normalize-space()] |\n ./descriptor/*/text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:11 PM",
"output": [
{
"value": "No duplicate attribute names were found.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "entity.attributeDefinition.present.1",
"name": "Entity Attribute Definition Present",
"description": "Check that an attribute definition exists.",
"type": "Interoperable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global attributeDefinition\n global attributeName\n \n import metadig.variable as mvar\n \n NoneType = type(None)\n \n # Check if definitions are present for each attribute.\n if not attributeDefinitionPresent:\n output = \"No attributes definitions are present.\"\n status = \"FAILURE\"\n return False\n \n attributeCount = 0\n missing = []\n maxPrint = 5\n \n # Convert all values to unicode \n attributeDefinition = mvar.toUnicode(attributeDefinition)\n # If a single value, convert to a list for easier processing (i.e. don't have\n # to keep checking if it's a single value or list)\n if(isinstance(attributeDefinition, unicode)):\n attributeDefinition = [attributeDefinition]\n \n attributeName = mvar.toUnicode(attributeName)\n if(isinstance(attributeName, unicode)):\n attributeName = [attributeName]\n \n for i in range(0, len(attributeDefinition)):\n attributeCount += 1\n definition = attributeDefinition[i]\n name = attributeName[i]\n # Check if the name is missing or blank\n if(isinstance(name, NoneType) or mvar.isBlank(name)):\n name = \"attr #{}\".format(i)\n \n # No definition for this attribute was specified, or it is blank\n if(isinstance(definition, NoneType) or mvar.isBlank(definition)): \n missing.append(name)\n \n if(len(missing) == 1):\n output = u\"This {} attribute (of {} total) does not have a definition: {}\".format(len(missing), attributeCount, missing[0])\n status = \"FAILURE\"\n return False\n elif(len(missing) > 1):\n output = u\"These {} attributes (of {} total) do not have a definition: {}\".format(len(missing), attributeCount, ', '.join(missing[0:maxPrint]))\n if(len(missing) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = \"All {} attributes have a definition\".format(attributeCount)\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "attributeDefinitionPresent",
"xpath": "boolean(\n /eml/dataset/*/attributeList/attribute/attributeDefinition or\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute/*/description or\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute/*/description or\n /*/contentInfo/*/dimension/MI_Band/descriptor or\n /*/contentInfo/*/dimension/MD_Band/descriptor)",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "attributeDefinition",
"xpath": "/eml/dataset/*/attributeList/attribute |\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/*/dimension/MI_Band |\n /*/contentInfo/*/dimension/MD_Band \n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "definition",
"xpath": "\n ./attributeDefinition/text()[normalize-space()] |\n ./*/description/*/text()[normalize-space()] |\n ./descriptor/*/text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "attributeName",
"xpath": "\n /eml/dataset/*/attributeList/attribute |\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/*/dimension/MI_Band |\n /*/contentInfo/*/dimension/MD_Band\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "\n (./attributeName/text()[normalize-space()] | \n ./sequenceIdentifier/MemberName/aName/*/text()[normalize-space()] |\n ./descriptor/*/text()[normalize-space()])[1]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:11 PM",
"output": [
{
"value": "No attributes definitions are present.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.attributeDefinition.sufficient.1",
"name": "Entity Attribute Definition Sufficient",
"description": "Check that each attribute definition is sufficient (e.g. word count).",
"type": "Interoperable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global attributeDefinition\n global attributeName \n \n import metadig.variable as mvar\n import re\n \n NoneType = type(None)\n minWordCount = 3\n maxPrint = 5\n \n # Check if definitions are present for each attribute.\n if not attributeDefinitionPresent:\n output = \"No attribute definitions are present, so unable to check if definitions are sufficient.\"\n status = \"FAILURE\"\n return False\n \n attributeCount = 0\n insufficient = []\n \n attributeDefinition = mvar.toUnicode(attributeDefinition)\n attributeName = mvar.toUnicode(attributeName)\n \n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(attributeDefinition, unicode)):\n attributeDefinition = [attributeDefinition]\n \n if(isinstance(attributeName, unicode)):\n attributeName = [attributeName]\n\n # Check each attribute definition to see if it is missing or blank.\n for i in range(0, len(attributeDefinition)):\n attributeCount += 1\n definition = attributeDefinition[i]\n name = attributeName[i]\n # Check if the name is missing or blank\n if(isinstance(name, NoneType) or mvar.isBlank(name)):\n name = \"attr #{}\".format(i)\n\n # No definition for this attribute was specified\n if(isinstance(definition, NoneType) or mvar.isBlank(definition)):\n insufficient.append(name)\n continue\n # If for some reason there are multiple definitions for this entity, just use the first one\n elif(isinstance(definition, list)):\n definition = mvar.toUnicode(definition[0])\n \n # Now check the word count of the definition\n words = re.split('\\s+', definition.strip())\n if(len(words) < minWordCount):\n insufficient.append(name)\n \n # Let the user know if definitions are missing, but don't fail the check based\n # on this (unless all are missing), as that is part of a different check\n if(len(insufficient) == 1):\n output = u\"This attribute (of {} total) has an insufficient definition of less than {} words: '{}'\".format(attributeCount, minWordCount, insufficient[0])\n elif(len(insufficient) > 1):\n # Only print the max entries allowed - otherwise the report could get unweildy \n output = u\"These {} attributes (of {} total) have insufficient definitions of less than {} words: '{}'\".format(len(insufficient),attributeCount, minWordCount, \", \".join(insufficient[0:maxPrint]))\n if(len(insufficient) > maxPrint):\n output += \", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = \"All {} attribute have sufficient definitions of greater than {} words\".format(attributeCount, minWordCount)\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "attributeDefinitionPresent",
"xpath": "boolean(\n /eml/dataset/*/attributeList/attribute/attributeDefinition or\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute/*/description or\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute/*/description or\n /*/contentInfo/*/dimension/MI_Band/descriptor or\n /*/contentInfo/*/dimension/MD_Band/descriptor)",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "attributeDefinition",
"xpath": "/eml/dataset/*/attributeList/attribute |\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/*/dimension/MI_Band |\n /*/contentInfo/*/dimension/MD_Band \n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "definition",
"xpath": "\n ./attributeDefinition |\n ./*/description |\n ./descriptor/*/text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "attributeName",
"xpath": "\n /eml/dataset/*/attributeList/attribute |\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute |\n /*/contentInfo/*/dimension/MI_Band |\n /*/contentInfo/*/dimension/MD_Band\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "\n (./attributeName/text()[normalize-space()] | \n ./sequenceIdentifier/MemberName/aName/*/text()[normalize-space()] |\n ./descriptor/*/text()[normalize-space()])[1]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:11 PM",
"output": [
{
"value": "No attribute definitions are present, so unable to check if definitions are sufficient.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.attributeStorageType.present.1",
"name": "Entity Attribute Storage Type Present",
"description": "Check that a storage type exits for each attribute.",
"type": "Interoperable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global attributeStorageType\n global attributeName\n \n import metadig.variable as mvar\n maxPrint = 5\n NoneType = type(None)\n \n # Check if an attribute storage type is present for each attribute.\n \n if not attributesPresent:\n output = \"No attributes are present, so unable to check for attribute storage type.\"\n status = \"FAILURE\"\n return False\n \n # If attributes are present, they can have a storage type (in EML), so fail if none are found.\n if not attributeStorageTypePresent:\n output = \"No attribute storage types are present.\"\n status = \"FAILURE\"\n return False\n\n attributeStorageType = mvar.toUnicode(attributeStorageType)\n attributeName = mvar.toUnicode(attributeName)\n \n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(attributeStorageType, unicode)):\n attributeStorageType = [attributeStorageType]\n \n if(isinstance(attributeName, unicode)):\n attributeName = [attributeName]\n \n attributeCount = 0\n missing = []\n \n for i in range(0, len(attributeStorageType)):\n attributeCount += 1\n storageType = attributeStorageType[i]\n name = attributeName[i]\n if(isinstance(name, NoneType) or mvar.isBlank(name)):\n name = \"attr #{}\".format(i)\n # No storageType for this attribute was specified\n if(storageType == None or mvar.isBlank(storageType)):\n missing.append(name)\n \n if(len(missing) > 0):\n output = u\"These attributes do not have a storage type: '{}'\".format(', '.join(missing[0:maxPrint]))\n if(len(missing) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = \"All {} attributes have a storage type\".format(attributeCount)\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "attributesPresent",
"xpath": "boolean(/eml/dataset/*/attributeList/attribute or\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute or\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute or\n /*/contentInfo/*/dimension/MI_Band or\n /*/contentInfo/*/dimension/MD_Band)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "attributeStorageTypePresent",
"xpath": "boolean(\n /eml/dataset/*/attributeList/attribute/storageType)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "attributeStorageType",
"xpath": "/eml/dataset/*/attributeList/attribute",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "storageType",
"xpath": "./storageType",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "attributeName",
"xpath": "\n /eml/dataset/*/attributeList/attribute\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "./attributeName",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:11 PM",
"output": [
{
"value": "No attributes are present, so unable to check for attribute storage type.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.checksum.present.1",
"name": "Entity checksum and algorithm are present.",
"description": "Check that an entity checksum and algorithm are present.",
"type": "Interoperable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global checksums\n global checksumAlgorithms\n global entityNames\n \n import metadig.variable as mvar\n maxPrint = 5\n \n # Check if measurement units are present for each attribute.\n if not entitiesPresent:\n output = \"No data entities are present, so cannot check for entity checksums.\"\n status = \"FAILURE\"\n return False\n \n entityCount = 0\n missing = []\n \n checksums = mvar.toUnicode(checksums)\n checksumAlgorithms = mvar.toUnicode(checksumAlgorithms)\n \n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(checksums, unicode)):\n checksums = [checksums]\n \n if(isinstance(checksumAlgorithms, unicode)):\n checksumAlgorithms = [checksumAlgorithms]\n \n if(isinstance(entityNames, unicode)):\n entityNames = [entityNames]\n \n for i in range(0, len(checksums)):\n entityCount += 1\n sum = checksums[i]\n algorithm = checksumAlgorithms[i]\n name = entityNames[i]\n if(name is None):\n name = \"#{}\".format(i)\n\n # No units for this attribute was specified\n if(sum is None or mvar.isBlank(sum)):\n missing.append(mvar.toUnicode(name))\n elif(algorithm is None or mvar.isBlank(algorithm)):\n missing.append(mvar.toUnicode(name))\n \n if(len(missing) == 1):\n output = u\"This 1 entity (of {} total) is missing a checksum value or algorithm: '{}'\".format(entityCount, missing[0])\n status = \"FAILURE\"\n return False\n elif(len(missing) > 1):\n output = u\"These {} entities (of {} total) are missing a checksum value or algorithm: '{}'\".format(len(missing), entityCount, ', '.join(missing[0:maxPrint]))\n if(len(missing) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = \"All entities have checksums and checksum algorithms specified.\"\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "entitiesPresent",
"xpath": "boolean(/eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity])\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "checksums",
"xpath": "\n /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "...",
"xpath": "./physical/authentication",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "checksumAlgorithms",
"xpath": "\n /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "...",
"xpath": "./physical/authentication[@method]",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "entityNames",
"xpath": " /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "./entityName",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:11 PM",
"output": [
{
"value": "All entities have checksums and checksum algorithms specified.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "coverage.contentType.present.1",
"name": "Coverage Content Type",
"description": "Check that a coverage content type exists.",
"type": "Interoperable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global coverageContentType\n \n import metadig.variable as mvar\n maxPrint = 3\n\n # A coverage content type is not present\n if 'coverageContentType' not in globals() or coverageContentType is None:\n output = \"A coverage content type was not found.\"\n status = \"FAILURE\"\n return False\n \n # Convert all values to unicode\n coverageContentType = mvar.toUnicode(coverageContentType)\n \n if (mvar.isBlank(coverageContentType)):\n output = \"The coverage content type is blank.\" \n status = \"FAILURE\"\n return False\n else:\n # Check if coverage content type is a single string or list\n if(isinstance(coverageContentType, list)):\n output = u\"The coverage content types '{}\".format(', '.join(coverageContentType[0:maxPrint]))\n if(len(coverageContentType) > maxPrint):\n output += u\", ...\"\n output = u\"{}' were found ({} total).\".format(output, len(coverageContentType))\n else: \n output = u\"The coverage content type '{}' was found\".format(coverageContentType)\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "coverageContentType",
"xpath": "\n /*/contentInfo/MD_CoverageDescription/attributeGroup/MD_AttributeGroup/contentType/MD_CoverageContentTypeCode/text()[normalize-space()] |\n /*/contentInfo/MD_CoverageDescription/contentType/MD_CoverageContentTypeCode/text()[normalize-space()] \n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:11 PM",
"output": [
{
"value": "A coverage content type was not found.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.attributeEnumeratedDomains.present.1",
"name": "Entity Attribute Enumerated Domains Present",
"description": "Check that enumerated domains are defined.",
"type": "Interoperable",
"level": "REQUIRED",
"environment": "rscript",
"code": "\nlibrary(metadig)\n\n# Find the <attribute> elements that have enumerated domains in them\nidxs <- which(!is.na(enumerated_domains))\n\nfor (i in idxs) {\n # Check the <code> element\n for (code in codes[i]) {\n if (is.na(code) || is.null(code) || nchar(code) <= 0) {\n failure(paste0(\"A code for the attribute \", names[i], \" (id: \", ids[i], \") was undefined. This is either because the <code> was missing or did not contain any content.\"))\n }\n }\n\n # Check the <definition> element\n for (definition in definitions[i]) {\n if (is.na(definition) || is.null(definition) || nchar(definition) <= 0) {\n failure(paste0(\"A definition for the attribute \", names[i], \" (id: \", ids[i], \") was undefined. This is either because the <definition> was missing or did not contain any content.\"))\n }\n }\n}\n\nsuccess(\"All enumerated domain descriptions found had codes and definitions.\")\n",
"library": null,
"inheritState": false,
"selector": [
{
"name": "ids",
"xpath": "/eml/dataset/dataTable/attributeList/attribute/@id",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "names",
"xpath": "/eml/dataset/dataTable/attributeList/attribute",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "...",
"xpath": "./attributeName",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "enumerated_domains",
"xpath": "/eml/dataset/dataTable/attributeList/attribute",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "...",
"xpath": "./measurementScale/nominal/nonNumericDomain/enumeratedDomain",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "codes",
"xpath": "/eml/dataset/dataTable/attributeList/attribute",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "...",
"xpath": "./measurementScale/nominal/nonNumericDomain/enumeratedDomain/codeDefinition/code",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "definitions",
"xpath": "/eml/dataset/dataTable/attributeList/attribute",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "...",
"xpath": "./measurementScale/nominal/nonNumericDomain/enumeratedDomain/codeDefinition/definition",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": null
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "All enumerated domain descriptions found had codes and definitions.",
"type": "text"
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "entity.format.present.1",
"name": "Entity Format Present",
"description": "Check that each entity has a format specified.",
"type": "Interoperable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global entityFormat\n \n import metadig.variable as mvar\n\n # An entity format is not present\n if 'entityFormat' not in globals() or entityFormat is None:\n output = \"An entity format was not found.\"\n status = \"FAILURE\"\n return False\n \n entityFormat = mvar.toUnicode(entityFormat)\n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(entityFormat, unicode)):\n entityFormat = [entityFormat]\n\n # Check if the entity format is a single string \n if(len(entityFormat) == 1):\n output = \"No entity formats were found.\"\n else: \n output = \"{} entity formats were found\".format(len(entityFormat))\n \n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "entityFormat",
"xpath": "\n /*/distributionInfo/MD_Distribution/distributionFormat/MD_Format |\n /*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorFormat/MD_Format |\n /*/identificationInfo/MD_DataIdentification/resourceFormat/MD_Format |\n /DryadDataFile/format |\n //resourceFormat/MD_Format | \n /eml/dataset/*/physical/dataFormat |\n /resource/formats/format\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "No entity formats were found.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "entity.name.present.1",
"name": "Entity Names Present",
"description": "Check that a name is specified for every entity.",
"type": "Interoperable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global entities\n \n import metadig.variable as mvar\n \n maxPrint = 5\n # Check if a name is present for each entity.\n if not entityPresent:\n output = \"No entities are present so cannot check their name.\"\n status = \"FAILURE\"\n return False\n \n missing = []\n \n # Convert all values, single or list, to unicode\n entities = mvar.toUnicode(entities)\n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(entities, unicode)):\n entities = [entities]\n \n for i in range(0, len(entities)):\n name = entities[i]\n # No name for this entity was specified\n if(name == None or mvar.isBlank(name)):\n missing.append(\"entity #{}\".format(i))\n \n if(len(missing) == 1):\n output = u\"The following entity is missing a name: {}\".format(missing[0])\n status = \"FAILURE\"\n return False\n elif (len(missing) > 1):\n output = u\"The following {} entities are missing a name: '{}'\".format(len(missing), ', '.join(missing[0:maxPrint]))\n if(len(missing) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = \"All {} entities have names.\".format(len(entities))\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "entityPresent",
"xpath": "boolean(\n /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity] or\n /*/contentInfo/MD_CoverageDescription or\n /*/contentInfo/MI_CoverageDescription)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "entities",
"xpath": "\n /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity] |\n /*/contentInfo/MD_CoverageDescription |\n /*/contentInfo/MI_CoverageDescription\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "\n ./entityName |\n ./attributeDescription/RecordType\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "All 1 entities have names.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "entity.type.present.1",
"name": "Entity Type Present",
"description": "Check that an entity type exists.",
"type": "Interoperable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n \n # Check if an entity type is present\n # Note: for EML, this checks looks for the presence of one of the\n # entity type elements, not an element value that specifies the entity type\n # as a string. When the quality engine is upgraded to an XPath implementation \n # supporting XPath 2.0 'name()' function, then the entity name can be returned.\n # For now, just the presence or absence is checked for with a boolean operator.\n if ('entityTypePresent' in globals() and entityTypePresent is not None and entityTypePresent):\n status = \"SUCCESS\"\n output = \"An entity type was found.\"\n return True \n else:\n output = \"An entity type was not found.\"\n status = \"FAILURE\"\n return False\n \n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "entityTypePresent",
"xpath": "boolean(\n /*/metadataScope/MD_MetadataScope/resourceScope/MD_ScopeCode or\n /*/type or\n /*/hierarchyLevel/MD_ScopeCode or\n /eml/dataset/dataTable or\n /eml/dataset/otherEntity or\n /eml/dataset/spatialVector or\n /eml/dataset/spatialRaster or\n /eml/dataset/view or\n /resource/resourceType/@resourceTypeGeneral)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "An entity type was found.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.serviceType.present.1",
"name": "Resource Service Type Present",
"description": "Check that a service type exists.",
"type": "Interoperable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n\n if ('serviceType' in globals() and serviceType is not None and serviceType):\n output = \"A service type is present.\"\n status = \"SUCCESS\"\n return True\n else:\n output = \"A service type is not present.\"\n status = \"FAILURE\"\n return False\n\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "serviceType",
"xpath": "boolean((/*/identificationInfo/SV_ServiceIdentification/serviceType/LocalName))",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "A service type is not present.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.format.nonproprietary.1",
"name": "Non proprietary entity format",
"description": "Check that all entities use non-propietary formats.",
"type": "Reusable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global mdq_params\n global entityFormats\n global entityFormatNames\n \n # Check the data formats for all data entities.\n # The check fails if the specified data format matches a format marked as proprietary.\n # This check uses a reformatted copy of the DataONE format list, that is usually kept in the file \n # /opt/local/metadig/DataONEformats.csv. This file is manually edited to mark specific formats as proprietary. This file is obtained using the DataONE 'formats'\n # service, i.e 'https://cn.dataone.org/cn/v2/formats'.\n \n import metadig.variable as mvar\n import csv\n maxPrint = 5\n \n def isProprietary(formats, thisFormat):\n for row in formats:\n if (row[4].lower().strip() in (\"yes\", \"y\", \"true\", \"t\", \"1\")):\n if(row[2].lower().strip() == thisFormat.lower().strip()):\n return True\n if(row[3].lower().strip() == thisFormat.lower().strip()):\n return True\n \n return False\n \n # Are any entity formats present at all, even ones that don't specify \n # a format name. This is the list of all formats present.\n if ('entityFormats' not in globals() or entityFormats is None):\n output = \"No entity formats are present, so cannot check for proprietary formats.\"\n status = \"FAILURE\"\n return False\n \n entityFormats = mvar.toUnicode(entityFormats)\n if(isinstance(entityFormats, unicode)):\n entityFormats = [entityFormats]\n \n # Entity formats are present, but non that could explicitly define a format type\n # so give them credit for having any formats defined, i.e. no formats that \n # require explicitly defined format names are present.\n if ('entityFormatNames' not in globals() or entityFormatNames is None):\n output = \"No proprietary data entity formats found (out of {} total formats.\".format(len(entityFormats))\n status = \"SUCCESS\"\n return True\n \n dataFilename = \"DataONEformats.csv\"\n formatsFile = \"\"\n # The checks data directory is passed via the 'mdq_params' hash\n # The filename is known only to this check.\n if('mdq_params' not in globals()):\n output = \"Internal error running check, mdq_params not available to check.\"\n status = \"ERROR\"\n return False\n else:\n formatsFile = \"{}/{}\".format(mdq_params['metadigDataDir'], dataFilename)\n \n # Create list with the DataONE formats\n formats = []\n with open(formatsFile, 'rb') as csvfile:\n fmtreader = csv.reader(csvfile, delimiter=',', quotechar='\"')\n for row in fmtreader:\n formats.append(row)\n\n entityFormatNames = mvar.toUnicode(entityFormatNames)\n \n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(entityFormatNames, unicode)):\n entityFormatNames = [entityFormatNames]\n \n proprietaryFound = []\n \n # Check each entity format and see if it is in the 'proprietary' list, which\n # is based on all formats from DataONE that have been manually determined t o be\n # proprietary\n for i in range(0, len(entityFormatNames)):\n # Check if the entity format is a single string or arrayList\n thisFormat = entityFormatNames[i]\n if(isProprietary(formats, thisFormat)):\n proprietaryFound.append(thisFormat)\n \n if(len(proprietaryFound) > 0):\n fmts = list(set([f.encode('UTF8') for f in proprietaryFound]))\n output = u\"These {} proprietary data entity formats (out of {} total formats) were found: {}\".format(len(fmts), len(entityFormats), ', '.join(fmts[0:maxPrint]))\n if(len(fmts) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = \"No proprietary data entity formats found (out of {} total formats.\".format(len(entityFormats))\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "entityFormats",
"xpath": "\n /*/distributionInfo/MD_Distribution/distributionFormat/MD_Format |\n /*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorFormat/MD_Format |\n /*/identificationInfo/MD_DataIdentification/resourceFormat/MD_Format |\n /DryadDataFile/format |\n //resourceFormat/MD_Format |\n /eml/dataset/*/physical/dataFormat |\n /resource/formats/format\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "entityFormatNames",
"xpath": "\n /*/distributionInfo/MD_Distribution/distributionFormat/MD_Format/name/*/text()[normalize-space()] |\n /*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorFormat/MD_Format/name/*/text()[normalize-space()] |\n /*/identificationInfo/MD_DataIdentification/resourceFormat/MD_Format/formatSpecificationCitation/CI_Citation/identifier/MD_Identifier/code |\n /*/identificationInfo/MD_DataIdentification/resourceFormat/MD_Format/formatSpecificationCitation/CI_Citation/title | \n /DryadDataFile/format/CharacterString |\n //resourceFormat/MD_Format/name | \n /eml/dataset/*/physical/dataFormat/externallyDefinedFormat/formatName/text()[normalize-space()] |\n /resource/formats/format\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "These 1 proprietary data entity formats (out of 1 total formats) were found: application/octet-stream",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.attributeDomain.present.1",
"name": "Entity Attribute Domain Present",
"description": "Check that an attribute domain is defined for each relevant attribute.",
"type": "Reusable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global attributeDomain\n global attributeName\n \n import metadig.variable as mvar\n \n NoneType = type(None)\n # Check if a measurementScale is present for each attribute.\n if not attributesPresent:\n output = \"No attributes are present so attribute measurement domain cannot be checked.\"\n status = \"FAILURE\"\n return False\n \n attributeCount = 0\n missing = []\n maxPrint = 5\n \n attributeDomain = mvar.toUnicode(attributeDomain)\n attributeName = mvar.toUnicode(attributeName)\n \n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(attributeDomain, unicode)):\n attributeDomain = [attributeDomain]\n \n if(isinstance(attributeName, unicode)):\n attributeName = [attributeName]\n \n for i in range(0, len(attributeDomain)):\n attributeCount += 1\n domain = attributeDomain[i]\n name = attributeName[i]\n if(isinstance(name, NoneType) or mvar.isBlank(name)):\n name = \"attr #{}\".format(i)\n\n # No domain for this attribute was specified\n if(isinstance(domain, NoneType) or mvar.isBlank(domain)):\n missing.append(name)\n\n if(len(missing) == 1):\n output = u\"This {} attribute (of {} total) does not have a measurement domain defined: {}.\".format(len(missing), attributeCount, missing[0])\n status = \"FAILURE\"\n return False\n elif(len(missing) > 1):\n output = u\"The following {} attributes (of {} total) do not have a measurement domain specified: '{}'.\".format(len(missing), attributeCount, ', '.join(missing[0:maxPrint]))\n if(len(missing) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = \"All {} attributes that require a measurement domain have one defined.\".format(attributeCount)\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "attributesPresent",
"xpath": "boolean(\n /eml/dataset/*/attributeList/attribute or\n /*/contentInfo/*/dimension/MI_Band or\n /*/contentInfo/*/dimension/MD_Band)",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "attributeDomain",
"xpath": "/eml/dataset/*/attributeList/attribute |\n /*/contentInfo/*/dimension/MI_Band |\n /*/contentInfo/*/dimension/MD_Band \n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "domain",
"xpath": "\n ./measurementScale/nominal/nonNumericDomain |\n ./measurementScale/ordinal/nonNumericDomain |\n ./measurementScale/interval/numericDomain |\n ./measurementScale/ratio/numericDomain |\n ./measurementScale/dateTime |\n ./minValue |\n ./maxValue\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "attributeName",
"xpath": "\n /eml/dataset/*/attributeList/attribute |\n /*/contentInfo/*/dimension/MI_Band |\n /*/contentInfo/*/dimension/MD_Band\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "\n (./attributeName/text()[normalize-space()] | \n ./sequenceIdentifier/MemberName/aName/*/text()[normalize-space()] |\n ./descriptor/*/text()[normalize-space()])[1]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "No attributes are present so attribute measurement domain cannot be checked.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.attributeUnits.present.1",
"name": "Entity Attribute Units Defined",
"description": "Check that units are defined for each attribute that should have them.",
"type": "Reusable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global attributeUnits\n global attributeName\n \n import metadig.variable as mvar\n maxPrint = 5\n \n # Check if measurement units are present for each attribute.\n \n if not attributesPresent:\n output = \"No attributes present.\"\n status = \"FAILURE\"\n return False\n \n attributeCount = 0\n missing = []\n \n attributeUnits = mvar.toUnicode(attributeUnits)\n attributeName = mvar.toUnicode(attributeName)\n \n # If only a single value is returned (vs type \"list\"), then convert to a list\n # for easier processing\n if(isinstance(attributeUnits, unicode)):\n attributeUnits = [attributeUnits]\n \n if(isinstance(attributeName, unicode)):\n attributeName = [attributeName]\n \n for i in range(0, len(attributeUnits)):\n attributeCount += 1\n units = attributeUnits[i]\n name = attributeName[i]\n if(name is None):\n name = \"#{}\".format(i)\n\n # No units for this attribute was specified\n if(units == None or mvar.isBlank(units)):\n missing.append(name)\n \n if(len(missing) > 0):\n output = u\"The following attributes are missing a units value: '{}'\".format(', '.join(missing[0:maxPrint]))\n if(len(missing) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n elif(len(missing) == 0 and len(attributeUnits) == 0):\n output = \"Attributes exist, but none of them require unit defintions.\"\n status = \"SUCCESS\"\n return True\n else:\n output = \"All {} attributes that require units have them defined.\".format(attributeCount)\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "attributesPresent",
"xpath": "boolean(\n /eml/dataset/*/attributeList/attribute or\n /eml/dataset/*/attributeList/attribute or\n /*/contentInfo/*/dimension/MD_Band or\n /*/contentInfo/*/dimension/MI_Band\n )\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "attributeUnits",
"xpath": "\n /eml/dataset/dataTable/attributeList/attribute/measurementScale/interval |\n /eml/dataset/dataTable/attributeList/attribute/measurementScale/ratio |\n /*/contentInfo/*/dimension/MD_Band |\n /*/contentInfo/*/dimension/MI_Band\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "units",
"xpath": "\n ./unit/*/text()[normalize-space()] |\n ./units//name/text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "attributeName",
"xpath": "\n /eml/dataset/*/attributeList/attribute/measurementScale/interval |\n /eml/dataset/*/attributeList/attribute/measurementScale/ratio |\n /*/contentInfo/*/dimension/MD_Band |\n /*/contentInfo/*/dimension/MI_Band\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "\n (../../attributeName/text()[normalize-space()] | \n ./sequenceIdentifier/MemberName/aName/*/text()[normalize-space()] |\n ./descriptor/*/text()[normalize-space()])[1]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "No attributes present.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.attributeMeasurementScale.present.1",
"name": "Entity Attribute Measurement Scales Present",
"description": "Check that an attribute measurement scale exists.",
"type": "Reusable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global attributeMeasurementScale\n global attributeName\n \n import metadig.variable as mvar\n \n maxPrint = 5\n\n # Check if a measurementScale is present for each attribute.\n if not attributesPresent:\n output = \"No attributes present, unable to check if attributes have a measurement scale defined.\"\n status = \"FAILURE\"\n return False\n \n if not attributeMeasurementScalePresent:\n output = \"No attribute measurement scale entries are present.\"\n status = \"FAILURE\"\n return False\n \n attributeCount = 0\n missing = []\n \n attributeMeasurementScale = mvar.toUnicode(attributeMeasurementScale)\n if(isinstance(attributeMeasurementScale, unicode)):\n attributeMeasurementScale = [attributeMeasurementScale]\n\n attributeName = mvar.toUnicode(attributeName)\n if(isinstance(attributeName, unicode)):\n attributeName = [attributeName]\n \n for i in range(0, len(attributeMeasurementScale)):\n measurementScale = attributeMeasurementScale[i]\n \n # No measurement scale for this attribute\n if(measurementScale == None or mvar.isBlank(measurementScale)):\n name = attributeName[i]\n missing.append(name)\n \n if(len(missing) > 0):\n output = u\"The following {} attributes do not have a measurement scale: '{}'.\".format(len(missing), ', '.join(missing[0:maxPrint]))\n if(len(missing) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = \"All {} attributes have a measurement scale.\".format(len(attributeMeasurementScale))\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "attributesPresent",
"xpath": "boolean(/eml/dataset/*/attributeList/attribute or\n /*/contentInfo/MD_CoverageDescription/attributeGroup/*/attribute or\n /*/contentInfo/MI_CoverageDescription/attributeGroup/*/attribute or\n /*/contentInfo/*/dimension/MI_Band or\n /*/contentInfo/*/dimension/MD_Band)",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "attributeMeasurementScalePresent",
"xpath": "boolean(/eml/dataset/*/attributeList/attribute/measurementScale)",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "attributeMeasurementScale",
"xpath": "/eml/dataset/*/attributeList/attribute",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "measurementScale",
"xpath": "./measurementScale",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "attributeName",
"xpath": "/eml/dataset/*/attributeList/attribute",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "./attributeName",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "No attributes present, unable to check if attributes have a measurement scale defined.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.attributePrecision.present.1",
"name": "Entity Attribute Precision Defined",
"description": "Check that attributes have a measurement precision defined.",
"type": "Reusable",
"level": "OPTIONAL",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n \n import metadig.variable as mvar\n \n global attributePrecision\n global attributeName\n \n maxPrint = 5\n \n # Check if a measurement precision is present for each attribute.\n if not attributesPresent:\n output = \"No attributes present, unable to check if attribute measurement precision is defined.\"\n status = \"FAILURE\"\n return False\n \n missing = []\n \n attributePrecision = mvar.toUnicode(attributePrecision)\n attributeName = mvar.toUnicode(attributeName)\n \n if(isinstance(attributePrecision, unicode)):\n attributePrecision = [attributePrecision]\n \n if(isinstance(attributeName, unicode)):\n attributeName = [attributeName]\n \n # The metadata has attributes, but just none that require a precision, so give it a pass.\n if(len(attributePrecision) == 0):\n output = \"Attributes are present, but none require a measurement precision.\"\n status = \"SUCCESS\"\n return True\n\n for i in range(0, len(attributePrecision)):\n precision = attributePrecision[i]\n name = attributeName[i]\n if(name is None or mvar.isBlank(name)):\n name = \"#{}\".format(i)\n \n # No precision for this attribute was specified\n if(precision is None or mvar.isBlank(precision)):\n missing.append(name)\n continue\n \n if(len(missing) > 0):\n output = u\"These {} attributes (of {} total) are missing a precision value: '{}'\".format(len(missing), len(attributePrecision), ', '.join(missing[0:maxPrint]))\n if(len(missing) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = \"All {} attributes that require a measurement precision have one defined\".format(len(attributePrecision))\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "attributesPresent",
"xpath": "boolean(\n /eml/dataset/*/attributeList/attribute or\n /*/contentInfo/*/dimension/MI_Band or\n /*/contentInfo/*/dimension/MD_Band)",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "attributePrecision",
"xpath": "\n /eml/dataset/*/attributeList/attribute/measurementScale/interval |\n /eml/dataset/*/attributeList/attribute/measurementScale/ratio |\n /eml/dataset/*/attributeList/attribute/measurementScale/dateTime |\n /*/contentInfo/*/dimension/MI_Band |\n /*/contentInfo/*/dimension/MD_Band \n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "domain",
"xpath": "\n ./precision |\n ./dateTimePrecision | \n ./bitsPerValue\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "attributeName",
"xpath": "\n /eml/dataset/*/attributeList/attribute/measurementScale/interval |\n /eml/dataset/*/attributeList/attribute/measurementScale/ratio |\n /eml/dataset/*/attributeList/attribute/measurementScale/dateTime |\n /*/contentInfo/*/dimension/MI_Band |\n /*/contentInfo/*/dimension/MD_Band\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "name",
"xpath": "\n (../../attributeName | \n ./sequenceIdentifier/MemberName/aName[1]/*/text()[normalize-space()] |\n ./descriptor[1]/*/text()[normalize-space()])[1]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "No attributes present, unable to check if attribute measurement precision is defined.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.description.present.1",
"name": "Entity Description Present",
"description": "Check that a description is available for every entity.",
"type": "Reusable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global entities\n global entityNames\n \n import metadig.variable as mvar\n \n NoneType = type(None)\n maxPrint = 5\n \n # Check if a entity description is present for each attribute.\n if not entityPresent:\n output = \"No entities are present so cannot check their descriptions.\"\n status = \"FAILURE\"\n return False\n \n entityCount = 0\n missing = []\n \n entities = mvar.toUnicode(entities)\n entityNames = mvar.toUnicode(entityNames)\n\n if(isinstance(entities, unicode)):\n entities = [entitities]\n\n if(isinstance(entityNames, unicode)):\n entityNames = [entityNames]\n \n # Loop through the entities, checking if each entity has a description.\n # If the entity does not have a description, then add the name to a list\n # of delinquent entities. If no name is present, then use the entity \n # sequence number.\n for i in range(0, len(entities)):\n entityCount += 1\n description = entities[i]\n name = entityNames[i]\n # Check if the name is missing or blank\n if(isinstance(name, NoneType) or mvar.isBlank(name)):\n name = \"attr #{}\".format(i)\n \n # No description for this entity was specified\n if(description == None):\n missing.append(name)\n \n # Add the list of deliquent entities, if any, to the output\n if(len(missing) > 0):\n output = u\"These entities are missing a description: {}\".format(', '.join(missing[0:maxPrint]))\n if(len(missing) > maxPrint):\n output += u\", ...\"\n status = \"FAILURE\"\n return False\n else:\n output = \"All {} entities have descriptions.\".format(entityCount)\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "entityPresent",
"xpath": "boolean(\n /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity] or\n /*/contentInfo/MD_CoverageDescription or\n /*/contentInfo/MI_CoverageDescription)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
},
{
"name": "entities",
"xpath": "\n /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity] |\n /*/contentInfo/MD_CoverageDescription |\n /*/contentInfo/MI_CoverageDescription\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "description",
"xpath": "\n ./entityDescription |\n ./attributeDescription/RecordType\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
},
{
"name": "entityNames",
"xpath": "\n /eml/dataset/*[self::dataTable|self::spatialRaster|self::spatialVector|self::storedProcedure|self::view|self::otherEntity] |\n /*/contentInfo/MD_CoverageDescription |\n /*/contentInfo/MI_CoverageDescription\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": {
"name": "names",
"xpath": "\n ./entityName | \n ./attributeDescription/RecordType\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "These entities are missing a description: Brian Yellen 2006 Summer time suspended sediment transport in Lake Linne Svalbard.doc",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "entity.qualityDescription.present.1",
"name": "Entity Data Quality Description Present",
"description": "Check that a description of data quality practices and protocols used is present.",
"type": "Reusable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n \n import metadig.variable as mvar\n\n # A data quality description is not present\n if 'dataQualityDescription' not in globals() or not dataQualityDescription:\n output = \"A data quality description was not found.\"\n status = \"FAILURE\"\n return False\n else:\n output = \"A data quality description was found.\"\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "dataQualityDescription",
"xpath": "boolean(/*/dataQualityInfo/DQ_DataQuality/report or\n /eml/dataset/methods/methodStep/qualityControl/description or\n /eml/dataset/methods/methodStep/qualityControl/description or\n /eml/dataset/*/methods/methodStep/qualityControl/description or\n /eml/dataset/attributeList/attribute/methods/methodStep/qualityControl/description)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "A data quality description was not found.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "resource.methods.present.1",
"name": "Resource Methods Present",
"description": "Check that a detailed methods section is present.",
"type": "Reusable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global detailedMethodsText\n \n import metadig.variable as mvar\n\n # A detailed methods section is not present\n if 'detailedMethodsText' not in globals() or detailedMethodsText is None:\n output = \"A detailed methods section was not found.\"\n status = \"FAILURE\"\n return False\n \n detailedMethodsText = mvar.toUnicode(detailedMethodsText) \n if (mvar.isBlank(detailedMethodsText)):\n output = \"The detailed methods section is blank.\"\n status = \"FAILURE\"\n return False\n else:\n output = \"A detailed methods section was found.\"\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "detailedMethodsText",
"xpath": "/*/resourceLineage/LI_Lineage/statement |\n /*/dataQualityInfo/DQ_DataQuality/lineage/LI_Lineage/statement//text()[normalize-space()] |\n /*/dataQualityInfo/DQ_DataQuality/lineage/LI_Lineage/processStep/LE_ProcessStep/description//text()[normalize-space()] |\n /*/provenance |\n /eml/dataset/methods/methodStep/description//text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "A detailed methods section was not found.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "provenance.processStepCode.present.1",
"name": "Provenance Process Step Code Present",
"description": "Check that provenance process step software is specified.",
"type": "Reusable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global provenanceProcessStepCode\n \n import metadig.variable as mvar\n\n # Provenace step source code is not present\n if 'provenanceProcessStepCode' not in globals() or provenanceProcessStepCode is None:\n output = \"Provenance process step source code (software) was not found.\"\n status = \"FAILURE\"\n return False\n \n # Convert all values to unicode\n provenanceProcessStepCode = mvar.toUnicode(provenanceProcessStepCode)\n\n if (mvar.isBlank(provenanceProcessStepCode)):\n output = \"The provenance process step code is blank.\" \n status = \"FAILURE\"\n return False\n else:\n # Check if provenance process step code is a single string or arrayList\n if(isinstance(provenanceProcessStepCode, unicode)):\n output = u\"Provenance process step code '{}' was found\".format(provenanceProcessStepCode)\n elif (isinstance(provenanceProcessStepCode, list)):\n output = u\"A provenance process step code (software) '{}' was found (first of {})\".format(provenanceProcessStepCode[0].strip(), len(provenanceProcessStepCode))\n else:\n output = u\"A provenance process step code (software)'{}' was found)\".format(provenanceProcessStepCode.strip())\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "provenanceProcessStepCode",
"xpath": "/eml/*/methods/methodStep/software//text()[normalize-space()]",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3 ",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "Provenance process step source code (software) was not found.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "provenance.sourceEntity.present.1",
"name": "Provenance Source Entity Present",
"description": "Check if a lineage source entity is specified.",
"type": "Reusable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n \n if ('Source' in globals() and Source is not None and Source):\n output = \"A lineage source entity is present.\"\n status = \"SUCCESS\"\n return True\n else:\n output = \"A lineage source entity is not present.\"\n status = \"FAILURE\"\n return False\n\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "Source",
"xpath": "boolean(/*/resourceLineage/LI_Lineage/source/LI_Source\n or /*/resourceLineage/LI_Lineage/source/LE_Source\n or /*/dataQualityInfo/DQ_DataQuality/lineage/LI_Lineage/source/LI_Source\n or /*/dataQualityInfo/DQ_DataQuality/lineage/LI_Lineage/source/LE_Source\n or /*/dataQualityInfo/DQ_DataQuality/lineage/LI_Lineage/processStep/LI_ProcessStep/source/LI_Source\n or /*/dataQualityInfo/DQ_DataQuality/lineage/LI_Lineage/processStep/LE_ProcessStep/source/LE_Source)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "A lineage source entity is not present.",
"type": null
}
],
"status": "FAILURE"
},
{
"check": {
"id": "provenance.trace.present.1",
"name": "Provenance Trace Present",
"description": "Check that provenance information is present.",
"type": "Reusable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n \n import metadig.variable as mvar\n\n # A provenance trace is present\n if ('provenanceTrace' in globals() and provenanceTrace is not None and provenanceTrace):\n output = \"Provenance trace information was found.\"\n status = \"SUCCESS\"\n return True\n else:\n output = \"Provenance trace information was not found.\"\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "provenanceTrace",
"xpath": "boolean(\n /*/resourceLineage/LI_Lineage/processStep/LI_ProcessStep or\n /*/dataQualityInfo/DQ_DataQuality/lineage/*/processStep or\n /*/dataQualityInfo/DQ_DataQuality/lineage/*/source/*/sourceStep/LI_ProcessStep or\n /eml/*/methods/methodStep/dataSource or\n /eml/*/methods/methodStep/subStep/dataSource)\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "Provenance trace information was not found.",
"type": null
}
],
"status": "SUCCESS"
},
{
"check": {
"id": "resource.license.present.1",
"name": "Resource License Present",
"description": "Check that a resource license exists.",
"type": "Reusable",
"level": "REQUIRED",
"environment": "python",
"code": "\ndef call():\n global output\n global status\n global datasetLicense\n \n import metadig.variable as mvar\n \n displayNchars = 40\n # A dataset license is not present\n if 'datasetLicense' not in globals() or datasetLicense is None:\n output = \"A resource license was not found.\"\n status = \"FAILURE\"\n return False\n\n datasetLicense = mvar.toUnicode(datasetLicense)\n if(isinstance(datasetLicense, unicode)):\n datasetLicense = [datasetLicense]\n \n if (mvar.isBlank(datasetLicense)):\n output = \"The resource license is blank.\" \n status = \"FAILURE\"\n return False\n else:\n # Check if dataset license is a single string or arrayList\n if(isinstance(datasetLicense, unicode)):\n outstr = datasetLicense.strip()\n outstr.replace('\\n', ' ').replace('\\r', '')\n if(len(outstr) > displayNchars):\n outstr = u\"{}...\".format(outstr[:displayNchars])\n output = u\"The resource license '{}' was found\".format(outstr)\n elif (isinstance(datasetLicense, list)):\n outstr = datasetLicense[0].strip()\n outstr.replace('\\n', ' ').replace('\\r', '')\n if(len(outstr) > displayNchars):\n outstr = u\"{}...\".format(outstr[:displayNchars])\n if(len(datasetLicense) == 1):\n output = u\"The resource license '{}' was found.\".format(outstr)\n else:\n output = u\"The resource license '{}' was found. \".format(outstr)\n else:\n output = u\"The resource license '{}' was found\".format(datasetLicense)\n status = \"SUCCESS\"\n return True\n ",
"library": null,
"inheritState": false,
"selector": [
{
"name": "datasetLicense",
"xpath": "\n /eml/dataset/intellectualRights//text()[normalize-space()] |\n /*/rights |\n /resource/rightsList/rights |\n /*/identificationInfo/MD_DataIdentification/resourceConstraints/MD_LegalConstraints/accessConstraints/MD_RestrictionCode[@codeListValue=\"license\"] |\n /*/identificationInfo/MD_DataIdentification/resourceConstraints/MD_LegalConstraints/useConstraints/MD_RestrictionCode[@codeListValue=\"license\"] |\n /*/identificationInfo/MD_DataIdentification/resourceConstraints/MD_LegalConstraints/otherConstraints/MD_RestrictionCode[@codeListValue=\"license\"] |\n /eml/dataset/licensed/licenseName//text()[normalize-space()]\n ",
"namespaceAware": null,
"namespace": null,
"subSelector": null
}
],
"dialect": [
{
"name": "DataCite 3.1",
"xpath": "boolean(/*[local-name() = 'resource'])"
},
{
"name": "Dryad Data Package and Data File Modules",
"xpath": "boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])"
},
{
"name": "Ecological Metadata Language 2.1, 2.2.0",
"xpath": "boolean(/*[local-name() = 'eml'])"
},
{
"name": "ISO 19115 and ISO 19115-2 / ISO 19139 and ISO 19139-2",
"xpath": "boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])"
},
{
"name": "ISO 19115-1 / ISO 19115-3",
"xpath": "boolean(/*[local-name() = 'MD_Metadata'])"
}
]
},
"timestamp": "Apr 9, 2022 1:43:12 PM",
"output": [
{
"value": "The resource license 'This work is licensed under the Creative...' was found.",
"type": null
}
],
"status": "SUCCESS"
}
],
"suiteId": "FAIR-suite-0.3.1",
"status": null,
"runStatus": "success",
"errorDescription": "",
"sysmeta": {
"originMemberNode": "urn:node:ARCTIC",
"rightsHolder": "http://orcid.org/0000-0001-8729-0036",
"groups": [],
"dateUploaded": "Apr 8, 2022 11:48:47 PM",
"formatId": "eml://ecoinformatics.org/eml-2.1.1",
"obsoletes": "doi:10.18739/A24X54G2H",
"obsoletedBy": null,
"seriesId": null
},
"sequenceId": "urn:uuid:0f71a671-a3dd-4cd6-a9fd-53590f358c50",
"modified": false,
"isLatest": false
} Note that the MetaDIG API is currently an unofficial API and has not been finalized for use in DataONE, and is likely to change as we refine the service. The API implementation is also not complete. We welcome contributions to the preliminary version of the MetaDIG API documentation as well. |
Beta Was this translation helpful? Give feedback.
The MetaDIG API is the mechanism to use to get access to the assessment reports published on DataONE. The MetaDIG engine supports running a configurable set of "assessment suites" on each dataset in DataONE. All Datasets that use compatible metadata standards are checked with the DataONE FAIR Suite of assessment check. The results of one of these runs can be retrieved from the API knowing the identifier of the suite and the identifier of the metadata record of interest. For example, for the assessment suite
FAIR-suite-0.3.1
, one can retrieve the assessment report a specific dataset (doi:10.18739/A24T6F461
) using a curl command: