From cb9653285392c2e5e34e6b8f16fca4672c479571 Mon Sep 17 00:00:00 2001 From: Lee Belbin Date: Tue, 29 Oct 2024 12:38:14 +1100 Subject: [PATCH] Update supplement-header.md Correct usage of Test Types VALIDATION etc to Validation --- .../build/templates/supplement/supplement-header.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/tg2/_review/build/templates/supplement/supplement-header.md b/tg2/_review/build/templates/supplement/supplement-header.md index 05cbebfa..dfe721dc 100644 --- a/tg2/_review/build/templates/supplement/supplement-header.md +++ b/tg2/_review/build/templates/supplement/supplement-header.md @@ -121,7 +121,7 @@ The following issues describing potential tests were tagged as [DO NOT IMPLEMENT | [53](https://github.com/tdwg/bdq/issues/53) | AMENDMENT_CLASS_STANDARDIZED | Can the value of dwc:class be standardized using the Source Authority? | | [44](https://github.com/tdwg/bdq/issues/44) | AMENDMENT_MINELEVATIONMAXELEVATION_TRANSPOSED | If \"dwc:minimumElevationInMeters is greater than dwc:maximumElevationInMeters, can they be meaningfully swapped? | | [37](https://github.com/tdwg/bdq/issues/37) | ISSUE_DAYMONTH_SWAPPED | Is it likely that the day and month have been swapped? | -| [35](https://github.com/tdwg/bdq/issues/35) | MEASURE_VALIDATIONTESTS_RUN | Total number of tests of output type VALIDATION that have been attempted to have been run against the record | +| [35](https://github.com/tdwg/bdq/issues/35) | MEASURE_VALIDATIONTESTS_RUN | Total number of tests of output type Validation that have been attempted to have been run against the record | | [34](https://github.com/tdwg/bdq/issues/34) | AMENDMENT_DAYMONTH_TRANSPOSED | Swap dwc:month and dwc:day if dwc:month is greater than 12 and dwc:day is less than 12. | | [27](https://github.com/tdwg/bdq/issues/27) | AMENDMENT_FAMILY_STANDARDIZED | Can the value of dwc:family be standardized using the Source Authority? | | [25](https://github.com/tdwg/bdq/issues/25) | AMENDMENT_ORDER_STANDARDIZED | Can the value of dwc:order be standardized using the Source Authority? | @@ -364,7 +364,7 @@ Issue Tests are a form of warning flag where the Test is drawing attention to po Issues can result in a Response.status="RUN_HAS_RESULT" accompanied by a Response.result="POTENTIAL_ISSUE" or "NOT_ISSUE". -An ISSUE is the equivalent to a Response.status="NOT_COMPLIANT" from a Validation Test. A Response.result="NOT_ISSUE" is similar to a Response.result="COMPLIANT" from a Validation Test, but with slightly different semantics, "COMPLIANT" means that the data is fit for some use. A Response.result="NOT_ISSUE" means that there was no reason for the data not to be fit for use. A Response.result="POTENTIAL_ISSUE" is the reason we incorporated Issue type Tests into BDQ Core. "POTENTIAL_ISSUE" means that the Issue found a concern in the data that might make it unfit for some use, but that human evaluation of the details of the data and the use are needed. Data flagged with potential issues require a human review. For example, [ISSUE_DATAGENERALIZATIONS_NOTEMPTY](https://rs.tdwg.org/bdqcore/terms/14da5b87-8304-4b2b-911d-117e3c29e890) will return a Response.result="POTENTIAL_ISSUE" if dwc:dataGeneralizations contains a value. Any value in dwc:dataGeneralizations asserts changes have been made to generalize other [Darwin Core Terms](https://dwc.tdwg.org/list/) (Darwin Core Maintenance Group 2021) and requires a human review to determine whether the data are fit for purpose. +An Issue is the equivalent to a Response.status="NOT_COMPLIANT" from a Validation Test. A Response.result="NOT_ISSUE" is similar to a Response.result="COMPLIANT" from a Validation Test, but with slightly different semantics, "COMPLIANT" means that the data is fit for some use. A Response.result="NOT_ISSUE" means that there was no reason for the data not to be fit for use. A Response.result="POTENTIAL_ISSUE" is the reason we incorporated Issue type Tests into BDQ Core. "POTENTIAL_ISSUE" means that the Issue found a concern in the data that might make it unfit for some use, but that human evaluation of the details of the data and the use are needed. Data flagged with potential issues require a human review. For example, [ISSUE_DATAGENERALIZATIONS_NOTEMPTY](https://rs.tdwg.org/bdqcore/terms/14da5b87-8304-4b2b-911d-117e3c29e890) will return a Response.result="POTENTIAL_ISSUE" if dwc:dataGeneralizations contains a value. Any value in dwc:dataGeneralizations asserts changes have been made to generalize other [Darwin Core Terms](https://dwc.tdwg.org/list/) (Darwin Core Maintenance Group 2021) and requires a human review to determine whether the data are fit for purpose. #### 3.1.3 Amendments @@ -482,7 +482,7 @@ BDQ Core Amendment Tests are paired with a corresponding Validation Test that as As noted above, one early conclusion to this project was the need for controlled vocabularies and led to an early spin-off of the Data Qality Task Group 4: Best Practice for Development of Vocabularies of Value (https://github.com/tdwg/bdq/tree/master/tg4). Testing the 'quality' or 'fitness for use' of Darwin Core encoded data is made more difficult due to the lack of a comprehensive suite of controlled vocabularies. -Testing Darwin Core values against a known Source Authority using a Validation Type Test is straight forward: A Test is either COMPLIANT or NOT COMPLIANT. The BDQ Core standard also includes Tests of type Amendment, and the mapping of input Darwin Core values to known Vocabulary values is poorly developed. If a Validation Test returns COMPLIANT, no amendment is necessary. For example, if the input value to a Test evaluating sex is dwc:sex="Female", then no amendment is required. If however, the input value is dwc:sex="f.", this can likely be interpreted as "Female"? The same is not true for dwc:sex="M" This value could be interpreted as "Male" or "Mixed" according to https://api.gbif.org/v1/vocabularies/Sex/concepts. GBIF currently treats this as "Male" but without a comprehensive synonymy within the vocabularies, one cannot be certain that this is the case. A key phrase within this standard that particularly relates to many of the Expected Responses of Tests is "dwc:term can be unambiguously interpreted as ...". In the case of dwc:sex="M", the determination is that the value is ambiguous and no AMENDMENT can be made. +Testing Darwin Core values against a known Source Authority using a Validation Type Test is straight forward: A Test is either COMPLIANT or NOT COMPLIANT. The BDQ Core standard also includes Tests of type Amendment, and the mapping of input Darwin Core values to known Vocabulary values is poorly developed. If a Validation Test returns COMPLIANT, no amendment is necessary. For example, if the input value to a Test evaluating sex is dwc:sex="Female", then no amendment is required. If however, the input value is dwc:sex="f.", this can likely be interpreted as "Female"? The same is not true for dwc:sex="M" This value could be interpreted as "Male" or "Mixed" according to https://api.gbif.org/v1/vocabularies/Sex/concepts. GBIF currently treats this as "Male" but without a comprehensive synonymy within the vocabularies, one cannot be certain that this is the case. A key phrase within this standard that particularly relates to many of the Expected Responses of Tests is "dwc:term can be unambiguously interpreted as ...". In the case of dwc:sex="M", the determination is that the value is ambiguous and no amendment can be made. **We see an urgent need for a comprehensive, internationally agreed list of [Darwin Core Term](https://dwc.tdwg.org/list/) (Darwin Core Maintenance Group 2021) values that are mapped to standard values**. GBIF has implemented some of the unique values for some Darwin Core terms, for example https://api.gbif.org/v1/vocabularies/Sex/concepts/Female/hiddenLabels, but such lists are not currently comprehensive, and we see this is a severe limitation to the evaluation of 'data quality'/'fitness for use'. While there has been a survey of Darwin Core 'distinct values' for GBIF, ALA, iDigBio and VertNet, these are dated, and have not been mapped to standard values, if they exist. @@ -633,11 +633,9 @@ The tag NEEDS WORK was repeatedly added and removed to issues and was a valuable ![Diagram of the 'NAME'-oriented tests and InformationElementsActedUpon.](TestsName.png "NAME by InformationElements") Diagram of the 'NAME'-oriented tests and InformationElementsActedUpon - ![Diagram of the 'SPACE'-oriented tests and InformationElementsActedUpon.](TestsSpace.png "SPACE by InformationElements") Diagram of the 'SPACE'-oriented tests and InformationElementsActedUpon - ![Diagram of the 'TIME'-oriented tests and InformationElementsActedUpon.](TestsTime.png "TIME by InformationElements") Diagram of the 'TIME'-oriented tests and InformationElementsActedUpon. @@ -656,7 +654,7 @@ Each test issue in GitHub begins with a table in markdown format describing the **Description** [non-normative]: A brief English language description of what the Test does, for example "Does the value of dwc:basisOfRecord occur in bdq:sourceAuthority?" -**Test Type** [non-normative]: Tests have been classified into four Fitness for Use Framework classes. VALIDATION validates values in one or more [Darwin Core Terms](https://dwc.tdwg.org/list/) (Darwin Core Maintenance Group 2021), for example [VALIDATION_SCIENTIFICNAME_FOUND](https://rs.tdwg.org/bdqcore/terms/3f335517-f442-4b98-b149-1e87ff16de45). ISSUE flags a potential problem that requires human interpretation, for example [ISSUE_DATAGENERALIZATION_NOTEMPTY](https://rs.tdwg.org/bdqcore/terms/13d5a10e-188e-40fd-a22c-dbaa87b91df2). AMENDMENT suggests an improvement that will result in a change or addition to at least one Darwin Core term, for example [AMENDMENT_COORDINATES_FROM_VERBATIM](https://rs.tdwg.org/bdqcore/terms/3c2590c7-af8a-4eb4-af57-5f73ba9d1f8e). MEASURE returns a numeric value, for example [MEASURE_VALIDATIONSTESTS_COMPLIANT](https://rs.tdwg.org/bdqcore/terms/45fb49eb-4a1b-4b49-876f-15d5034dfc73). +**Test Type** [non-normative]: Tests have been classified into four Fitness for Use Framework classes. Validation Tests validate values in one or more [Darwin Core Terms](https://dwc.tdwg.org/list/) (Darwin Core Maintenance Group 2021), for example [VALIDATION_SCIENTIFICNAME_FOUND](https://rs.tdwg.org/bdqcore/terms/3f335517-f442-4b98-b149-1e87ff16de45). Issue flags a potential problem that requires human interpretation, for example [ISSUE_DATAGENERALIZATION_NOTEMPTY](https://rs.tdwg.org/bdqcore/terms/13d5a10e-188e-40fd-a22c-dbaa87b91df2). Amendment suggests an improvement that will result in a change or addition to at least one Darwin Core term, for example [AMENDMENT_COORDINATES_FROM_VERBATIM](https://rs.tdwg.org/bdqcore/terms/3c2590c7-af8a-4eb4-af57-5f73ba9d1f8e). Measure returns a numeric value, for example [MEASURE_VALIDATIONSTESTS_COMPLIANT](https://rs.tdwg.org/bdqcore/terms/45fb49eb-4a1b-4b49-876f-15d5034dfc73). **Darwin Core Class** [non-normative]: The Darwin Core class that contains the Information Elements, for example: dwc:Taxon.