-
Notifications
You must be signed in to change notification settings - Fork 10
2.2 Collection description schemes
When describing large collections it is anticipated that the same collections can be described using different schemes for different purposes. For instance a museum collection may be described based on “famous named” collections or collectors (e.g., Darwin, Spruce) if an aggregator has the need to “find” lost specimens from previously formed collections. The same collection may be described in whole or part based on taxonomic or geographic properties for the purpose of environmental or taxonomic research or funding.
The CollectionDescriptionScheme
class, and the supporting SchemeTerm
and SchemeMeasurementOrFact
classes are intended to provide some parameters around the purpose and expectations of the descriptions and to indicate if objects within the descriptions are assigned attributes that will cause errors in metrics if not explicitly noted.
Using these three classes enables you to build a 'profile' for your LtC implementation, so that you may:
- describe the purpose of your collection description scheme (using the
CollectionDescriptionScheme.basisOfScheme
property) - define whether the
ObjectGroups
within the scheme overlap (i.e., a single object might be represented in more than oneObjectGroup
) or are distinct (using theCollectionDescriptionScheme.distinctObjects
property`) - apply restrictions on which terms within the overall LtC standard can be included, and which are mandatory (using the
SchemeTerm
class) - define the metrics that you want to be included in the scheme via the
MeasurementOrFact
class (using theSchemeMeasurementOrFact class
)
Essentially, LtC is a fairly broad and flexible standard which can be applied in multiple ways. While this allows it to support a broad range of collection description use cases, it also presents a risk that if its use isn't constrained appropriately to fit the use case, data coherency and usability may be compromised. In particular, defining 1. common metrics and 2. controlled vocabularies for appropriate terms are vital steps for making sure that the data are consistent and interoperable. The collection description scheme concept and related LtC terms are intended to help to support this process.
Below is an example of steps that you can take to begin defining a new collection description scheme in Latimer Core, using the CollectionDescriptionScheme
, SchemeTerm
and SchemeMeasurementOrFact
classes and properties.
{
"@context": {
"ltc": "http://rs.tdwg.org/ltc/terms/"
},
"@type": "ltc:CollectionDescriptionScheme",
"schemeName": "NHM London departmental collections",
"basisOfScheme": "Collections inventory",
"distinctObjects": "True"
}
This provides a name for the scheme, allowing it to be distinguished from other schemes that might be in the same dataset, and what the scheme is intended to be for. It also dictates that no single object is expected to be represented in more than one ObjectGroup
within the scheme, so it should be safe to aggregate metrics within the scheme without the risk of, for example, counting the same object multiple times.
{
"@context": {
"ltc": "http://rs.tdwg.org/ltc/terms/"
},
"@type": "ltc:CollectionDescriptionScheme",
"schemeName": "NHM London departmental collections",
"basisOfScheme": "Collections inventory",
"distinctObjects": "True",
"ltc:SchemeTerm": [
{
"@type": "ltc:SchemeTerm",
"termName": "ObjectGroup.collectionName",
"mandatoryTerm": "True",
"repeatableTerm": "False"
},{
"@type": "ltc:SchemeTerm",
"termName": "ObjectGroup.Identifier",
"mandatoryTerm": "True",
"repeatableTerm": "False"
},{
"@type": "ltc:SchemeTerm",
"termName": "ObjectGroup.Identifier.identifier",
"mandatoryTerm": "True",
"repeatableTerm": "False"
},{
"@type": "ltc:SchemeTerm",
"termName": "OrganisationalUnit",
"mandatoryTerm": "True",
"repeatableTerm": "False"
},{
"@type": "ltc:SchemeTerm",
"termName": "OrganisationalUnit.organisationalUnitName",
"mandatoryTerm": "True",
"repeatableTerm": "False"
},{
"@type": "ltc:SchemeTerm",
"termName": "OrganisationalUnit.organisationalUnitType",
"mandatoryTerm": "True",
"repeatableTerm": "False"
},{
"@type": "ltc:SchemeTerm",
"termName": "StorageLocation",
"mandatoryTerm": "True",
"repeatableTerm": "False"
},{
"@type": "ltc:SchemeTerm",
"termName": "StorageLocation.locationName",
"mandatoryTerm": "True",
"repeatableTerm": "False"
},{
"@type": "ltc:SchemeTerm",
"termName": "StorageLocation.locationType",
"mandatoryTerm": "True",
"repeatableTerm": "False"
},{
"@type": "ltc:SchemeTerm",
"termName": "StorageLocation",
"mandatoryTerm": "True",
"repeatableTerm": "False"
},{
"@type": "ltc:SchemeTerm",
"termName": "ObjectGroup.preservationMethod",
"mandatoryTerm": "False",
"repeatableTerm": "True"
},{
"@type": "ltc:SchemeTerm",
"termName": "Taxon",
"mandatoryTerm": "False",
"repeatableTerm": "True"
}
]
}
In human-readable terms, this says:
We want the collection to be broken down by department and building, so each subcollection must have one, and only one, department (
OrganisationalUnit
) and building (StorageLocation
) attached. We need to have the type and name for each of those things so that we know what they are. Each subcollection should also have a single short name (ObjectGroup
.collectionName
) and an identifier (ObjectGroup
.Identifer
) so that humans and machines can tell them apart.
It may be useful, but not critical, to get an idea of the taxa represented and preservation methods used in each of those subcollections. However not everyone will have the time to add that data, so we'll make that option available but not force it at this stage.
This has implications on the structure of the data, as the mandatoryTerm
= "True" and repeatableTerm
= "False" values for OrganisationalUnit
and StorageLocation
dictate that there should be one ObjectGroup
created for every combination of (in this example) department and building. More information on LtC modelling approaches can be found in the ObjectGroups and relationships section.
Step 3: add SchemeMeasurementOrFact
to define the quantitative and qualitative measures that we want to include in the dataset:
{
"@context": {
"ltc": "http://rs.tdwg.org/ltc/terms/"
},
"@type": "ltc:CollectionDescriptionScheme",
"schemeName": "NHM London departmental collections",
"basisOfScheme": "Collections inventory",
"distinctObjects": "True",
"ltc:SchemeTerm": [...],
"ltc:SchemeMeasurementOrFact": [
{
"@type": "ltc:SchemeMeasurementOrFact",
"schemeMeasurementType": "Object count",
"mandatoryMetrics": "True",
"repeatableMetric": "False"
},{
"@type": "ltc:SchemeMeasurementOrFact",
"schemeMeasurementType": "Percentage barcoded",
"mandatoryMetric": "True",
"repeatableMetric": "False"
},{
"@type": "ltc:SchemeMeasurementOrFact",
"schemeMeasurementType": "Historical narrative",
"mandatoryMetric": "False",
"repeatableMetric": "False"
}
],
}
In human-readable terms, this says:
For every subcollection, we expect people to provide one, and only one, estimate or count of the number of objects in that subcollection, and the same for the percentage of those objects that have been barcoded. If people have the time, we'll also provide the facility to add a historical narrative to describe the subcollection, but make that optional.
The implication of this is that we would expect to see two instances of the MeasurementOrFact
class, one with measurementType
of "Object count" and one of "Percentage barcoded", for every ObjectGroup
linked to the CollectionDescriptionScheme
, and can validate against that expectation.
This principle is not dissimilar to (although simpler than) constructs such as JSON Schema and RDF Schema, which could also be applied to LtC data if stored in that form to achieve similar ends. However, including this in the standard enables this to be applicable regardless of the data serialisation.
In the example below an LtC description record for the Insects and Invertebrate Zoology collections at the Field Museum is created and its three-term CollectionDescriptionScheme
is included.
Figure 1: An example record structure that might be a useful scheme for GRSciColl records.
Figure 2: Another example record structure of a way to describe all of the "famous" collections within a larger collection.
In both of the above examples the distinctObjects
term is 'True', because no metric is associated with a description that could cause objects to be counted twice. However, if the two examples (GRSciColl and Famous collections) are nested in a single record, the distinctObjects
term needs to be 'False'.
Figure 3: An example of a record-structure that combines ObjectGroups from the above examples, and has overlapping "Specimen Count" measurements.
The distinctObjects
term becomes 'False' because the count metric for the OrganisationalUnit
contains within it the objects and metrics associated with the ObjectGroup
(i.e. specimen count).
Latimer Core is intended to be flexible for use to accommodate as much as possible future use cases, CollectionDescriptionScheme
serves as the “definition” of any scheme that is developed.
Version | Date | Contributors | Status |
---|---|---|---|
1.x | TBD | Matt Woodburn, Jutta Buschbom, Sarah Vincent, Kate Webbink, Maarten Trekels, Janeen Jones, Sharon Grant | Expert Review - In progress |
1.0 | 2022-06-10 | Matt Woodburn, Jutta Buschbom, Sarah Vincent, Kate Webbink, Maarten Trekels, Janeen Jones, Sharon Grant | v1 - Archived |
0.1 | 2022-02-10 | Matt Woodburn, Jutta Buschbom, Sarah Vincent, Kate Webbink, Maarten Trekels, Janeen Jones, Sharon Grant | Draft |