-
Notifications
You must be signed in to change notification settings - Fork 5
Home
The geometa R package offers tools for managing ISO/OGC geographic metadata, including ISO 19115, 19110, and 19119 through the ISO 19139 XML format. This also extends to the Geographic Markup Language (GML – ISO 19136) used for describing geographic data. The main features of geometa
include:
- a R object-oriented model for OGC/ISO geographic metadata classes covering ISO 19115, 19110, 19119 and 19136 standards
- Capacity to write ISO 19139 geographic metadata (XML format encoding)
- Capacity to read ISO 19139 geographic metadata (XML format decoding)
- Capacity to validate ISO 19139 geographic metadata documents produced with geometa, including a ISO/OGC validator (based on ISO 19139 schemas), and an INSPIRE metadata validator (interacting with the INSPIRE online metadata validator)
- Capacity to convert metadata objects from/to other metadata languages of interest such as NetCDF-CF (with
ncdf4
package) and EML (withEML
/emld
packages)
The use of geometa in combination with publication tools such as ows4R and geosapi fosters the use of R software to ease the management and publication of metadata documents and related datasets in web catalogues, and then allows to move forward with a real R implementation of spatial data management plans based on FAIR (Findable, Accessible Interoperable and Reusable) principles. Orchestration tools of such spatial data management plans are under active development in the geoflow project.
If you wish to sponsor geometa, do not hesitate to contact me
Many thanks to the following organizations that have provided fundings for strenghtening the geometa
package:
Table of contents
1. Overview
2. Package status
3. Credits
4. User guide
4.1 Installation
4.2 Write metadata (encoding)
4.2.1 List of metadata classes
4.2.2 Write ISO 19139 XML from geometa
├ Methodology to write metadata with geometa
└ Write multi-lingual metadata
4.2.3 Validate ISO 19139 XML from geometa
├ ISO 19139 compliance
└ INSPIRE compliance
4.3 Read metadata (decoding)
4.4 Convert metadata (mapping)
5. Issue reporting
Until now, equivalent tools were existing for other programming languages (e.g. Java, Python) but not in R. However, R is still one of the preferred languages of the data management community. The increasing amount of data and metadata to produce requires tools to easily manage metadata description, from its production to its publication on catalogues.
geometa intends to provide facilities to write and read OGC/ISO geographic metadata in R, covering various ISO metadata standards and its XML format (ISO 191139): ISO 19115 (dataset metadata), ISO 19119 (service metadata), ISO 19110 (Feature Catalogue) but also ISO 19136 (GML - Geographic Markup Language).
Metadata produced with geometa can then be published in catalogues such as Geonetwork, e.g. through its API with package geonapi or through standard OGC Catalogue Service for the Web (CSW) protocol using package ows4R.
The package is under active developmennt on Github and regular releases are done on CRAN.
The present focus of geometa
is on the following activities:
- support of multi-lingual metadata
- provide facilities to validate metadata sheets according to INSPIRE directive & ISO profile schemas
- completeness of ISO 19115:2003, 19115-2 and 19139:2015, and related standards (ISO 19119, ISO 19110)
- increase R native support of ISO 19136 (GML) in geometa
- provide adapters with other metadata formats such as EML and NetCDF-CF
(c) 2017, Emmanuel Blondel
Package distributed under MIT license.
If you use geometa
, i would be very grateful if you can add a citation in your published work. By citing geometa
, beyond acknowledging the work, you contribute to make it more visible and guarantee its growing and sustainability. geometa can be cited with its DOI: https://doi.org/10.5281/zenodo.1184892
To install geometa
from CRAN
install.packages("geometa")
To install geometa
from Github
Install devtools package:
install.packages("devtools")
Once the devtools package loaded, you can use the install_github to install geometa
. By default, package will be installed from master
which is the current version in development (likely to be unstable).
require("devtools")
install_github("eblondel/geometa")
geometa suggests a simple object-oriented model where the R user can create objects from the ISO/OGC metadata standards.
The geometa project references all ISO/OGC metadata classes in its inventory. You can check if a particular class is implemented in geometa by checking this inventory (columns geometa_class
/ in_geometa
)
A summary of ISO/OGC standard classes coverage is computed and updated each time new classes will be added to geometa
. The perspective is naturally to reach 100% coverage of the standards implemented by geometa
. The following table presents the current metadata standards coverage of geometa:
All ISO element classes provided by geometa
extend an abstract class named ISOAbstractObject
that comes with a simple encode()
method to convert the object into its XML form (specified by the ISO 19139 standard).
For example, a simple ISOGeographicBoundingBox
element will be converted as below:
bbox <- ISOGeographicBoundingBox$new(minx = -180, miny = -90, maxx = 180, maxy = 90)
bbox_xml <- bbox$encode()
Hence, with the encode()
, you can check that each metadata element you want to add is correctly encoded as XML, before adding the final metadata XML to your favorite metadata catalogue.
The below example provides a complete ISOMetadata
creation and its encoding to XML.
#Create an ISO metadata and encode it as XML
md = ISOMetadata$new()
md$setFileIdentifier("my-metadata-identifier")
md$setParentIdentifier("my-parent-metadata-identifier")
md$setCharacterSet("utf8")
md$setLanguage("eng")
md$setDateStamp(ISOdate(2015, 1, 1, 1))
md$setMetadataStandardName("ISO 19115:2003/19139")
md$setMetadataStandardVersion("1.0")
md$setDataSetURI("my-dataset-identifier")
#add 3 contacts
for(i in 1:3){
rp <- ISOResponsibleParty$new()
rp$setIndividualName(paste0("someone",i))
rp$setOrganisationName("somewhere")
rp$setPositionName(paste0("someposition",i))
rp$setRole("pointOfContact")
contact <- ISOContact$new()
phone <- ISOTelephone$new()
phone$setVoice(paste0("myphonenumber",i))
phone$setFacsimile(paste0("myfacsimile",i))
contact$setPhone(phone)
address <- ISOAddress$new()
address$setDeliveryPoint("theaddress")
address$setCity("thecity")
address$setPostalCode("111")
address$setCountry("France")
address$setEmail("someone@theorg.org")
contact$setAddress(address)
res <- ISOOnlineResource$new()
res$setLinkage("http://somelink")
res$setName("someresourcename")
contact$setOnlineResource(res)
rp$setContactInfo(contact)
md$addContact(rp)
}
#VectorSpatialRepresentation
vsr <- ISOVectorSpatialRepresentation$new()
vsr$setTopologyLevel("geometryOnly")
geomObject <- ISOGeometricObjects$new()
geomObject$setGeometricObjectType("surface")
geomObject$setGeometricObjectCount(5L)
vsr$setGeometricObjects(geomObject)
md$addSpatialRepresentationInfo(vsr)
#ReferenceSystem
rs <- ISOReferenceSystem$new()
rsId <- ISOReferenceIdentifier$new(code = "4326", codeSpace = "EPSG")
rs$setReferenceSystemIdentifier(rsId)
md$setReferenceSystemInfo(rs)
#data identification
ident <- ISODataIdentification$new()
ident$setAbstract("abstract")
ident$setPurpose("purpose")
ident$addCredit("credit1")
ident$addCredit("credit2")
ident$addCredit("credit3")
ident$addStatus("completed")
ident$setLanguage("eng")
ident$setCharacterSet("utf8")
ident$addTopicCategory("biota")
ident$addTopicCategory("oceans")
#adding a point of contact
rp <- ISOResponsibleParty$new()
rp$setIndividualName("someone")
rp$setOrganisationName("somewhere")
rp$setPositionName("someposition")
rp$setRole("pointOfContact")
contact <- ISOContact$new()
phone <- ISOTelephone$new()
phone$setVoice("myphonenumber")
phone$setFacsimile("myfacsimile")
contact$setPhone(phone)
address <- ISOAddress$new()
address$setDeliveryPoint("theaddress")
address$setCity("thecity")
address$setPostalCode("111")
address$setCountry("France")
address$setEmail("someone@theorg.org")
contact$setAddress(address)
res <- ISOOnlineResource$new()
res$setLinkage("http://somelink")
res$setName("somename")
contact$setOnlineResource(res)
rp$setContactInfo(contact)
ident$addPointOfContact(rp)
#citation
ct <- ISOCitation$new()
ct$setTitle("sometitle")
d <- ISODate$new()
d$setDate(ISOdate(2015, 1, 1, 1))
d$setDateType("publication")
ct$addDate(d)
ct$setEdition("1.0")
ct$setEditionDate(as.Date(ISOdate(2015, 1, 1, 1)))
ct$setIdentifier(ISOMetaIdentifier$new(code = "identifier"))
ct$setPresentationForm("mapDigital")
ct$setCitedResponsibleParty(rp)
ident$setCitation(ct)
#graphic overview
go1 <- ISOBrowseGraphic$new(
fileName = "http://wwww.somefile.org/png1",
fileDescription = "Map Overview 1",
fileType = "image/png"
)
go2 <- ISOBrowseGraphic$new(
fileName = "http://www.somefile.org/png2",
fileDescription = "Map Overview 2",
fileType = "image/png"
)
ident$addGraphicOverview(go1)
ident$addGraphicOverview(go2)
#maintenance information
mi <- ISOMaintenanceInformation$new()
mi$setMaintenanceFrequency("daily")
ident$setResourceMaintenance(mi)
#adding legal constraints
lc <- ISOLegalConstraints$new()
lc$addUseLimitation("limitation1")
lc$addUseLimitation("limitation2")
lc$addUseLimitation("limitation3")
lc$addAccessConstraint("copyright")
lc$addAccessConstraint("license")
lc$addUseConstraint("copyright")
lc$addUseConstraint("license")
ident$addResourceConstraints(lc)
#adding security constraints
sc <- ISOSecurityConstraints$new()
sc$setClassification("secret")
sc$setUserNote("ultra secret")
sc$setClassificationSystem("no classification in particular")
sc$setHandlingDescription("description")
ident$addResourceConstraints(sc)
#adding extent
extent <- ISOExtent$new()
bbox <- ISOGeographicBoundingBox$new(minx = -180, miny = -90, maxx = 180, maxy = 90)
extent$setGeographicElement(bbox)
ident$setExtent(extent)
#add keywords
kwds <- ISOKeywords$new()
kwds$addKeyword("keyword1")
kwds$addKeyword("keyword2")
kwds$setKeywordType("theme")
th <- ISOCitation$new()
th$setTitle("General")
th$addDate(d)
kwds$setThesaurusName(th)
ident$addKeywords(kwds)
#supplementalInformation
ident$setSupplementalInformation("some additional information")
#spatial representation type
ident$addSpatialRepresentationType("vector")
md$setIdentificationInfo(ident)
#Distribution
distrib <- ISODistribution$new()
dto <- ISODigitalTransferOptions$new()
for(i in 1:3){
or <- ISOOnlineResource$new()
or$setLinkage(paste0("http://somelink",i))
or$setName(paste0("name",i))
or$setDescription(paste0("description",i))
or$setProtocol("WWW:LINK-1.0-http--link")
dto$addOnlineResource(or)
}
distrib$setDigitalTransferOptions(dto)
md$setDistributionInfo(distrib)
#create dataQuality object with a 'dataset' scope
dq <- ISODataQuality$new()
scope <- ISOScope$new()
scope$setLevel("dataset")
dq$setScope(scope)
#add data quality reports...
#add a report the data quality
dc <- ISODomainConsistency$new()
result <- ISOConformanceResult$new()
spec <- ISOCitation$new()
spec$setTitle("Data Quality check")
spec$setAlternateTitle("This is is some data quality check report")
d <- ISODate$new()
d$setDate(ISOdate(2015, 1, 1, 1))
d$setDateType("publication")
spec$addDate(d)
result$setSpecification(spec)
result$setExplanation("some explanation about the conformance")
result$setPass(TRUE)
dc$addResult(result)
dq$addReport(dc)
#add INSPIRE reports?
#INSPIRE - interoperability of spatial data sets and services
dc_inspire1 <- ISODomainConsistency$new()
cr_inspire1 <- ISOConformanceResult$new()
cr_inspire_spec1 <- ISOCitation$new()
cr_inspire_spec1$setTitle("Commission Regulation (EU) No 1089/2010 of 23 November 2010 implementing Directive 2007/2/EC of the European Parliament and of the Council as regards interoperability of spatial data sets and services")
cr_inspire1$setExplanation("See the referenced specification")
cr_inspire_date1 <- ISODate$new()
cr_inspire_date1$setDate(ISOdate(2010,12,8))
cr_inspire_date1$setDateType("publication")
cr_inspire_spec1$addDate(cr_inspire_date1)
cr_inspire1$setSpecification(cr_inspire_spec1)
cr_inspire1$setPass(TRUE)
dc_inspire1$addResult(cr_inspire1)
dq$addReport(dc_inspire1)
#INSPIRE - metadata
dc_inspire2 <- ISODomainConsistency$new()
cr_inspire2 <- ISOConformanceResult$new()
cr_inspire_spec2 <- ISOCitation$new()
cr_inspire_spec2$setTitle("COMMISSION REGULATION (EC) No 1205/2008 of 3 December 2008 implementing Directive 2007/2/EC of the European Parliament and of the Council as regards metadata")
cr_inspire2$setExplanation("See the referenced specification")
cr_inspire_date2 <- ISODate$new()
cr_inspire_date2$setDate(ISOdate(2008,12,4))
cr_inspire_date2$setDateType("publication")
cr_inspire_spec2$addDate(cr_inspire_date2)
cr_inspire2$setSpecification(cr_inspire_spec2)
cr_inspire2$setPass(TRUE)
dc_inspire2$addResult(cr_inspire2)
dq$addReport(dc_inspire2)
#add lineage (more example of lineages in ISOLineage documentation)
lineage <- ISOLineage$new()
lineage$setStatement("statement")
dq$setLineage(lineage)
md$setDataQualityInfo(dq)
#XML representation of the ISOMetadata
xml <- md$encode()
For further examples, please use the standard R documentation available within the package, e.g. ?ISOMetadata
To write a multi-lingual metadata document, the pre-requisite is to define the locales
that this metadata document will handle, and the default locale of the document. Let's assume that we want to have a metadata document written with all six UN official languages (English, French, Spanish, Chinese, Rusian, Arabic). The main metadata object will be writen as follows:
md <- ISOMetadata$new()
eng <- ISOLocale$new()
eng$setId("EN")
eng$setLanguage("EN")
eng$setCharacterSet("utf8")
md$addLocale(eng)
fr <- ISOLocale$new()
fr$setId("FR")
fr$setLanguage("FR")
fr$setCharacterSet("utf8")
md$addLocale(fr)
esp <- ISOLocale$new()
esp$setLanguage("ES")
esp$setCharacterSet("utf8")
md$addLocale(esp)
chi <- ISOLocale$new()
chi$setLanguage("ZH")
chi$setCharacterSet("utf8")
md$addLocale(chi)
ru <- ISOLocale$new()
ru$setLanguage("RU")
ru$setCharacterSet("utf8")
md$addLocale(ru)
ar <- ISOLocale$new()
ar$setLanguage("AR")
ar$setCharacterSet("utf8")
md$addLocale(ar)
Then, in order to add multi-language properties, for each R associated to a metadata property subject to multi-language, a locales
parameter allows to define the language translations for this property. For a given class, to know if a property is subject to multi-language, you should check the R documentation of this class. The locales
should be a list named with the language codes specified as locales in the metadata document (here EN
, FR
, ES
, AR
, ZH, and
RU``). Here below is an example where we want to specify locales for the abstract of a metadata data identification block:
ident <- ISODataIdentification$new()
ident$setAbstract(
"abstract",
locales = list(
EN = "abstract",
FR = "résumé",
ES = "resumen",
AR = "ملخص",
RU = "резюме",
ZH = "摘要"
))
By default the function $encode()
available in all geometa classes will perform validation against ISO 19139 XML schemas (parameter validate
set to TRUE by default), as non-blocking in case the metadata produced is not valid. It is possible to make the non-validity blocking by specifying the parameter strict
as TRUE).
The validity status will be reported as footer (as XML comments) of the XML Metadata document produced:
<gmd:MD_Metadata xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:gfc="http://www.isotc211.org/2005/gfc" xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:gmi="http://www.isotc211.org/2005/gmi" xmlns:gmx="http://www.isotc211.org/2005/gmx" xmlns:gts="http://www.isotc211.org/2005/gts" xmlns:srv="http://www.isotc211.org/2005/srv" xmlns:gml="http://www.opengis.net/gml/3.2" xmlns:gmlcov="http://www.opengis.net/gmlcov/1.0" xmlns:gmlrgrid="http://www.opengis.net/gml/3.3/rgrid" xmlns:xlink="http://www.w3.org/1999/xlink">
<gmd:characterSet>
<gmd:MD_CharacterSetCode codeList="http://www.isotc211.org/2005/resources/Codelist/ML_gmxCodelists.xml#MD_CharacterSetCode" codeListValue="utf8">utf8</gmd:MD_CharacterSetCode>
</gmd:characterSet>
<gmd:hierarchyLevel>
<gmd:MD_ScopeCode codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#MX_ScopeCode" codeListValue="dataset" codeSpace="ISOTC211/19115">dataset</gmd:MD_ScopeCode>
</gmd:hierarchyLevel>
<!--Metadata Creation date/time: 2019-01-17T19:02:42-->
<!--ISO 19139 XML generated by geometa R package - Version 0.5-->
<!--ISO 19139 XML compliance: NO-->
<!--geometa R package information: Contact: Emmanuel Blondel emmanuel.blondel1@gmail.com URL: https://github.com/eblondel/geometa/wiki BugReports: https://github.com/eblondel/geometa/issues-->
</gmd:MD_Metadata>
The INSPIRE metadata validator can be used in several ways:
- directly using the validator and get the validation report
inspireValidator = INSPIREMetadataValidator$new()
inspireValidator$getValidationReport(md) #with md being the metadata object
- using shortcuts, with the parameter
inspire
(TRUE/FALSE) available through methods$encode()
and$save()
md$encode(inspire = TRUE)
md$save("mymetadata.xml", inspire = TRUE)
These two methods will add further XML comments targeting the INSPIRE validation as footer of the ISO 19139 XML. An example on an empty object ISOMetadata$new()
would look like that:
<gmd:MD_Metadata xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:gfc="http://www.isotc211.org/2005/gfc" xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:gmi="http://www.isotc211.org/2005/gmi" xmlns:gmx="http://www.isotc211.org/2005/gmx" xmlns:gts="http://www.isotc211.org/2005/gts" xmlns:srv="http://www.isotc211.org/2005/srv" xmlns:gml="http://www.opengis.net/gml/3.2" xmlns:gmlcov="http://www.opengis.net/gmlcov/1.0" xmlns:gmlrgrid="http://www.opengis.net/gml/3.3/rgrid" xmlns:xlink="http://www.w3.org/1999/xlink">
<gmd:characterSet>
<gmd:MD_CharacterSetCode codeList="http://www.isotc211.org/2005/resources/Codelist/ML_gmxCodelists.xml#MD_CharacterSetCode" codeListValue="utf8">utf8</gmd:MD_CharacterSetCode>
</gmd:characterSet>
<gmd:hierarchyLevel>
<gmd:MD_ScopeCode codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#MX_ScopeCode" codeListValue="dataset" codeSpace="ISOTC211/19115">dataset</gmd:MD_ScopeCode>
</gmd:hierarchyLevel>
<!--Metadata Creation date/time: 2019-01-17T19:02:42-->
<!--ISO 19139 XML generated by geometa R package - Version 0.5-->
<!--ISO 19139 XML compliance: NO-->
<!--INSPIRE compliance: NO-->
<!--INSPIRE completeness: 5.56%-->
<!--INSPIRE Report: http://inspire-geoportal.ec.europa.eu/resources/sandbox/INSPIRE-14472bd2-1a82-11e9-9b61-52540023a883_20190117-190239/datasets/1/resourceReport-->
<!--geometa R package information: Contact: Emmanuel Blondel emmanuel.blondel1@gmail.com URL: https://github.com/eblondel/geometa/wiki BugReports: https://github.com/eblondel/geometa/issues-->
</gmd:MD_Metadata>
Note: Although the INSPIRE report URL is useful for checking the INSPIRE compliance details after producing the metadata document, please note the this the INSPIRE geoportal administrators regularly purge the repository (which is a "sandbox") where your metadata has been tested. With time, the report URL will lead to a "Not found" page with the message: The page you were looking for appears to have been moved, deleted or does not exist in the INSPIRE Geoportal.).
With geometa
, it is very easy to read an metadata XML into a ISOMetadata
object, as illustrated in the below example:
require(XML)
xmlfile <- system.file("extdata", "metadata.xml", package = "geometa")
#read XML file
xml <- xmlParse(xmlfile)
#read XML as ISOMetadata object!
md <- ISOMetadata$new(xml = xml)
#... and then modify whatever you want in md with geometa ISO API!
geometa
offers the capacity to convert objects from/to other metadata languages. The object is to provide a generic interoperable mechanism to convert R metadata objects from one metadata standard to another.
At now the focus was given on the mapping between ISO/OGC metadata (modeled in R by geometa) covering core business metadata elements with two widely used metadata formats which are:
- NetCDF-CF Conventions - Climate and Forecast conventions - (modeled in R with ncdf4)
- EML (Ecological Metadata Language) (modeled in R with EML and emld)
The formats for which conversion is enabled can be listed with the R function getMappingFormats(pretty = TRUE)
. The column from
indicates if conversion from the specified format can be operated (reading), and the column to
indicates if conversion to the specified format can be operated (writing).
The conversion between one metadata object to another is done with the generic method as
in this way:
target_object <- as(source_object, "format")
For example, to convert an object md
of class ISOMetadata
from geometa to an object of class emld
(using EML and emld), we will run the below R code:
my_emld <- as(md, "emld")
Reversely, we can create an ISOMetadata
object from an emld
object:
my_geometa_object <- as(my_emld, "ISOMetadata")
For a NetCDF-CF object in package ncdf4, we can create an object of class ISOMetadata
:
nc <- ncdf4::nc_open("http://gsics.eumetsat.int/thredds/dodsC/DemoLevel1B25Km/W_XX-EUMETSAT-Darmstadt,SURFACE+SATELLITE,METOPA+ASCAT_C_EUMP_20131231231800_37368_eps_o_125_l1.nc")
test_ogc_cf <- as(nc, "ISOMetadata")
Issues can be reported at https://github.com/eblondel/geometa/issues