Skip to content

Commit

Permalink
Gen version 15.2
Browse files Browse the repository at this point in the history
  • Loading branch information
frmichel committed Jul 29, 2022
1 parent b937095 commit b2f1f50
Show file tree
Hide file tree
Showing 8 changed files with 149 additions and 109 deletions.
11 changes: 10 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,16 @@
# TAXREF-LD changelog


## [15.1] 2022-01-19 - TAXREF v15
## [15.2] 2022-07-29

### Changed
- Fixed issues in the ontology (owl:AnnotationProperties instead of owl:ObjectProperties)
- Update of dataset metadata (fixed issues, added dcat:Distribution, better named graphs description...)
- Regenerate taxonomy (taxa and names) to fix an issue with missing taxa (detected with Agroportal)
- Reorganize data dump as a zip file with subfolders rather than a tar of zip files


## [15.1] 2022-01-19

### Added
- Development stages and sex as part of the species interactions, e.g.: species A in larva stage feeds on B.
Expand Down
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ Previous versions are still aviable on this Github repo:

| Version | Download link |
| ---- | ---- |
| 15.2 | [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6940891.svg)](https://doi.org/10.5281/zenodo.6940891) |
| 15.1 | [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5876775.svg)](https://doi.org/10.5281/zenodo.5876775) |
| 13.0 | https://github.com/frmichel/taxref-ld/tree/13.0/dataset |
| 12.0 | https://github.com/frmichel/taxref-ld/tree/12.0/dataset |
Expand All @@ -69,20 +70,20 @@ The following **named graphs** can be queried from our SPARQL endpoint:

| Named graph | Description | No. RDF triples |
| ------------- | ---- | ----: |
| `http://taxref.mnhn.fr/lod/graph/metadata` | DCAT, VOID and SPARQL SD dataset descriptions + definition of various classes, concepts, properties (content of files `dataset/Taxrefld_static*.ttl`) |1,469|
| `http://taxref.mnhn.fr/lod/graph/metadata` | DCAT, VOID and SPARQL SD dataset descriptions + definition of various classes, concepts, properties (content of files `dataset/Taxrefld_static*.ttl`) |1,740|
| `http://taxref.mnhn.fr/lod/graph/biblio` | bibliographic resources |408,737|
| `http://taxref.mnhn.fr/lod/graph/locations` | regions, departements, territories etc. |320,599|
| `http://taxref.mnhn.fr/lod/graph/locations` | regions, departements, territories etc. |393,496|
| `http://taxref.mnhn.fr/lod/graph/media` | media (photos) linked to taxa |690,508|
| `http://taxref.mnhn.fr/lod/graph/statuscodes` | description of the status values of types international convention, european directive, protection and regulation. These are represented as instances of the class bibo:DocumentPart (e.g. http://taxref.mnhn.fr/lod/status/BONN/IBOAC) and related to the bibliographic source describing the document with property dct:isPartOf (content of files `statusCodes.ttl` and `statusBiblio.ttl`) |1,804|
| `http://taxref.mnhn.fr/lod/graph/classes/{TAXREF version}` | description of taxa as OWL classes |4,300,619|
| `http://taxref.mnhn.fr/lod/graph/concepts/{TAXREF version}` | description of scientific names as SKOS concepts |7,739,313|
| `http://taxref.mnhn.fr/lod/graph/classes/{TAXREF version}` | description of taxa as OWL classes |4,374,167|
| `http://taxref.mnhn.fr/lod/graph/concepts/{TAXREF version}` | description of scientific names as SKOS concepts |7,799,394|
| `http://taxref.mnhn.fr/lod/graph/interactions/{TAXREF version}` | species interactions |303,025|
| `http://taxref.mnhn.fr/lod/graph/statuses/{TAXREF version}` | all taxa statuses (legal, biogeographical, red list) |7,846,358|
| `http://taxref.mnhn.fr/lod/graph/vernacular/{TAXREF version}` | taxa vernacular names (direct and as SKOS-XL labels) |518,708|
| `http://taxref.mnhn.fr/lod/graph/dbxref/{TAXREF version}` | cross-references to identifiers of third-party data sources such as GBIF, WoRMS, the Plant List etc. |10,330,904|
| `http://taxref.mnhn.fr/lod/graph/webpages/{TAXREF version}` | `foaf:page` links to webpages |2,567,841|
| `http://taxref.mnhn.fr/lod/graph/links-*/{TAXREF version}` | interllinking to equivalent URIs from NCBI, Agrovoc, WoRMS |250,249|
| Total | | 35,280,107 |
| Total | | 35,486,931 |

## License

Expand Down
4 changes: 2 additions & 2 deletions dataset/examples/Taxrefld_example_interactions.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,11 @@

# ================================== Interaction between two species =====================================

# Direct link to the vernacular name (here: pollinates)
# Direct link (here: pollinates)
<http://taxref.mnhn.fr/lod/taxon/838621/13.0>
ro:RO_0002455 <http://taxref.mnhn.fr/lod/taxon/630004/13.0> .

# Reified link to the vernacular name (adds location and bibliographic reference)
# Reified link (adds location and bibliographic reference)
<http://taxref.mnhn.fr/lod/interact/838621-630004-178398>
a <http://purl.org/biotop/biotop.owl#OrganismInteraction> , rdf:Statement ;
rdfs:label "pollinates (statement)" ;
Expand Down
123 changes: 123 additions & 0 deletions src/add_dwc_ranks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
#!/bin/python3

import requests
import urllib.parse


endpoint = "https://taxref.mnhn.fr/sparql?query="
headers = { 'accept' : 'text/turtle' }

def run_query(query, outputfile):
output = requests.get(endpoint + urllib.parse.quote(query), headers = headers)
with open(outputfile, "w") as f:
f.write(output.text)
return

prefixes = '''
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix dwc: <http://rs.tdwg.org/dwc/terms/>
prefix taxref: <http://taxref.mnhn.fr/lod/>
prefix taxrefprop: <http://taxref.mnhn.fr/lod/property/>
prefix taxrefrk: <http://taxref.mnhn.fr/lod/taxrank/>
'''


# Add dwc:subgenus
query = prefixes + '''
construct { ?s dwc:subgenus ?tLabel. }
where {
graph <http://taxref.mnhn.fr/lod/graph/classes/15.0> {
?t a owl:Class;
taxrefprop:hasRank taxrefrk:SubGenus;
rdfs:label ?tLabel.
?s rdfs:subClassOf+ ?t.
}
}
'''
run_query(query, "dwc_subgenus.ttl")


# Add dwc:genus
query = prefixes + '''
construct { ?s dwc:genus ?tLabel. }
where {
graph <http://taxref.mnhn.fr/lod/graph/classes/15.0> {
?t a owl:Class;
taxrefprop:hasRank taxrefrk:Genus;
rdfs:label ?tLabel.
?s rdfs:subClassOf+ ?t.
}
}'''
run_query(query, "dwc_genus.ttl")


# Add dwc:subfamily
query = prefixes + '''
construct { ?s dwc:subfamily ?tLabel. }
where {
graph <http://taxref.mnhn.fr/lod/graph/classes/15.0> {
?t a owl:Class;
taxrefprop:hasRank taxrefrk:SubFamily;
rdfs:label ?tLabel.
?s rdfs:subClassOf+ ?t.
}
}'''
run_query(query, "dwc_subfamily.ttl")


# Add dwc:family
query = prefixes + '''
construct { ?s dwc:family ?tLabel. }
where {
graph <http://taxref.mnhn.fr/lod/graph/classes/15.0> {
?t a owl:Class;
taxrefprop:hasRank taxrefrk:Family;
rdfs:label ?tLabel.
?s rdfs:subClassOf+ ?t.
}
}'''
run_query(query, "dwc_family.ttl")


# Add dwc:order
query = prefixes + '''
construct { ?s dwc:order ?tLabel. }
where {
graph <http://taxref.mnhn.fr/lod/graph/classes/15.0> {
?t a owl:Class;
taxrefprop:hasRank taxrefrk:Order;
rdfs:label ?tLabel.
?s rdfs:subClassOf+ ?t.
}
}'''
run_query(query, "dwc_order.ttl")


# Add dwc:phylum
query = prefixes + '''
construct { ?s dwc:phylum ?tLabel. }
where {
graph <http://taxref.mnhn.fr/lod/graph/classes/15.0> {
?t a owl:Class;
taxrefprop:hasRank taxrefrk:Phylum;
rdfs:label ?tLabel.
?s rdfs:subClassOf+ ?t.
}
}'''
run_query(query, "dwc_phylum.ttl")


# Add dwc:kingdom
query = prefixes + '''
construct { ?s dwc:kingdom ?tLabel. }
where {
graph <http://taxref.mnhn.fr/lod/graph/classes/15.0> {
?t a owl:Class;
taxrefprop:hasRank taxrefrk:Kingdom;
rdfs:label ?tLabel.
?s rdfs:subClassOf+ ?t.
}
}'''
run_query(query, "dwc_kingdom.ttl")

84 changes: 0 additions & 84 deletions src/add_dwc_ranks.sparql

This file was deleted.

2 changes: 1 addition & 1 deletion src/env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# Licensed under the Apache License, Version 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

export DATASET_VERSION=15.0
export DATASET_DATE=2022-01-13
export DATASET_DATE=2022-07-29

# MongoDB database
export DB=taxrefv15
Expand Down
9 changes: 5 additions & 4 deletions src/virtuoso/import-taxrefld.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,6 @@ graph="http://taxref.mnhn.fr/lod/graph/concepts"
graph="http://taxref.mnhn.fr/lod/graph/classes/$DATASET_VERSION"
./virtuoso-import.sh --cleargraph --path $DATA_DIR --graph $graph taxonomy_classes.ttl

# After the previous one has been imported, use add_dwc_ranks.sparql to generate files dwc_*.ttl
graph="http://taxref.mnhn.fr/lod/graph/classes/$DATASET_VERSION"
./virtuoso-import.sh --path $DATA_DIR --graph $graph dwc_%.ttl

graph="http://taxref.mnhn.fr/lod/graph/classes/$DATASET_VERSION"
./virtuoso-import.sh --path $DATA_DIR --graph $graph dwc_%.ttl

Expand All @@ -62,6 +58,11 @@ graph="http://taxref.mnhn.fr/lod/graph/links-worms"
./virtuoso-import.sh --cleargraph --path $DATA_DIR --graph $graph externalIds_worms.ttl


# After the taxonomy is loaded, use add_dwc_ranks.py to generate files dwc_*.ttl
graph="http://taxref.mnhn.fr/lod/graph/classes/$DATASET_VERSION"
./virtuoso-import.sh --path .. --graph $graph dwc_%.ttl


# Calculated links
graph="http://taxref.mnhn.fr/lod/graph/links-agrovoc"
./virtuoso-import.sh --cleargraph --path $DATA_DIR --graph $graph links-agrovoc.nt
Expand Down
14 changes: 2 additions & 12 deletions src/xr2rml/xr2rml_externalIds_dbxref_tpl.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -55,12 +55,7 @@
rr:predicateObjectMap [
rr:predicate {{WDTPROP}};
rr:objectMap [ xrr:reference "$.externalId"; xsd:datatype xsd:string ];
].

<#TM_Taxon_XRef>
a rr:TriplesMap;
xrr:logicalSource [ xrr:query """db.externalIds.find({externalDbName: "{{EXTDBNAME}}"}, $where: 'this.taxrefId == this.taxonReferenceId'})""" ];
rr:subjectMap <#SM_Taxon>;
];
rr:predicateObjectMap [
rr:predicate schema:identifier;
rr:objectMap [ rr:template "{$.taxrefId}{$.externalDbName}{$.externalId}"; rr:termType rr:BlankNode ];
Expand All @@ -76,12 +71,7 @@
rr:predicateObjectMap [
rr:predicate {{WDTPROP}};
rr:objectMap [ xrr:reference "$.externalId"; xsd:datatype xsd:string ];
].

<#TM_Name_XRef>
a rr:TriplesMap;
xrr:logicalSource [ xrr:query """db.externalIds.find({externalDbName: "{{EXTDBNAME}}"})""" ];
rr:subjectMap <#SM_Name>;
];
rr:predicateObjectMap [
rr:predicate schema:identifier;
rr:objectMap [ rr:template "{$.taxrefId}{$.externalDbName}{$.externalId}"; rr:termType rr:BlankNode ];
Expand Down

0 comments on commit b2f1f50

Please sign in to comment.