Skip to content

Repository for subproject P4 "Endpoint Drafting and Testing", part of project "Increasing interoperability of bioimage dataset resources"at the 2024 deNBI BioHackathon

License

Notifications You must be signed in to change notification settings

NFDI4BIOIMAGE/omero-kg-benchmark

Repository files navigation

BioHackathon2024-P4

Repository for project P4 “Endpoint Drafting and Testing” at the 2024 deNBI BioHackathon

Benchmark environment

Triplestores and SPARQL endpoints

All endpoints run on the 128.176.233.7 server.
NameQuery form VPN IPEndpoint (http API) VPN IPQuery form public IPEndpoint (http API) public IPComments
Ontophttp://10.14.28.137:8080http://10.14.28.137:8080/sparqlhttp://128.176.233.7:8080http://128.176.233.7:8080/sparql
Fusekihttp://10.14.28.137:3030/#/dataset/OME/queryhttp://10.14.28.137:3030/OME/sparqlhttp://128.176.233.7:3030/#/dataset/OME/queryhttp://128.176.233.7:3030/OME/sparql
Virtuosohttp://10.14.28.137:8890/sparqlhttp://10.14.28.137:8890/sparqlhttp://128.176.233.7:8890/sparqlhttp://128.176.233.7:8890/sparql

SPARQL client

Apache-Jena

Queries

Directory queries/ contains a number of SPARQL query files. They can be run on any of the endpoints listed above (`rsparql` must be in `$PATH`)
rsparql --service http://128.176.233.7:8080/sparql --query 01-list_of_attributes.rq

Timing

The script queries/timer.sh runs a given query N times, measure wall clock, user, and system time and report the resulting statistics. Timings will be saved to disk. Usage (must be in the queries/ directory):
./timer.sh QUERY ENDPOINTURL ENDPOINTNAME NQUERIES
Example
cd queries
./timer.sh 01-list_of_attributes.rq http://128.176.233.7:8080/sparql ontop 100

This would run the query 01-list_of_attributes.rq on the ontop endpoint and write results to 01-list_of_attributes.ontop.timings.csv.

Timing analysis

The notebook queries/analyze_timings.ipynb loads all timing csv data, performs some descriptive statistics and renders a number of plots. Adjust to your liking. Required python packages: pandas, seaborn, matplotlib.

Mappings

Mappings are defined in the omero-ontop-mappings repo. The repo is cloned on the Münster server. Ontop runs inside a screen session and normally restarts after a change to the mappings, ontology, or config. It may become neccessary to restart ontop in case of syntactical errors in any of these files. To (re-)launch the ontop server:
screen -dr ontop
cd /home/ubuntu/repos/omero-ontop-mappings/hack24
./run-ontop.sh

To leave the screen session press and hold CTRL, press and release a, press and release d, release CTRL. This brings you back into your login shell. You can then logout from the server, the screen session will continue in the background.

Further reading

https://medium.com/wallscope/comparing-linked-data-triplestores-ebfac8c3ad4f

Logbook

A log of all benchmark runs executed during the 2024 de.NBI BioHackathon is in the logbook document.

About

Repository for subproject P4 "Endpoint Drafting and Testing", part of project "Increasing interoperability of bioimage dataset resources"at the 2024 deNBI BioHackathon

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages