On today's Web, Linked Data is published in different ways, which include data dumps, subject pages, and results of SPARQL queries. We call each such part a Linked Data Fragment.
The issue with the current Linked Data Fragments is that they are either so powerful that their servers suffer from low availability rates (as is the case with SPARQL), or either don't allow efficient querying.
Instead, this server offers Triple Pattern Fragments. Each Triple Pattern Fragment offers:
- data that corresponds to a triple pattern (example).
- metadata that consists of the (approximate) total triple count (example).
- controls that lead to all other fragments of the same dataset (example).
An example server is available at data.linkeddatafragments.org.
This server requires Node.js 4.0 or higher and is tested on OSX and Linux. To install, execute:
$ [sudo] npm install -g ldf-server
First, create a configuration file config.json
similar to config/config-example.json
,
in which you detail your data sources.
For example, this configuration uses an HDT file
and a SPARQL endpoint as sources:
{
"title": "My Linked Data Fragments server",
"datasources": {
"dbpedia": {
"title": "DBpedia 2014",
"type": "HdtDatasource",
"description": "DBpedia 2014 with an HDT back-end",
"settings": { "file": "data/dbpedia2014.hdt" }
},
"dbpedia-sparql": {
"title": "DBpedia 3.9 (Virtuoso)",
"type": "SparqlDatasource",
"description": "DBpedia 3.9 with a Virtuoso back-end",
"settings": { "endpoint": "http://dbpedia.restdesc.org/", "defaultGraph": "http://dbpedia.org" }
}
}
}
The following sources are supported out of the box:
- HDT files (
HdtDatasource
withfile
setting) - N-Triples documents (
TurtleDatasource
withurl
setting) - Turtle documents (
TurtleDatasource
withurl
setting) - JSON-LD documents (
JsonLdDatasource
withurl
setting) - SPARQL endpoints (
SparqlDatasource
withendpoint
and optionallydefaultGraph
settings)
Support for new sources is possible by implementing the Datasource
interface.
After creating a configuration file, execute
$ ldf-server config.json 5000 4
Here, 5000
is the HTTP port on which the server will listen,
and 4
the number of worker processes.
Now visit http://localhost:5000/
in your browser.
You can reload the server without any downtime
in order to load a new configuration or version.
In order to do this, you need the process ID of the server master process.
One possibility to obtain this are the server logs:
$ bin/ldf-server config.json
Master 28106 running.
Worker 28107 running on http://localhost:3000/.
If you send the server a SIGHUP
signal:
$ kill -s SIGHUP 28106
it will reload by replacing its workers.
Note that crashed or killed workers are always replaced automatically.
A typical Linked Data Fragments server will be exposed
on a public domain or subdomain along with other applications.
Therefore, you need to configure the server to run behind an HTTP reverse proxy.
To set this up, configure the server's public URL in your server's config.json
:
{
"title": "My Linked Data Fragments server",
"baseURL": "http://data.example.org/",
"datasources": { … }
}
Then configure your reverse proxy to pass requests to your server. Here's an example for nginx:
server {
server_name data.example.org;
location / {
proxy_pass http://127.0.0.1:3000$request_uri;
proxy_set_header Host $http_host;
proxy_pass_header Server;
}
}
Change the value 3000
into the port on which your Linked Data Fragments server runs.
If you would like to proxy the data in a subfolder such as http://example.org/my/data
,
modify the baseURL
in your config.json
to "http://example.org/my/data"
and change location
from /
to /my/data
(excluding a trailing slash).
HTTPS can be enabled in two ways: natively by the server, or through a proxy (explained above).
With native HTTPS, the server will establish the SSL layer. Set the following values in your config file to enable this:
{
"protocol": "https",
"ssl": {
"keys" : {
"key": "./private-key-server.key.pem",
"ca": ["./root-ca.crt.pem"],
"cert": "./server-certificate.crt.pem"
}
}
}
If protocol
is not specified, it will derive the protocol from the baseURL
. Hence, HTTPS can also be enabled as such:
{
"baseURL": "https://data.example.org/",
"ssl": {
"keys" : {
"key": "./private-key-server.key.pem",
"ca": ["./root-ca.crt.pem"],
"cert": "./server-certificate.crt.pem"
}
}
}
If you decide to let a proxy handle HTTPS, use this configuration to run the server as http
, but construct links as https
(so clients don't break):
{
"protocol": "http",
"baseURL": "https://data.example.org/",
}
You can export the environment variable ORIGINAL_PATH_HEADER
before starting
the server. The value provided will be looked up in the headers and the value
of the header will replace the path normally used to generate the URL when
consulting the service.
If you want to rapidly deploy the server as a microservice, you can build a Docker container as follows:
$ docker build -t ldf-server .
After that, you can run your newly created container:
$ docker run -p 3000:3000 -t -i --rm -v $(pwd)/config.json:/tmp/config.json ldf-server /tmp/config.json
You can enable the Memento protocol to offer different versions of an evolving dataset.
The Linked Data Fragments server is written by Ruben Verborgh.
The Linked Data Fragments client is written by Ruben Verborgh and colleagues.
This code is copyrighted by Ghent University – imec and released under the MIT license.