The following instructions are for reproducing the results in the paper:
Modelling Life Cycle Sustainability Assessment on the Semantic Web
The overall workflow has several steps, which need to be executed in the correct order. Below is the list of steps, followed by individual instructions:
- Requirements
- Clone Repositories
- Getting Data
- Run Data Conversion
- Collect triple files
- Setup virtuoso
- Load triples
- Setup Yasgui
- Validate database
To run all scripts, a machine running Linux needs to be installed. The machine should atleast have the following specs:
- Linux Ubuntu 18.04 Distribution (with git, bash, unzip, wget and docker)
- Python 3.6+, pip and virtualenv
- DISK SPACE: ´~500GB Free
- CPU: 8 cores
All code for reproducibility is publically available from git repositories.
You can clone all repositories needed for reproducibility with the following commands:
git clone
git clone
git clone
git clone
We make use of the exiobase and YSTAFDB datasets.
Data access to the exiobase dataset needs a free user.
For convenience, we provide a copy of the dataset freely available at the following URL:
You can download, unpack, and move exiobase files to their required positions with the following commands:
wget '' -O
rm -rf
mv EXIOBASE_3.3.17_hsut_2011/MR_HSUT_2011_v3_3_17_extensions.xlsb arborist/arborist/data
mv EXIOBASE_3.3.17_hsut_2011/MR_HUSE_2011_v3_3_17.xlsb EXIOBASE-conversion-software/EXIOBASE_conversion_software/data/
mv EXIOBASE_3.3.17_hsut_2011/MR_HSUP_2011_v3_3_17.xlsb EXIOBASE-conversion-software/EXIOBASE_conversion_software/data/
Otherwise, the exiobase dataset EXIOBASE 3.3.17 hsut 2011
can be downloaded from this URL
It can be found under the tab DATA DOWNLOAD/EXIOBASE3 - hybrid
Extract the downloaded zip file and move files according to the above commands.
You can download and unpack the YSTAFDB files with the following commands:
wget '' -O
unzip -d ystafdb-input
rm -rf
mv ystafdb-input ystafdb/
In this step we run the EXIOBASE-conversion-software, as well as the ystafdb software, to convert excel files to RDF data.
The software is used to first convert the exiobase 3.3.17 xlsb dataset to a CSV file and from that extract the final triple graph.
To install the software, and convert the xlsb files to CSV files, the following commands can be used:
cd ~/EXIOBASE-conversion-software
mkdir output
pipenv --python python3 install
pipenv shell
python install
excel2csv-cli -i EXIOBASE_conversion_software/data/MR_HSUP_2011_v3_3_17.xlsb -o EXIOBASE_conversion_software/data/
excel2csv-cli -i EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17.xlsb -o EXIOBASE_conversion_software/data/
Be aware, the following scripts can take several hours to run, and should in some environments be run in a terminal screen environment. While still in the pipenv, we convert the hsup data file to an RDF graph with the following commands:
csv2rdf-cli -i EXIOBASE_conversion_software/data/MR_HSUP_2011_v3_3_17.csv -o EXIOBASE_conversion_software/data/ -c HSUP --flowtype output --multifile 100000 --merge True
mv EXIOBASE_conversion_software/data/flows_merged.nt output/exiobase_hsup.nt
gzip output/exiobase_hsup.nt
rm -rf EXIOBASE_conversion_software/data/MR_HSUP_2011_v3_3_17*
We now do the same for the huse data file with the following commands:
csv2rdf-cli -i EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17.csv -o EXIOBASE_conversion_software/data/ -c HUSE --flowtype input --multifile 100000 --merge True
mv EXIOBASE_conversion_software/data/flows_merged.nt output/exiobase_huse.nt
gzip output/exiobase_huse.nt
rm -rf EXIOBASE_conversion_software/data/MR_HUSE_2011_v3_3_17*
The two output RDF graphs for the hsup and huse data can now be found in the output
folder as exiobase-huse.nt.gz
and exiobase-hsup.nt.gz
The software is used to extract metadata from the exiobase dataset, used as a foundation of integration with other datasets. For the installation and usage of the arborist dataset, run the following commands:
cd ~/arborist
rm -rf config.json
git clone
mv bd4ab2d5ed82a8523a162e76d968971a/config.json .
rm -rf bd4ab2d5ed82a8523a162e76d968971a
pipenv --python python3 install
pipenv shell
python install
arborist-cli regenerate output -i arborist/data/
The software is used to extract triple graphs from the ystafdb dataset. For the installation and usage of the ystafdb dataset, run the following commands:
cd ~/ystafdb
pipenv --python python3 install
pipenv shell
python install
ystafdb-cli -i ystafdb-input
cd ../
The ystafdb triple files can now be found in the output folder
We now move all output graphs to a single folder, for import into virtuoso. This is done with the following commands:
git clone
mv cf16f495291d6f47fbd659367c2863ea/file_mover.bash .
rm -rf cf16f495291d6f47fbd659367c2863ea
mkdir -p import
bash file_mover.bash
wget -O import/ontology_v0.2.ttl
To setup virtuoso with docker, use the following commands:
docker pull openlink/virtuoso-opensource-7:latest
mkdir -p database
wget -O virtuoso.ini
mv virtuoso.ini database/
docker run --name vos -d --volume `pwd`/database:/database -v `pwd`/import:/import -t -p 1111:1111 -p 8890:8890 -i openlink/virtuoso-opensource-7:latest
We now load all triples into virtuoso, using a script that can be executed through isql. To download the script and import all graphs, use the following commands:
git clone
mv c8069487db59827cd62ab3d7ebb132a5/import.isql import/
rm -rf c8069487db59827cd62ab3d7ebb132a5
docker exec -it vos isql 1111 exec="LOAD /import/import.isql"
The yasgui requires no installation.
We simply add the yasgui index file to the correct folder inside virtuoso.
docker cp yasgui-query-interface/index.html vos:/opt/virtuoso-opensource/vsp/index.html
We have created several query tests for which consistency and correctness of the database can be shown. We have provided query results from the original bonsai database, along with their respective queries. The following process will run all queries on the new database, and md5 check the results against the result files from the original database. Use the following commands to test the consistency of the database.
git clone
git clone
mv 6de02ab2abd5bbe351d1c251ed94f525/bonsai_database_test.bash BONSAI-reproducibility/
cd BONSAI-reproducibility
bash bonsai_database_test.bash
The database will now have all queries from the data_queries run against it, and their respective output will be returned as a CSV file. The files will then be md5 checked against the output CSV files of running the queries against the original odas server database. OK means the md5 hashes are identical, whereas FALSE means they are not identical.