-
Notifications
You must be signed in to change notification settings - Fork 0
ISWC2013Paper1BenchmarkDetails
This page presents the test configurations and the detailed results mentioned in the paper for ISWC 2013 entitled: "Efficient SPARQL-to-SQL translation using R2RML Mappings" (PDF) (Tech Report)
There are 5 systems used by the test and reported in the paper:
- onTop (with MySQL and BD2 as the backend)
- Stardog 1.2
- OWLIM-SE 5.3.5849
- Virtuoso OpenSource RDF 6.1.6
- Virtuoso OpenSource RDF Views 6.1.6
The tests were run on the same machine with the following specification:
- Hardware
* Processors: 24 units Intel Xeon 6 cores @3.47GHz * Memory: 106GB DDR3 * Hard Disks: 1TB (15,000 rpm) SCSI and 200GB (15,000 rpm) SCSI
- Software
* Operating System: Ubuntu 12.04 64-bit, Kernel 3.2.0 * Filesystem: ext3 * Separate partitions for application (on 200GB HDD) and databases (on 1TB HDD) * Java Version and JVM: Version 1.6.0_27, OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
The test uses three different dataset sizes: 25 million, 100 million and 200 million triples. The datasets were generated using the BSBM data generator.
Details about the benchmark datasets are summarised in the following table:
BSBM 25 | BSBM 100 | BSBM 200 | |
---|---|---|---|
Scale Factor | 71431 | 288115 | 576495 |
Exact number | 25,000,172 | 100,000,165 | 200,000,153 |
File Size Turtle (.ttl) | 695MB | 2.8GB | 5.6GB |
File Size SQL* (.csv) | 1.3GB | 5.2GB | 11GB |
- Note: We modified the generated SQL files into comma separated values (CSV) files.
- MySQL: Version 5.6
- Configuration file changes:
# CACHES AND LIMITS # tmp_table_size = 32M max_heap_table_size = 32M query_cache_type = 0 query_cache_size = 0 max_connections = 500 thread_cache_size = 50 open_files_limit = 65535 table_definition_cache = 1024 table_open_cache = 2048 # INNODB # innodb_flush_method = O_DIRECT innodb_log_file_size = 512M innodb_flush_log_at_trx_commit = 0 innodb_buffer_pool_size = 80G sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES innodb_thread_concurrency = 0 innodb_read_io_threads = 32
- Indexes:
* Indexes on primary key * Indexes on columns given by bsbmtools data generator
- Other notes:
* After loading the dataset, we performed index generation and did "analyse table" on all tables in the database. * MySQL Connector Version 5.0.8 was used for JDBC access.
The table below summarises the load time of the CSV dump files (in hh:mm:ss). It includes the indexing time.
BSBM 25 | BSBM 100 | BSBM 200 | |
---|---|---|---|
Total time | 00:01:57 | 00:11:33 | 00:25:13 |
The summary of the benchmark can be found in the Google Doc public spreadsheet.
Number of clients | ||||||
---|---|---|---|---|---|---|
1 | 4 | 8 | 16 | 25 | Return Triples | |
BSBM 25 | csv | csv | csv | csv | csv | csv |
BSBM 100 | csv | csv | csv | csv | csv | csv |
BSBM 200 | csv | csv | csv | csv | csv | csv |
- MySQL: Version 5.6
- Indexes:
* Indexes on primary key * Indexes on columns given by bsbmtools data generator
- Other notes:
* Set buffer pool page size 32K * Set automatic memory, i.e., DB2 will use any available system memory * After loading the dataset, we performed index generation and did "RUNSTATS" on all tables in the database.
The table below summarises the load time of the CSV dump files (in hh:mm:ss). It includes the indexing time.
BSBM 25 | BSBM 100 | BSBM 200 |
---|---|---|
00:02:20 | 00:15:20 | 00:24:38 |
The summary of the benchmark can be found in the Google Doc public spreadsheet.
Number of clients | ||||||
---|---|---|---|---|---|---|
1 | 4 | 8 | 16 | 25 | Return Triples | |
BSBM 25 | csv | csv | csv | csv | csv | csv |
BSBM 100 | csv | csv | csv | csv | csv | csv |
BSBM 200 | csv | csv | csv | csv | csv | csv |
- Stardog Version 1.2
- Server configuration in stardog.properties:
strict.parsing=false
- Indexing:
Create less index by adding "--index-triples-only" in the loading command.
- Other notes:
* Use multiple input files (bulk load) * Each input file was compressed using gzip.
BSBM 25 | BSBM 100 | BSBM 200 |
---|---|---|
00:02:54 | 00:11:47 | 00:22:59 |
The summary of the benchmark can be found in the Google Doc public spreadsheet.
Number of clients | ||||||
---|---|---|---|---|---|---|
1 | 4 | 8 | 16 | 25 | Return Triples | |
BSBM 25 | csv | csv | csv | csv | csv | csv |
BSBM 100 | csv | csv | csv | csv | csv | csv |
BSBM 200 | csv | csv | csv | csv | csv | csv |
- OWLIM-SE Version 5.3.5849
- Repository configuration template for BSBM 25:
# # Sesame configuration template for a owlim repository # @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix rep: <http://www.openrdf.org/config/repository#>. @prefix sr: <http://www.openrdf.org/config/repository/sail#>. @prefix sail: <http://www.openrdf.org/config/sail#>. @prefix owlim: <http://www.ontotext.com/trree/owlim#>. [] a rep:Repository ; rep:repositoryID "BSBM_25" ; rdfs:label "BSBM 25 Million Triples" ; rep:repositoryImpl [ rep:repositoryType "openrdf:SailRepository" ; sr:sailImpl [ sail:sailType "owlim:Sail" ; owlim:base-URL "http://www4.wiwiss.fu-berlin.de/ibzer/bsbm/v01/instances/" ; owlim:entity-index-size "50000000" ; owlim:repository-type "file-repository" ; owlim:ruleset "empty" ; # OWLIM-SE parameters owlim:cache-memory "15G" ; # Other OWLIM-SE parameters owlim:transaction-mode "fast" ; ] ].
BSBM 25 | BSBM 100 | BSBM 200 |
---|
owlim-bsbm25.ttl | owlim-bsbm100.ttl | owlim-bsbm200.ttl |
---|
The table below summarises the load time of using a single Turtle TTL file (in hh:mm:ss).
BSBM 25 | BSBM 100 | BSBM 200 |
---|---|---|
00:12:21 | 00:49:35 | 01:43:10 |
The summary of the benchmark can be found in the Google Doc public spreadsheet.
Number of clients | ||||||
---|---|---|---|---|---|---|
1 | 4 | 8 | 16 | 25 | Return Triples | |
BSBM 25 | csv | csv | csv | csv | csv | csv |
BSBM 100 | csv | csv | csv | csv | csv | csv |
BSBM 200 | csv | csv | csv | csv | csv | csv |
- Virtuoso OpenSource Version 6.1
- Changes in the configuration file virtuoso.ini
BSBM 25 | BSBM 100 | BSBM 200 |
---|---|---|
00:16:02 | 01:13:04 | 02:21:46 |
The summary of the benchmark can be found in the Google Doc public spreadsheet.
Number of clients | ||||||
---|---|---|---|---|---|---|
1 | 4 | 8 | 16 | 25 | Return Triples | |
BSBM 25 | csv | csv | csv | csv | csv | csv |
BSBM 100 | csv | csv | csv | csv | csv | csv |
BSBM 200 | csv | csv | csv | csv | csv | csv |
- Virtuoso OpenSource Version 6.1
- Changes in the configuration file virtuoso.ini
Due to limited time, the loading time was not measured
BSBM 25 | BSBM 100 | BSBM 200 |
---|---|---|
N/A | N/A | N/A |
The summary of the benchmark can be found in the Google Doc public spreadsheet.
Number of clients | ||||||
---|---|---|---|---|---|---|
1 | 4 | 8 | 16 | 25 | Return Triples | |
BSBM 25 | csv | csv | csv | csv | csv | csv |
BSBM 100 | csv | csv | csv | csv | csv | csv |
BSBM 200 | csv | csv | csv | csv | csv | csv |
- MySQL: Version 5.6
- Configuration file changes:
# CACHES AND LIMITS # tmp_table_size = 32M max_heap_table_size = 32M query_cache_type = 0 query_cache_size = 0 max_connections = 500 thread_cache_size = 50 open_files_limit = 65535 table_definition_cache = 1024 table_open_cache = 2048 # INNODB # innodb_flush_method = O_DIRECT innodb_log_file_size = 512M innodb_flush_log_at_trx_commit = 0 innodb_buffer_pool_size = 80G sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES innodb_thread_concurrency = 0 innodb_read_io_threads = 32
- Indexes:
* Indexes on primary key * comnames (SpecCode) * comnames (Script) * comnames (ComNamesRefNo) * Species (FBname) * Species (FamCode) * Species (Subfamily) * Species (DemersPelag) * Species (Aquarium) * countref (PAESE) * countref (Note) * refrens (Author) * refrens (Year) * refrens (FirstAuthor) * refrens (Keywords) * morphdat (autoctr) * morphdat (Speccode) * picturesmain (SpecCode) * picturesmain (C_Code) * families (Order) * families (Ordnum) * country (autoctr) * country (SpecCode)
- Other notes:
* After loading the dataset, we performed index generation and did "analyse table" on all tables in the database. * MySQL Connector Version 5.0.8 was used for JDBC access.
N/A
Number of clients | |||||
---|---|---|---|---|---|
1 | 4 | 8 | 16 | 25 | Return Triples |
csv | csv | csv | csv | csv | csv |
- Stardog Version 1.2
- Server configuration in stardog.properties:
strict.parsing=false
- Indexing:
Create less index by adding "--index-triples-only" in the loading command.
- Other notes:
The load time of a single N-TRIPLES dump files (in hh:mm:ss): 00:02:34
Number of clients | |||||
---|---|---|---|---|---|
1 | 4 | 8 | 16 | 25 | Return Triples |
csv | csv | csv | csv | csv | csv |
- OWLIM-SE Version 5.3.5849
- Repository configuration template (i.e., owlim-fish.ttl):
# # Sesame configuration template for a owlim repository # @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix rep: <http://www.openrdf.org/config/repository#>. @prefix sr: <http://www.openrdf.org/config/repository/sail#>. @prefix sail: <http://www.openrdf.org/config/sail#>. @prefix owlim: <http://www.ontotext.com/trree/owlim#>. [] a rep:Repository ; rep:repositoryID "fish" ; rdfs:label "Fishmark" ; rep:repositoryImpl [ rep:repositoryType "openrdf:SailRepository" ; sr:sailImpl [ sail:sailType "owlim:Sail" ; owlim:base-URL "http://fishdelish.cs.man.ac.uk/rdf/vocab/resource/" ; # owlim:defaultNS "hhttp://www4.wiwiss.fu-berlin.de/ibzer/bsbm/v01/instances/" ; owlim:entity-index-size "50000000" ; owlim:repository-type "file-repository" ; owlim:ruleset "empty" ; # owlim:storage-folder "storage" ; # OWLIM-SE parameters owlim:cache-memory "15G" ; # Other OWLIM-SE parameters owlim:transaction-mode "fast" ; ] ].
The load time of a single N-TRIPLES dump files (in hh:mm:ss): 00:06:32
Number of clients | |||||
---|---|---|---|---|---|
1 | 4 | 8 | 16 | 25 | Return Triples |
csv | csv | csv | csv | csv | csv |
Note: OWLIM produced more results for Query 2 compare to the other systems.
- Virtuoso OpenSource Version 6.1
- Changes in the configuration file virtuoso.ini
The load time of a single N-TRIPLES dump files (in hh:mm:ss): 00:12:21
Number of clients | |||||
---|---|---|---|---|---|
1 | 4 | 8 | 16 | 25 | Return Triples |
csv | csv | csv | csv | csv | csv |
- Quick Start Guide
- Easy-Tutorials
- More Tutorials
- Examples
- FAQ
- Using Ontop
- Learning more
- Troubleshooting
- Developer Guides
- Links