ProvBench Traces

Below are provenance traces collected from three applications (Apache, BLAST, PostMark), each run on three operating systems (Linux, Mac OS X, and Windows). Each trace was collected using SPADE with a Linux / Mac OS X / Windows reporter and H2 SQL storage.

For more information about the workloads, please see:

Hasnain Lakhani, Rashid Tahir, Azeem Aqil, Fareed Zaffar, Dawood Tariq, and Ashish Gehani, Optimized Rollback and Re-computation, 46th IEEE Hawaii International Conference on Systems Science (HICSS), IEEE Computer Society, 2013. [PDF]

The provenance traces are provided in compressed SQL script format. This can be imported into most SQL databases.

	Linux (Linux Audit)	Mac OS X (OpenBSM)	Windows (Process Monitor)
Apache	SQL script, H2 database	SQL script, H2 database	SQL script, H2 database
BLAST	SQL script, H2 database	SQL script, H2 database	SQL script, H2 database
PostMark	SQL script, H2 database	SQL script, H2 database	SQL script, H2 database

Importing into MySQL

To import the provenance traces into MySQL, use the following command on Linux or Mac OS X, replacing {user} and {password} with your MySQL credentials:

gzip -dc < script.tar.gz | mysql -u {user} -p {password}

Viewing the H2 SQL database

To browse the H2 SQL database without importing the SQL script into an existing database, follow the instructions here.

Generating the provenance traces

For Apache, the following commands were used on Linux and Mac OS X:

./configure
make

The following command was used on Windows:

nmake /f Makefile.win _apached

For Postmark, the following configuration settings were used for the benchmark:

Base number of files: 500
Transactions: 500
Files range between 500 bytes and 9.77 kilobytes in size
Block sizes: read=512 bytes, write=512 bytes

For BLAST, the input data set was downloaded from ftp://ftp.ncbi.nlm.nih.gov/genomes/INFLUENZA/influenza.faa and the following command was used:

makeblastdb -in influenza.faa -parse_seqids -hash_index -out outputdb

This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Setting up SPADE
Storing provenance
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
  - On Linux
  - On macOS
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
  - Using filters
  - Available filters
Viewing provenance
- In a graph database
- In a relational database
Querying SPADE
- Illustrative example
- Transforming query responses
  - Using transformers
  - Available transformers
- Protecting query responses
Miscellaneous

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ProvBench Traces

Importing into MySQL

Viewing the H2 SQL database

Generating the provenance traces

Clone this wiki locally