Skip to content

Example illustrating provenance querying

Ashish Gehani edited this page Jun 29, 2015 · 16 revisions

The example below illustrates the use of the query client.


After the SPADE server has been started, add the Neo4j storage and DSL reporter using the SPADE controller (invoked with spade control):

-> add storage Neo4j /tmp/spade_database
Adding storage Neo4j... done

-> add reporter DSL /tmp/spade_pipe
Adding reporter DSL... done

To create the provenance metadata needed for the example, use the following commands in terminal or shell to add the vertices and edges:

echo type:Process id:1 name:root\\ process pid:10 >> /tmp/spade_pipe 
echo type:Process id:2 name:child\\ process pid:32 >> /tmp/spade_pipe 
echo type:WasTriggeredBy from:2 to:1 time:5\\:56\\ PM >> /tmp/spade_pipe 
echo type:Artifact id:3 filename:output.tmp >> /tmp/spade_pipe 
echo type:Artifact id:4 filename:output.o >> /tmp/spade_pipe 
echo type:Used from:2 to:3 iotime:12\\ ms >> /tmp/spade_pipe 
echo type:WasGeneratedBy from:4 to:2 iotime:11\\ ms >> /tmp/spade_pipe 
echo type:WasDerivedFrom from:4 to:3 >> /tmp/spade_pipe 
echo type:Agent id:user uid:10 gid:10 name:john >> /tmp/spade_pipe 
echo type:WasControlledBy from:1 to:user >> /tmp/spade_pipe 
echo type:WasControlledBy from:2 to:user >> /tmp/spade_pipe 

(Note that the Agent vertex has a non-numeric id. This illustrates that a vertex's id key can have any value as long as it is unique.)


If the Graphviz storage had been added, the resulting graph would be:

Note that the Open Provenance Model (OPM) convention of using octagons for Agent vertices, rectangles for Process vertices, and ellipses for Artifact vertices is followed. In addition, the graph elements are colored with the following semantics:

Color OPM meaning
pink Agent
light blue Process
yellow Artifact
purple WasControlledBy
green Used
red WasGeneratedBy
dark blue WasTriggeredBy
orange WasDerivedFrom

Using the SPADE query client (invoked with spade dig), the graph of the provenance of the root process (which has a vertexId of 1) can be retrieved:

-> graph1 = getLineage(1, 10, ancestors)
Time taken for query: 389 ms
-> export graph1 /tmp/ancestors.dot
Exported graph1 to /tmp/ancestors.dot

The file /tmp/ancestors.dot is in Graphviz format. When rendered it looks like this:


Similarly, the graph in which the same vertex is part of the provenance can be retrieved with:

-> graph2 = getLineage(1, 10, descendants)
Time taken for query: 560 ms
-> export graph2 /tmp/descendants.dot
Exported graph2 to /tmp/descendants.dot

When rendered, it looks like this:

Clone this wiki locally