Skip to content

Example illustrating provenance querying

Raza Ahmad edited this page Aug 25, 2017 · 16 revisions

The example below illustrates the use of the query client.


After the SPADE server has been started, add the Neo4j storage and DSL reporter using the SPADE controller (invoked with spade control):

-> add storage Neo4j /tmp/spade_database
Adding storage Neo4j... done

-> add reporter DSL /tmp/spade_pipe
Adding reporter DSL... done

To create the provenance metadata needed for the example, use the following commands in terminal or shell to add the vertices and edges:

echo type:Process id:1 name:root\\ process pid:10 >> /tmp/spade_pipe 
echo type:Process id:2 name:child\\ process pid:32 >> /tmp/spade_pipe 
echo type:WasTriggeredBy from:2 to:1 time:5\\:56\\ PM >> /tmp/spade_pipe 
echo type:Artifact id:3 filename:output.tmp >> /tmp/spade_pipe 
echo type:Artifact id:4 filename:output.o >> /tmp/spade_pipe 
echo type:Used from:2 to:3 iotime:12\\ ms >> /tmp/spade_pipe 
echo type:WasGeneratedBy from:4 to:2 iotime:11\\ ms >> /tmp/spade_pipe 
echo type:WasDerivedFrom from:4 to:3 >> /tmp/spade_pipe 
echo type:Agent id:user uid:10 gid:10 name:john >> /tmp/spade_pipe 
echo type:WasControlledBy from:1 to:user >> /tmp/spade_pipe 
echo type:WasControlledBy from:2 to:user >> /tmp/spade_pipe 

(Note that the Agent vertex has a non-numeric id. This is different from the unique id for each vertex which is a content-based hash.)


If the Graphviz storage had been added, the resulting graph would be:

Note that the Open Provenance Model (OPM) convention of using octagons for Agent vertices, rectangles for Process vertices, and ellipses for Artifact vertices is followed. In addition, the graph elements are colored with the following semantics:

Color OPM meaning
pink Agent
light blue Process
yellow Artifact
purple WasControlledBy
green Used
red WasGeneratedBy
dark blue WasTriggeredBy
orange WasDerivedFrom

Using the SPADE query client (invoked with spade query), the graph of the provenance of the root process (which has a vertexId of 1) can be retrieved:

-> c1 = vertexId = 1
-> export > /tmp/ancestors.dot
-> GetLineage(c1, 100, ancestors, 10)
Time taken for query: 389 ms
Output exported to file /tmp/ancestors.dot

The file /tmp/ancestors.dot is in Graphviz format. When rendered it looks like this:


Similarly, the graph in which the same vertex is part of the provenance can be retrieved with:

-> export > /tmp/descendants.dot
-> GetLineage(c1, 100, descendants, 10)
Time taken for query: 560 ms
Output exported to file /tmp/descendants.dot

When rendered, it looks like this:

Clone this wiki locally