-
Notifications
You must be signed in to change notification settings - Fork 77
Example illustrating provenance querying
The example below illustrates the use of the query client.
After the SPADE server has been started, add the Neo4j storage and DSL reporter using the SPADE controller (invoked with spade control
):
-> add storage Neo4j /tmp/spade_database
Adding storage Neo4j... done
-> add reporter DSL /tmp/spade_pipe
Adding reporter DSL... done
To create the provenance metadata needed for the example, use the following commands in terminal or shell to add the vertices and edges:
echo type:Process id:1 name:root\\ process pid:10 >> /tmp/spade_pipe
echo type:Process id:2 name:child\\ process pid:32 >> /tmp/spade_pipe
echo type:WasTriggeredBy from:2 to:1 time:5\\:56\\ PM >> /tmp/spade_pipe
echo type:Artifact id:3 filename:output.tmp >> /tmp/spade_pipe
echo type:Artifact id:4 filename:output.o >> /tmp/spade_pipe
echo type:Used from:2 to:3 iotime:12\\ ms >> /tmp/spade_pipe
echo type:WasGeneratedBy from:4 to:2 iotime:11\\ ms >> /tmp/spade_pipe
echo type:WasDerivedFrom from:4 to:3 >> /tmp/spade_pipe
echo type:Agent id:user uid:10 gid:10 name:john >> /tmp/spade_pipe
echo type:WasControlledBy from:1 to:user >> /tmp/spade_pipe
echo type:WasControlledBy from:2 to:user >> /tmp/spade_pipe
(Note that the Agent vertex has a non-numeric id
. This is different from the unique id for each vertex which is a content-based hash
.)
If the Graphviz
storage had been added, the resulting graph would be:
Note that the Open Provenance Model (OPM) convention of using octagons for Agent vertices, rectangles for Process vertices, and ellipses for Artifact vertices is followed. In addition, the graph elements are colored with the following semantics:
Color | OPM meaning |
---|---|
pink | Agent |
light blue | Process |
yellow | Artifact |
purple | WasControlledBy |
green | Used |
red | WasGeneratedBy |
dark blue | WasTriggeredBy |
orange | WasDerivedFrom |
Using the SPADE query client (invoked with spade query
), the graph of the provenance of the root process
(which has a vertexId
of 1
) can be retrieved:
-> c1 = vertexId = 1
-> export > /tmp/ancestors.dot
-> GetLineage(c1, 100, ancestors, 10)
Time taken for query: 389 ms
Output exported to file /tmp/ancestors.dot
The file /tmp/ancestors.dot
is in Graphviz format. When rendered it looks like this:
Similarly, the graph in which the same vertex is part of the provenance can be retrieved with:
-> export > /tmp/descendants.dot
-> GetLineage(c1, 100, descendants, 10)
Time taken for query: 560 ms
Output exported to file /tmp/descendants.dot
When rendered, it looks like this:
This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
- Setting up SPADE
- Storing provenance
-
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
- Viewing provenance
-
Querying SPADE
- Illustrative example
- Transforming query responses
- Protecting query responses
- Miscellaneous