-
Notifications
You must be signed in to change notification settings - Fork 76
Available filters
SPADE includes a set of filters to manipulate provenance metadata before it is committed to storage. They are described below.
This filter can be used to add annotation(s) to all vertices and edges that pass through it. If the annotation already existed then it is replaced.
The filter can be added using SPADE's controller:
-> add filter AddAnnotation position=1 host=spade-host
Adding filter AddAnnotation... done
The command above will add the filter which will add the annotation key host
with annotation value spade-host
to vertices and edges.
This filter is used to exclude files (based on their name) from being committed to persistent storage. The regular expression for matching filenames should be specified in cfg/spade.filter.Blacklist.config
.
This filter converts the date-time present in a vertex or an edge annotation value to a specified format. The converted date-time is added as a new annotation which can be specified in the configuration file cfg/spade.filter.ConvertTime.config
. Please see cfg/spade.filter.ConvertTime.config
for more details.
This filter tracks the ipc
, mount
, net
, pid
, and user
namespaces of every thread that performs a write to the filesystem. When any thread performs a read, its tuple of namespaces is compared to that of each thread that has performed a write to the same path in the past. If the tuples differ, this is logged. Log entries indicate the presence of cross-container or host-container flows.
The default configuration is specified in cfg/spade.filter.CrossNamespaces.config
. It detects flows between processes in different Linux namespaces through all artifact types. The log file containing the cross-namespace events is created at tmp/cross-namespaces.json
. Each line in the log file is a cross-namespace event as a JSON object. Each event contains the following information:
-
cross-namespace-event-id
: The event id generated by the filter. -
artifact
: The annotations of the artifact through which the cross-namespace flow occurred. This includes only the matched annotations. Only those annotations that are configured incfg/spade.filter.CrossNamespaces.config
are reported. -
artifacts
: A list of artifacts with the matched and extra reportable annotations as specified by keyartifactAnnotationsToReport
incfg/spade.filter.CrossNamespaces.config
. Note: This may include false positives. -
reader
: The annotations of the reader. -
read-edge
: The annotations of the edge between the reader and the artifact. -
writers
: A list of writers. For each writer, only those annotations that are configured incfg/spade.filter.CrossNamespaces.config
usingprocessAnnotationsToMatch
, andprocessAnnotationsToReport
are reported. Note: This may include false positives.
An example cross-namespace event (truncated and formatted for simplicity):
{
"cross-namespace-event-id": "0",
"artifact": {
"path": "/etc/passwd"
},
"artifacts":[
{
"path": "/etc/passwd",
"inode": "880"
},
{
"path": "/etc/passwd",
"inode": "881"
}
],
"reader": {
"pid": "7575",
"ppid": "7574",
"name": "docker_container_process",
"mount namespace": "100000001"
},
"read-edge": {
"operation": "read",
"size": "5"
},
"writers": [
{
"mount namespace": "100000002",
"pid": "25"
}
]
}
Above, file /etc/passwd
was written by process in mount namespace
100000002
and read by process docker_container_process
. The flow was logged due to the difference in the value of the mount namespace
.
This filter tracks the ancestors of a file and creates a new version each time a new ancestor is encountered. Please see cfg/spade.filter.CycleAvoidance.config
for configuration options.
The Fusion filter can be used to merge vertices from related provenance streams. The configuration for this filter is stored in cfg/fusion.config
and has the following format:
-- BEGIN FILE --
<1st reporter>
<2nd reporter>
<1st reporter>.<annotation>=<2nd reporter>.<annotation>
...
-- END FILE --
To merge the two streams, the names of both reporters must be specified on the first two lines of the config file.
Next, rules can be specified on which to merge annotations. These rules are specified as <1st reporter>.<annotation>=<2nd reporter>.<annotation>
.
The Fusion filter will check to see if the incoming vertices satisfy the merging rules. If vertices are found that match the criteria, they are fused into a single vertex.
This filter tracks the entire lineage graph of a file and creates a new version if a new edge would have created a cycle. By default, an annotation named GFVersion
is added to all vertices. The value of the annotation GFVersion
is the version assigned by the GraphFinesse
filter. Please see cfg/spade.filter.GraphFinesse.config
for configuration options.
Reads and writes in an operating system often occurs as runs of one or the other type. For example, a single function that reads in a file may result in multiple read system calls. This can result in a high volume of provenance metadata, especially when reading or writing large files. The IORuns filter can be used to fuse consecutive edges of the same type of I/O operation (i.e., either read or write) into a single edge.
By default only the reads, and writes for artifacts with the annotation path
are merged. To merge reads, and writes for some other artifact specify the annotation(s) for the artifact as key
in arguments to the filter or update the value specified in the default config file for the filter at cfg/spade.filter.IORuns.config
.
The filter can be added using SPADE's controller:
-> add filter IORuns position=1 key="path,permissions"
Adding filter IORuns... done
The above-mentioned command tells the filter to merge reads, and writes for artifacts which have both the annotations path
, and permissions
.
This filter translates OPM vertex and edge elements into corresponding W3C PROV ones.
This filter versions a vertex each time it is encountered as a child vertex in an edge. Please see cfg/spade.filter.VersionOnWrite.config
for configuration options.
This filter computes features on provenance collected using the ProcMon reporter. The features are computed according to the approach described in the paper: Mining Data Provenance to Detect Advanced Persistent Threats. This filter can be used as follows:
-> add filter WindowsFeatures position=1 malicious=cmd.exe,Explorer.EXE inceptionTime=10000000 taintedParentWeight=5.0
Adding filter WindowsFeatures... done
The command, above, specifies the three following arguments:
-
malicious
: A comma-separated list of process names to mark as malicious -
inceptionTime
: Time window in a process's lifetime to consider as it's inception window -
taintedParentWeight
: The weight to use for parent processes to compute the value of taint on child processes
Additionally, the filter (on removal) writes the computed features for processes, and artifacts to tmp/windows.process.features.csv
, and tmp/windows.filepath.features.csv
, respectively.
All of the above arguments (except position
) can also be specified in the configuration file cfg/spade.filter.WindowsFeatures.config
.
This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
- Setting up SPADE
- Storing provenance
-
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
- Viewing provenance
-
Querying SPADE
- Illustrative example
- Transforming query responses
- Protecting query responses
- Miscellaneous