-
Notifications
You must be signed in to change notification settings - Fork 75
Ingesting CDM
The DARPA Transparent Computing program defined a Common Data Model (CDM) to represent data provenance and information flow. SPADE's CDM reporter ingests provenance emitted by SPADE's CDM storage, in either Avro binary or JSON format, that conforms to the schema in the cfg/spade.storage.CDM.avsc
file.
The CDM reporter requires at least one argument, which is the inputFile
containing the CDM. If the file is in JSON format, its name must include the .json extension. If the file is in Avro binary format, its name must include the .bin extension. Note that this must be done in the SPADE controller (after the SPADE server has been started):
-> add reporter CDM inputFile=/tmp/cdm.json
Adding reporter CDM... done
The waitForLog=false
option can be used to ensure that ingestions stops when the reporter is removed. Note that by default, the reporter will continue to process all records even after it is removed.
-> add reporter CDM inputFile=/tmp/cdm.json waitForLog=false
Adding reporter CDM... done
If the CDM records are stored in a collection of files, they can be ingested together with the rotate
option. If rotate=true
is specified, the inputFile
is processed first. Next, files with the same name but .1
, .2
, ... extensions are processed in ascending order. For example, /tmp/cdm.json
, /tmp/cdm.json.1
, /tmp/cdm.json.2
, and /tmp/cdm.json.3
can be ingested with the command:
-> add reporter CDM inputFile=/tmp/cdm.json rotate=true
Adding reporter CDM... done
The reporter can be deactivated using the following command in the SPADE controller:
-> remove reporter CDM
Shutting down reporter CDM... done
This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
- Setting up SPADE
- Storing provenance
-
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
- Viewing provenance
-
Querying SPADE
- Illustrative example
- Transforming query responses
- Protecting query responses
- Miscellaneous