-
Notifications
You must be signed in to change notification settings - Fork 76
Collecting system wide provenance on Linux with Audit
The Audit reporter collects provenance from across the operating system using the Linux kernel's audit event stream of system calls. (Note: Activity of the user that SPADE runs as is excluded.)
This reporter is built automatically when SPADE's top-level make
command is issued.
(Note: The included kernel modules are optional. To build them, use the command make KERNEL_MODULES=true
.)
Before this reporter can be used, the below commands must be run. These commands only need to be executed once after SPADE is compiled. (Note: This will allow a normal user to configure and access the audit stream.)
The first two commands allow users to configure the audit rules and packet filtering needed to generate the provenance graph. The next two commands grant users access to the audit stream:
sudo chmod ug+s `which auditctl`
sudo chmod ug+s `which iptables`
sudo chmod ug+s `which kmod`
sudo chown root bin/spadeAuditBridge
sudo chmod ug+s bin/spadeAuditBridge
To let the above utility access the audit stream, edit the file /etc/audisp/plugins.d/af_unix.conf
on Ubuntu or /etc/audit/plugins.d/af_unix.conf
on Fedora and activate the plugin by changing the line that says
active = no
to
active = yes
Restart auditd
to activate the dispatcher (audispd
):
sudo service auditd restart
The Audit reporter can be started using SPADE's controller:
-> add reporter Audit
Adding reporter Audit... done
The reporter will transform records from the Linux audit dispatcher into an Open Provenance Model representation. The details of the key-value annotations are available here.
Filesystem reads and writes, as well as network connection sends and receives, can generate significant log overhead. In many contexts, knowledge that a process opened a file or made a network connection, suffices for understanding the provenance of data.
By default, this reporter only tracks when files are opened for reading or writing, and when network connection are made or accepted. To report all filesystem reads and writes, the argument fileIO=true
should be provided when starting the reporter with the SPADE controller. Similarly, to report all network sends and receives, the argument netIO=true
should be used:
-> add reporter Audit fileIO=true netIO=true
Adding reporter Audit... done
Linux containers are a user-space construct. They are created by virtualizing selected kernel resources using namespaces. The included kernel module can report the specific namespaces of each process. To enable this functionality, SPADE must be built with make KERNEL_MODULES=true
. Providing the localEndpoints=true
argument will activate the kernel module (and reporting of local IP address / port information that is not present in Audit records).
By default, namespaces are not tracked. Providing the namespaces=true
argument activates reporting of values for the user and group identifier (User
), process identifier (PID
), filesystem mount point (Mount
), and network information (Network
) namespaces. Providing the IPC=true
argument activates reporting of inter-process message queue (IPC
) namespace. By providing the networkAddressTranslation=true
argument, both the host-level and intra-network-namespace address and port values will be reported.
The above arguments should be provided when starting the Audit reporter in the SPADE controller:
-> add reporter Audit localEndpoints=true namespaces=true IPC=true networkAddressTranslation=true
Adding reporter Audit... done
For debugging purposes, the Linux Audit records that have been processed can be stored in a file using the outputLog
argument. For example, the records can be stored in the file /tmp/audit.log
by using this command to start the reporter in the SPADE controller:
-> add reporter Audit outputLog=/tmp/audit.log
Adding reporter Audit... done
Instead of collecting Linux Audit records from the running system, a previously saved log can be used by specifying it with the inputLog
argument. The hardware architecture of the machine on which the log was collected must be x86-64.
For example, to read records from the file /tmp/audit.log
collected on an x86-64 machine, this command can be used to start the reporter in the SPADE controller:
-> add reporter Audit inputLog=/tmp/audit.log
Adding reporter Audit... done
Logs must be sorted by event identifier. This is done automatically during preprocessing.
The end of Audit log processing is reported in SPADE's log (that is stored in log/SPADE_<date>-<time>.log
).
This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
- Setting up SPADE
- Storing provenance
-
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
- Viewing provenance
-
Querying SPADE
- Illustrative example
- Transforming query responses
- Protecting query responses
- Miscellaneous