-
Notifications
You must be signed in to change notification settings - Fork 75
Add Metadata to LLVM Bitcode
The steps below describe how to use an LLVM optimizer pass, named AddMetadata, to add metadata at function call-sites to specified parameters of the called function.
Following is an example C
program snippet that reads portion of a file into process's memory:
read(file_fd, read_buffer, read_buffer_size);
The arguments of the read
function call (above) can be supplemented with metadata by using the following AddMetadata configuration file:
read, 1, source
read, 1, input
read, 2, size
read, 2, length
The configuration file, says to add the descriptors source
, and input
to the second argument (index 1) of the read
function call. Also, to add the descriptors size
, and length
to the third argument (index 2) of the read
function call.
The AddMetadata pass iterates all instructions in the LLVM bitcode of the program (above) and adds the above-mentioned descriptors to the LLVM bitcode as LLVM Metadata. For each function call-site, a metadata is attached to it with the name call-site-metadata
. The value of the attached metadata is an MDTuple
instance. The first element in this tuple is a unique identifier for the call-site. Rest of the elements in the tuple are instances of MDTuple
, one for each argument (if metadata was added for the argument). The first element in the argument tuple is the index of the argument (as specified in the configuration file). The rest of the elements in the argument tuple are the descriptors (one or more) as specified in the configuration file.
The output LLVM IR snippet of the above example looks as follows:
%14 = call i64 @read(i32 %12, i8* %13, i64 10), !call-site-metadata !2
!2 = !{!"0", !3, !4}
!3 = !{!"1", !"source", !"input"}
!4 = !{!"2", !"size", !"length"}
-
LLVM - Recommended release
10.0.0
-
Clang - Recommended release
10.0.0
-
CMake - Recommended release
3.13.4
or higher
On Ubuntu 20.04, the requirements can be installed with:
sudo apt-get install -y llvm-10 clang-10 cmake
- Clone SPADE repository
- Execute the command:
./bin/build-add-metadata.sh /usr/lib/llvm-10
. Make sure to update the argument/usr/lib/llvm-10
to your LLVM installation - Upon successful build, the shared library for the pass would be created in
lib/libAddMetadata.so
The pass takes three arguments:
-
-config
: (Mandatory) The path to input configuration file (format described below) -
-output
: (Optional) File location to write the output of the pass to. If the value isstdout
then output is written to standard out -
-debug
: (Optional) Print debug information, specifically, after each metadata addition, parse and print the metadata
Following is an example command:
$ opt -load lib/libAddMetadata.so -legacy-add-metadata -config input.config -output stdout bitcode.bc -o bitcode_with_metadata.bc
The command above reads input configuration file from input.config
, and writes the output to standard out.
Following is a sample input configuration file specified using -config
:
# Each line contains 3 comma-separated values
# 1. The first value is the function name. The metadata will be added for all call-sites of this function
# 2. The seconds value is the parameter index of the function to which the metadata would be added. Indices start from '0'
# 3. The third value is a descriptor of the metadata to identify the semantics of the metadata
# Comments start with '#' and must be at the beginning of the line
# Following tells the pass to add metadata with descriptor 'source' to all call-sites of the function 'read' for it's second parameter
read, 1, source
Following is a sample output of the pass:
10, read, 1, source
The output, above, indicates that the descriptor source
was added to the second parameter of the function read
at it's call-site which is identified by the value 10
.
The function parseCallSiteMetadata
in AddMetadata.cpp
can be used extract the added metadata using LLVM API.
This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
- Setting up SPADE
- Storing provenance
-
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
- Viewing provenance
-
Querying SPADE
- Illustrative example
- Transforming query responses
- Protecting query responses
- Miscellaneous