-
Notifications
You must be signed in to change notification settings - Fork 75
Add Metadata to LLVM Bitcode
The steps below describe how to use an LLVM optimizer pass, named AddMetadata, to add metadata at function call-sites to specified parameters of the called function.
Following is a code snippet from the web server nweb which shows the read of a request data from the client into the argument buffer
of the read
system call:
...
void web(int fd, int hit)
{
int j, file_fd, buflen, len;
long i, ret;
char * fstr;
static char buffer[BUFSIZE+1]; /* static so zero filled */
ret =read(fd,buffer,BUFSIZE); /* read Web request in one go */
...
Metadata can be added to the argument buffer
of the read
function call (above) using the following AddMetadata configuration file:
read, 2, user_request
The configuration file, says to add the metadata user_request
to the second argument of the read
function call.
To add the metadata, the AddMetadata pass requires the LLVM bitcode of the nweb
application. Then it iterates over all the instructions in the LLVM bitcode and adds the above-mentioned metadata to an instruction as LLVM Metadata. For each function call, a metadata is attached to it with the name call-site-metadata
. The value of the attached metadata is an MDTuple
instance. The first element in this tuple is a unique identifier for the site of the function call. Rest of the elements in the tuple are instances of MDTuple
, one for each argument (if metadata was added for the argument). The first element in the argument tuple is the index of the argument (as specified in the configuration file). The rest of the elements in the argument tuple are the metadatas (one or more) as specified in the configuration file.
The output LLVM IR snippet of the above example looks as follows:
%14 = call i64 @read(i32 %12, i8* %13, i64 10), !call-site-metadata !2
!2 = !{!"0", !3}
!3 = !{!"2", !"user_request"}
-
LLVM - Recommended release
10.0.0
-
Clang - Recommended release
10.0.0
-
CMake - Recommended release
3.13.4
or higher
On Ubuntu 20.04, the requirements can be installed with:
sudo apt-get install -y llvm-10 clang-10 cmake
- Clone SPADE repository
- Execute the command:
./bin/build-add-metadata.sh /usr/lib/llvm-10
. Make sure to update the argument/usr/lib/llvm-10
to your LLVM installation - Upon successful build, the shared library for the pass would be created in
lib/libAddMetadata.so
The pass takes three arguments:
-
-config
: (Mandatory) The path to input configuration file (format described below) -
-output
: (Optional) File location to write the output of the pass to. If the value isstdout
then output is written to standard out -
-debug
: (Optional) Print debug information, specifically, after each metadata addition, parse and print the metadata
Following is an example command:
$ opt -load lib/libAddMetadata.so -legacy-add-metadata -config input.config -output stdout bitcode.bc -o bitcode_with_metadata.bc
The command above reads input configuration file from input.config
, and writes the output to standard out.
Following is a sample input configuration file specified using -config
:
# Each line contains 3 comma-separated values
# 1. The first value is the function name. The metadata will be added for all call-sites of this function
# 2. The seconds value is the argument index of the function call to which the metadata would be added
# 3. The third value is a descriptor of the metadata to identify the semantics of the metadata
# Comments start with '#' and must be at the beginning of the line
# Following tells the pass to add metadata with descriptor 'source' to all call-sites of the function 'read' for it's second parameter
read, 2, source
Following is a sample output of the pass:
10, read, 2, source
The output, above, indicates that the descriptor source
was added to the second parameter of the function read
at it's call-site which is identified by the value 10
.
The following code snippet shows how to extract the added metadata using a callback mechanism:
// The callback function which would be called for each metadata description added for each parameter
static void extraction_metadata_callback(Instruction *instruction, StringRef *functionName, APInt *callSiteNumber, APInt *parameterIndex, StringRef *description){
// Do your work here
}
// The main function for an LLVM optimizer pass
bool MyOPTPass::runOnModule(Module &module){
// The required definition of the callback can be seen here
void (*metadata_callback_func)(Instruction *instruction, StringRef *functionName, APInt *callSiteNumber, APInt *parameterIndex, StringRef *description);
// Assigning your callback function
metadata_callback_func = &extraction_metadata_callback;
// The 'extractAllMetadata' function checks for existing metadata on each instruction. If found, then it calls the callback function.
extractAllMetadata(module, metadata_callback_func);
return false;
}
The implementation of extractAllMetadata
can be found in AddMetadata.cpp
.
This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
- Setting up SPADE
- Storing provenance
-
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
- Viewing provenance
-
Querying SPADE
- Illustrative example
- Transforming query responses
- Protecting query responses
- Miscellaneous