CISA is an LLVM-based IR static analysis framework supporting an incremental analysis over
the git
commit history.
The basic philosophy is to do costly static analyses (e.g., indirect call graph analysis) incrementally while scanning through the commit history. Every analysis is partially done and updated at the commit-modified parts (hence incremental) and, like the LLVM IR passes, can refer to the result of other analyses.
It is still in its infancy and only supports limited stuff (e.g., analyses can only refer to the call graph analysis, not other custom ones). If anybody reads this, I welcome any contribution.
As the introduction mentions, CISA aims to only analyze changed parts from commits. To do so, CISA scans the commit history within a given range in chronological order and, given the changed entity X
by the current commit (e.g., changed function or module), it updates the analysis in the changed part first and then aggregates the up-to-date analysis results. For this, CISA requires custom analyzers for the following two callbacks: Update(X)
and Aggregate(X)
.
Update(X)
: update the analysis for the changed entityX
. This only updates the analysis insideX
.Aggregate(X)
: aggregate the up-to-date analysis result for the changed entityX
. This assembles the analysis done byUpdate
and produces the final analysis result.Aggregate
is always called after every possibleUpdate
has been called first, so it's safe to assume all entities in the source code have up-to-date analysis states.
The following is what developing and using a custom analyzer would look like.
- Write a custom analyzer (in
src/analyzer
) that implementsUpdate
andAggregate
. - Build again (
$ make
). - Run the CISA front-end (
$ ./cisa <repo_path> -o <out_path>
).- For each commit from the beginning to the end, CISA calls
Update
with all changed entities first and callsAggregate
next.
- For each commit from the beginning to the end, CISA calls
- Inspect the printed analysis result in
<out_path>
.
- Integrated call graph analysis [MLTA, CCS'19]
- Nice C++ interface for custom function-level analyses
- LLVM 15.0.5
- Python 3.8.0+
- CMake 3.16.3+
- Some python packages: gitpython, termcolor, alive_progress
- Install prerequisites. (assuming Ubuntu 20.04+)
- Make sure that
python
ispython3
andpip
ispip3
.
- Make sure that
$ sudo apt install python3 python3-pip python-is-python3 cmake
$ sudo pip install gitpython termcolor alive_progress
- Decompress the prebuilt LLVM 15 binary to
llvm
at the root.- Or you can create a symlink
llvm
to the LLVM install directory (if you built LLVM on your own).
- Or you can create a symlink
$ # example: assuming Ubuntu 20.04+. at the root directory.
$ wget https://github.com/llvm/llvm-project/releases/download/llvmorg-15.0.5/clang+llvm-15.0.5-x86_64-linux-gnu-ubuntu-18.04.tar.xz
$ tar -xvf clang+llvm-15.0.5-x86_64-linux-gnu-ubuntu-18.04.tar.xz
$ rm clang+llvm-15.0.5-x86_64-linux-gnu-ubuntu-18.04.tar.xz
$ mv clang+llvm-15.0.5-x86_64-linux-gnu-ubuntu-18.04 llvm
- Make.
$ make # at the root directory.
See this page for a dockerized setting.
script
: CISA front-end scripts (Python)src
: CISA back-end code (C++)analyzer
: where custom analyzers residecallgraph
: incremental call graph analysis (MLTA)
extern
: external dependencies
- Supporting references to LLVM objects (e.g.,
Function
) in custom analyses - Supporting custom module-level analyses
- Converting the integrated call graph analysis to a custom module-level analysis
- Supporting custom analysis inter-operability
- Improving initial checkout delay