This project demonstrates the usage of the Clash, a novel framework designed to enhance context sensitivity in static data-flow analysis, particularly in the presence of indirect calls. Indirect calls pose significant challenges due to their dynamic resolution at runtime, which can introduce imprecision and inefficiency by allowing spurious data facts to propagate across mismatched call contexts.
Clash addresses this limitation with two core innovations:
- Mutex identifiers, which statically approximate invocation contexts and are bound to function pointers to detect mismatches during analysis.
- Conflict detectors, including Call-Store Constraint (CSC), Callback-Constraint (CBC), and Call-Return Constraint (CRC), which selectively suppress invalid inter-procedural data-flows at critical edges in the Static Value-Flow Graph (SVFG).
These mechanisms enable Clash to efficiently prune infeasible data-flows while preserving necessary data dependencies, achieving both precision and scalability in large-scale programs.
Before proceeding with the setup, ensure that the following tools are installed and available in your system's PATH environment variable:
- Python 3
- LLVM 15
- Clang 15
- Java Development Kit (JDK) 21
- Maven 3.9
You can verify the installation by running:
python3 --version
llvm-config --version
clang --version
java --version
javac --version
mvn -vAll should report the right version.
Run the following command to generate the LLVM bitcode file using the provided setup script:
python3 setup.py exampleThis will compile the example code into LLVM bitcode format, which can be used for further processing with the Clash framework.
Prior to executing the analysis, you may configure various runtime parameters. Please refer to the Configuration Guide for detailed instructions on how to customize these parameters.
cd src\main\java\cn\njupt\xjy\Execute the main class Main.java, which serves as the entry point for the application:
mvn clean compile exec:javaEnsure that all required dependencies are properly configured in your environment before execution.
After the analysis is complete, all results and statistical data will be saved in the results directory. This folder contains structured output files that make it easy to locate and interpret the outcomes of the analysis.
Analyzer.analysis(String filename);The analysis function is used to analyze a sample in benchmark. The input file format must be the LLVM-IR readable text generated by LLVM 15. Before use, ensure that the benchmark path and the type of defect are properly configured.
As for the statistics data, they are cumulative. In other words, repeatedly calling the analysis function will always retain the statistics data from the previous analysis.
Analyzer.autoAnalysis();The autoAnalysis function is designed to automatically analyze all test cases in the benchmark directory. Before use, ensure that the benchmark path and the type of defect are properly configured. The statistical results will be automatically saved in the results directory.
Analyzer.chainAnalysis();Similar to the autoAnalysis function, chainAnalysis can also automatically analyze all test cases in the benchmark directory. The only difference is that the statistical data generated by autoAnalysis is merged, while the statistics from chainAnalysis are saved separately for each individual test case.
setBenchmark(String path, Defect type, Config config);The setBenchmark function is primarily used to configure the benchmark directory, the type of defect to be analyzed, and the analyzer parameters. It is worth noting that the setBenchmark function will automatically clear the previous benchmark statistics.
- The
pathparameter specifies the benchmark directory path. - The
typeparameter defines the type of defect, with the following optional values:
{None, MemoryLeak, DoubleFree, StackFree, Mismatch}. - The
configparameter represents the analyzer configuration options, with the following optional values:
{Off, IntraOnly, DCallOnly, ICallType, ICallPointer, ICallCombined, CSC, CRC, CBC, CSCMI, CRCMI, CBCMI, Clash, ClashType, ClashCombined}.
The detailed meanings of each defect type and analyzer parameter are explained in the two tables provided below.
| Defect | Description |
|---|---|
| None | Without checking any defects, only conducting data flow analysis |
| MemoryLeak | CWE401 Memory Leak |
| DoubleFree | CWE415 Double Free |
| StackFree | CWE590 Free Memory Not on Heap |
| Mismatch | CWE762 Mismatched Memory Management Routines |
| Config | Description |
|---|---|
| Off | Disable all analysis |
| IntraOnly | Only enable Intra-procedural analysis (Andersen-Style) |
| DCallOnly | Only build normal call/return edges |
| ICallType | Use FLTA to build virtual call/return edges |
| ICallPointer | Use SVFG to build virtual call/return edges |
| ICallCombined | Combination of FLTA and SVFG |
| CSC | Only enable Call Store Conflict detector and Mutex tracking |
| CRC | Only enable Call Return Conflict detector and Mutex tracking |
| CBC | Only enable Call Back Conflict detector and Mutex tracking |
| CSCMI | Enable CSC and Mutex Inherit |
| CRCMI | Enable CRC and Mutex Inherit |
| CBCMI | Enable CBC and Mutex Inherit |
| Clash | Enable Clash framework with SVFG |
| ClashType | Enable Clash framework with FLTA |
| ClashCombined | Enable Clash framework with combination of FLTA and SVFG |
Analyzer.showConfigure();The showConfigure function is used to output the current configuration parameters of the analyzer.
Analyzer.setWindow(Integer size);The setWindow function is primarily used to control the maximum display width in the console output, where the size parameter represents the number of characters.
Analyzer.resetBenchmark();The resetBenchmark function is used to clear the current benchmark's statistical data.
Analyzer.showBenchmark();The showBenchmark function is used to show the current benchmark's statistical data.
Analyzer.showCallGraph();The showCallGraph function is used to show the Call Graph of current sample. If autoAnalysis or chainAnalysis is used before showCallGraph function, it will show the Call Graph of the last sample.
Analyzer.showTypeGraph();The showTypeGraph function is used to show the Type Graph of current sample. If autoAnalysis or chainAnalysis is used before showTypeGraph function, it will show the Type Graph of the last sample.
Analyzer.showMemoryGraph();The showMemoryGraph function is used to show the Memory Graph of current sample. If autoAnalysis or chainAnalysis is used before showMemoryGraph function, it will show the Memory Graph of the last sample. (Memory Graph is a special case of the Static Value-Flow Graph (SVFG) presented in the paper.)
saveBenchmark(String filename);The saveBenchmark function is used to save the current benchmark's statistical data into a file specified by the filename parameter, which must have the .json extension.