A concolic testing engine for RISC-V embedded software with support for SystemC peripherals.
SymEx-VP focuses explicitly on testing software for constrained embedded devices (e.g. as used in the Internet of Things). This software often interacts very closely with low-level hardware peripherals and in order to support these interactions, during simulation-based software testing, SymEx-VP supports SystemC peripheral models. SystemC is a C++ class library for modeling hardware peripherals. SystemC is often used to create executable models of an entire hardware platform, so-called Virtual Prototypes (VPs). Using these SystemC peripherals models, SymEx-VP allows injecting test inputs into the software simulation via the MMIO peripheral interface. Thereby allowing tests of embedded software with only minimal software modifications.
Based on injected inputs, SymEx-VP allows symbolic execution (or more specifically concolic testing) of RISC-V RV32IMAC machine code. Branches based on injected symbolic values are tracked and as soon as execution terminates new assignments for symbolic variables are determined by negating encountered branch conditions (Dynamic Symbolic Execution). For each new assignment, the software simulation is restarted from the beginning, thereby (ideally) enabling exploration of all paths through the program based on the introduced symbolic variables.
SymEx-VP is implemented on top of the existing riscv-vp codebase and integrates this existing VP with the clover concolic testing library. More details on SymEx-VP are provided in the SymEx-VP paper.
- Concolic execution of RISC-V RV32IMAC machine code
- Accurate hardware peripheral modeling via the SystemC C++ library
- Support for injecting concolic test inputs via the MMIO peripheral interface
- Support for generating test case files and replaying them (see
SYMEX_TESTCASE
below) - Integrated GDB stub to ease debugging of encountered errors (
--debug-mode
) - Support for many embedded operating systems (e.g. RIOT, Zephyr, …)
This repository makes use of submodules to include vendored dependencies. In order to automatically checkout these submodules clone the repository as follows:
$ git clone --recursive https://github.com/agra-uni-bremen/symex-vp
Alternatively, if you already cloned the repository without passing the
--recursive
option run the following command to checkout all submodules:
$ git submodule update --init
This software can be installed either using Docker (recommended) or manually via GNU make.
To build a Docker image for SymEx-VP run the following command:
$ docker build -t riscv-symex .
Afterwards, create a new Docker container from this image using:
$ docker run --rm -it riscv-symex
The SymEx-VP source directory is then available in
/home/riscv-vp/riscv-vp
within the container. Provided VPs are
automatically added to $PATH
. For this reason, the examples provided
in /home/riscv-vp/riscv-vp/examples
(see below) can easily be executed
within the provided container.
Manual installation requires the following software to be installed:
- A C++ compiler toolchain with C++17 support
- CMake
- A recent version of Z3 (
4.8.X
is known to work) - LLVM
- Boost
After all dependencies have been installed run:
$ make
Executable binaries are then available in ./vp/build/bin
.
In regards to using SymEx-VP for software execution it behaves like a
normal virtual prototype and should be able to execute any RV32IMAC
binaries. In order to utilize the symbolic execution features provided
by SymEx-VP, the following additional aspects have to be consider for
software testing. Communication between software and the virtual
prototype is achieved through memory-mapped IO with a provided
SymbolicCTRL
peripheral.
- Symbolic Inputs: In order to explore different paths through the
program, inputs based on which path exploration should take place
need to be marked as symbolic. Symbolic variables can be introduced
through SystemC peripherals which return symbolic variables through
a TLM 2.0 extension as part of memory-mapped I/O. For example, it is
possible to model an Ethernet peripheral using SystemC which returns
symbolic variables for network packets and thereby allows exploring
paths through the network stack of the executed software.
Alternatively, it is also possible to declare variables as symbolic
manually through the aforementioned
SymbolicCTRL
peripheral. - Termination Points: Since SymEx-VP restarts the entire SystemC
simulation for each new assignment of symbolic input variables,
termination points need to be defined for the simulated software. For
example, when exploring the network stack of the executed software,
it may be sufficient to terminate software simulation as soon as the
symbolic network packet was handled by the network stack. Termination
points must presently be declared by the executed software by writing
to a control register in the memory-mapped
SymbolicCTRL
peripheral. - Path Analyzers: In order to find errors using symbolic execution,
a so-called path analyzer needs to be employed on each executed path.
For each executed path, the employed path analyzer needs to decide if
the path constitutes an error case. For example, errors may include
violation of spatial memory safety,
stack overflows, et cetera. Presently, we mostly rely on
the executed software to signal an error condition to the virtual
prototype using the
SymbolicCTRL
peripheral. For example, this allows signaling errors from software-specific panic handlers.
Usage examples which demonstrate the three aspects mentioned in the
previous section can be found in the ./examples
subdirectory. For
instance, the ./examples/zig-out-of-bounds
demonstrates the discovery
of out-of-bounds array accesses using SymEx-VP. More information on
individual example applications is available in the README.md
file in
the ./examples
subdirectory.
The provided Docker image already contains a correctly configured RISC-V
cross toolchain. For example, to run the assertion-failure
example
simply execute the following command inside the container:
$ make -C examples/assertion-failure/ sim
More complex usage example, e.g. for the integration with embedded operating systems, are provided as part of the evaluation artifacts for the SymEx-VP overview paper. Furthermore, simple test harnesses for using SymEx-VP in conjunction with the RIOT operating systems are provided in the symex-vp-riot repository.
The following virtual prototypes are available:
symex-vp
: A very minimal virtual prototype, based ontiny32-vp
from the original riscv-vp repository. This allows testing very basic bare-metal RV32 software.hifive-vp
: A virtual prototype mostly compatible with the SiFive HiFive1. This allows executing software for embedded operating systems like RIOT or Zephyr symbolically.
Other platforms from riscv-vp can be trivially ported.
The following environment variables can be set:
- SYMEX_ERREXIT: If set, terminate after encountering the first error during symbolic execution of the software.
- SYMEX_TIMEOUT: This can be used to configure the solver timeout of the employed SMT solver, by default no timeout is used. The timeout is given in seconds.
- SYMEX_TIMEBUDGET: Time budget (in seconds) for path exploration, by default the software is explored until all branches have been negated.
- SYMEX_TESTCASE: This environment variable can point to a test case
file for replaying inputs causing an error. This is most useful in
conjunction with
--debug-mode
. - SYMEX_DUMPALL: Create test case files for all generated paths, by default test case files are only generated for paths with an error condition.
The concepts behind SymEx-VP are further described in the following publication:
@article{tempel2022symex,
title = {{SymEx-VP: An open source virtual prototype for OS-agnostic concolic testing of IoT firmware}},
journal = {Journal of Systems Architecture},
pages = {12},
year = {2022},
issn = {1383-7621},
doi = {10.1016/j.sysarc.2022.102456},
author = {Sören Tempel and Vladimir Herdt and Rolf Drechsler},
}
The artifacts for this publication are also available on Code Ocean.
This work was supported in part by the German Federal Ministry of Education and Research (BMBF) within the project Scale4Edge under contract no. 16ME0127 and within the project VerSys under contract no. 01IW19001.
The original riscv-vp code is licensed under MIT (see LICENSE.MIT
).
All modifications made for the integration of symbolic execution with
riscv-vp are licensed under GPLv3+ (see LICENSE.GPL
). Consult the
copyright headers of individual files for more information.