Skip to content
View ConcoLLMic's full-sized avatar

Block or report ConcoLLMic

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ConcoLLMic/README.md

ConcoLLMic: Agentic Concolic Execution

License: GPL v3 License: Commercial

Paper: IEEE S&P 2026


ConcoLLMic is the first language- and theory-agnostic concolic executor powered by LLM agents. Unlike traditional symbolic execution tools that require language-specific implementations and struggle with constraint solving, ConcoLLMic:

  • Works with any programming language and environment interactions — supports C, C++, Python, Java, ..., and even multi-language systems; requires no additional environment modeling and handles them efficiently.
  • Handles diverse constraint theories — including floating-point arithmetic, strings, structured data, and bit-level operations.
  • Achieves superior coverage — in ~4 hours, ConcoLLMic covers 115%-233% more branches than state-of-the-art tools like KLEE and 81% more than AFL++ running for 48 hours.
  • Finds real bugs — has already discovered 10+ previously unknown vulnerabilities in real-world softwares.

How it works: ConcoLLMic uses LLM agents to

  1. instrument code at source level with logging statements like fprintf(stderr, ...);
  2. summarize path constraints flexibly in natural language, code, or SMT formulas;
  3. solve constraints intelligently — directly reasoning constraints over semantics and delegating to grounded solvers when needed.

This eliminates the need for hand-crafted language-specific symbolic interpreters and environment modeling as well as expensive constraint solving.



Table of Contents


Workflow Example

Phase 1: Instrumentation

Instrumentation of ConcoLLMic

Example instrumentation of src/count.c from FP-Bench, by inserting fprintf at key points in the execution. Output in stderr allows us to reconstruct the dynamic execution path.

Phase 2: Concolic Testing

Testing Phase of ConcoLLMic

This shows one round of concolic execution with ConcoLLMic. Compared to conventional tools with verbose, implementation-level formulas:

start = atof(I[0]) ∧ end = atof(I[1]) ∧ start < end ∧ cur[0] = start ∧ 
(∀ 0 ≤ i < cnt, cur[i] ≠ end) ∧ cur[cnt] = end ∧ cnt ≤ 20

ConcoLLMic summarizes the path constraints concisely using high-level natural-language:

The number of representable FP values in the range must be ≤ 20.

Getting Started

Prerequisites

  • Python 3.10 or higher
  • API key for Claude-3.7-Sonnet (from Anthropic)

Installation

  1. Clone the repository
git clone https://github.com/ConcoLLMic/ConcoLLMic.git
cd ConcoLLMic
  1. Install Python dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt
  1. Set up your API key
export ANTHROPIC_API_KEY="your_api_key_here"

By default, ConcoLLMic uses Claude-3.7-Sonnet. You can configure other models in ACE.py (see setup_model()).

Running Your First Example

Let's walk through a complete example using a C program from FP-Bench that counts floating-point values in a range.

Step 1: Instrumentation

ConcoLLMic instruments source code by inserting trace statements. For the provided examples, instrumented code is already available in code_example/instr/.

To instrument from scratch:

rm ./code_example/instr/count.c

python3 ACE.py instrument \
    --src_dir ./code_example/src/ \
    --out_dir ./code_example/instr/ \
    --instr_languages c,cpp,python,java

Note: The --instr_languages flag is only used for filtering files by extension. No additional language-specific support is needed.

Step 2: Compilation

Compile the instrumented code normally:

gcc -o ./code_example/instr/count ./code_example/instr/count.c

Step 3: Concolic Execution

Run the concolic execution agent with a test harness (code_example/harness/count.py):

python3 ACE.py run \
    --project_dir ./code_example/instr/ \
    --execution ./code_example/harness/count.py \
    --out ./out/ \
    --rounds 2 \
    --parallel_num 3

Parameters:

  • --project_dir: Directory containing the instrumented code and the compiled program
  • --execution: Test harness (a Python script that executes the target program and returns stderr + exit code)
  • --out: Output directory for generated test cases and logs
  • --rounds: Number of concolic execution rounds (default: unlimited until coverage plateau)
  • --parallel_num: Maximum number of concurrent test case generations per round

Expected cost: ~$0.40 for 2 rounds with Claude-3.7-Sonnet

Output:

  • ./out/ConcoLLMic_*.log — detailed execution log
  • ./out/queue/id:*.yaml — generated test cases with metadata

Step 4: View Statistics (Optional)

Analyze testing statistics (tokens used, costs):

python3 ACE.py run_data --out_dir ./out/

Step 5: Replay & Coverage (Optional)

To measure code coverage or verify bugs:

  1. Compile the original (non-instrumented) code with coverage:
cd ./code_example/src/
gcc --coverage -o count count.c
lcov -z -d .
cd ../../
  1. Replay generated test cases:
python3 ACE.py replay \
    ./out/ \
    ./code_example/src/ \
    ./coverage.csv \
    --cov_script ./code_example/src/coverage.sh

Output:

  • ./coverage.csv — time-series coverage data for each test case
  • ./code_example/src/count.c.gcov — line-by-line coverage report

The gcov report will show that the bug at line 67 has been triggered.


Command Reference

ConcoLLMic provides a unified CLI interface:

python3 ACE.py <command> [options]

Main Commands

Command Description
instrument Instrument source code for tracing
run Execute concolic testing
replay Replay generated test cases
instrument_data Analyze instrumentation statistics
run_data Display testing statistics and costs

For detailed options, run:

python3 ACE.py <command> --help

Or visit the documentation.


Running Large-Scale Experiments

To reproduce the paper's evaluation results or run ConcoLLMic on real-world benchmarks, see the experiments/README.md.

We provide Docker-based setups for:

  • C/C++ benchmarks
  • Multi-language systems
  • Comparison with other tools — KLEE, KLEE-Pending, AFL++, SymCC, SymSan

License

This project is dual licensed:

  • Non-commercial use: Licensed under GPLv3 for academic and non-commercial purposes
  • Commercial use: Please contact the authors for commercial licensing

For full license details, see LICENSE.

For commercial licensing inquiries, contact:


Citation

@inproceedings{luo2026concollmic,
  title={Agentic Concolic Execution},
  author={Luo, Zhengxiong and Zhao, Huan and Wolff, Dylan and Cadar, Cristian and Roychoudhury, Abhik},
  booktitle={2026 IEEE Symposium on Security and Privacy (S\&P)},
  pages={1--19},
  year={2026},
  organization={IEEE}
}

Pinned Loading

  1. ConcoLLMic ConcoLLMic Public

    ConcoLLMic: the first language- and theory-agonistic concolic execution engine via LLM agents

    Python 77 1