This repository contains the implementation for CoIn. The code is organized by stages as follows:
Scripts for data preprocessing. A subset of data is sampled from HuggingFace datasets.
Constructs training and evaluation data for the two matching heads: Tokens2Block and Block2Answer.
Implements hash tree construction, verification, and profiling.
Training and evaluation code for the Block-to-Answer Verification component.
Training and evaluation code for the Tokens-to-Block Verification component.
Contains scripts to train the learning-based verifier used in the CoIn pipeline.
Implements the full CoIn auditing workflow. This is the core component of the project.
Code used in the discussion section of the paper. Requires local deployment of Qwen-2.5-70B-Instruct via vLLM.
Evaluation datasets used in the CoIn pipeline. Due to file size constraints, only block size 256 samples from OpenR1-Math-220k are provided here, with 10 samples per category. More comprehensive data will be released after paper acceptance.
All experiments in this project use a fixed random seed of 42. We ensure full reproducibility of all reported results.