LLM-Auditing-CoIn

This repository contains the implementation for CoIn. The code is organized by stages as follows:

Directory Overview

`0_preprocess/`

Scripts for data preprocessing. A subset of data is sampled from HuggingFace datasets.

`1_mk_data/`

Constructs training and evaluation data for the two matching heads: Tokens2Block and Block2Answer.

`2_hash_tree/`

Implements hash tree construction, verification, and profiling.

`3_Block2Answer/`

Training and evaluation code for the Block-to-Answer Verification component.

`3_Tokens2Block/`

Training and evaluation code for the Tokens-to-Block Verification component.

`4_train_verifier/`

Contains scripts to train the learning-based verifier used in the CoIn pipeline.

`5_CoIn_pipeline/`

Implements the full CoIn auditing workflow. This is the core component of the project.

`6_discussion/`

Code used in the discussion section of the paper. Requires local deployment of Qwen-2.5-70B-Instruct via vLLM.

`7_eval_data/`

Evaluation datasets used in the CoIn pipeline. Due to file size constraints, only block size 256 samples from OpenR1-Math-220k are provided here, with 10 samples per category. More comprehensive data will be released after paper acceptance.

Reproducibility

All experiments in this project use a fixed random seed of 42. We ensure full reproducibility of all reported results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM-Auditing-CoIn

Directory Overview

`0_preprocess/`

`1_mk_data/`

`2_hash_tree/`

`3_Block2Answer/`

`3_Tokens2Block/`

`4_train_verifier/`

`5_CoIn_pipeline/`

`6_discussion/`

`7_eval_data/`

Reproducibility

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
0_preprocess		0_preprocess
1_mk_data		1_mk_data
2_hash_tree		2_hash_tree
3_Block2Answer		3_Block2Answer
3_Tokens2Block		3_Tokens2Block
4_train_verifier		4_train_verifier
5_CoIn_pipline		5_CoIn_pipline
6_discussion		6_discussion
7_eval_data		7_eval_data
README.md		README.md

CASE-Lab-UMD/LLM-Auditing-CoIn

Folders and files

Latest commit

History

Repository files navigation

LLM-Auditing-CoIn

Directory Overview

0_preprocess/

1_mk_data/

2_hash_tree/

3_Block2Answer/

3_Tokens2Block/

4_train_verifier/

5_CoIn_pipeline/

6_discussion/

7_eval_data/

Reproducibility

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`0_preprocess/`

`1_mk_data/`

`2_hash_tree/`

`3_Block2Answer/`

`3_Tokens2Block/`

`4_train_verifier/`

`5_CoIn_pipeline/`

`6_discussion/`

`7_eval_data/`

Packages