Skip to content

Comments

feat(v0.4-0.7): Multi-assistant provenance + semantic blame + evidence packs#4

Merged
wolfiesch merged 3 commits intomasterfrom
feat/multi-assistant-v0.7
Jan 11, 2026
Merged

feat(v0.4-0.7): Multi-assistant provenance + semantic blame + evidence packs#4
wolfiesch merged 3 commits intomasterfrom
feat/multi-assistant-v0.7

Conversation

@wolfiesch
Copy link
Owner

Summary

Major feature release spanning v0.4 through v0.7, adding:

  • Multi-Assistant Support (v0.7) - Track file changes from OpenAI Codex CLI alongside Claude Code events
  • Semantic Blame (v0.4) - Fingerprint-based code attribution with confidence tiers
  • Intent Extraction (v0.5) - Show "why" code was written from conversation history
  • Reliability (v0.6) - Log rotation, DB maintenance, real-time timeline

Key Features

Version Feature Impact
v0.7 Codex CLI integration Unified provenance across AI assistants
v0.6 diachron maintenance Prevents DB bloat, optimizes queries
v0.6 --watch mode Real-time timeline for live events
v0.5 Intent extraction Blame shows user's original request
v0.4 Evidence packs JSON/Markdown export for PR narratives
v0.4 Hash chain Tamper detection for audit trails

Files Changed

  • 44 files, +7866 lines
  • New Rust crate: codex-wrapper/
  • New Python: codex_capture.py, test_codex_capture.py
  • New docs: IPC-API.md, github-action/
  • Core modules: hash_chain.rs, fingerprint.rs, pr_correlation.rs, evidence_pack.rs

Test plan

  • 51 Rust tests passing
  • 12 Python Codex capture tests passing
  • Real Codex execution test (hello.py creation captured)
  • Daemon IPC verified working
  • Manual: diachron timeline --watch shows live events
  • Manual: diachron maintenance runs without errors

🤖 Generated with Claude Code

…e packs

## v0.7 - Multi-Assistant Support (Codex CLI)
- Add codex_capture.py for parsing Codex JSONL session logs
- Add diachron-codex Rust wrapper binary for standalone usage
- Integrate capture into /handoffcodex and /handoffcodex-full skills
- Support both old (custom_tool_call) and new (exec_command) Codex formats
- 12 Python tests, 3 Rust tests passing

## v0.6 - Reliability & Developer UX
- Add log rotation with tracing-appender (daily rolling)
- Add `diachron maintenance` command (VACUUM, ANALYZE, prune)
- Add `diachron timeline --watch` for real-time events
- Create IPC-API.md for community integrations
- Fix OpenAI→Anthropic references in docs

## v0.5 - Intent Extraction
- Extract user intent from conversation history for blame
- Multi-factor relevance scoring (+3 file, +2 tool, +1 branch)
- 9 new intent extraction tests (51 total)

## v0.4 - Semantic Blame & Evidence Packs
- Add fingerprint-based blame (content_hash, context_hash)
- Add PR correlation (events→commits→PRs)
- Add evidence pack generation with hash chain verification
- Add GitHub Action template for automated PR comments
- Three-tier confidence matching (HIGH/MEDIUM/LOW/INFERRED)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 11, 2026 16:14
@gemini-code-assist
Copy link

Summary of Changes

Hello @wolfiesch, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a suite of major features aimed at significantly enhancing the system's ability to track, attribute, and verify code changes made by AI assistants. It expands compatibility to multiple AI tools, introduces robust mechanisms for code provenance and tamper detection, and provides tools for generating detailed narratives of AI-assisted development. These changes collectively improve the transparency, auditability, and overall reliability of AI-driven coding workflows.

Highlights

  • Multi-Assistant Support: Introduced support for tracking file changes from OpenAI Codex CLI, unifying provenance tracking across different AI assistants alongside Claude Code events.
  • Semantic Blame: Implemented fingerprint-based code attribution with confidence tiers, allowing users to identify which AI session wrote specific lines of code.
  • Evidence Packs: Added functionality to generate JSON and Markdown evidence packs, facilitating the creation of PR narratives that detail AI contributions.
  • Hash Chain Integrity: Integrated a SHA256 hash chain for tamper detection across all events, enhancing the auditability and trustworthiness of the event log.
  • Improved Reliability & Performance: Enhanced system reliability with features like log rotation, database maintenance commands, and a real-time timeline --watch mode for live event monitoring.
  • IPC API Documentation: New comprehensive documentation for the Inter-Process Communication (IPC) API has been added, enabling custom integrations and hooks for other AI tools.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This is a major feature release spanning versions 0.4-0.7, introducing multi-assistant provenance tracking, semantic blame, intent extraction, and reliability improvements. The PR adds comprehensive features for tracking AI-generated code changes across multiple assistants (Claude Code, OpenAI Codex CLI), with tamper-detection via hash chains and exportable evidence packs for PR narratives.

Changes:

  • Multi-assistant support with Codex CLI integration via Rust wrapper and Python capture module
  • Semantic blame using content fingerprinting (SHA256 hashes + semantic similarity)
  • Intent extraction from conversation history to show "why" code was written
  • Database maintenance commands (VACUUM, ANALYZE, pruning), log rotation, and real-time timeline watch mode
  • Evidence pack generation for PR correlation with JSON/Markdown export
  • Hash chain integrity verification for tamper detection
  • Search result caching and parallel FTS/vector search
  • GitHub Action for automated PR narrative posting

Reviewed changes

Copilot reviewed 44 out of 44 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
rust/tests/integration_tests.rs Comprehensive integration tests for hash chain, PR correlation, evidence packs, and fingerprinting
rust/daemon/src/main.rs Added search cache, log rotation with daily file appender, test helper methods
rust/daemon/src/handlers.rs New handlers for maintenance, fingerprint blame, PR evidence correlation; parallelized hybrid search with caching
rust/daemon/src/db.rs Hash chain integration in event insertion, maintenance operations, intent extraction queries, read-only connections
rust/daemon/src/cache.rs LRU cache implementation for search results with database version tracking
rust/core/src/types.rs New IPC message types for maintenance, blame, and evidence correlation
rust/core/src/schema.rs Schema v4 migration adding hash chain and fingerprint columns
rust/core/src/pr_correlation.rs PR-to-commit event correlation with confidence levels (HIGH/MEDIUM/LOW)
rust/core/src/hash_chain.rs SHA256 hash chain implementation with GENESIS_HASH and checkpoint support
rust/core/src/fingerprint.rs Content-based fingerprinting for stable blame across refactors
rust/core/src/evidence_pack.rs Evidence pack generation and Markdown rendering for PR narratives
rust/codex-wrapper/src/main.rs Standalone Rust wrapper for OpenAI Codex CLI capturing file operations
rust/cli/src/main.rs New commands: verify, maintenance, blame, export-evidence, pr-comment; timeline watch mode
lib/codex_capture.py Python module for parsing Codex JSONL sessions and sending events to daemon
lib/test_codex_capture.py Comprehensive pytest tests for Codex capture functionality
github-action/ TypeScript GitHub Action for posting evidence to PR comments
docs/IPC-API.md Complete IPC API documentation for daemon integration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


import argparse
import json
import os
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'os' is not used.

Suggested change
import os

Copilot uses AI. Check for mistakes.
import re
import socket
import sys
from datetime import datetime
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'datetime' is not used.

Suggested change
from datetime import datetime

Copilot uses AI. Check for mistakes.
"timestamp": timestamp,
"raw_input": cmd,
})
except json.JSONDecodeError:
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except json.JSONDecodeError:
except json.JSONDecodeError:
# Ignore malformed arguments for this event and continue processing other log entries.

Copilot uses AI. Check for mistakes.
docs/IPC-API.md Outdated
Comment on lines 539 to 546
run: |
echo '{"type":"CorrelateEvidence","payload":{
"pr_id": ${{ github.event.pull_request.number }},
"commits": ${{ toJson(github.event.pull_request.commits) }},
"branch": "${{ github.head_ref }}",
"start_time": "2026-01-01T00:00:00Z",
"end_time": "2026-01-11T23:59:59Z"
}}' | nc -U ~/.diachron/diachron.sock > evidence.json
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GitHub Actions example constructs a shell command with github.head_ref interpolated directly inside a single-quoted echo string that is then piped to nc. Because branch names on GitHub can contain characters like single quotes and are attacker-controlled for forked PRs, a malicious branch name can break out of the quoted string and inject arbitrary shell commands executed in the Actions runner. You should build the JSON payload without unescaped string interpolation (e.g., using a safer JSON construction mechanism or proper shell escaping) so that github.head_ref and other dynamic values cannot alter the shell command structure.

Suggested change
run: |
echo '{"type":"CorrelateEvidence","payload":{
"pr_id": ${{ github.event.pull_request.number }},
"commits": ${{ toJson(github.event.pull_request.commits) }},
"branch": "${{ github.head_ref }}",
"start_time": "2026-01-01T00:00:00Z",
"end_time": "2026-01-11T23:59:59Z"
}}' | nc -U ~/.diachron/diachron.sock > evidence.json
env:
PR_ID: ${{ github.event.pull_request.number }}
COMMITS: ${{ toJson(github.event.pull_request.commits) }}
BRANCH: ${{ github.head_ref }}
run: |
PAYLOAD=$(python - << 'PY'
import json, os, sys
pr_id = int(os.environ["PR_ID"])
commits = json.loads(os.environ["COMMITS"])
branch = os.environ["BRANCH"]
payload = {
"type": "CorrelateEvidence",
"payload": {
"pr_id": pr_id,
"commits": commits,
"branch": branch,
"start_time": "2026-01-01T00:00:00Z",
"end_time": "2026-01-11T23:59:59Z",
},
}
sys.stdout.write(json.dumps(payload))
PY
)
printf '%s\n' "$PAYLOAD" | nc -U ~/.diachron/diachron.sock > evidence.json

Copilot uses AI. Check for mistakes.
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a massive and impressive pull request that introduces a suite of powerful features for provenance, including multi-assistant support, semantic blame, and evidence packs. The implementation is robust, well-documented, and thoroughly tested. The addition of hash-chain tamper evidence, content fingerprinting, and a detailed IPC API are particularly noteworthy. The new benchmark scripts and GitHub Action are also great additions. My review has identified a few areas for improvement, primarily concerning a potential race condition in the Codex wrapper, some minor issues in the benchmark scripts and documentation, and some code duplication in the markdown rendering logic. Overall, this is an excellent contribution that significantly enhances the capabilities of the project.

}

/// Find the most recent Codex session JSONL file
fn find_latest_session() -> Option<PathBuf> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The find_latest_session function finds the most recently modified session file. If multiple codex commands are run concurrently or in quick succession, this could lead to a race condition where the wrapper captures events from the wrong session. To make this more robust, you could have codex output the session file path and pass it to the wrapper, or use a more specific identifier than just 'latest' to associate the execution with its corresponding log file.

Comment on lines 28 to 33
time_ms() {
local start=$(python3 -c "import time; print(int(time.time() * 1000))")
eval "$@" >/dev/null 2>&1
local end=$(python3 -c "import time; print(int(time.time() * 1000))")
echo $((end - start))
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The time_ms function uses eval, which can be a security risk if the command arguments are not properly sanitized. While it seems safe with the current usage in this script, it's a best practice to avoid eval. Consider rewriting the function to execute the command directly and updating the call sites.

Suggested change
time_ms() {
local start=$(python3 -c "import time; print(int(time.time() * 1000))")
eval "$@" >/dev/null 2>&1
local end=$(python3 -c "import time; print(int(time.time() * 1000))")
echo $((end - start))
}
time_ms() {
local start=$(python3 -c "import time; print(int(time.time() * 1000))")
"$@" >/dev/null 2>&1
local end=$(python3 -c "import time; print(int(time.time() * 1000))")
echo $((end - start))
}


# Calculate improvements
if [[ "$DIACHRON_COLD_START" =~ ^[0-9]+$ ]] && [[ "$EPISODIC_COLD_START" == "2500-3500" ]]; then
COLD_IMPROVEMENT=$(echo "scale=0; 3000 / $DIACHRON_COLD_START" | bc)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The calculation for COLD_IMPROVEMENT could result in a division-by-zero error if $DIACHRON_COLD_START is 0. You've handled this for SEARCH_IMPROVEMENT on line 260. I recommend applying a similar safeguard here to prevent the script from failing.

Suggested change
COLD_IMPROVEMENT=$(echo "scale=0; 3000 / $DIACHRON_COLD_START" | bc)
COLD_IMPROVEMENT=$(echo "scale=0; 3000 / $DIACHRON_COLD_START" | bc 2>/dev/null || echo "N/A")

Comment on lines 39 to 40
| Exchanges | 284288
284729 | ~230K |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There appears to be a formatting error in the 'Index Statistics' table. The value for 'Exchanges' is split across two lines, which breaks the table rendering. This seems to be a copy-paste or generation error.

Comment on lines +82 to +115
get_percentiles() {
local path="$1"
python3 - "$path" <<'PY'
import json
import math
import sys
from pathlib import Path

path = Path(sys.argv[1])
if not path.exists() or path.stat().st_size == 0:
print("N/A N/A N/A")
raise SystemExit(0)

data = json.loads(path.read_text())
times = data.get("results", [{}])[0].get("times", [])
if not times:
print("N/A N/A N/A")
raise SystemExit(0)

def pct(vals, p):
vals = sorted(vals)
k = (len(vals) - 1) * (p / 100)
f = math.floor(k)
c = math.ceil(k)
if f == c:
return vals[int(k)]
return vals[f] + (vals[c] - vals[f]) * (k - f)

p50 = pct(times, 50) * 1000
p95 = pct(times, 95) * 1000
p99 = pct(times, 99) * 1000
print(f"{p50:.1f} {p95:.1f} {p99:.1f}")
PY
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The get_percentiles function contains a fairly large embedded Python script. For better readability, maintainability, and to leverage editor features like syntax highlighting, consider moving this script to its own file (e.g., scripts/calculate_percentiles.py) and calling it from the shell script.

### Python Example

```python
import socket

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Python example for send_message uses os.path.expanduser but os is not imported. Please add import os to the example to make it runnable.

Suggested change
import socket
import socket
import json
import os

docs/IPC-API.md Outdated
Comment on lines 541 to 542
"pr_id": ${{ github.event.pull_request.number }},
"commits": ${{ toJson(github.event.pull_request.commits) }},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The example for generating an evidence pack in a CI/CD pipeline incorrectly uses github.event.pull_request.commits, which is a number representing the count of commits, not a list of commit SHAs. This will cause the IPC call to fail. The action should instead gather the list of commit SHAs from the PR context, for example by using the GitHub API or git log.

Comment on lines +1353 to +1478
Commands::PrComment { pr, evidence } => {
println!("Posting PR narrative comment...\n");

// Read evidence pack
let evidence_content = std::fs::read_to_string(&evidence)
.context("Failed to read evidence file")?;

let pack: serde_json::Value = serde_json::from_str(&evidence_content)
.context("Failed to parse evidence JSON")?;

// Build markdown narrative
let mut md = String::new();

// Header
md.push_str(&format!(
"## PR #{}: AI Provenance Evidence\n\n",
pack["pr_id"].as_u64().unwrap_or(pr)
));

// Intent section (if available)
if let Some(intent) = pack["intent"].as_str() {
if !intent.is_empty() {
md.push_str("### Intent\n");
md.push_str(&format!("> {}\n\n", intent));
}
}

// Summary section
md.push_str("### What Changed\n");
md.push_str(&format!(
"- **Files modified**: {}\n",
pack["summary"]["files_changed"].as_u64().unwrap_or(0)
));
md.push_str(&format!(
"- **Lines**: +{} / -{}\n",
pack["summary"]["lines_added"].as_u64().unwrap_or(0),
pack["summary"]["lines_removed"].as_u64().unwrap_or(0)
));
md.push_str(&format!(
"- **Tool operations**: {}\n",
pack["summary"]["tool_operations"].as_u64().unwrap_or(0)
));
md.push_str(&format!(
"- **Sessions**: {}\n\n",
pack["summary"]["sessions"].as_u64().unwrap_or(0)
));

// Evidence trail section
md.push_str("### Evidence Trail\n");
let coverage = pack["coverage_pct"].as_f64().unwrap_or(0.0);
let unmatched = pack["unmatched_count"].as_u64().unwrap_or(0);
md.push_str(&format!("- **Coverage**: {:.1}% of events matched to commits", coverage));
if unmatched > 0 {
md.push_str(&format!(" ({} unmatched)", unmatched));
}
md.push_str("\n");

// List commits with their events
if let Some(commits) = pack["commits"].as_array() {
for commit in commits {
let sha = commit["sha"].as_str().unwrap_or("");
let sha_short = &sha[..7.min(sha.len())];
let confidence = commit["confidence"].as_str().unwrap_or("LOW");

md.push_str(&format!("\n**Commit `{}`**", sha_short));
if let Some(msg) = commit["message"].as_str() {
let first_line = msg.lines().next().unwrap_or(msg);
md.push_str(&format!(": {}", first_line));
}
md.push_str(&format!(" ({})\n", confidence));

if let Some(events) = commit["events"].as_array() {
for event in events.iter().take(5) {
let tool = event["tool_name"].as_str().unwrap_or("-");
let file = event["file_path"].as_str().unwrap_or("-");
let op = event["operation"].as_str().unwrap_or("-");
md.push_str(&format!(" - `{}` {} → {}\n", tool, op, file));
}
if events.len() > 5 {
md.push_str(&format!(" - *...and {} more*\n", events.len() - 5));
}
}
}
}
md.push_str("\n");

// Verification section
md.push_str("### Verification\n");
md.push_str(&format!(
"- [{}] Hash chain integrity\n",
if pack["verification"]["chain_verified"].as_bool().unwrap_or(false) { "x" } else { " " }
));
md.push_str(&format!(
"- [{}] Tests executed after changes\n",
if pack["verification"]["tests_executed"].as_bool().unwrap_or(false) { "x" } else { " " }
));
md.push_str(&format!(
"- [{}] Build succeeded\n",
if pack["verification"]["build_succeeded"].as_bool().unwrap_or(false) { "x" } else { " " }
));
md.push_str(&format!(
"- [{}] Human review\n\n",
if pack["verification"]["human_reviewed"].as_bool().unwrap_or(false) { "x" } else { " " }
));

// Footer
md.push_str(&format!(
"---\n*Generated by [Diachron](https://github.com/wolfiesch/diachron) v{} at {}*\n",
pack["diachron_version"].as_str().unwrap_or(env!("CARGO_PKG_VERSION")),
pack["generated_at"].as_str().unwrap_or("unknown")
));

// Post via gh CLI
let status = std::process::Command::new("gh")
.args(["pr", "comment", &pr.to_string(), "-b", &md])
.status()
.context("Failed to run gh CLI")?;

if status.success() {
println!("✅ PR comment posted successfully");
println!("\nPosted content:\n{}", md);
} else {
eprintln!("Failed to post PR comment (gh exit code: {:?})", status.code());
std::process::exit(1);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to render the evidence pack into a markdown comment is duplicated here, in rust/core/src/evidence_pack.rs, and in the TypeScript code for the GitHub Action (github-action/src/index.ts). This creates a maintenance burden, as any change to the markdown format needs to be updated in multiple places. Consider centralizing this logic. For example, the daemon could have an IPC endpoint that takes an evidence pack and returns the rendered markdown, which both the CLI and the GitHub Action could use.

wolfiesch and others added 2 commits January 11, 2026 08:23
- Fix CI: dtolnay/rust-action → dtolnay/rust-toolchain (missing action)
- Fix security: Shell injection in IPC-API.md CI example (use Python for safe JSON)
- Fix docs: Add missing `import os` to Python example
- Fix docs: Use `gh pr view` to get commit SHAs (not count)
- Fix Python: Remove unused os/datetime imports, add comment to except clause
- Fix benchmarks: Replace eval with direct execution, add div-by-zero guard
- Fix benchmarks: Repair broken table formatting in results markdown
- Add note about race condition limitation in Rust wrapper

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@github-actions
Copy link

⚡ Benchmark Results

Metric Value Threshold Status
CLI cold start 11.32ms 50ms
Daemon IPC 19.81ms 20ms
Hook capture 5.98ms 20ms

@wolfiesch wolfiesch merged commit a99b840 into master Jan 11, 2026
1 check passed
@wolfiesch wolfiesch deleted the feat/multi-assistant-v0.7 branch January 11, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant