Cross-system process forensics for SAP ERP, Salesforce CRM, and NetSuite
Multi-system - Analyze SAP, Salesforce, NetSuite, or any combination correlated together Adapter-based - 7 data adapters (SAP RFC, OData, SALT, BPI, CSV, Synthetic, SFDC) Pattern detection - Conformance checking, temporal analysis, contradiction detection, cross-system gap analysis Evidence-grade - Field-level provenance, SHA-256 replay hashing, self-contained reviewer handoff packets Zero risk - Read-only access, no data modification
Full evidence lifecycle from extraction through reviewer handoff, with cryptographic verification at every step.
| Feature | Description |
|---|---|
| Provenance Graph | Field-level DAG tracing every finding to system/table/record/field/value/timestamp |
| Extraction Registry | 19 named, versioned, deterministic extraction paths across SAP, Salesforce, and NetSuite |
| Contradiction Engine | 12-category typed taxonomy with risk scoring and type-specific weights |
| Schema Validator | 19-table IDES reference schema (438 fields) with pre-flight validation and customization detection |
| Reality-Gap Detector | Three-way gap analysis: reference models vs documented business rules vs actual event logs |
| Finding Lifecycle | 8-state machine with SQLite persistence, transition history, and deduplication |
| Reviewer Handoff | Self-contained audit artifacts verifiable without model access |
| 1,639 Tests | 70 test suites, zero regressions |
Finding: AMOUNT_DIVERGENCE on Sales Order 0000045123
Evidence:
Left: SAP.VBAK.0000045123.NETWR = 125,000.00 (extracted 2025-09-15T14:22:00Z)
Right: SFDC.Opportunity.006R00000123.Amount = 118,750.00 (extracted 2025-09-15T14:22:01Z)
Delta: 5.3% ($6,250.00)
Provenance:
Extraction Path: sap-o2c-order-headers v1.0
Replay Hash: sha256:a7f3b2...
State: CONFIRMED β REMEDIATION (transitioned 2025-09-16 by reviewer@corp.com)
Choose your path:
# Generate synthetic SAP data, run analysis, view results
docker-compose up --build
# Open browser to http://localhost:8080# Export from SE16: VBAK, VBAP, LIKP, LIPS, VBRK, VBRP, STXH/STXL
# Place files in ./input-data/
docker-compose run pattern-engine --input-dir /app/input-data --output-dir /app/output# Copy and edit configuration
cp .env.rfc.example .env.rfc
# Edit .env.rfc with your SAP connection details
# Run with RFC adapter
docker-compose --profile rfc up mcp-server-rfcSee Installation Guide for detailed setup instructions.
# 1. Generate synthetic SFDC data (200 Opportunities, 10 planted anomaly patterns)
cd synthetic-data
python3 src/generate_sfdc.py --count 200 --accounts 50 --output sfdc_output/ --seed 42
# 2. Run the forensic analysis
cd ../pattern-engine
python3 scripts/analyze_sfdc.py
# Or bring your own SFDC export:
# Place Opportunity, Account, StageHistory CSVs in ./data/sfdc/
# python3 scripts/analyze_sfdc.py --data-dir ../data/sfdcThe evidence infrastructure provides a complete chain of custody from raw system data through forensic findings to reviewer-ready audit packets.
Every finding traces back to specific fields in specific records in specific systems through a directed acyclic graph (DAG). Each extraction record captures:
- System - SAP, Salesforce, or NetSuite
- Table - Source table (e.g., VBAK, Opportunity)
- Record ID - Specific document or record
- Field - Individual field name
- Value - Extracted value at time of extraction
- Timestamp - When the extraction occurred
- Replay Hash - SHA-256 hash for independent re-verification
Export formats: DAG JSON (full graph), flat (tabular), Markdown (human-readable).
19 named, versioned, deterministic extraction paths ensure reproducible data collection:
| Domain | Path | Description |
|---|---|---|
| SAP O2C | sap-o2c-order-headers |
Sales order header fields (VBAK) |
sap-o2c-order-items |
Line item details (VBAP) | |
sap-o2c-doc-flow |
Document flow chain (VBFA) | |
sap-o2c-delivery-timing |
Requested vs actual delivery (LIKP/LIPS) | |
sap-o2c-invoice-timing |
Invoice creation and posting (VBRK/VBRP) | |
| SAP FI/CO | sap-fico-journal-entries |
Journal entry headers (BKPF) |
sap-fico-line-items |
Journal line items (BSEG) | |
sap-fico-sod-conflicts |
Segregation of duties analysis | |
sap-fico-gl-balances |
GL account balances | |
| SAP P2P | sap-p2p-purchase-orders |
Purchase order data (EKKO/EKPO) |
sap-p2p-requisitions |
Purchase requisitions (EBAN) | |
sap-p2p-goods-receipts |
Goods receipt documents (MKPF/MSEG) | |
sap-p2p-invoice-verification |
Invoice verification (RBKP/RSEG) | |
| Salesforce | sfdc-opportunities |
Opportunity pipeline data |
sfdc-stage-history |
Stage transition history | |
sfdc-activities |
Tasks and events on records | |
| NetSuite | netsuite-user-activity |
User activity audit trail |
netsuite-transaction-summary |
Transaction summaries | |
netsuite-login-history |
Login and access history |
Each path is versioned and produces deterministic output for the same input, enabling SHA-256 replay verification.
Cross-system contradiction detection with a 12-category typed taxonomy:
| Category | What It Detects |
|---|---|
AMOUNT_DIVERGENCE |
Dollar amounts that differ beyond tolerance across systems |
DATE_CONFLICT |
Dates that disagree between matched records |
STATUS_INCOMPATIBLE |
Status fields that cannot logically coexist |
ENTITY_MISMATCH |
Customer/vendor/material IDs that do not match across systems |
QUANTITY_DIVERGENCE |
Quantities that differ beyond tolerance |
APPROVAL_BYPASS |
Transactions that bypassed required approval steps |
TEMPORAL_IMPOSSIBILITY |
Events that occur in an impossible sequence |
DUPLICATE_REFERENCE |
Multiple records claiming the same reference number |
ORPHAN_RECORD |
Records in one system with no counterpart in the other |
RETROACTIVE_CHANGE |
Changes made to records after they were finalized |
SOD_VIOLATION |
Same user performing conflicting duties |
SCHEMA_GHOST |
Fields or values that reference non-existent schema elements |
Risk scoring uses type-specific weights. Severity levels: CRITICAL, HIGH, MEDIUM, LOW, INFO.
Pre-flight validation of extraction paths against client schemas before any data is pulled.
- Reference schema: 19 tables, 438 fields from an actual SAP IDES dump
- Path validation: Verifies that every field referenced by an extraction path exists in the client schema
- Customization detection: Identifies Z-tables, Z-fields, and custom namespaces
- Gap reporting: Shows exactly which fields are missing and which paths are affected
Three-way gap analysis comparing what should happen, what is documented, and what actually happens:
| Gap Type | Comparison | Example |
|---|---|---|
| Design Gap | Reference model vs documented rules | SoD policy exists but no enforcing control configured |
| Compliance Gap | Documented rules vs actual events | Three-way match required but invoices posted without GR |
| Shadow Process | Actual events vs all documented models | Goods receipts posted on weekends with no approval workflow |
Includes a rule parser with standard rulesets for SAP, NetSuite, and Salesforce.
8-state machine tracking every finding from detection through resolution:
DETECTED β TRIAGED β INVESTIGATING β CONFIRMED β REMEDIATION β RESOLVED
β β β
FALSE_POSITIVE ACCEPTED_RISK
- SQLite persistence with full transition history (who, when, from-state, to-state)
- Deduplication prevents the same finding from being logged twice
- Four finding sources: contradiction, reality_gap, conformance, fi_co_anomaly
- Risk scores (0.0-1.0) computed from finding type and severity
Self-contained audit artifacts that can be verified without model access:
- Executive Summary - Scope, systems analyzed, key metrics, risk distribution
- Rendered Findings - Each finding with severity, evidence tables, and provenance chain
- Extraction Manifest - Every extraction path used, with parameters and SHA-256 replay hashes
- Reproduction README - Step-by-step instructions to re-run the analysis independently
- Reviewer Checklist - 25-item verification checklist covering completeness, accuracy, and methodology
The Salesforce adapter maps Opportunity pipeline data through the same pattern engine used for SAP:
| SFDC Concept | SAP Equivalent | Mapping |
|---|---|---|
| Opportunity.Id | VBELN | Padded to 10 chars |
| RecordType.Name | AUART | New BusinessβZNEW, RenewalβZREN, UpsellβZUPS |
| Account.Id | KUNNR | Padded to 10 chars |
| Opportunity.Amount | NETWR | Direct |
| Stage transitions | VBFA (doc flow) | Each stage change β flow entry |
| Task/Event | STXH/STXL (texts) | Activity subject + description β doc text |
| Account (safe fields) | KNA1 | Industry, State, Country only (no PII) |
When both SFDC and SAP data are loaded, the entity resolver matches records using:
- Explicit ID (confidence 0.99) β
Opportunity.SAP_Order_Number__c == VBAK.VBELN - Proximity (confidence 0.50-0.95) β Account name similarity + amount tolerance + date proximity
- Temporal sequence (Phase 2) β Monotonic SFDCβSAP event chain validation
Anomalies detected across matched pairs:
- Timing gaps β SFDC close to SAP order creation > 30 days
- Amount discrepancies β SFDC Amount vs SAP NETWR > 5% tolerance
- Sequence violations β SAP order created before SFDC close
- Missing handoffs β SFDC Closed Won with no corresponding SAP order
The SFDC generator plants 10 detectable patterns at controlled rates:
| Pattern | Rate | What It Tests |
|---|---|---|
| Stage skip | 5% | Conformance: mandatory stages bypassed |
| Quarter-end compression | 40% of won | Temporal: period-end deal clustering |
| Ghost pipeline | 10% of late-stage | Correlation: zero activities on active deals |
| Stage regression | 3% | Conformance: backward stage movement |
| Amount inflation | 8% | Correlation: >50% amount increase at close |
| Split deal | 6% | Cross-entity: same account, duplicate deals within 7 days |
| Speed anomaly | 5% | Temporal: created to closed in <3 days |
| Stale pipeline | 15% of open | Temporal: no movement for >90 days |
| Owner swap at close | 4% of won | Conformance: owner changes in final stage |
| Cross-system gap | 6% of SAP-linked | Cross-system: >30 day SFDCβSAP timing gap |
+-----------------------------------------------------------------------------------+
| Pattern Discovery Report |
+-----------------------------------------------------------------------------------+
| Pattern: "Credit Hold Escalation" |
| ----------------------------------------------------------------------------------|
| Finding: Orders with 'CREDIT HOLD' in notes have 3.2x longer fulfillment cycles |
| |
| Occurrence: 234 orders (4.7% of dataset) |
| Sales Orgs: 1000 (64%), 2000 (36%) |
| Confidence: HIGH (p < 0.001) |
| |
| Caveat: Correlation only - does not imply causation |
+-----------------------------------------------------------------------------------+
Key Features:
- Text Pattern Discovery - Find hidden patterns in order notes, rejection reasons, and delivery instructions
- Document Flow Analysis - Trace complete order-to-cash chains with timing at each step
- Outcome Correlation - Identify text patterns that correlate with delays, partial shipments, or returns
- Evidence-Based Reporting - Every pattern links to specific documents with field-level provenance
- Privacy-First Design - PII redaction enabled by default, shareable output mode for external review
Ask questions about your SAP processes in plain English:
User: "Why are orders from sales org 1000 taking longer to ship?"
System: Based on analysis of 5,234 orders:
- Average delay: 4.2 days vs 1.8 days for other orgs
- Root cause: 73% have "CREDIT HOLD" in notes
- Recommendation: Review credit check thresholds for org 1000
Confidence: HIGH | Evidence: 847 documents analyzed
Supports multiple LLM providers:
- Ollama (local, private) - Default for air-gapped environments
- OpenAI (GPT-4) - For cloud deployments
- Anthropic (Claude) - Alternative cloud option
Export to the Object-Centric Event Log standard for advanced process mining:
{
"ocel:version": "2.0",
"ocel:objectTypes": ["order", "item", "delivery", "invoice"],
"ocel:events": [...],
"ocel:objects": [...]
}- Captures multi-object relationships (order β items β deliveries β invoices)
- Compatible with PM4Py, Celonis, and other OCEL tools
- Export formats: JSON, XML, SQLite
Compare actual SAP processes against expected Order-to-Cash models:
Conformance Report: 94.2% (4,712 / 5,000 cases)
Deviations Detected:
βββ CRITICAL: Invoice before Goods Issue (23 cases)
βββ MAJOR: Skipped Delivery step (187 cases)
βββ MINOR: Duplicate Order Created (78 cases)
- Pre-built O2C reference models (simple and detailed)
- Severity scoring: Critical / Major / Minor
- Deviation types: skipped steps, wrong order, missing activities
Generate process flow diagrams with bottleneck highlighting:
graph LR
A[Order Created] -->|2.1 days| B[Delivery Created]
B -->|0.5 days| C[Goods Issued]
C -->|3.2 days| D[Invoice Created]
style C fill:#f8d7da
- Output formats: Mermaid (Markdown), GraphViz (DOT), SVG
- Color-coded bottleneck severity (green/yellow/red)
- Timing annotations between process steps
ML-based prediction for process outcomes:
Order 0000012345 - Risk Assessment:
βββ Late Delivery: 78% probability (HIGH RISK)
β βββ Factors: credit_block, order_value > $50k
βββ Credit Hold: 45% probability (MEDIUM RISK)
βββ Est. Completion: 8.2 days
Prediction Types:
- Late Delivery - Probability based on case age, progress, stalls, rework
- Credit Hold - Likelihood based on credit check status, complexity
- Completion Time - Estimated hours remaining based on progress/pace
29 Extracted Features:
- Temporal: case age, time since last event, avg time between events
- Activity: milestones reached, rework detection, loop count, backtracks
- Resource: unique resources, handoff count
- Risk indicators: stalled cases, credit holds, rejections, blocks
| Consideration | S/4HANA Migration | Transaction Forensics |
|---|---|---|
| Timeline | 18-36 months | Hours to first insights |
| Cost | $10M-$100M+ | Free (MIT license) |
| Risk | Business disruption | Zero - read-only access |
| Data Location | Cloud/hosted | On-premise only |
| Prerequisites | Greenfield/brownfield project | Works with existing ECC 6.0 |
| Process Visibility | After migration | Before any changes |
| Use Case | Full transformation | Process discovery & optimization |
This tool does not replace S/4HANA. It helps you understand your current processes before making migration decisions - or find optimization opportunities in your existing ECC system.
- Docker & Docker Compose (recommended)
- OR Node.js 18+ and Python 3.10+ for local development
git clone https://github.com/your-org/transaction-forensics.git
cd transaction-forensics
docker-compose up --buildSee docs/adapter_guide.md for:
- RFC adapter configuration for ECC 6.0
- OData adapter configuration for S/4HANA
- CSV import from SE16 exports
- Air-gapped installation options
Configure the natural language interface in .env:
# Option 1: Local Ollama (default, private)
LLM_PROVIDER=ollama
OLLAMA_HOST=http://localhost:11434
LLM_MODEL=llama3
# Option 2: OpenAI
LLM_PROVIDER=openai
LLM_API_KEY=<YOUR_OPENAI_KEY>
LLM_MODEL=gpt-4
# Option 3: Anthropic
LLM_PROVIDER=anthropic
LLM_API_KEY=<YOUR_ANTHROPIC_KEY>
LLM_MODEL=claude-3-sonnet-20240229For air-gapped environments, use Ollama with locally downloaded models.
Interactive demos for all v2.0 process mining tools. No SAP connection required - all demos use synthetic data.
cd mcp-server
# Natural Language Interface - ask questions in plain English
npx tsx ../demos/ask_process_demo.ts
npx tsx ../demos/ask_process_demo.ts --interactive # Interactive mode
# OCEL 2.0 Export - export to process mining standard format
npx tsx ../demos/export_ocel_demo.ts
# Conformance Checking - compare against O2C reference model
npx tsx ../demos/check_conformance_demo.ts
# Visual Process Maps - generate Mermaid flowcharts
npx tsx ../demos/visualize_process_demo.ts
# Predictive Monitoring - ML-based risk predictions
npx tsx ../demos/predict_outcome_demo.ts| Demo | Description |
|---|---|
ask_process_demo.ts |
Natural language queries with LLM integration |
export_ocel_demo.ts |
OCEL 2.0 export with object/event breakdown |
check_conformance_demo.ts |
Deviation detection and severity scoring |
visualize_process_demo.ts |
Mermaid diagrams with bottleneck highlighting |
predict_outcome_demo.ts |
Risk predictions and alerts |
salt_adapter_demo.ts |
Real SAP O2C data from SALT dataset |
visualize_process_bpi_demo.ts |
Process maps with real P2P data (BPI 2019) |
predict_outcome_bpi_demo.ts |
Risk predictions with real P2P data (BPI 2019) |
ask_process_bpi_demo.ts |
Natural language queries on P2P data |
Use real SAP Purchase-to-Pay data from the BPI Challenge 2019 for testing with authentic business patterns.
# Download and convert BPI 2019 data
python scripts/download_bpi_2019.py
# Run demos with real P2P data
npx tsx demos/visualize_process_bpi_demo.ts 50
npx tsx demos/predict_outcome_bpi_demo.ts 30
npx tsx demos/ask_process_bpi_demo.tsDataset Statistics:
| Metric | Value |
|---|---|
| Total cases | 251,734 |
| Total events | 1.5M+ |
| Unique activities | 39 |
| Process type | Purchase-to-Pay (P2P) |
| Source | Multinational coatings company |
Activities include: SRM workflows, Purchase Orders, Goods Receipts, Service Entries, Invoice Processing, Vendor interactions
Use real SAP ERP data from SAP's SALT dataset on HuggingFace for testing with authentic business patterns.
# 1. Install Python dependencies
pip install datasets pyarrow
# 2. Download SALT dataset
python scripts/download-salt.py
# 3. Run demo with real data
cd mcp-server
npx tsx ../demos/salt_adapter_demo.tsSALT (Sales Autocompletion Linked Business Tables) contains:
| Table | Description | Records |
|---|---|---|
| I_SalesDocument | Sales order headers | ~1M+ |
| I_SalesDocumentItem | Order line items | ~5M+ |
| I_Customer | Customer master data | ~100K |
| I_AddrOrgNamePostalAddress | Address data | ~100K |
import { SaltAdapter } from './adapters/salt/index.js';
const adapter = new SaltAdapter({
maxDocuments: 10000, // Limit for memory management
});
await adapter.initialize();
// Get real sales order data
const header = await adapter.getSalesDocHeader({ vbeln: '0000012345' });
const items = await adapter.getSalesDocItems({ vbeln: '0000012345' });
// Get dataset statistics
const stats = adapter.getStats();
console.log(`Loaded ${stats.salesDocuments} sales documents`);SALT contains sales orders only (no deliveries or invoices). For full Order-to-Cash testing:
- Use SALT for sales order analysis and ML training
- Use synthetic adapter for complete O2C flow testing
- Combine both for comprehensive validation
| Aspect | Synthetic Data | SALT Real Data |
|---|---|---|
| Patterns | Random/artificial | Authentic business patterns |
| ML Training | Limited accuracy | Real-world feature distributions |
| Demos | Good for UI testing | Compelling for stakeholders |
| Validation | Functional testing | Business logic validation |
We've validated the MCP tools against real SAP datasets. View the detailed analysis:
| Dataset | System | Cases | Events | Key Findings | Report |
|---|---|---|---|---|---|
| SFDC Synthetic | Salesforce | 214 | 2,417 | 10 anomaly patterns, 57% QE compression, 2 cross-system gaps | Run: python3 scripts/analyze_sfdc.py |
| BPI Challenge 2019 | SAP P2P | 251,734 | 1.6M | 42 activities, 64-day median throughput | View β |
| SAP IDES O2C | SAP O2C | 646 | 5,708 | 158 variants, bottlenecks identified | View β |
| SAP IDES P2P | SAP P2P | 2,486 | 7,420 | 7 compliance violations detected | View β |
Process Diagrams: Mermaid flowcharts for O2C and P2P
Test Suite: 1,639 tests passing across 70 test suites (TypeScript + Python)
This system is designed for enterprise security requirements.
| Concern | How We Address It |
|---|---|
| Data Access | Read-only BAPIs only - no write operations, no arbitrary SQL |
| Data Location | All processing is on-premise - no cloud, no external APIs |
| Network | No outbound connections, no telemetry, no phone-home |
| PII Protection | Automatic redaction of emails, phones, names, addresses |
| Audit Trail | Every query logged with parameters, timestamps, row counts |
| Row Limits | Default 200 rows per query, max 1000 - prevents bulk extraction |
| Provenance | SHA-256 replay hashing on every extraction for independent verification |
| Handoff Integrity | Reviewer packets are self-contained and verifiable without model access |
See SECURITY.md for complete security documentation.
The RFC user requires display-only access to SD documents:
Authorization Object: S_RFC
RFC_TYPE = FUGR
RFC_NAME = STXR, 2001, 2051, 2056, 2074, 2077
ACTVT = 16 (Execute)
Authorization Object: V_VBAK_VKO
VKORG = [Your Sales Organizations]
ACTVT = 03 (Display)
Authorization Object: V_VBAK_AAT
AUART = * (or specific document types)
ACTVT = 03 (Display)
Copy-paste ready role template: See docs/SAP_AUTHORIZATION.md
| BAPI | Purpose | Tables Accessed |
|---|---|---|
BAPI_SALESORDER_GETLIST |
List sales orders | VBAK |
SD_SALESDOCUMENT_READ |
Read order header/items | VBAK, VBAP |
BAPI_SALESDOCU_GETRELATIONS |
Document flow (VBFA) | VBFA |
BAPI_OUTB_DELIVERY_GET_DETAIL |
Delivery details | LIKP, LIPS |
BAPI_BILLINGDOC_GETDETAIL |
Invoice details | VBRK, VBRP |
READ_TEXT |
Long text fields | STXH, STXL |
BAPI_CUSTOMER_GETDETAIL2 |
Customer master (stub) | KNA1 |
BAPI_MATERIAL_GET_DETAIL |
Material master (stub) | MARA |
No direct table access. No RFC_READ_TABLE unless explicitly enabled.
+------------------------------------------------------------------+
| Your Network |
| +------------------------------------------------------------+ |
| | | |
| | +----------------+ +-------------------+ | |
| | | SAP ECC 6.0 | | SAP Workflow | | |
| | | | | Mining Server | | |
| | | +----------+ | | | | |
| | | | SD/MM | | RFC | +-------------+ | | |
| | | | Tables |<--------->| MCP Server | | | |
| | | +----------+ | (R/O)| +-------------+ | | |
| | | | | | | | |
| | +----------------+ | v | | |
| | | +-------------+ | | |
| | +----------------+ | | Evidence | | | |
| | | Salesforce | | | Engine | | | |
| | | | API | | +---------+ | | | |
| | | Opportunities |<------>| |Provnance| | | | |
| | | Activities | | | |Registry | | | | |
| | +----------------+ | | |Findings | | | | |
| | | | +---------+ | | | |
| | +----------------+ | +-------------+ | | |
| | | NetSuite | | | | | |
| | | | API | v | | |
| | | Users/Txns |<--->| +-------------+ | | |
| | +----------------+ | | Pattern | | | |
| | | | Engine | | | |
| | | +-------------+ | | |
| | | | | | |
| | +----------------+ | +-------------+ | | |
| | | Browser |<------>| Web Viewer | | | |
| | | (localhost) | | +-------------+ | | |
| | +----------------+ +-------------------+ | |
| | | |
| +------------------------------------------------------------+ |
| |
| NO EXTERNAL CONNECTIONS |
+------------------------------------------------------------------+
Data Flow:
- MCP Server connects to SAP via RFC, Salesforce via API, NetSuite via API (all read-only)
- Extraction Registry executes named, versioned extraction paths
- Provenance Graph records field-level evidence for every extraction
- Contradiction Engine and Reality-Gap Detector analyze cross-system data
- Finding Lifecycle Manager tracks findings from detection through resolution
- Handoff Generator produces self-contained reviewer packets
- Web Viewer displays findings on localhost
Nothing leaves your network.
No. This is an independent open-source project. It uses standard SAP BAPIs that are publicly documented.
Minimal impact. All queries are:
- Read-only (no locks)
- Row-limited (200 default, 1000 max)
- Rate-limited (configurable)
- Use standard BAPIs (not direct table access)
We recommend running initial analysis during off-peak hours.
SD (Sales & Distribution), MM (Materials Management), and FI/CO (Financial Accounting / Controlling) document flows. Cross-system analysis with Salesforce CRM and NetSuite is also supported.
Yes. The tool uses BAPIs which are database-agnostic. Works with HANA, Oracle, DB2, SQL Server, MaxDB.
Yes. The Docker images can be built offline and transferred. No external dependencies at runtime.
Every finding includes:
- Field-level provenance tracing to system/table/record/field/value/timestamp
- SHA-256 replay hashes for independent re-verification
- Sample document numbers for verification in SAP (VA03, VL03N, VF03)
- Statistical confidence intervals
- Explicit caveats about correlation vs. causation
For formal review, use generate_handoff_packet to produce a self-contained audit artifact with a 25-item reviewer checklist.
- PII redaction is enabled by default
- No data leaves your network
- Shareable mode applies additional redaction
- See SECURITY.md for compliance considerations
Yes. See CONTRIBUTING.md for guidelines. Feature requests via GitHub Issues.
The MCP server includes a governance layer based on PromptSpeak symbolic frames for pre-execution blocking and human-in-the-loop approval workflows.
When AI agents access SAP data, you need controls to:
- Prevent bulk extraction - Hold requests for large date ranges or row counts
- Protect sensitive data - Require approval for searches containing PII patterns
- Halt rogue agents - Circuit breaker to immediately stop misbehaving agents
- Audit everything - Complete trail of all operations for compliance
Every operation has a symbolic frame indicating mode, domain, action, and entity:
Frame: βββΞ±
β β β βββ Entity: Ξ± (primary agent)
β β βββββ Action: β (retrieve)
β βββββββ Domain: β (operational)
βββββββββ Mode: β (strict)
| Symbol | Category | Meaning |
|---|---|---|
β |
Mode | Strict - exact compliance required |
β |
Mode | Neutral - standard operation |
β |
Mode | Flexible - allow interpretation |
β |
Mode | Forbidden - blocks all actions |
β |
Domain | Financial (invoices, values) |
β |
Domain | Operational (orders, deliveries) |
β |
Action | Retrieve data |
β² |
Action | Analyze/search |
β |
Action | Validate |
Ξ± Ξ² Ξ³ |
Entity | Primary/secondary/tertiary agent |
Operations are automatically held for human approval when:
| Trigger | Threshold | Example |
|---|---|---|
| Broad date range | >90 days | date_from: 2024-01-01, date_to: 2024-12-31 |
| High row limit | >500 rows | limit: 1000 |
| Sensitive patterns | SSN, credit card, password | pattern: "social security" |
Agent Request
β
βΌ
βββββββββββββββ βββββββββββββββ
β Circuit ββββββΆβ BLOCKED β (if agent halted)
β Breaker β βββββββββββββββ
βββββββββββββββ
β OK
βΌ
βββββββββββββββ βββββββββββββββ
β Frame ββββββΆβ BLOCKED β (if β forbidden)
β Validation β βββββββββββββββ
βββββββββββββββ
β OK
βΌ
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Hold ββββββΆβ HELD ββββββΆβ Human β
β Check β β (pending) β β Approval β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β OK β
βΌ βΌ
βββββββββββββββ βββββββββββββββ
β EXECUTE βββββββββββββββββββββββββββ APPROVED β
βββββββββββββββ βββββββββββββββ
| Tool | Purpose |
|---|---|
ps_precheck |
Dry-run: check if operation would be allowed |
ps_list_holds |
List pending holds awaiting approval |
ps_approve_hold |
Approve a held operation |
ps_reject_hold |
Reject a held operation with reason |
ps_agent_status |
Check circuit breaker state for an agent |
ps_halt_agent |
Immediately halt an agent (blocks all ops) |
ps_resume_agent |
Resume a halted agent |
ps_stats |
Get governance statistics |
ps_frame_docs |
Get PromptSpeak frame reference |
// 1. Agent makes a request that triggers hold
const result = await mcp.callTool('search_doc_text', {
pattern: 'delivery',
date_from: '2024-01-01',
date_to: '2024-12-31', // >90 days triggers hold
});
// Returns: { held: true, hold_id: 'hold_abc123', reason: 'broad_date_range' }
// 2. Supervisor reviews pending holds
const holds = await mcp.callTool('ps_list_holds', {});
// Returns: [{ holdId: 'hold_abc123', tool: 'search_doc_text', severity: 'medium' }]
// 3. Supervisor approves
const approved = await mcp.callTool('ps_approve_hold', {
hold_id: 'hold_abc123',
approved_by: 'supervisor@example.com'
});
// Returns: { allowed: true, auditId: 'audit_xyz789' }// Immediately block a misbehaving agent
await mcp.callTool('ps_halt_agent', {
agent_id: 'agent-123',
reason: 'Excessive query rate detected'
});
// All subsequent requests from this agent are blocked
const result = await mcp.callTool('get_doc_text', {
doc_type: 'order',
doc_key: '0000000001',
_agent_id: 'agent-123' // Identifies the agent
});
// Returns: { error: 'Governance Blocked', message: 'Agent halted: Excessive query rate' }
// Resume when issue is resolved
await mcp.callTool('ps_resume_agent', { agent_id: 'agent-123' });| Tool | Purpose | Returns |
|---|---|---|
search_doc_text |
Find documents by text pattern | doc_type, doc_key, snippet, match_score |
get_doc_text |
Get all text fields for a document | header_texts[], item_texts[] |
get_doc_flow |
Get order-delivery-invoice chain | chain with keys, statuses, dates |
get_sales_doc_header |
Order header details | sales_org, customer, dates, values |
get_sales_doc_items |
Order line items | materials, quantities, values |
get_delivery_timing |
Requested vs actual delivery | timestamps, variance analysis |
get_invoice_timing |
Invoice creation/posting | invoice dates, accounting refs |
get_master_stub |
Safe master data attributes | hashed IDs, categories (no PII) |
| Tool | Purpose | Returns |
|---|---|---|
ask_process |
Natural language queries | answer, confidence, evidence, recommendations |
export_ocel |
Export to OCEL 2.0 format | OCEL JSON/XML with objects and events |
check_conformance |
Compare against O2C model | conformance_rate, deviations, severity_summary |
visualize_process |
Generate process diagrams | Mermaid/DOT/SVG with bottleneck highlighting |
predict_outcome |
ML-based outcome prediction | predictions, alerts, risk_levels, factors |
| Tool | Purpose | Returns |
|---|---|---|
analyze_journal_entries |
Journal entry anomaly detection | anomalies, risk_scores, patterns |
analyze_sod |
Segregation of duties analysis | conflicts, violation_count, users |
analyze_gl_balances |
GL account balance analysis | balance_anomalies, trends |
get_fi_document |
Retrieve FI document details | header, line_items, amounts |
generate_fi_assessment |
FI/CO risk assessment report | assessment, findings, recommendations |
| Tool | Purpose | Returns |
|---|---|---|
query_provenance |
Trace evidence chain for a finding | DAG/flat/Markdown with field-level provenance |
list_extraction_paths |
List available extraction paths | path definitions with system, version, fields |
run_extraction |
Execute a named extraction path | extracted records with provenance and replay hash |
detect_contradictions |
Cross-system contradiction detection | typed contradictions with severity and evidence |
validate_schema |
Pre-flight schema validation | path compatibility, missing fields, customizations |
analyze_reality_gaps |
Three-way gap analysis | design gaps, compliance gaps, shadow processes |
manage_finding |
Create/transition/query findings | finding state, history, risk score |
get_finding_summary |
Aggregated finding statistics | counts by state, source, severity, avg risk |
generate_handoff_packet |
Produce reviewer handoff packet | executive summary, findings, manifest, checklist |
| Tool | Purpose | Returns |
|---|---|---|
ps_precheck |
Check if operation would be allowed | wouldAllow, wouldHold, reason |
ps_list_holds |
List pending holds | Array of hold requests |
ps_approve_hold |
Approve a held operation | Execution result with auditId |
ps_reject_hold |
Reject a held operation | Success boolean |
ps_agent_status |
Get agent circuit breaker state | isAllowed, state, haltReason |
ps_halt_agent |
Halt an agent immediately | halted, agent_id |
ps_resume_agent |
Resume a halted agent | resumed, agent_id |
ps_stats |
Get governance statistics | holds, haltedAgents, auditEntries |
ps_frame_docs |
Get PromptSpeak documentation | Frame format reference |
MIT License - See LICENSE
This is enterprise-friendly open source:
- Use commercially without restriction
- Modify and distribute freely
- No copyleft obligations
- No warranty (provided as-is)
- Documentation: docs/
- Issues: GitHub Issues
- Security: See SECURITY.md for vulnerability reporting
This project was built with Claude Code (Anthropic). All commits are co-authored as reflected in git history. The architecture, design decisions, and analysis methodology are the author's; the implementation was pair-programmed with AI assistance.
This tool is provided as-is for process analysis purposes. It does not modify SAP data. Users are responsible for:
- Ensuring compliance with organizational data access policies
- Validating findings before making business decisions
- Proper configuration of SAP authorizations
Correlation does not imply causation. All pattern findings should be verified against actual business processes.