Skip to content

Releases: CyberStrategyInstitute/ai-safe2-framework

2026-02-03 – ISHI Governance Scenarios

10 Feb 01:24
46961a9

Choose a tag to compare

ISHI Mission Command Structure with OpenClaw

This release introduces the examples/ishi/ folder to demonstrate how AI SAFE² governs a second reference agent / orchestration pattern (ISHI), expanding beyond OpenClaw-specific examples. Places ISHI in a Mission Command structure over OpenClaw to reduce risks and create a better operational environment for your personal AI assistant.

🧩 What’s New – ISHI Examples

  • Added examples/ishi/ showcasing:

    • How to model ISHI workflows as AI SAFE² assets (agents, tools, memory, orchestration steps).
    • Control implementations for:
      • Input sanitization and boundary enforcement (Pillar 1).
      • Audit trails and inventory (Pillar 2).
      • Kill switches and rollback patterns (Pillar 3).
      • Human-in-the-loop checkpoints (Pillar 4).
      • Continuous red-teaming and tuning (Pillar 5).[page:1]
  • Included scenario files that show:

    • Safe handling of Non-Human Identities for ISHI.
    • Memory/RAG safeguards for ISHI’s context sources.
    • How to register ISHI flows into an enterprise asset inventory.

📚 Documentation & Positioning

  • Referenced examples/ishi/ from the README under target scope & environment, indicating AI SAFE² is not tied to a single agent framework.[page:1]
  • Clarified how ISHI examples differ from examples/openclaw/:
    • OpenClaw: Deep integration and hardening toolkit.
    • ISHI: Generic governance and pattern-focused scenarios.
  • Use-Cases: Showed Top-20 Use-cases for Personal AI Assistants

🔄 Framework Impact

  • No changes to the core AI SAFE² control taxonomy.
  • This release expands the example corpus to help teams translate the same controls across multiple agent stacks.

2026-01-29 – OpenClaw Security Pack

10 Feb 00:58
46961a9

Choose a tag to compare

ClawdBot / MoltBot / OpenClaw 3-Tools = Memory Defense, Scanner & Gateway

This release adds a complete OpenClaw (formerly Moltbot / Clawdbot) security toolkit on top of the AI SAFE² framework, including examples, hardening guidance, and industry resource mapping.

🔐 What’s New – OpenClaw Examples

  • Added examples/openclaw/ with opinionated, ready-to-run examples for securing OpenClaw deployments under AI SAFE².
  • Showcased how to wire OpenClaw into the AI SAFE² pillars (Sanitize, Audit, Fail-Safe, Engage, Evolve) using concrete configuration patterns.
  • Included examples that demonstrate:
    • Memory safety patterns for OpenClaw’s long-lived state.
    • Use of a control gateway to enforce policies outside the agent runtime.
    • Integration points for future CI/CD and orchestration pipelines.

📚 Guides & Documentation

  • Linked OpenClaw examples from the main README under “🛡️ OpenClaw Security” for discoverability.[page:1]
  • Highlighted the 10-Minute Hardening Guide in guides/openclaw-hardening.md as the primary entrypoint for securing OpenClaw.[page:1]
  • Documented OpenClaw’s relationship to the AI SAFE² 5-layer model (L1–L5) and how OpenClaw roles map to Non-Human Identities (NHI).

🌐 Industry Resource Map

  • Added resources/openclaw_security_resource_map.md with curated links to:
    • Known OpenClaw threat models and security write-ups.
    • Recommended hardening techniques aligned with AI SAFE² controls.
    • External research relevant to memory poisoning, agentic abuse, and orchestration risks.

🔄 Framework Impact

  • No changes to the core AI SAFE² control taxonomy.
  • This release focuses on implementation patterns and examples for OpenClaw users, making the framework directly executable in real-world agent stacks.

2026-01-26 – Gateway, Scanner & Skill Runtime Drop

10 Feb 00:54
46961a9

Choose a tag to compare

This release operationalizes AI SAFE² with runtime components (Gateway, Scanner) and developer ergonomics (skill file, Docker assets), turning the framework into an executable control plane.

🧠 Skills & Developer Onramp

  • Added skill.md as the canonical “brain” file for AI assistants and IDEs (e.g., Claude Projects, Cursor, Windsurf).[page:1]
    • Encodes the core AI SAFE² context so agents can answer architecture, control, and mapping questions.
    • Used in the README “🚀 Start Securing in 5 Minutes” path to instantly turn an LLM into an AI SAFE² architect.[page:1]

🛡️ AI SAFE² Gateway

  • Added gateway/ directory containing the AI SAFE² Gateway proxy implementation.[page:1]
    • Enforces policy decisions derived from the framework at runtime.
    • Designed to sit between orchestration layers (e.g., n8n, LangGraph, Make.com, CrewAI) and downstream tools/LLMs.[page:1]
  • Introduced containerization assets:
    • Root-level Dockerfile for building the Gateway image.[page:1]
    • docker-compose.yml for local or test deployment with sensible defaults.[page:1]
  • Documented Gateway behavior and deployment patterns in README and INTEGRATIONS.md, including target environments such as MCP, coding assistants, and no-code workflows.[page:1]

🕵️ Audit Scanner CLI

  • Added scanner/ directory providing the AI SAFE² Audit Scanner CLI.[page:1]
    • Supports the 5-Minute Audit path described in QUICKSTART_5_MIN.md.[page:1]
    • Provides deeper scan modes aligned with the 5 pillars and risk domains (Agentic Swarms, NHI, Memory/RAG, Supply Chain, Universal GRC).[page:1]
  • Included Python project metadata in pyproject.toml to streamline installation and packaging.[page:1]

⚙️ Configuration & Defaults

  • Added config/default.yaml with baseline security configuration and control thresholds.[page:1]
    • Intended as a starting point for enterprises to customize control strength and enforcement rules.
  • Aligned default configuration with the AI SAFE² coverage matrix and the “Universal Rosetta Stone” mappings (NIST AI RMF, ISO 42001, OWASP LLM, MITRE ATLAS, MIT AI Risk Repo).[page:1]

🔄 Framework Impact

  • No changes to the core control taxonomy.
  • This release focuses on runtime enforcement and operational tooling that make AI SAFE² deployable as infrastructure.

2026-01-19 – Research & EFA Insights

10 Feb 00:48
46961a9

Choose a tag to compare

This release expands the research/ corpus with new dossiers, connects AI SAFE² more tightly to external threat intelligence (including EFA), and documents advanced agentic threats.

🧠 Research Dossiers

  • Expanded research/ directory with additional numbered deep-dive documents (e.g., research/00X_*).[page:1]
    • Cover topics such as:
      • Multi-agent swarm failure modes.
      • Memory poisoning campaigns and long-horizon abuse.
      • Non-Human Identity lifecycle risks and abuse patterns.[page:1]
  • Linked each research dossier to specific AI SAFE² controls and pillars, creating a traceable line from evidence → design decision.

🔗 Integration with External Frameworks & EFA

  • Strengthened explicit connections between AI SAFE² and external frameworks:
    • MIT AI Risk Repository (1,600+ risks, fully mapped in v2.1).[page:1]
    • MITRE ATLAS agentic techniques and attack chains.[page:1]
    • OWASP LLM and Google SAIF coverage, with AI SAFE² “gap filler” domains for swarm and RAG security.[page:1]
  • Added research notes documenting how AI SAFE² interacts with EFA-aligned or EFA-style risk taxonomies (e.g., evaluation of failure modes, systemic risk amplification, and emergent behaviors).
    • Clarified which AI SAFE² controls mitigate the classes of failures highlighted in EFA-aligned analyses.
    • Provided mapping tables or narrative sections tying EFA concepts to concrete S-A-F-E-E controls.

📄 Advanced Threat Guidance

  • Updated ADVANCED_AGENT_THREATS.md with:
    • New examples of swarm abuse, orchestration loops, and memory drift scenarios.[page:1]
    • References back into research/ for teams that need a deeper evidence base when justifying controls.
  • Ensured research outputs are discoverable from EVOLUTION.md and README’s Architectural Insights section.[page:1]

🔄 Framework Impact

  • Reinforces the why behind AI SAFE² controls via evidence-backed research.
  • No structural changes to the control taxonomy, but clearer mappings from real-world threat research (including EFA-oriented perspectives) into the framework’s design.

AI SAFE² v2.1: Advanced Agentic & Distributed Edition

05 Jan 19:11
b9a429b

Choose a tag to compare

AI SAFE² v2.1: Advanced Agentic & Distributed Edition

Originally Released November 2025

The transition from AI SAFE² Version 2.0 to Version 2.1 (released in November 2025) marks the framework's evolution into the Advanced Agentic & Distributed AI Edition. While v2.0 provided a detailed, enterprise-grade taxonomy of 99+ subtopics, v2.1 is an additive enhancement that introduces 35+ specialized sub-domains. These "Gap Fillers" specifically address emerging threats in Swarm Intelligence, Non-Human Identity (NHI) governance, and advanced memory security.

According to the sources, here is the summary of the transition to Version 2.1:

Pillar 1: Sanitize & Isolate (P1)

Topic 1: Sanitize → Beyond basic filtering, v2.1 adds Supply Chain Artifact Validation via OpenSSF Model Signing (OMS), real-time NHI secret scanning (integrated with GitGuardian), and specific mitigations for memory attacks like AgentPoison and RAG poisoning.

Topic 2: Isolate → Evolved to include Multi-Agent Boundary Enforcement with A2A (Agent-to-Agent) protocol restrictions and NHI Access Control featuring automated provisioning/decommissioning and Just-In-Time (JIT) privilege elevation.

Pillar 2: Audit & Inventory (P2)

Topic 3: Audit → Enhanced with Decision Traceability for autonomous reasoning, Consensus Voting Validation for swarms, and cryptographic Agent State Verification using SHA-256 hashing to detect tampering.

Topic 4: Inventory → Now mandates a Swarm Topology Map to document multi-agent communication patterns, a centralized NHI Registry for service accounts and agents, and an inventory of Model Artifacts with cryptographic fingerprints.

Pillar 3: Fail-Safe & Recovery (P3)

Topic 5: Fail-Safe → Introduces Distributed Agent Fail-Safes, including centralized kill switches for multi-agent systems, automated NHI credential revocation, and specialized incident response for memory poisoning.

Topic 6: Recovery → Expanded to cover Agent State & Memory Snapshots, RAG/Knowledge Base versioning with rollback capabilities, and HSM-integrated credential recovery for machine identities.

Pillar 4: Engage & Monitor (P4)

Topic 7: Engage → Formalizes oversight for distributed systems through Consensus Failure Escalation (human intervention when agents disagree) and NHI Privilege Elevation Review to approve high-risk agent actions.

Topic 8: Monitor → Delivers real-time Distributed Agent Health & Consensus Monitoring, behavior-driven NHI Activity Dashboards, and advanced Memory Poisoning Monitoring to detect adversarial trigger phrases and semantic drift.

Pillar 5: Evolve & Educate (P5)

Topic 9: Evolve → Focuses on the agile adaptation of Consensus Algorithms, tracking OMS specification updates, and evolving NHI security postures based on the latest machine-identity threat intelligence.

Topic 10: Educate → Specialized training now includes Swarm Manager Certification, Machine Identity Security Awareness, and deep-dives into RAG Security Best Practices to prevent memory-based injection attacks.


Strategic Advancement in 2.1: The framework now achieves Universal GRC Tagging, mapping every sub-domain to seven external frameworks, including 100% coverage of ISO/IEC 42001 and the April 2025 MIT AI Risk Repository update. This version transitions from securing AI as a tool to governing AI as an autonomous workforce.

To visualize this transition: if Version 2.0 was a high-tech security system for an office building, Version 2.1 is the advanced mission control center required to manage an entire fleet of autonomous drones, ensuring each drone (agent) has the right credentials, stays within its flight path, and can be instantly grounded if its internal logic is compromised.

AI SAFE² v2.0: Enterprise Operational Standard

05 Jan 19:08
b9a429b

Choose a tag to compare

AI SAFE² v2.0: Enterprise Operational Standard

Originally Released September 2025

The transition from Version 1.0 to Version 2.0 of the AI SAFE² framework represents a shift from a conceptual "foundational" model to a granular, "enterprise-grade" operational framework. While Version 1.0 established the five core pillars, Version 2.0 expanded these into 99 detailed subtopics, providing specific technical controls and integrating them with global standards like NIST, OWASP, and MITRE ATLAS.

Here is the summary of the framework’s evolution into Version 2.0:

Pillar 1: Sanitize & Isolate (P1)

Topic 1: Sanitize → Expanded from generic cleansing to a comprehensive input validation taxonomy, including schema enforcement, malicious prompt filtering (OWASP LLM01/02), toxic content detection, and sensitive data masking (PII/PHI).
Topic 2: Isolate → Evolved from "containing agents" to detailed boundary enforcement, utilizing agent sandboxing, network segmentation, API gateway restrictions, and credential compartmentalization.

Pillar 2: Audit & Inventory (P2)

Topic 3: Audit → Shifted from "continuous verification" to comprehensive accountability, mandating real-time activity logging, model drift detection, explainability tracking, and bias monitoring.
Topic 4: Inventory → Moved from a "full map" to detailed asset and dependency management, featuring centralized AI registries, model catalogs, agent capability documentation, and automated SBOM generation.

Pillar 3: Fail-Safe & Recovery (P3)

Topic 5: Fail-Safe → Transitioned from simple shutdowns to advanced resilience controls, incorporating circuit breaker patterns, emergency "kill switches" for runaway agents, and formal incident response playbooks.
Topic 6: Recovery → Expanded into full continuity planning, focusing on model state backups, disaster recovery drills, and strict RTO/RPO (Recovery Time/Point Objective) management.

Pillar 4: Engage & Monitor (P4)

Topic 7: Engage → Formalized "human-in-the-loop" into structured oversight, including mandatory human approval workflows for critical actions, reasoning transparency, and regular red-team exercises.
Topic 8: Monitor → Evolved from "continuous observation" to comprehensive detection systems, utilizing real-time performance dashboards, SIEM integration, and token usage/cost tracking.

Pillar 5: Evolve & Educate (P5)

Topic 9: Evolve → Moved from "adapting" to lifecycle improvement, integrating threat intelligence feeds, security patch management, and automated model retraining/refinement.
Topic 10: Educate → Transformed "training" into a culture-building program, providing specialized education for operators, safe prompt engineering training, and sharing industry best practices.


Key Strategic Upgrades in 2.0:

Five-Layer Architecture Model: Version 2.0 introduced a structured approach to securing the entire AI stack, from core models and data infrastructure to agentic workflows and non-human users.

Framework Integration: Version 2.0 achieved 100% coverage of the OWASP Top 10 for LLM Applications and the NIST AI Risk Management Framework, while integrating technical vulnerability scoring via CVE/CVSS.

Risk Scoring: Introduced the Combined Risk Score formula, which integrates technical vulnerability severity with organizational control effectiveness.

AI SAFE² v1.0: Foundational Structure

05 Jan 19:00
b9a429b

Choose a tag to compare

AI SAFE² v1.0: The Foundational Blueprint

Originally Released June 2025

The Origin Story

The AI SAFE² framework was born from a simple, stark reality: “AI is accelerating faster than our guardrails”. As enterprises rapidly integrated tools like GitHub Copilot and n8n into their workflows, they inadvertently birthed a massive, unmanaged workforce of Non-Human Identities (NHIs)—AI agents, service accounts, and CI/CD bots.

Our research revealed a critical security gap: organizations were granting these autonomous entities broad access to sensitive data and APIs without the governance, identity management, or safety nets used for human employees. v1.0 was released to provide the industry’s first foundational architecture to bridge this gap, moving beyond static checklists to a living governance strategy.

Pillar 1: Sanitize & Isolate

• **Topic 1: Sanitize** → Cleansing inputs, filtering unsafe or toxic data.
• **Topic 2: Isolate** → Containing agents, sandboxing, enforcing boundaries.

Pillar 2: Audit & Inventory

• **Topic 3: Audit** → Continuous verification, accountability, and activity tracking.
• **Topic 4: Inventory** → Maintaining a full map of assets, agents, workflows, and data.

Pillar 3: Fail-Safe & Recovery

• **Topic 5: Fail-Safe** → Pre-planned shutdowns, graceful error handling, minimizing blast radius.
• **Topic 6: Recovery** → Backups, rapid restoration, continuity planning.

Pillar 4: Engage & Monitor

• **Topic 7: Engage** → Human-in-the-loop, proactive intervention, interactive oversight.
• **Topic 8: Monitor** → Continuous observation, anomaly detection, real-time logging.

Pillar 5: Evolve & Educate

• **Topic 9: Evolve** → Adapting to new threats, updating playbooks, agile improvement.
• **Topic 10: Educate** → Training users/teams, spreading a culture of safe AI usage.

The Analogy: Supersonic Jets vs. Horse-Drawn Carriages

To understand why v1.0 was necessary, consider the speed of modern automation. Attempting to manage an autonomous enterprise with traditional, human-centric security models is akin to trying to direct a fleet of supersonic jets with traffic signals designed for horse-drawn carriages. v1.0 was built to be the engineering manual for those jets, ensuring that machine-speed actions remain within safe, governed flight paths.