OWASP GenAI Data Security Initiative

Part of the OWASP GenAI Security Project · Data Security Initiative Page

Overview

The OWASP GenAI Data Security Initiative addresses the data security risks unique to Large Language Models, Generative AI, and Agentic AI systems. As AI introduces new data surfaces — prompts, context windows, embeddings, vector stores, agent traces, tool payloads — and new failure modes — prompt-driven extraction, cross-session bleed, inference attacks, plugin data drains — traditional data security frameworks no longer map cleanly to what needs protection.

This initiative produces community-developed, peer-reviewed guidance to help organizations understand and address these emerging challenges. All materials are released under CC BY-SA 4.0.

Key Deliverables

GenAI Data Security Risks and Mitigations 2026 (v1.0)

📄 Download PDF · Released March 2026

A comprehensive enumeration of 21 data security risks specific to GenAI systems, each with tiered mitigations (Foundational → Hardening → Advanced) designed for organizations at different maturity levels. This is not a Top 10 — it is a structured risk taxonomy following data as it moves through a GenAI system.

Cross-referenced to the OWASP Top 10 for LLM Applications and the OWASP Top 10 for Agentic Applications 2026.

DSGAI Risk Taxonomy (21 entries)

ID	Risk
DSGAI01	Sensitive Data Leakage
DSGAI02	Agent Identity & Credential Exposure
DSGAI03	Shadow AI & Unsanctioned Data Flows
DSGAI04	Data, Model & Artifact Poisoning
DSGAI05	Data Integrity & Validation Failures
DSGAI06	Tool, Plugin & Agent Data Exchange Risks
DSGAI07	Data Governance, Lifecycle & Classification for AI Systems
DSGAI08	Non-Compliance & Regulatory Violations
DSGAI09	Multimodal Capture & Cross-Channel Data Leakage
DSGAI10	Synthetic Data, Anonymization & Transformation Pitfalls
DSGAI11	Cross-Context & Multi-User Conversation Bleed
DSGAI12	Unsafe Natural-Language Data Gateways (LLM-to-SQL/Graph)
DSGAI13	Vector Store Platform Data Security
DSGAI14	Excessive Telemetry & Monitoring Leakage
DSGAI15	Over-Broad Context Windows & Prompt Over-Sharing
DSGAI16	Endpoint & Browser Assistant Overreach
DSGAI17	Data Availability & Resilience Failures in AI Pipelines
DSGAI18	Inference & Data Reconstruction
DSGAI19	Human-in-the-Loop & Labeler Overexposure
DSGAI20	Model Exfiltration & IP Replication
DSGAI21	Disinformation & Integrity Attacks via Data Poisoning

Each entry follows a consistent structure: attack scenario in GenAI-specific terms, attacker capabilities, impact, and tiered mitigations with scope annotations (Buy / Build / Both).

LLM and GenAI Data Security Best Practices 2025 (v1.0)

📄 Download PDF · Released February 2025

The companion implementation guide covering data security principles, secure deployment architectures, monitoring and auditing guidelines, governance models, and future trends. Topics include data minimization, encryption strategies, access control for LLM pipelines, securing data flows in LLM agents, and regulatory compliance alignment.

Framework Crosswalk

The initiative maintains a comprehensive crosswalk of GenAI data security risks against 20 widely recognized cybersecurity and AI frameworks. The crosswalk covers all 41 entries across the OWASP LLM Top 10 2025, Agentic Top 10 2026, and DSGAI Risk Taxonomy.

🌐 Interactive Crosswalk Explorer →

The web application provides an interactive interface for exploring crosswalk mappings, framework coverage, incident data, security tooling, implementation recipes, and a glossary of key terms.

Crosswalk source files are maintained in the /crosswalk directory.

Frameworks Covered

Cybersecurity & AI Standards — NIST CSF 2.0, NIST AI RMF 1.0, ISO/IEC 27001, ISO/IEC 42001, ISO/IEC 23894, ISO/IEC 5338

Threat Modeling & Adversarial — MITRE ATT&CK, MITRE ATLAS, STRIDE, FAIR

Application Security & Compliance — CIS Controls, ASVS, SOC 2, PCI DSS, ENISA, CycloneDX ML SBOM

Governance & Architecture — COBIT, OPENCRE, SAMM, BSIMM

OT / Industrial AI Security — ISA/IEC 62443

Workstreams

Data Collection — Public open call for real-world vulnerability data and incident reports related to LLMs and GenAI applications. Submit data through the Slack channel or by opening an issue in this repository.

Framework Crosswalk — Crosswalking the OWASP Top 10 for LLM Applications 2025 and Agentic Top 10 2026 to recognized cybersecurity and AI frameworks. See the Interactive Crosswalk Explorer and the /crosswalk directory for current crosswalk data.

Data Security Risks & Best Practices — Research, authoring, and maintenance of the initiative's two published white papers: the DSGAI risk taxonomy and the companion implementation guide. This workstream also maintains alignment with the broader OWASP GenAI Security Project deliverables.

Community Datasets — Development of open, community-contributed datasets for research, benchmarking, and security testing. See the /datasets directory for current and proposed datasets.

Data Validation — Automated and peer-reviewed validation of all contributed data. Scripts in /data_validation ensure integrity and accuracy.

Datasets

Current

Vulnerability Dataset — Real-world vulnerabilities affecting LLM applications
Exploit Dataset — Documented exploits and attack techniques targeting LLMs
Risk Assessment Dataset — Mapped risk assessments for various LLM deployments

Proposed Community Datasets

We are looking for contributors to help build datasets for the broader AI security community. Priority areas include:

GenAI Data Security Incident Database — Structured, anonymized catalog of real-world data security incidents in GenAI deployments (leakage events, cross-tenant bleed, RAG exfiltration, agent credential exposure), focused on failure modes enumerated in DSGAI01–DSGAI21.
Prompt Injection & Data Extraction Test Cases — Adversarial prompts and extraction techniques mapped to the DSGAI risk taxonomy for red-teaming and regression testing.
RAG Poisoning & Retrieval Integrity Dataset — Benign and adversarial document sets for testing vector store integrity, retrieval-time redaction, and poisoning detection.
Cross-Framework Control Crosswalk Dataset — Machine-readable crosswalk of DSGAI risks to controls across NIST CSF 2.0, NIST AI RMF, MITRE ATLAS, ISO/IEC 42001, and OWASP LLM Top 10 for automated compliance gap analysis.
Agent Data Flow & Tool Exchange Traces — Sanitized traces of agentic AI tool calls, plugin data exchanges, and delegation chains supporting research into DSGAI06 and related agent security patterns.

If you are interested in contributing, join the Slack channel or reach out directly.

AI Risk Database Collaboration

The initiative collaborates with leading AI risk authorities to consolidate efforts and avoid fragmented approaches to risk identification:

Community members are encouraged to report new GenAI data security risks to these organizations as well as to this initiative.

How to Contribute

Slack: Join #team-genai-data-security-initiative on the OWASP Slack workspace · New to OWASP Slack? Join here
GitHub: Submit issues or pull requests to this repository
Contact: Reach out to Emmanuel Guilherme Junior (Initiative Lead) via Slack or LinkedIn

All contributions are welcome — from security practitioners, AI engineers, researchers, compliance professionals, and anyone working to secure GenAI systems.

OWASP GenAI Security Project — Initiatives

This initiative is one of several under the OWASP GenAI Security Project:

Initiative	Description	Link
Agentic App Security	Securing autonomous and agentic AI systems, including the Top 10 for Agentic Applications 2026	Initiative Page
AI Red Teaming & Evaluation	Methodology, benchmarks, and tools for adversarial testing of GenAI systems	Initiative Page
AI Security Solutions Landscape	Vendor-agnostic mapping of the GenAI security tooling ecosystem	Solutions Directory
AIBOM Generator	Open-source tool for generating AI Bills of Materials for supply chain transparency	Initiative Page
Data Security	GenAI data security risks, mitigations, best practices, and framework crosswalks (this initiative)	Initiative Page
Governance Checklist (COMPASS)	Cybersecurity and governance checklist for LLM and GenAI deployments	Resource Page
Secure AI Adoption	Center of Excellence guidance for safe, ethical, and secure organizational AI adoption	Initiative Page
Threat Intelligence	Research into LLM-enabled exploit generation and deepfake threat preparation	Initiative Page

Acknowledgments

Initiative Lead: Emmanuel Guilherme Junior

This initiative is made possible by the contributions of its authors, contributors, and reviewers from across the global AI security community. Thank you to everyone who has helped build and shape this community resource. Full contributor lists are included in each published document.

License

All materials produced by this initiative are licensed under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).

You are free to share and adapt the material for any purpose, including commercial, under the following terms: provide appropriate attribution including the project name and asset name, and distribute any derivative works under the same license.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
crosswalk		crosswalk
data_validation		data_validation
datasets		datasets
entrydatasupport/2_0/2_0_candidates		entrydatasupport/2_0/2_0_candidates
literature		literature
Contributing.md		Contributing.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OWASP GenAI Data Security Initiative

Overview

Key Deliverables

GenAI Data Security Risks and Mitigations 2026 (v1.0)

LLM and GenAI Data Security Best Practices 2025 (v1.0)

Framework Crosswalk

🌐 Interactive Crosswalk Explorer →

Frameworks Covered

Workstreams

Datasets

Current

Proposed Community Datasets

AI Risk Database Collaboration

How to Contribute

OWASP GenAI Security Project — Initiatives

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OWASP GenAI Data Security Initiative

Overview

Key Deliverables

GenAI Data Security Risks and Mitigations 2026 (v1.0)

LLM and GenAI Data Security Best Practices 2025 (v1.0)

Framework Crosswalk

🌐 Interactive Crosswalk Explorer →

Frameworks Covered

Workstreams

Datasets

Current

Proposed Community Datasets

AI Risk Database Collaboration

How to Contribute

OWASP GenAI Security Project — Initiatives

Acknowledgments

License

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages