LLM Penetration Testing Framework - Discover vulnerabilities before attackers do
PromptXploit is a comprehensive security testing framework for LLM applications. Test your AI systems for vulnerabilities before deployment.
Key Features:
- π― 147 attack vectors across 17 categories
- π§ AI-powered adaptive mode - Learns defenses and crafts new attacks
- π Intelligence-based recon - Novel attack discovery
- π JSON reporting - Detailed vulnerability analysis
- π Framework-agnostic - Works with any LLM
git clone https://github.com/Neural-alchemy/promptxploit
cd promptxploit
pip install -e .# my_target.py
def run(prompt: str) -> str:
# Your LLM here
return your_llm(prompt)python -m promptxploit.main \
--target my_target.py \
--attacks attacks/ \
--output scan.jsonDone! Check scan.json for vulnerabilities.
PromptXploit tests 147 attacks across these categories:
LLM Attack Surface
βββ Prompt Injection (8 variants)
β βββ Direct instruction override
β βββ Context confusion
β βββ Delimiter exploitation
βββ Jailbreaks (10 variants)
β βββ DAN (Do Anything Now)
β βββ Developer mode
β βββ Persona manipulation
βββ System Extraction (8 variants)
β βββ Prompt leakage
β βββ Configuration disclosure
β βββ Training data extraction
βββ Encoding Attacks (8 variants)
β βββ Base64 obfuscation
β βββ ROT13/Caesar
β βββ Unicode tricks
βββ Multi-Agent Exploitation (10 variants)
β βββ Tool hijacking
β βββ Agent confusion
β βββ Coordination attacks
βββ RAG Poisoning (8 variants)
β βββ Context injection
β βββ Retrieval manipulation
β βββ Source confusion
βββ [11 more categories...]
Test with all 147 pre-built attacks:
python -m promptxploit.main \
--mode static \
--target my_app.py \
--attacks attacks/ \
--output results.jsonEvolve attacks if blocked:
python -m promptxploit.main \
--mode adaptive \
--adaptive-strategy mutation \
--adaptive-api "YOUR_OPENAI_KEY" \
--max-iterations 3 \
--target my_app.py \
--attacks attacks/jailbreak \
--output adaptive.jsonIntelligence-based attack crafting:
python -m promptxploit.main \
--mode adaptive \
--adaptive-strategy recon \
--probe-diversity 10 \
--adaptive-api "YOUR_OPENAI_KEY" \
--target my_app.py \
--attacks attacks/ \
--output recon.jsonHow recon works:
- Phase 1: Tests diverse attacks to learn defenses
- Phase 2: AI analyzes patterns and weaknesses
- Phase 3: Crafts brand new attacks tailored to bypass defenses
- Phase 4: Validates crafted attacks
# 1. Test unprotected version
python -m promptxploit.main --target unprotected.py --attacks attacks/ --output before.json
# 2. Review vulnerabilities
cat before.json | jq '.[] | select(.verdict.verdict=="fail")'
# 3. Add PromptShield (our defense framework)
# See: https://github.com/Neural-alchemy/promptshield
# 4. Re-test protected version
python -m promptxploit.main --target protected.py --attacks attacks/ --output after.json
# 5. Verify fixes
diff before.json after.json# Deep scan with adaptive recon
python -m promptxploit.main \
--mode adaptive \
--adaptive-strategy recon \
--probe-diversity 15 \
--max-iterations 5 \
--adaptive-api $OPENAI_KEY \
--target production.py \
--attacks attacks/ \
--output audit_$(date +%Y%m%d).json{
"attack_id": "JB-003",
"category": "jailbreak",
"verdict": {
"verdict": "fail", // β οΈ VULNERABLE
"confidence": 0.9,
"severity": 0.9,
"rationale": "Successfully bypassed restrictions"
},
"risk": {
"risk_score": 0.81,
"risk_level": "critical" // π¨ Fix immediately
}
}Verdict types:
- FAIL = Vulnerable (attack succeeded) π¨
- PARTIAL = Uncertain (needs review)
β οΈ - PASS = Safe (attack blocked) β
Create your own attack patterns:
[
{
"id": "CUSTOM-001",
"category": "my_category",
"description": "My custom attack",
"prompt": "Your attack prompt here"
}
]python -m promptxploit.main --target X --attacks my_attacks.json --output YSee CUSTOM_ATTACKS.md for details.
Perfect combo: Test with PromptXploit β Fix with PromptShield
# Before: Vulnerable
def vulnerable_llm(prompt):
return openai.chat(prompt)
# After: Protected
from promptshield import Shield
shield = Shield(level=5)
def protected_llm(prompt):
check = shield.protect_input(prompt, "context")
if check["blocked"]:
return "Invalid input"
return openai.chat(check["secured_context"])Test again with PromptXploit β Verify 100% protection β
vs. Other Tools:
- β Most comprehensive - 147 attacks (others: ~20)
- β AI-powered adaptive - Unique recon-based intelligence
- β Framework-agnostic - Any LLM (OpenAI, Claude, local, custom)
- β Easy to extend - JSON-based attacks
- β Production-ready - JSON reporting, CI/CD integration
vs. Manual testing:
- β‘ Automated
- π― Comprehensive coverage
- π Consistent methodology
- π Repeatable
- β Test your own applications
- β Authorized penetration testing
- β Security research
- β Unauthorized access
- β Malicious attacks
See DISCLAIMER.md for full ethical guidelines.
- Attack Taxonomy - All 147 attacks explained
- Custom Attacks - Create your own tests
- Responsible Use - Ethical guidelines
- Examples - Usage examples
We welcome contributions! See CONTRIBUTING.md.
Security researchers: Please follow responsible disclosure practices.
MIT License - see LICENSE
@software{promptxploit2024,
title={PromptXploit: LLM Penetration Testing Framework},
author={Neural Alchemy},
year={2024},
url={https://github.com/Neural-alchemy/promptxploit}
}Built by Neural Alchemy
Test with PromptXploit | Protect with PromptShield