PromptXploit

LLM Penetration Testing Framework - Discover vulnerabilities before attackers do

⚠️ READ DISCLAIMER - Authorized testing only

What is PromptXploit?

PromptXploit is a comprehensive security testing framework for LLM applications. Test your AI systems for vulnerabilities before deployment.

Key Features:

🎯 147 attack vectors across 17 categories
🧠 AI-powered adaptive mode - Learns defenses and crafts new attacks
🔍 Intelligence-based recon - Novel attack discovery
📊 JSON reporting - Detailed vulnerability analysis
🔌 Framework-agnostic - Works with any LLM

Quick Start (30 seconds)

1. Install

git clone https://github.com/Neural-alchemy/promptxploit
cd promptxploit
pip install -e .

2. Create Target

# my_target.py
def run(prompt: str) -> str:
    # Your LLM here
    return your_llm(prompt)

3. Run Scan

python -m promptxploit.main \
    --target my_target.py \
    --attacks attacks/ \
    --output scan.json

Done! Check scan.json for vulnerabilities.

Attack Taxonomy

PromptXploit tests 147 attacks across these categories:

LLM Attack Surface
├── Prompt Injection (8 variants)
│   ├── Direct instruction override
│   ├── Context confusion
│   └── Delimiter exploitation
├── Jailbreaks (10 variants)
│   ├── DAN (Do Anything Now)
│   ├── Developer mode
│   └── Persona manipulation
├── System Extraction (8 variants)
│   ├── Prompt leakage
│   ├── Configuration disclosure
│   └── Training data extraction
├── Encoding Attacks (8 variants)
│   ├── Base64 obfuscation
│   ├── ROT13/Caesar
│   └── Unicode tricks
├── Multi-Agent Exploitation (10 variants)
│   ├── Tool hijacking
│   ├── Agent confusion
│   └── Coordination attacks
├── RAG Poisoning (8 variants)
│   ├── Context injection
│   ├── Retrieval manipulation
│   └── Source confusion
└── [11 more categories...]

Usage Modes

Static Mode (Fast)

Test with all 147 pre-built attacks:

python -m promptxploit.main \
    --mode static \
    --target my_app.py \
    --attacks attacks/ \
    --output results.json

Adaptive Mode - Mutation

Evolve attacks if blocked:

python -m promptxploit.main \
    --mode adaptive \
    --adaptive-strategy mutation \
    --adaptive-api "YOUR_OPENAI_KEY" \
    --max-iterations 3 \
    --target my_app.py \
    --attacks attacks/jailbreak \
    --output adaptive.json

Adaptive Mode - Recon (Advanced) ⭐

Intelligence-based attack crafting:

python -m promptxploit.main \
    --mode adaptive \
    --adaptive-strategy recon \
    --probe-diversity 10 \
    --adaptive-api "YOUR_OPENAI_KEY" \
    --target my_app.py \
    --attacks attacks/ \
    --output recon.json

How recon works:

Phase 1: Tests diverse attacks to learn defenses
Phase 2: AI analyzes patterns and weaknesses
Phase 3: Crafts brand new attacks tailored to bypass defenses
Phase 4: Validates crafted attacks

Real-World Workflow

Pre-Deployment Testing

# 1. Test unprotected version
python -m promptxploit.main --target unprotected.py --attacks attacks/ --output before.json

# 2. Review vulnerabilities
cat before.json | jq '.[] | select(.verdict.verdict=="fail")'

# 3. Add PromptShield (our defense framework)
# See: https://github.com/Neural-alchemy/promptshield

# 4. Re-test protected version
python -m promptxploit.main --target protected.py --attacks attacks/ --output after.json

# 5. Verify fixes
diff before.json after.json

Monthly Security Audit

# Deep scan with adaptive recon
python -m promptxploit.main \
    --mode adaptive \
    --adaptive-strategy recon \
    --probe-diversity 15 \
    --max-iterations 5 \
    --adaptive-api $OPENAI_KEY \
    --target production.py \
    --attacks attacks/ \
    --output audit_$(date +%Y%m%d).json

Understanding Results

{
  "attack_id": "JB-003",
  "category": "jailbreak",
  "verdict": {
    "verdict": "fail",        // ⚠️ VULNERABLE
    "confidence": 0.9,
    "severity": 0.9,
    "rationale": "Successfully bypassed restrictions"
  },
  "risk": {
    "risk_score": 0.81,
    "risk_level": "critical"  // 🚨 Fix immediately
  }
}

Verdict types:

FAIL = Vulnerable (attack succeeded) 🚨
PARTIAL = Uncertain (needs review) ⚠️
PASS = Safe (attack blocked) ✅

Custom Attacks

Create your own attack patterns:

[
  {
    "id": "CUSTOM-001",
    "category": "my_category",
    "description": "My custom attack",
    "prompt": "Your attack prompt here"
  }
]

python -m promptxploit.main --target X --attacks my_attacks.json --output Y

See CUSTOM_ATTACKS.md for details.

Integration with PromptShield

Perfect combo: Test with PromptXploit → Fix with PromptShield

# Before: Vulnerable
def vulnerable_llm(prompt):
    return openai.chat(prompt)

# After: Protected
from promptshield import Shield
shield = Shield(level=5)

def protected_llm(prompt):
    check = shield.protect_input(prompt, "context")
    if check["blocked"]:
        return "Invalid input"
    return openai.chat(check["secured_context"])

Test again with PromptXploit → Verify 100% protection ✅

Why PromptXploit?

vs. Other Tools:

✅ Most comprehensive - 147 attacks (others: ~20)
✅ AI-powered adaptive - Unique recon-based intelligence
✅ Framework-agnostic - Any LLM (OpenAI, Claude, local, custom)
✅ Easy to extend - JSON-based attacks
✅ Production-ready - JSON reporting, CI/CD integration

vs. Manual testing:

⚡ Automated
🎯 Comprehensive coverage
📊 Consistent methodology
🔁 Repeatable

Responsible Use

⚠️ This is a security testing tool for authorized use only.

✅ Test your own applications
✅ Authorized penetration testing
✅ Security research
❌ Unauthorized access
❌ Malicious attacks

See DISCLAIMER.md for full ethical guidelines.

Documentation

Attack Taxonomy - All 147 attacks explained
Custom Attacks - Create your own tests
Responsible Use - Ethical guidelines
Examples - Usage examples

Contributing

We welcome contributions! See CONTRIBUTING.md.

Security researchers: Please follow responsible disclosure practices.

License

MIT License - see LICENSE

Citation

@software{promptxploit2024,
  title={PromptXploit: LLM Penetration Testing Framework},
  author={Neural Alchemy},
  year={2024},
  url={https://github.com/Neural-alchemy/promptxploit}
}

Built by Neural Alchemy

Test with PromptXploit | Protect with PromptShield

Website | PromptShield | Documentation

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
adaptive		adaptive
attacker		attacker
attacks		attacks
config		config
evaluator		evaluator
scoring		scoring
targets		targets
.gitignore		.gitignore
DISCLAIMER.md		DISCLAIMER.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
main.py		main.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PromptXploit

What is PromptXploit?

Quick Start (30 seconds)

1. Install

2. Create Target

3. Run Scan

Attack Taxonomy

Usage Modes

Static Mode (Fast)

Adaptive Mode - Mutation

Adaptive Mode - Recon (Advanced) ⭐

Real-World Workflow

Pre-Deployment Testing

Monthly Security Audit

Understanding Results

Custom Attacks

Integration with PromptShield

Why PromptXploit?

Responsible Use

Documentation

Contributing

License

Citation

About

Uh oh!

Releases

Packages

Languages

License

Neural-alchemy/promptxploit

Folders and files

Latest commit

History

Repository files navigation

PromptXploit

What is PromptXploit?

Quick Start (30 seconds)

1. Install

2. Create Target

3. Run Scan

Attack Taxonomy

Usage Modes

Static Mode (Fast)

Adaptive Mode - Mutation

Adaptive Mode - Recon (Advanced) ⭐

Real-World Workflow

Pre-Deployment Testing

Monthly Security Audit

Understanding Results

Custom Attacks

Integration with PromptShield

Why PromptXploit?

Responsible Use

Documentation

Contributing

License

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages