Skip to content

"Universal AI security framework - Protect LLM applications from prompt injection, jailbreaks, and adversarial attacks. Works with OpenAI, Anthropic, LangChain, and any LLM."

License

Notifications You must be signed in to change notification settings

Neural-alchemy/promptshield

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PromptShield

Universal AI Security Framework - Protect LLM applications from prompt injection and adversarial attacks

License: MIT Python 3.8+


What is PromptShield?

PromptShield is a lightweight security framework that protects AI applications from:

  • 🚫 Prompt injection attacks
  • πŸ”“ Jailbreak attempts
  • πŸ“€ System prompt extraction
  • πŸ” PII leakage
  • 🎭 Dozens of attack variants

Key Features:

  • ⚑ Fast: Pattern matching in ~0.1ms (semantic mode: ~20-30ms)
  • πŸ”Œ Framework-agnostic: Works with any LLM (OpenAI, Anthropic, local models)
  • 🎯 Simple: 3 lines of code to integrate
  • πŸ›‘οΈ Comprehensive: Multiple attack categories + semantic generalization

Installation

# Install from source (PyPI package coming soon)
git clone https://github.com/Neural-alchemy/promptshield
cd promptshield
pip install -e .

Quick Start

from promptshield import Shield

# Initialize shield
shield = Shield(level=5)  # Production security

# Protect your LLM
def safe_llm(user_input: str):
    # 1. Validate input
    result = shield.protect_input(
        user_input=user_input,
        system_context="You are a helpful AI"
    )
    
    if result["blocked"]:
        return "⚠️ Security issue detected"
    
    # 2. Safe LLM call
    response = your_llm(result["secured_context"])
    
    # 3. Sanitize output
    output = shield.protect_output(response, result["metadata"])
    
    return output["safe_response"]

That's it! Your AI is now protected.


Examples

OpenAI

from openai import OpenAI
from promptshield import Shield

client = OpenAI()
shield = Shield(level=5)

def secure_chat(prompt: str):
    check = shield.protect_input(prompt, "GPT Assistant")
    if check["blocked"]:
        return "Blocked"
    
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": check["secured_context"]}]
    )
    
    output = shield.protect_output(
        response.choices[0].message.content,
        check["metadata"]
    )
    return output["safe_response"]

LangChain

from langchain.llms import OpenAI
from promptshield import Shield

llm = OpenAI()
shield = Shield(level=5)

def secure_chain(query: str):
    check = shield.protect_input(query, "Assistant")
    if check["blocked"]:
        return "Blocked"
    
    result = llm(check["secured_context"])
    output = shield.protect_output(result, check["metadata"])
    return output["safe_response"]

Anthropic Claude

import anthropic
from promptshield import Shield

client = anthropic.Anthropic()
shield = Shield(level=5)

def secure_claude(prompt: str):
    check = shield.protect_input(prompt, "Claude")
    if check["blocked"]:
        return "Blocked"
    
    message = client.messages.create(
        model="claude-3-opus-20240229",
        messages=[{"role": "user", "content": check["secured_context"]}]
    )
    
    output = shield.protect_output(message.content[0].text, check["metadata"])
    return output["safe_response"]

See examples/ for more integrations.


Security Levels

Choose the right level for your needs:

Level Protection Latency Use Case
L3 Pattern-based ~0.1ms Fast, pattern matching only
L5 Production ~0.1-30ms Recommended ⭐ Pattern + semantic (if enabled)
Shield(level=3)  # Fast pattern-only protection
Shield(level=5)  # Production (pattern + optional semantic)

Performance breakdown:

  • Pattern matching: ~0.1ms
  • Semantic matching (optional): +20-30ms
  • PII detection: +1-5ms
  • Output sanitization: ~1-2ms

Attack Protection

PromptShield detects and blocks:

  • Prompt injection ("Ignore all previous instructions")
  • Jailbreaks ("You are DAN, an AI without restrictions")
  • System prompt extraction ("What are your instructions?")
  • PII leakage (emails, SSNs, credit cards)
  • Encoding attacks (base64, ROT13, unicode)
  • Context manipulation
  • Output manipulation
  • And 40+ more attack types

Performance

Pattern-only mode (L3):

  • Latency: ~0.1ms per check
  • Throughput: 10,000+ req/s
  • Memory: <5MB

Production mode (L5):

  • Pattern matching: ~0.1ms
  • Semantic (if enabled): +20-30ms
  • Total: ~0.1-30ms depending on features
  • Memory: <10MB (or +500MB if semantic models loaded)

Honest benchmarks: Pattern matching is extremely fast. Semantic matching adds latency but improves detection. Choose based on your latency requirements.


Documentation


Why PromptShield?

vs. LLM Guard

  • ⚑ 10x faster (0.05ms vs 0.5ms)
  • πŸ”Œ Framework-agnostic (they're FastAPI-only)

vs. Guardrails AI

  • 🎯 Attack-focused (they're validation-focused)
  • πŸš€ Simpler (3 lines vs complex schemas)

vs. DIY Solutions

  • βœ… Battle-tested (51 attack patterns)
  • ⚑ Optimized (<0.1ms latency)
  • πŸ”„ Maintained (regular updates)

Contributing

We welcome contributions! See CONTRIBUTING.md.


License

MIT License - see LICENSE


Citation

@software{promptshield2024,
  title={PromptShield: Universal AI Security Framework},
  author={Neural Alchemy},
  year={2024},
  url={https://github.com/neuralalchemy/promptshield}
}

About

"Universal AI security framework - Protect LLM applications from prompt injection, jailbreaks, and adversarial attacks. Works with OpenAI, Anthropic, LangChain, and any LLM."

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages