PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML Safety Workshop 2022
machine-learning agi language-models ai-safety adversarial-attacks ai-alignment ml-safety gpt-3 large-language-models prompt-engineering chain-of-thought agi-alignment
-
Updated
Feb 26, 2024 - Python