jailbreaking
Here are 32 public repositories matching this topic...
Frida script to bypass the iOS application Jailbreak Detection
-
Updated
Mar 5, 2019 - JavaScript
Does Refusal Training in LLMs Generalize to the Past Tense? [NeurIPS 2024 Safe Generative AI Workshop (Oral)]
-
Updated
Oct 13, 2024 - Python
An extensive prompt to make a friendly persona from a chatbot-like model like ChatGPT
-
Updated
Apr 19, 2023
Security Kit is a lightweight framework that helps to achieve a security layer
-
Updated
Sep 9, 2023 - Swift
During the Development of Suave7 and it's Predecessors, we've created a lot of Icons and UI-Images and we would like to share them with you. The Theme Developer Kit contains nearly 5.600 Icons, more than 380 Photoshop-Templates and 100 Pixelmator-Documents. With this Package you can customize every App from the App Store …
-
Updated
Oct 16, 2024
iOS APT distribution repository for rootful and rootless jailbreaks
-
Updated
Dec 13, 2024 - JavaScript
Customizable Dark Mode Extension for iOS 13+
-
Updated
Nov 10, 2020 - Logos
Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring 2025). LLM architectures, training paradigms (pre- and post-training, alignment), test-time computation, reasoning, safety and robustness (jailbreaking, oversight, uncertainty), representations, interpretability (circuits), etc.
-
Updated
Dec 18, 2024
Source code for bypass tweaks hosted under https://github.com/hekatos/repo. Licensed under 0BSD except submodules
-
Updated
Feb 17, 2022 - Logos
LV-Crew.org_(LVC)_-_Howto_-_iPhones
-
Updated
Jul 6, 2017
This repository contains the code for the paper "Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks" by Abhinav Rao, Sachin Vashishta*, Atharva Naik*, Somak Aditya, and Monojit Choudhury, accepted at LREC-CoLING 2024
-
Updated
May 22, 2024 - Jupyter Notebook
Your best llm security paper library
-
Updated
Sep 18, 2024
Updater script for iOS-OTA-Downgrader.
-
Updated
Mar 23, 2023 - Shell
HITC reborn: faster, better and prettier
-
Updated
Jan 12, 2021 - HTML
"ChatGPT Evil Confidant Mode" delves into a controversial and unethical use of AI, highlighting how specific prompts can generate harmful and malicious responses from ChatGPT.
-
Updated
Jun 7, 2024
Improve this page
Add a description, image, and links to the jailbreaking topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the jailbreaking topic, visit your repo's landing page and select "manage topics."