Skip to content

Commit

Permalink
Update index.html
Browse files Browse the repository at this point in the history
  • Loading branch information
AmanPriyanshu authored Aug 29, 2024
1 parent ebc43ef commit a4385d7
Showing 1 changed file with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,13 @@ <h1>FRACTURED-SORRY-Bench</h1>
<p class="subtitle">Framework for Revealing Attacks in Conversational Turns Undermining Refusal Efficacy and Defenses over SORRY-Bench</p>
</header>

<div class="section">
<h2>Get Started</h2>
<a href="https://github.com/AmanPriyanshu/FRACTURED-SORRY-Bench/" class="button">View on GitHub</a>
<a href="https://huggingface.co/datasets/AmanPriyanshu/FRACTURED-SORRY-Bench" class="button">Check out the Dataset</a>
<a href="https://amanpriyanshu.github.io/FRACTURED-SORRY-Bench/FRACTURED_SORRY_Bench.pdf" class="button">Read the Paper</a>
</div>

<div class="section">
<h2>About FRACTURED-SORRY-Bench</h2>
<p>FRACTURED-SORRY-Bench is a framework for evaluating the safety of Large Language Models (LLMs) against <span class="highlight">multi-turn conversational attacks</span>. Building upon the SORRY-Bench dataset, we propose a simple yet effective method for generating adversarial prompts by breaking down harmful queries into seemingly innocuous sub-questions.</p>
Expand Down Expand Up @@ -206,8 +213,9 @@ <h3>Decomposed Responses</h3>
</div>

<div class="section">
<h2>Get Started</h2>
<h2>Important Links</h2>
<a href="https://github.com/AmanPriyanshu/FRACTURED-SORRY-Bench/" class="button">View on GitHub</a>
<a href="https://huggingface.co/datasets/AmanPriyanshu/FRACTURED-SORRY-Bench" class="button">Check out the Dataset</a>
<a href="https://amanpriyanshu.github.io/FRACTURED-SORRY-Bench/FRACTURED_SORRY_Bench.pdf" class="button">Read the Paper</a>
</div>

Expand Down

0 comments on commit a4385d7

Please sign in to comment.