Skip to content

Commit

Permalink
Website update
Browse files Browse the repository at this point in the history
  • Loading branch information
rand0musername committed Jun 24, 2024
1 parent 917fa09 commit 2f9c9ab
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 14 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
.history/
.history/
.jekyll-cache/
_site/
24 changes: 11 additions & 13 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ <h2 class="title publication-title"><a href=""><img class="logo" src="static/ima
<span class="icon">
<i class="fas fa-file-pdf"></i>
</span>
<span>Paper</span>
<span>Paper (ICML 2024)</span>
</a>
</span>

Expand Down Expand Up @@ -211,8 +211,8 @@ <h2 class="title is-3">What are <span class="devilish">spoofing attacks</span>?<
aligned to refuse any harmful prompts. We show some <a href="#examples">examples</a> below.
<li> In our <a
href="https://files.sri.inf.ethz.ch/website/papers/jovanovic2024watermarkstealing.pdf">experiments</a>
we additionally demonstrate similar success across several other schemes, study how our attack scales
with query cost, and show success in the setting where the attacker paraphrases existing (<span
we additionally demonstrate similar success across several other schemes and experimental settings, study how our attack
scales with query cost, and show success in the setting where the attacker paraphrases existing (<span
class="red">non-watermarked</span>) text.
</ul>
</p>
Expand All @@ -230,8 +230,8 @@ <h2 class="title is-3">What are <span class="devilish">scrubbing attacks</span>?
<img class="figure-scrubbing" src="static/images/scrubbing.png" alt="Scrubbing" class="">
<div class="content has-text-centered has-text-justified">
<div class="caption">
Our stealing attacker can also strip the watermark from LLM outputs even in challenging settings (85%
success rate, 1% before our work), concealing misuse such as plagiarism.
Our attacker can also strip the watermark from LLM outputs even in challenging settings (>80%
success, below 25% before our work), concealing misuse such as plagiarism.
</div>
<p>
<ul>
Expand All @@ -246,10 +246,10 @@ <h2 class="title is-3">What are <span class="devilish">scrubbing attacks</span>?
<li> We show that this is not the case under the threat of watermark stealing. Our attacker can apply its
partial knowledge of the watermark rules (<img class="intext"
src="static/images/intext/knowledge_full.png">) to significantly boost the success rate of scrubbing
on long texts with no need for additional queries to the server. Notably, we boost scrubbing success
<b>from 1% to 85%</b> for the <a href="https://arxiv.org/abs/2306.04634"
class="textsc">KGW2-SelfHash</a> scheme. Similar results are obtained for several other schemes, as we
show in our experimental evaluation in <a
on long texts with no need for additional queries to the server. Notably, we boost the scrubbing success of a popular
paraphraser from <b>from 1% to 85%</b> for the <a href="https://arxiv.org/abs/2306.04634"
class="textsc">KGW2-SelfHash</a> scheme. The best baseline we are aware of achieves <b>below 25%</b>.
Similar results are obtained for several other schemes, as we show in our experimental evaluation in <a
href="https://files.sri.inf.ethz.ch/website/papers/jovanovic2024watermarkstealing.pdf">the paper</a>.
Below, we also show several <a href="#examples">examples</a>.
<li> Our results challenge the common belief that robustness to <span class="devilish">spoofing
Expand Down Expand Up @@ -522,10 +522,8 @@ <h2 class="title is-5">Citation</h2>
<pre id="BibTeX">@article{jovanovic2024watermarkstealing,
title = {Watermark Stealing in Large Language Models},
author = {Jovanović, Nikola and Staab, Robin and Vechev, Martin},
year = {2024},
eprint={2402.19361},
archivePrefix={arXiv},
primaryClass={cs.LG}
jorunal = {{ICML}},
year = {2024}
}</pre>
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
Expand Down

0 comments on commit 2f9c9ab

Please sign in to comment.