Skip to content

Commit

Permalink
Update references.bib
Browse files Browse the repository at this point in the history
  • Loading branch information
kristiankersting committed Nov 12, 2023
1 parent e0d26f8 commit dd7bc00
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion references.bib
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,7 @@ @inproceedings{brack2023distilling
Anote = {./images/brack2023distilling.png},
title={Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge},
author={Manuel Brack and Patrick Schramowski and Kristian Kersting},
booktitle = {Working Notes of the AACL Workshop on the ART of Safety: Workshop on Adversarial testing and Red-Teaming for generative AI)},
booktitle = {Working Notes of the AACL Workshop on the ART of Safety(ARTS): Workshop on Adversarial testing and Red-Teaming for generative AI),
OPTbooktitle = {Proceedings of the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 13th International Joint Conference on Natural Language Processing (IJCNLP-AACL)},
year = {2023},
Note = {Text-conditioned image generation models have recently achieved astonishing image quality and alignment results. Consequently, they are employed in a fast-growing number of applications. Since they are highly data-driven, relying on billion-sized datasets randomly scraped from the web, they also produce unsafe content. As a contribution to the Adversarial Nibbler challenge, we distill a large set of over 1,000 potential adversarial inputs from existing safety benchmarks. Our analysis of the gathered prompts and corresponding images demonstrates the fragility of input filters and provides further insights into systematic safety issues in current generative image models.},
Expand Down

0 comments on commit dd7bc00

Please sign in to comment.