Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contribute new test - base64_injection! #19

Merged
merged 14 commits into from
Apr 16, 2024
Merged

Conversation

guy-ps
Copy link
Contributor

@guy-ps guy-ps commented Apr 16, 2024

Overview

This pull request introduces a new test, base64_injection. The purpose of this test is to enhance the robustness of LLMs by evaluating their response to encoded prompt injections. This test specifically addresses the scenario where prompts are encoded as base64 strings, which can be a potential vector for security vulnerabilities if not properly handled by the model.

Changes

  1. New Test Implementation: A new test named base64_injection has been added to our testing suite. This test utilizes a dataset of 203 encoded injection prompts to assess the model's ability to process and respond to base64-encoded inputs without executing unintended actions or exposing sensitive information. The new test implementation can be found in file base64_injection.py
  2. Dataset Integration: The injection prompts are stored in a .parquet file for efficient access and processing. The file is saved in the ps_fuzz/attack_data directory. I've introduced fastparquet as a dependency to facilitate the reading of this file format within our testing framework.
  3. Test Coverage: The test covers a diverse set of base64-encoded injections, ensuring comprehensive coverage across different potential security vulnerabilities in LLMs. The test was added to the attack_loader.py file.

Added Dependencies

fastparquet: This library has been added to efficiently handle reading from the .parquet file containing our prompt injection dataset. I updated setup.py as a result to ensure seamless integration and deployment.

Impact

The introduction of the base64_injection test is expected to significantly improve the security posture of our LLMs by providing a systematic approach to detect and mitigate prompt injection attacks. This will contribute to the overall reliability and trustworthiness of our models in production environments.

Testing

The new test has been integrated into our existing test suite and has been validated for correctness and performance impact. Detailed test results and logs can be found attached to this pull request.

@guy-ps guy-ps requested a review from vitaly-ps April 16, 2024 14:39
@vitaly-ps vitaly-ps merged commit b6ca4e1 into main Apr 16, 2024
2 checks passed
@vitaly-ps vitaly-ps deleted the contribute-new-test branch April 16, 2024 17:16
@vitaly-ps vitaly-ps restored the contribute-new-test branch April 16, 2024 17:16
@lior-ps lior-ps deleted the contribute-new-test branch April 17, 2024 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants