This corpus of questions and answers is made available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 Generic license (CC BY-NC-SA 4.0). Inquiries about commercial licensing should be directed to Prof. Norman Sadeh (sadeh@cs.cmu.edu). If you use the corpus (i.e., the questions, generated answers, or both) toward a publication, you must cite the following paper:
Lea Duesterwald, Ian Yang, Norman Sadeh, "Can a Cybersecurity Question Answering Assistant Help Change User Behavior? An In Situ Study", The Symposium on Usable Security and Privacy (USEC 2025), Feb 2025.
The above paper is also important for understanding the contents of the corpus, and the collection of questions and answers. The full paper can be found at www.usableprivacy.org/static/files/duesterwald_2024.pdf
This corpus contains 1,045 questions about every day cybersecurity that were asked by participants in an in situ user study. Each of these questions has two answers automatically generated by GPT 4: one answer with extra prompt engineering (for specific details about the prompt engineering, see the paper above) and one answer without extra prompt engineering. This data is formatted as a .csv file (security_qa_questions_and_answers.csv) with three columns: question_asked, answer_with_prompt_engineering, and answer_no_prompt_engineering.
- question_asked: the original question asked by the study participant.
- answer_with_prompt_engineering: the answer generated by GPT 4 to the asked question when extra prompt engineering was applied to the question.
- answer_no_prompt_engineering: the answer generated by GPT 4 to the asked question without extra prompt engineering applied to the question.
Each row of the .csv file represents one question, and its corresponding answers both with and without extra prompt engineering.
This research has been supported in part by grants from the National Science Foundation under the SaTC program (grants CNS-1914486) and under the REU program, the latter in part through CMU's RE-USE Program (NSF grant 2150217). Additional support was also provided by CMU's Block Center under its Responsible AI initiative.