-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathprompt.txt
26 lines (21 loc) · 2.54 KB
/
prompt.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
It is known that there are currently seven main types of PII (Personally Identifiable Information):
(1)NAME_STUDENT - The full or partial name of a student that is not necessarily the author of the essay. This excludes instructors, authors, and other person names.
(2)EMAIL - A student’s email address.
(3)USERNAME - A student's username on any platform.
(4)ID_NUM - A number or sequence of characters that could be used to identify a student, such as a student ID or a social security number.
(5)PHONE_NUM - A phone number associated with a student.
(6)URL_PERSONAL - A URL that might be used to identify a student.
(7)STREET_ADDRESS - A full or partial street address that is associated with the student, such as their home address.
At the same time,token labels are presented in BIO (Beginning, Inner, Outer) format. The PII type is prefixed with “B-” when it is the beginning of an entity. If the token is a continuation of an entity, it is prefixed with “I-”. Tokens that are not PII are labeled “O”, which means labels are like "B-NAME_STUDENT", "I-USERNAME". Thus, we have 15 kinds of labels in total (7 * 2 + 1).
I now need you to generate a complex English text of about 500 words. The text should include various possible labels as mentioned above, and it should be tokenized as example below. Attention: commas, periods, and other connecting symbols are all counted as separate tokens.
text example:
Design Thinking for innovation reflexion-Avril 2021-Nathalie Sylla\n\nChallenge & selection\n\nThe tool I use to help all stakeholders finding their way through the complexity of a project is the mind map.\n\nWhat exactly is a mind map?
token example:
["Design","Thinking","for","innovation","reflexion","-","Avril","2021","-","Nathalie","Sylla","\n\n","Challenge","&","selection","\n\n","The","tool","I","use","to","help","all","stakeholders","finding","their","way","through","the","complexity","of","a","project","is","the"," ","mind","map",".","\n\n","What","exactly","is","a","mind","map","?"]
label example:
["O","O","O","O","O","O","O","O","O","B-NAME_STUDENT","I-NAME_STUDENT","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O","O"]
Now you need to generate a completely new text and return its label. The text should be about diverse topics and cover all the possible tags mentioned above. Also, the position distribution of label should be average. Only return the text, token, and label, don't output any other single word.
Return format:
text: {}
token: {}
label: {}