-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathrag_prompt_template.py
369 lines (265 loc) · 12.9 KB
/
rag_prompt_template.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
relation_list = [
"temporally follows",
"after",
"due to",
"has realization",
"associated with",
"has definitional manifestation",
"associated finding",
"associated aetiologic finding",
"associated etiologic finding",
"interprets",
"associated morphology",
"causative agent",
"course",
"finding site",
"temporally related to",
"pathological process",
"direct morphology",
"is modification of",
"measures",
"direct substance",
"has active ingredient",
"using",
"part of"
]
# ======================================
# snomed concepts extraction zbeta
# ======================================
snomed_extraction_prompt_zbeta = """\
Extract the most likely concepts with type from the given context with the format of "(concept; type)".\
Here is the context: {text}.\
concepts:\
\
Note: Only the SNOMED CT concepts are allowed; Compound phrase first; Remove repetition; Only output concepts and types.\
"""
# ======================================
# snomed concepts extraction
# ======================================
snomed_extraction_prompt = """\
Here is the context: {text}.\
Task: Extract the SNOMED CT triplets from the given context with the format of (concept ; is ; type).\
Here is the optional type list: [disorder, clinical finding, substance, morphologically abnormal structures, organism].\
The steps are as follows:\
1. extract the concept from the given context sentence, using the retrieved sub-graph.
2. select the most likely type from the list for the extracted concept.
3. output the triplets in the format of (concept ; is ; type) strictly.\
\
triplets:\
\
Note: Only output the triplets.\
"""
# ======================================
# snomed relation extraction
# ======================================
snomed_relation_extraction_prompt = """\
Here is the context: {text}.\
Task: Extract the SNOMED CT triples from the given context with the format of (concept 1 ; relation ; concept 2).\
Here is the optional relation list: [temporally follows, after, due to, has realization, associated with, has definitional manifestation,
associated finding, associated aetiologic finding, associated etiologic finding, interprets, associated morphology, causative agent, course,
finding site, temporally related to, pathological process, direct morphology, is modification of, measures, direct substance, has active ingredient, using, part of].\
The steps are as follows:\
1. extract the concept 1 and concept 2 from the given context sentence, using the retrieved sub-graph.
2. select ONE most likely relation from the list for the extracted concepts.
3. output the triplets in the format of (concept 1 ; relation ; concept 2) strictly.\
\
Provide your answer as follows:
Answer:::
Triples: (The extracted triples)\
Answer End:::\
You MUST provide values for 'Triples:' in your answer.\
"""
# ======================================
# snomed description generation
# ======================================
snomed_description_generation_prompt = """\
Here is the context: {text}.\
Here is the optional relation list: [temporally follows, after, due to, has realization, associated with, has definitional manifestation,
associated finding, associated aetiologic finding, associated etiologic finding, interprets, associated morphology, causative agent, course,
finding site, temporally related to, pathological process, direct morphology, is modification of, measures, direct substance, has active ingredient, using, part of].\
Task: Generate the SNOMED CT descriptions for the given concept.
The steps are as follows:
1. extract the concept 1 from the given context sentence, using the retrieved sub-graph.
2. generate the concept 2 that can describe the concept 1, and select ONE most likely relation from the list for the concept 1.
3. output (concept 1 ; relation ; concept 2) strictly as one generated description.
4. Each extracted concept could have multiple descriptions.\
Provide your answer as follows:
Answer:::
[Extracted Concept] (The generated description) (The generated description)\
Answer End:::\
You MUST provide values for 'Extracted Concept' and 'The generated descriptions' in your answer.\
"""
# ======================================
# BC5CDR entity-type extraction
# ======================================
BC5CDR_extraction_prompt = """\
Here is the context: {text}.\
Task: Extract the entity-type pairs from the given context with the format of (entity ; type).\
Here is the type list: [Disorder, Substance].\
The steps are as follows:\
1. extract the entity from the given context abstract, using the retrieved sub-graph.
2. select ONE most likely type from the list for the extracted entity.
3. output the pairs in the format of (entity ; type) strictly.
4. repeat the step 1 to step 3.\
\
Provide your answer as follows:
Answer:::
Pairs: (All extracted pairs)\
Answer End:::\
Requirements:\
You MUST provide values for 'Pairs:' in your answer. \
ONLY use the type in the type list: [Disorder, Substance].\
Extract as many valid entity-type pairs as possible from the given context abstract.\
"""
# ======================================
# BC5CDR entity-type extraction with additional entities
# ======================================
BC5CDR_extraction_prompt_with_entities = """\
You are a medical professional working in a hospital. You have been given a medical abstract, a list of entities, and a list of types. Your task is to link the entities to the most likely type from the type list.
Here is the abstract: {text}.\
Here is the type list: [Disorder, Substance].\
Here is the list of entities for consideration: {entities}.\
Task: link the entity and the type and output entity-type pairs with the format of (entity ; type).\
The steps are as follows:
1. for each entity in {entities}, link it to the most likely type from the type list. if you cannot find a suitable type, ignore the entity.
2. if you find more entities in the abstract, extract them and link them to the most likely type.
3. output the pairs in the format of (entity ; type) strictly.\
\
Provide your answer as follows:
Answer:::
Pairs: (entity ; type)
Answer End:::\
Requirements:
You MUST provide values for 'Pairs:' in your answer.
ONLY use the type in the type list: [Disorder, Substance].
ONLY output valid entity-type pairs without any reasoning.
"""
# ======================================
# NCBIdevelopset entity-type extraction
# ======================================
NCBIdevelopset_extraction_prompt = """\
Here is the context: {text}.\
Task: Extract the entity-type pairs from the given context with the format of (entity ; type).\
Here is the type list: ['DiseaseClass', 'SpecificDisease', 'Modifier', 'CompositeMention'].\
The steps are as follows:\
1. extract the entity from the given context abstract, using the retrieved sub-graph.
2. select ONE most likely type from the list for the extracted entity.
3. output the pairs in the format of (entity ; type) strictly.
4. repeat the step 1 to step 3.\
\
Provide your answer as follows:
Answer:::
Pairs: (All extracted pairs)\
Answer End:::\
Requirements:\
You MUST provide values for 'Pairs:' in your answer. \
ONLY use the type in the type list: ['DiseaseClass', 'SpecificDisease', 'Modifier', 'CompositeMention'].\
Extract as many valid entity-type pairs as possible from the given context abstract.\
"""
# ======================================
# MIMIC-IV snomed entity extraction
# ======================================
MIMICIV_entity_extraction_prompt = """\
Here is the context: {text}.\
Task: Extract the SNOMED CT concepts from the given context.\
The steps are as follows:\
1. extract the concepts from the given context sentence, using the retrieved triplets.
2. there may be abbreviations or acronyms in the context, extract them as concepts as well if they are related to the concepts.
3. output the concepts in a list [] strictly, each concept is separated by a comma.\
\
Provide your answer as follows:
Answer:::
Concepts: [] \
Answer End:::\
Requirements:\
You MUST provide values for 'Concepts:' in your answer. \
ONLY extract concepts, DO NOT include the type of the concept, reasoning, or any other information. \
DO NOT include mark numbers or ordinal numbers in your answer. \
Extract as many unique concepts as possible from the given context. \
"""
# ======================================
# MIMIC-IV entity-type extraction
# ======================================
MIMICIV_extraction_prompt = """\
Here is the context: {text}.\
Task: Extract the entity-type pairs from the given context with the format of (entity ; type).\
Here is the type list: [finding, disorder, procedure, regime/therapy, morphologic abnormality, body structure, cell structure].\
The steps are as follows:\
1. extract the entity from the given context abstract, using the retrieved sub-graph.
2. select ONE most likely type from the list for the extracted entity.
3. output the pairs in the format of (entity ; type) strictly.
4. repeat the step 1 to step 3.\
\
Provide your answer as follows:
Answer:::
Pairs: (All extracted pairs)\
Answer End:::\
\
Requirements:\
You MUST provide values for 'Pairs:' in your answer. \
output the pairs in the format of (entity ; type) strictly. \
"""
# ======================================
# MIMIC-IV entity-type extraction with additional entities
# ======================================
MIMICIV_extraction_prompt_with_entities = """\
You are a medical professional working in a hospital. You have been given a discharge note, a list of entities, and a list of types. Your task is to link the entities to the most likely type from the type list.
Here is the abstract: {text}.\
Here is the type list: [finding, disorder, procedure, regime/therapy, morphologic abnormality, body structure, cell structure].\
Here is the list of entities for consideration: {entities}.\
Task: link the entity and the type and output entity-type pairs with the format of (entity ; type).\
The steps are as follows:
1. for each entity in {entities}, link it to the most likely type from the type list. if you cannot find a suitable type, ignore the entity.
2. if you find more entities in the abstract, extract them and link them to the most likely type.
3. output the pairs in the format of (entity ; type) strictly.\
\
Provide your answer as follows:
Answer:::
Pairs: (entity ; type)
Answer End:::\
Requirements:
You MUST provide values for 'Pairs:' in your answer.
ONLY output valid entity-type pairs without any reasoning.
"""
# ======================================
# Pubmed snomed triple extraction
# ======================================
Pubmed_snomed_triple_extraction_prompt = """\
Here is the context: {text}.\
Task: Extract the SNOMED CT triples from the given context with the format of (concept 1 ; relation ; concept 2).\
Here is the optional relation list: [temporally follows, after, due to, has realization, associated with, has definitional manifestation,
associated finding, associated aetiologic finding, associated etiologic finding, interprets, associated morphology, causative agent, course,
finding site, temporally related to, pathological process, direct morphology, is modification of, measures, direct substance, has active ingredient, using, part of].\
The steps are as follows:\
1. extract the concept 1 and concept 2 from the given context sentence, using the retrieved sub-graph.
2. select ONE most likely relation from the list for the extracted concepts.
3. output the triples in the format of (concept 1 ; relation ; concept 2) strictly.\
\
Provide your answer as follows:
Answer:::
Triples: (The extracted triples)\
Answer End:::\
Requirements:\
You MUST provide values for 'Triples:' in your answer.\
ONLY output the triples without any other information.\
"""
Pubmed_snomed_triple_extraction_prompt_with_entities = """\
Here is the context: {text}.\
Task: Extract the SNOMED CT triples from the given context with the format of (concept 1 ; relation ; concept 2).\
Here is the optional relation list: [temporally follows, after, due to, has realization, associated with, has definitional manifestation,
associated finding, associated aetiologic finding, associated etiologic finding, interprets, associated morphology, causative agent, course,
finding site, temporally related to, pathological process, direct morphology, is modification of, measures, direct substance, has active ingredient, using, part of].\
Here is the list of entities for consideration: {entities}.\
The steps are as follows:\
1. extract the concept 1 and concept 2 from the given context sentence, using the retrieved sub-graph.
2. select ONE most likely relation from the list for the extracted concepts.
3. output the triples in the format of (concept 1 ; relation ; concept 2) strictly.\
Provide your answer as follows:
Answer:::
Triples: (The extracted triples)\
Answer End:::\
Requirements:\
You MUST provide values for 'Triples:' in your answer.\
ONLY output the triples without any other information.\
"""
prompt_var_mappings = {"text": "text"}