You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/unitxt/llm_as_judge_chat_templates.py
+13-13Lines changed: 13 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -3,20 +3,20 @@
3
3
direct_template_dict= {
4
4
"assessment": InputOutputTemplate(
5
5
input_format="""
6
-
You are presented with a response generated subject to a context.
7
-
The context includes information relevant to the nature or generation of the response.
8
-
You will assess the quality of the response subject to an evaluation criteria.
6
+
You are presented with a {response_variable_name} generated subject to a context.
7
+
The context includes information relevant to the nature or generation of the {response_variable_name}.
8
+
You will assess the quality of the {response_variable_name} subject to an evaluation criteria.
9
9
###Context:
10
10
{context_variables}
11
11
12
-
###Response:
12
+
###{response_variable_name_title}:
13
13
{response}
14
14
15
15
###Evaluation criteria:
16
16
{criteria_description}
17
17
{display_options_instruction}
18
18
19
-
Briefly assess the quality of the response subject to the evaluation criteria.
19
+
Briefly assess the quality of the {response_variable_name} subject to the evaluation criteria.
20
20
Focus on the evaluation criteria during assessment, do not provide a general assessment.
21
21
Assessment:
22
22
@@ -29,7 +29,7 @@
29
29
Summary:"""
30
30
),
31
31
"answer": InputOutputTemplate(
32
-
input_format="""Now consider the evaluation criteria and choose a final answer. Only include the chosen answer in the response.
32
+
input_format="""Now consider the evaluation criteria and choose a final answer. Only include the chosen answer in the {response_variable_name}.
33
33
###Evaluation criteria:
34
34
{criteria_description}
35
35
{score_option_instruction}
@@ -41,8 +41,8 @@
41
41
42
42
pairwise_template_dict= {
43
43
"assessment": InputOutputTemplate(
44
-
input_format="""You are provided a pair of responses (Response {option_a} and Response {option_b}) generated subject to a context.
45
-
You will choose the better quality response subject to the evaluation criteria.
44
+
input_format="""You are provided a pair of {response_variable_name}s ({response_variable_name_title} {option_a} and {response_variable_name_title} {option_b}) generated subject to a context.
45
+
You will choose the better quality {response_variable_name} subject to the evaluation criteria.
46
46
47
47
This is the context:
48
48
{context_variables}
@@ -51,25 +51,25 @@
51
51
{criteria_name}
52
52
{criteria_description}
53
53
54
-
Response {option_a}:
54
+
{response_variable_name_title} {option_a}:
55
55
{response_a}
56
-
Response {option_b}:
56
+
{response_variable_name_title} {option_b}:
57
57
{response_b}
58
58
59
-
Keeping the evaluation criteria in mind, briefly assess which response is better.
59
+
Keeping the evaluation criteria in mind, briefly assess which {response_variable_name} is better.
60
60
Focus on the evaluation criteria during assessment, do not provide a general assessment.
61
61
Assessment:
62
62
63
63
Lets think step by step """
64
64
),
65
65
"summarization": InputOutputTemplate(
66
-
input_format="""Transform the following assessment into a concise summary that focuses on the key details, excluding references to the assessment itself. The summary must clearly state which response won.
66
+
input_format="""Transform the following assessment into a concise summary that focuses on the key details, excluding references to the assessment itself. The summary must clearly state which {response_variable_name} won.
67
67
68
68
Assessment: {assessment}
69
69
Summary:"""
70
70
),
71
71
"answer": InputOutputTemplate(
72
-
input_format="""Now considering the evaluation criteria, which response is better quality? Only include the chosen response.
72
+
input_format="""Now considering the evaluation criteria, which {response_variable_name} is better quality? Only include the chosen {response_variable_name}.
0 commit comments