You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your feedback. We use the same code you post for evaluation. Is there something wrong with your evaluation data? Or you can try another model to see whether the error exists.
Hi
Could you share scripts that may reproduce the results in the paper ? Thanks.
I tried the generation and evaluation for safety using the following script on an Nvidia GPU. The results are
total:
0.18
single:
{'fixed sentence': 0.13, 'no_punctuation': 0.15, 'programming': 0.14, 'cou': 0.24, 'Refusal sentence prohibition': 0.12, 'cot': 0.28, 'scenario': 0.2, 'multitask': 0.14, 'no_long_word': 0.15, 'url_encode': 0.21, 'without_the': 0.23, 'json_format': 0.17, 'leetspeak': 0.21, 'bad words': 0.15}
They are not close to the results shown in Table 17 for the model.
The text was updated successfully, but these errors were encountered: