Skip to content

Commit

Permalink
remove figure reference
Browse files Browse the repository at this point in the history
  • Loading branch information
slobentanzer committed Feb 8, 2024
1 parent fa211c5 commit 9150bd3
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion content/20.results.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ To achieve this goal, we implemented an encrypted pipeline that contains the ben

Current results confirm the prevailing opinion of OpenAI's leading role in LLM performance (Figure @fig:benchmark A).
Since the benchmark datasets were created to specifically cover functions relevant in BioChatter's application domain, the benchmark results are primarily a measure for the LLMs' usefulness in our applications.
OpenAI's GPT models (gpt-4 and gpt-3.5-turbo) lead by some margin on overall performance and consistency, but several open-source models reach high performance in specific tasks (Sup Figure).
OpenAI's GPT models (gpt-4 and gpt-3.5-turbo) lead by some margin on overall performance and consistency, but several open-source models reach high performance in specific tasks.
Of note, performance in open-source models appears to depend on their quantisation level, i.e., the bit precision used to represent the model's parameters.
For models that offer quantisation options, 4- and 5-bit models perform best, while 2- and 8-bit models appear to perform worse (Figure @fig:benchmark A).

Expand Down

0 comments on commit 9150bd3

Please sign in to comment.