Log input tokens, output tokens and token details #642

simonw · 2024-11-20T02:31:24Z

Refs:

Abstract out token usage numbers #610

TODO:

Log input/output/details to new columns on responses table.
llm prompt -u/--usage option
Add token usage information to markdown llm logs output
Implement this in at least one other plugin to check it makes sense
Update plugin docs to explain response.set_usage()
~~Document how to use this in the Python API (I'll need this myself for the datasette-llm package)~~ I need to document Response generally, will do this in a new issue.

simonw · 2024-11-20T02:42:51Z

I'm going to omit the token information from llm logs markdown unless the user specifies -u/--usage (I'll keep it on the JSON by default though).

Refs #642 (comment)

simonw · 2024-11-20T03:02:21Z

End output of llm logs -u now:

...

Example Command:

If you have a SQLite database named texts.db with a table documents containing a text column content, the command would look like this:
llm embed-multi my-texts \
  --sql "SELECT id, content FROM documents" \
  --model ada-002 \
  --store
Replace ada-002 with the embedding model that you wish to use for processing the text. Adjust the SQL query to fit your actual table structure.

This will process all entries in the documents table and store the embeddings in the my-texts collection.

Token usage:

30,791 input, 30,791 output, {"prompt_tokens_details": {"cached_tokens": 30592}}

simonw · 2024-11-20T03:16:38Z

This diff to llm-claude-3 logged token counts correctly:

diff --git a/llm_claude_3.py b/llm_claude_3.py
index a05b01b..281084e 100644
--- a/llm_claude_3.py
+++ b/llm_claude_3.py
@@ -240,16 +240,23 @@ class ClaudeMessages(_Shared, llm.Model):
     def execute(self, prompt, stream, response, conversation):
         client = Anthropic(api_key=self.get_key())
         kwargs = self.build_kwargs(prompt, conversation)
+        usage = None
         if stream:
             with client.messages.stream(**kwargs) as stream:
                 for text in stream.text_stream:
                     yield text
                 # This records usage and other data:
                 response.response_json = stream.get_final_message().model_dump()
+                usage = response.response_json.pop("usage")
         else:
             completion = client.messages.create(**kwargs)
             yield completion.content[0].text
             response.response_json = completion.model_dump()
+            usage = response.response_json.pop("usage")
+        if usage:
+            response.set_usage(
+                input=usage.get("input_tokens"), output=usage.get("output_tokens")
+            )
 
 
 class ClaudeMessagesLong(ClaudeMessages):

simonw · 2024-11-20T03:21:08Z

Better Claude diff:

diff --git a/llm_claude_3.py b/llm_claude_3.py
index a05b01b..0a6e236 100644
--- a/llm_claude_3.py
+++ b/llm_claude_3.py
@@ -231,6 +231,13 @@ class _Shared:
             kwargs["extra_headers"] = self.extra_headers
         return kwargs
 
+    def set_usage(self, response):
+        usage = response.response_json.pop("usage")
+        if usage:
+            response.set_usage(
+                input=usage.get("input_tokens"), output=usage.get("output_tokens")
+            )
+
     def __str__(self):
         return "Anthropic Messages: {}".format(self.model_id)
 
@@ -250,6 +257,7 @@ class ClaudeMessages(_Shared, llm.Model):
             completion = client.messages.create(**kwargs)
             yield completion.content[0].text
             response.response_json = completion.model_dump()
+        self.set_usage(response)
 
 
 class ClaudeMessagesLong(ClaudeMessages):
@@ -270,6 +278,7 @@ class AsyncClaudeMessages(_Shared, llm.AsyncModel):
             completion = await client.messages.create(**kwargs)
             yield completion.content[0].text
             response.response_json = completion.model_dump()
+        self.set_usage(response)
 
 
 class AsyncClaudeMessagesLong(AsyncClaudeMessages):

simonw added 3 commits November 19, 2024 18:30

Store input_tokens, output_tokens, token_details on Response, refs #610

80956da

Update schema with cog, refs #610

3306f3a

llm prompt -u/--usage option

08d4376

Refs #610 (comment)

llm logs -u/--usage option, refs #610

d7eb138

Refs #642 (comment)

simonw added 2 commits November 19, 2024 19:04

Commas on thousands for token usage logs

7c5d2bb

Fixed bug where logged output_tokens value was incorrect

5dd618b

Extra newline after token usage in llm logs -u

8de7bf2

Docs on tracking token usage in plugins, refs #610

fff6a92

simonw mentioned this pull request Nov 20, 2024

Python API documentation for Response objects #643

Open

simonw marked this pull request as ready for review November 20, 2024 04:15

test_simplify_usage_dict test

0ebb5f4

simonw merged commit cfb10f4 into main Nov 20, 2024
61 checks passed

simonw deleted the usage branch November 20, 2024 04:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log input tokens, output tokens and token details #642

Log input tokens, output tokens and token details #642

simonw commented Nov 20, 2024 •

edited

Loading

simonw commented Nov 20, 2024

simonw commented Nov 20, 2024 •

edited

Loading

Example Command:

Token usage:

simonw commented Nov 20, 2024

simonw commented Nov 20, 2024

Log input tokens, output tokens and token details #642

Log input tokens, output tokens and token details #642

Conversation

simonw commented Nov 20, 2024 • edited Loading

simonw commented Nov 20, 2024

simonw commented Nov 20, 2024 • edited Loading

Example Command:

Token usage:

simonw commented Nov 20, 2024

simonw commented Nov 20, 2024

simonw commented Nov 20, 2024 •

edited

Loading

simonw commented Nov 20, 2024 •

edited

Loading