Token / Cost / Usage Tracking with the new ADK (Using Python, not the Dashboard)? #97

BrewHog · 2025-04-11T17:50:43Z

BrewHog
Apr 11, 2025

Are there any built-in functions to determine cost / use of queries using the ADK? At the moment, I am using the free exp models (gemini-2.0-flash-exp). I need to understand what the costs/usage will be for my flows when put into production.

osmariofilho · 2025-04-11T18:52:00Z

osmariofilho
Apr 11, 2025

yes. correct

3 replies

BrewHog Apr 11, 2025
Author

yes. correct

? There is a function to track tokens/cost? Could you elaborate? Thanks

thetushargoyal Apr 23, 2025

I have added support for cost in this PR for LiteLLM models. Check it out #198

weisiang9678 May 6, 2025

can you show an example ? i could not get the "usage_metadata" from the llm response.
The only to get the token usage is through LiteLLM ? For my use case, it does not require LiteLLM.
Thanks !

Shivam-002 · 2025-04-16T08:30:02Z

Shivam-002
Apr 16, 2025

I am also having same question. Looking into the code I didn't find any build-in methods. If anyone knows that how to do it please let us know.

1 reply

thetushargoyal Apr 23, 2025

I have added support for cost in this PR for LiteLLM models. Check it out #198

Pavan-Bellam · 2025-04-21T21:00:58Z

Pavan-Bellam
Apr 21, 2025

Has anyone figured out a solution for this?

1 reply

thetushargoyal Apr 23, 2025

I have added support for cost in this PR for LiteLLM models. Check it out #198

beinoriusju · 2025-04-22T04:06:16Z

beinoriusju
Apr 22, 2025

LiteLlm has cost counters

0 replies

lsimone · 2025-04-30T18:07:15Z

lsimone
Apr 30, 2025

I had the same problem since I wanted to track some costs/usage stats, so I tried to intercept with after_model_callback but no luck.

I finally came up with something similar:

def track_llm_usage(res: ModelResponse | CustomStreamWrapper, model):
            # Log token usage and cost before returning
        try:
            usage = res.get("usage", {})
            prompt_tokens = usage.get("prompt_tokens", 0)
            completion_tokens = usage.get("completion_tokens", 0)
            total_tokens = usage.get("total_tokens", prompt_tokens + completion_tokens)
            
            # Calculate cost if possible
            cost = litellm.completion_cost(completion_response=res) or 0.0
            
            print(f"[acompletion] Model: {model}")
            print(f"[acompletion] Prompt tokens: {prompt_tokens}")
            print(f"[acompletion] Completion tokens: {completion_tokens}")
            print(f"[acompletion] Total tokens: {total_tokens}")
            print(f"[acompletion] Cost: ${cost:.6f}")
            print(f"[acompletion] Time: {(res._response_ms / 1000):.2f}s")
        except Exception as e:
            print(f"Error logging token usage: {e}")
        return res


class KLiteLLMClient(LiteLLMClient):
    async def acompletion(
      self, model, messages, tools, **kwargs
    ):
        return track_llm_usage(await super().acompletion(
            model=model,
            messages=messages,
            tools=tools,
            **kwargs,
        ), model=model)
        
 
    
 root_agent = LlmAgent(
    name="weather_time_agent",
    model=LiteLlm(model="gemini/gemini-2.5-flash-preview-04-17", 
        llm_client=KLiteLLMClient()
    ),
#    [...]
)

hope it helps!

0 replies

PaveLuchkov · 2025-05-07T07:11:16Z

PaveLuchkov
May 7, 2025

litellm._turn_on_debug()
you can paste this in your code, and it will show you following:

"usage":{"prompt_tokens":350,"completion_tokens":418,"total_tokens":768}}

search within terminal using ctrl f "total_tokens". hope that helps

0 replies

OhVIton · 2025-06-11T10:06:15Z

OhVIton
Jun 11, 2025

If you are using LiteLlm as an agent model, add stream_options={"include_usage": True} to LiteLlm constructor

LiteLlm(
    'azure/gpt-4.1',
    stream_options={"include_usage": True}
)

and you can get usage tokens by Event.usage_metadata

        events = runner.run_async(
            session_id=session.id,
            user_id=session.user_id,
            new_message=self.__messages_to_agent_new_message(messages),
            run_config=RunConfig(streaming_mode=StreamingMode.SSE),
        )
    async for event in events:
        # cache_tokens_details=None cached_content_token_count=None candidates_token_count=None candidates_tokens_details=None prompt_token_count=172 prompt_tokens_details=[ModalityTokenCount(modality=<MediaModality.TEXT: 'TEXT'>, token_count=172)] thoughts_token_count=44 tool_use_prompt_token_count=None tool_use_prompt_tokens_details=None total_token_count=216 traffic_type=None
        print(event.usage_metadata)

5 replies

adityapandey216 Dec 9, 2025

Does the response return output tokens? Having output tokens gives sense of better overall cost breakdown

sarojrout Dec 9, 2025

Yes! ADK provides token tracking through event.usage_metadata. Here's a quick example:

async for event in runner.run_async(...):
    if event.usage_metadata:
        print(f"Tokens: {event.usage_metadata.total_token_count}")
        print(f"Prompt: {event.usage_metadata.prompt_token_count}")
        print(f"Candidates: {event.usage_metadata.candidates_token_count}")

See contributing/samples/token_usage/main.py for a complete example with cost calculation.

adityapandey216 Dec 9, 2025

@sarojrout is it safe to say total_output_tokens = candidates_token_count + thoughts_token_count ?

sarojrout Dec 9, 2025

@sarojrout is it safe to say total_output_tokens = candidates_token_count + thoughts_token_count ?

yes

sarojrout Dec 9, 2025

just to let you know thoughts_token_count is only available for models with reasoning capabilities (e.g., Gemini 2.0 Flash Thinking). For regular models, it's None or 0.

0dd · 2025-12-10T06:28:10Z

0dd
Dec 10, 2025

@adk-bot what is the current design?

0 replies

adk-bot · 2025-12-10T06:29:42Z

adk-bot
Dec 10, 2025
Maintainer

Response from ADK Answering Agent (experimental, answer may be inaccurate)

TLDR: You can track token usage directly in Python by inspecting the usage_metadata events generated by the ADK Runner. This will give you prompt, candidate, and total token counts for each model call.

Hello! As an AI assistant, I can provide some information on how to track token and cost usage with the ADK in Python.

Built-in Usage Tracking

The ADK provides a built-in mechanism to track token usage for each call to a language model. When you use runner.run_live(), the runner emits usage_metadata events that contain token count information. You can inspect these events to get detailed usage data. [1]

Here is an example of how you can access this information in your Python code:

from google.adk.runtime import runner

# Your ADK setup and agent definition here...

if __name__ == "__main__":
    for event_type, event_data in runner.run_live(agent):
        if event_type == "usage_metadata":
            print(f"Token Usage: {event_data}")

The event_data for a usage_metadata event will look something like this:

{
    "prompt_token_count": 120,
    "candidates_token_count": 250,
    "total_token_count": 370
}

This allows you to programmatically track token usage without relying on the dashboard. You can then calculate the estimated cost based on the pricing of the models you are using.

Advanced Monitoring and Observability

For more advanced monitoring and analytics, you can explore the following options:

ADK Plugins: You can create your own ADK Plugins to collect and export metrics on token usage and other agent behaviors to any monitoring system you prefer. [2, 3]
BigQuery Agent Analytics Plugin: For high-volume analysis, the ADK offers a plugin to log detailed agent analytics to BigQuery. This allows you to run complex queries on your agent's behavior and performance. [5]
Third-Party Integrations: The ADK also supports integration with third-party observability platforms like AgentOps for more comprehensive monitoring of cost, latency, and other metrics. [4]

I hope this helps you understand how to track your ADK usage and costs!

Footnotes:

[1] https://google.github.io/adk-docs/streaming/dev-guide/part3/
[2] https://github.com/google/adk-python/blob/main/contributing/samples/plugin_basic/README.md
[3] https://google.github.io/adk-docs/plugins/
[4] https://google.github.io/adk-docs/observability/agentops/
[5] https://google.github.io/adk-docs/tools/google-cloud/bigquery-agent-analytics/

0 replies

Token / Cost / Usage Tracking with the new ADK (Using Python, not the Dashboard)? #97

Uh oh!

Replies: 9 comments · 10 replies

Uh oh!

Uh oh!

Uh oh!

BrewHog Apr 11, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adk-bot Dec 10, 2025 Maintainer

Built-in Usage Tracking

Advanced Monitoring and Observability

Replies: 9 comments 10 replies

BrewHog Apr 11, 2025
Author

adk-bot
Dec 10, 2025
Maintainer