Clientside token counting + price estimation for LLM apps and AI agents. Tokencost helps calculate the USD cost of using major Large Language Model (LLMs) APIs by calculating the estimated cost of prompts and completions.
Building AI agents? Check out AgentOps
- LLM Price Tracking Major LLM providers frequently add new models and update pricing. This repo helps track the latest price changes
- Token counting Accurately count prompt tokens before sending OpenAI requests
- Easy integration Get the cost of a prompt or completion with a single function
from tokencost import calculate_prompt_cost, calculate_completion_cost
model = "gpt-3.5-turbo"
prompt = [{ "role": "user", "content": "Hello world"}]
completion = "How may I assist you today?"
prompt_cost = calculate_prompt_cost(prompt, model)
completion_cost = calculate_completion_cost(completion, model)
print(f"{prompt_cost} + {completion_cost} = {prompt_cost + completion_cost}")
# 0.0000135 + 0.000014 = 0.0000275
Recommended: PyPI:
pip install tokencost
Calculating the cost of prompts and completions from OpenAI requests
from openai import OpenAI
client = OpenAI()
model = "gpt-3.5-turbo"
prompt = [{ "role": "user", "content": "Say this is a test"}]
chat_completion = client.chat.completions.create(
messages=prompt, model=model
)
completion = chat_completion.choices[0].message.content
# "This is a test."
prompt_cost = calculate_prompt_cost(prompt, model)
completion_cost = calculate_completion_cost(completion, model)
print(f"{prompt_cost} + {completion_cost} = {prompt_cost + completion_cost}")
# 0.0000180 + 0.000010 = 0.0000280
Calculating cost using string prompts instead of messages:
from tokencost import calculate_prompt_cost
prompt_string = "Hello world"
response = "How may I assist you today?"
model= "gpt-3.5-turbo"
prompt_cost = calculate_prompt_cost(prompt_string, model)
print(f"Cost: ${prompt_cost}")
# Cost: $3e-06
Counting tokens
from tokencost import count_message_tokens, count_string_tokens
message_prompt = [{ "role": "user", "content": "Hello world"}]
# Counting tokens in prompts formatted as message lists
print(count_message_tokens(message_prompt, model="gpt-3.5-turbo"))
# 9
# Alternatively, counting tokens in string prompts
print(count_string_tokens(prompt="Hello world", model="gpt-3.5-turbo"))
# 2
Units denominated in USD. All prices can be located in model_prices.json
.
- Prices last updated Jan 30, 2024 from: https://openai.com/pricing and https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json
Model Name | Prompt Cost (USD) | Completion Cost (USD) | Max Prompt Tokens |
---|---|---|---|
gpt-4 | $0.00003000 | $0.00006000 | 8192 |
gpt-4-0314 | $0.00003000 | $0.00006000 | 8192 |
gpt-4-0613 | $0.00003000 | $0.00006000 | 8192 |
gpt-4-32k | $0.00006000 | $0.00012000 | 32768 |
gpt-4-32k-0314 | $0.00006000 | $0.00012000 | 32768 |
gpt-4-32k-0613 | $0.00006000 | $0.00012000 | 32768 |
gpt-4-1106-preview | $0.00001000 | $0.00003000 | 128000 |
gpt-4-0125-preview | $0.00001000 | $0.00003000 | 128000 |
gpt-4-vision-preview | $0.00001000 | $0.00003000 | 128000 |
gpt-3.5-turbo | $0.00000150 | $0.00000200 | 4097 |
gpt-3.5-turbo-0301 | $0.00000150 | $0.00000200 | 4097 |
gpt-3.5-turbo-0613 | $0.00000150 | $0.00000200 | 4097 |
gpt-3.5-turbo-1106 | $0.00000050 | $0.00000150 | 16385 |
gpt-3.5-turbo-0125 | $0.00000050 | $0.00000150 | 16385 |
gpt-3.5-turbo-16k | $0.00000300 | $0.00000400 | 16385 |
gpt-3.5-turbo-16k-0613 | $0.00000300 | $0.00000400 | 16385 |
text-embedding-ada-002 | $0.00000010 | $0.00000000 | 8191 |
text-embedding-3-small | $0.00000002 | $0.00000000 | 8191 |
text-embedding-3-large | $0.00000013 | $0.00000000 | 8191 |
azure/gpt-4-1106-preview | $0.00001000 | $0.00003000 | 128000 |
azure/gpt-4-0613 | $0.00003000 | $0.00006000 | 8192 |
azure/gpt-4-32k-0613 | $0.00006000 | $0.00012000 | 32768 |
azure/gpt-4-32k | $0.00006000 | $0.00012000 | 32768 |
azure/gpt-4 | $0.00003000 | $0.00006000 | 8192 |
azure/gpt-35-turbo-16k-0613 | $0.00000300 | $0.00000400 | 16385 |
azure/gpt-35-turbo-1106 | $0.00000150 | $0.00000200 | 16384 |
azure/gpt-35-turbo-16k | $0.00000300 | $0.00000400 | 16385 |
azure/gpt-35-turbo | $0.00000150 | $0.00000200 | 4097 |
azure/text-embedding-ada-002 | $0.00000010 | $0.00000000 | 8191 |
text-davinci-003 | $0.00000200 | $0.00000200 | 4097 |
text-curie-001 | $0.00000200 | $0.00000200 | 2049 |
text-babbage-001 | $0.00000040 | $0.00000040 | 2049 |
text-ada-001 | $0.00000040 | $0.00000040 | 2049 |
babbage-002 | $0.00000040 | $0.00000040 | 16384 |
davinci-002 | $0.00000200 | $0.00000200 | 16384 |
gpt-3.5-turbo-instruct | $0.00000150 | $0.00000200 | 8192 |
claude-instant-1 | $0.00000160 | $0.00000550 | 100000 |
mistral/mistral-tiny | $0.00000010 | $0.00000040 | 8192 |
mistral/mistral-small | $0.00000060 | $0.00000190 | 8192 |
mistral/mistral-medium | $0.00000270 | $0.00000820 | 8192 |
claude-instant-1.2 | $0.00000010 | $0.00000050 | 100000 |
claude-2 | $0.00000800 | $0.00002400 | 100000 |
claude-2.1 | $0.00000800 | $0.00002400 | 200000 |
text-bison | $0.00000010 | $0.00000010 | 8192 |
text-bison@001 | $0.00000010 | $0.00000010 | 8192 |
text-unicorn | $0.00001000 | $0.00002800 | 8192 |
text-unicorn@001 | $0.00001000 | $0.00002800 | 8192 |
chat-bison | $0.00000010 | $0.00000010 | 4096 |
chat-bison@001 | $0.00000010 | $0.00000010 | 4096 |
chat-bison@002 | $0.00000010 | $0.00000010 | 4096 |
chat-bison-32k | $0.00000010 | $0.00000010 | 32000 |
code-bison | $0.00000010 | $0.00000010 | 6144 |
code-bison@001 | $0.00000010 | $0.00000010 | 6144 |
code-gecko@001 | $0.00000010 | $0.00000010 | 2048 |
code-gecko@002 | $0.00000010 | $0.00000010 | 2048 |
code-gecko | $0.00000010 | $0.00000010 | 2048 |
codechat-bison | $0.00000010 | $0.00000010 | 6144 |
codechat-bison@001 | $0.00000010 | $0.00000010 | 6144 |
codechat-bison-32k | $0.00000010 | $0.00000010 | 32000 |
gemini-pro | $0.00000020 | $0.00000050 | 30720 |
gemini-pro-vision | $0.00000020 | $0.00000050 | 30720 |
palm/chat-bison | $0.00000010 | $0.00000010 | 4096 |
palm/chat-bison-001 | $0.00000010 | $0.00000010 | 4096 |
palm/text-bison | $0.00000010 | $0.00000010 | 8196 |
palm/text-bison-001 | $0.00000010 | $0.00000010 | 8196 |
palm/text-bison-safety-off | $0.00000010 | $0.00000010 | 8196 |
palm/text-bison-safety-recitation-off | $0.00000010 | $0.00000010 | 8196 |
command-nightly | $0.00001500 | $0.00001500 | 4096 |
command | $0.00001500 | $0.00001500 | 4096 |
command-light | $0.00001500 | $0.00001500 | 4096 |
command-medium-beta | $0.00001500 | $0.00001500 | 4096 |
command-xlarge-beta | $0.00001500 | $0.00001500 | 4096 |
openrouter/openai/gpt-3.5-turbo | $0.00000150 | $0.00000200 | 4095 |
openrouter/openai/gpt-3.5-turbo-16k | $0.00000300 | $0.00000400 | 16383 |
openrouter/openai/gpt-4 | $0.00003000 | $0.00006000 | 8192 |
openrouter/anthropic/claude-instant-v1 | $0.00000160 | $0.00000550 | 100000 |
openrouter/anthropic/claude-2 | $0.00001100 | $0.00003260 | 100000 |
openrouter/google/palm-2-chat-bison | $0.00000050 | $0.00000050 | 8000 |
openrouter/google/palm-2-codechat-bison | $0.00000050 | $0.00000050 | 8000 |
openrouter/meta-llama/llama-2-13b-chat | $0.00000020 | $0.00000020 | 4096 |
openrouter/meta-llama/llama-2-70b-chat | $0.00000150 | $0.00000150 | 4096 |
openrouter/meta-llama/codellama-34b-instruct | $0.00000050 | $0.00000050 | 8096 |
openrouter/nousresearch/nous-hermes-llama2-13b | $0.00000020 | $0.00000020 | 4096 |
openrouter/mancer/weaver | $0.00000560 | $0.00000560 | 8000 |
openrouter/gryphe/mythomax-l2-13b | $0.00000180 | $0.00000180 | 8192 |
openrouter/jondurbin/airoboros-l2-70b-2.1 | $0.00001380 | $0.00001380 | 4096 |
openrouter/undi95/remm-slerp-l2-13b | $0.00000180 | $0.00000180 | 6144 |
openrouter/pygmalionai/mythalion-13b | $0.00000180 | $0.00000180 | 4096 |
openrouter/mistralai/mistral-7b-instruct | $0.00000000 | $0.00000000 | 4096 |
j2-ultra | $0.00001500 | $0.00001500 | 8192 |
j2-mid | $0.00001000 | $0.00001000 | 8192 |
j2-light | $0.00000300 | $0.00000300 | 8192 |
dolphin | $0.00002000 | $0.00002000 | 4096 |
chatdolphin | $0.00002000 | $0.00002000 | 4096 |
luminous-base | $0.00003000 | $0.00003300 | 2048 |
luminous-base-control | $0.00003740 | $0.00004120 | 2048 |
luminous-extended | $0.00004500 | $0.00004940 | 2048 |
luminous-extended-control | $0.00005620 | $0.00006180 | 2048 |
luminous-supreme | $0.00017500 | $0.00019250 | 2048 |
luminous-supreme-control | $0.00021870 | $0.00024060 | 2048 |
ai21.j2-mid-v1 | $0.00001250 | $0.00001250 | 8191 |
ai21.j2-ultra-v1 | $0.00001880 | $0.00001880 | 8191 |
amazon.titan-text-lite-v1 | $0.00000030 | $0.00000040 | 8000 |
amazon.titan-text-express-v1 | $0.00000130 | $0.00000170 | 8000 |
anthropic.claude-v1 | $0.00000800 | $0.00002400 | 100000 |
bedrock/us-east-1/anthropic.claude-v1 | $0.00000800 | $0.00002400 | 100000 |
bedrock/us-west-2/anthropic.claude-v1 | $0.00000800 | $0.00002400 | 100000 |
bedrock/ap-northeast-1/anthropic.claude-v1 | $0.00000800 | $0.00002400 | 100000 |
bedrock/eu-central-1/anthropic.claude-v1 | $0.00000800 | $0.00002400 | 100000 |
anthropic.claude-v2 | $0.00000800 | $0.00002400 | 100000 |
bedrock/us-east-1/anthropic.claude-v2 | $0.00000800 | $0.00002400 | 100000 |
bedrock/us-west-2/anthropic.claude-v2 | $0.00000800 | $0.00002400 | 100000 |
bedrock/ap-northeast-1/anthropic.claude-v2 | $0.00000800 | $0.00002400 | 100000 |
bedrock/eu-central-1/anthropic.claude-v2 | $0.00000800 | $0.00002400 | 100000 |
anthropic.claude-v2:1 | $0.00000800 | $0.00002400 | 200000 |
bedrock/us-east-1/anthropic.claude-v2:1 | $0.00000800 | $0.00002400 | 100000 |
bedrock/us-west-2/anthropic.claude-v2:1 | $0.00000800 | $0.00002400 | 100000 |
bedrock/ap-northeast-1/anthropic.claude-v2:1 | $0.00000800 | $0.00002400 | 100000 |
bedrock/eu-central-1/anthropic.claude-v2:1 | $0.00000800 | $0.00002400 | 100000 |
anthropic.claude-instant-v1 | $0.00000160 | $0.00000550 | 100000 |
bedrock/us-east-1/anthropic.claude-instant-v1 | $0.00000080 | $0.00000240 | 100000 |
bedrock/us-west-2/anthropic.claude-instant-v1 | $0.00000080 | $0.00000240 | 100000 |
bedrock/ap-northeast-1/anthropic.claude-instant-v1 | $0.00000220 | $0.00000750 | 100000 |
bedrock/eu-central-1/anthropic.claude-instant-v1 | $0.00000240 | $0.00000830 | 100000 |
cohere.command-text-v14 | $0.00000150 | $0.00000200 | 4096 |
cohere.command-light-text-v14 | $0.00000030 | $0.00000060 | 4000 |
cohere.embed-english-v3 | $0.00000010 | $0.00000000 | 512 |
cohere.embed-multilingual-v3 | $0.00000010 | $0.00000000 | 512 |
meta.llama2-13b-chat-v1 | $0.00000070 | $0.00000100 | 4096 |
meta.llama2-70b-chat-v1 | $0.00000190 | $0.00000250 | 4096 |
sagemaker/meta-textgeneration-llama-2-7b | $0.00000000 | $0.00000000 | 4096 |
sagemaker/meta-textgeneration-llama-2-7b-f | $0.00000000 | $0.00000000 | 4096 |
sagemaker/meta-textgeneration-llama-2-13b | $0.00000000 | $0.00000000 | 4096 |
sagemaker/meta-textgeneration-llama-2-13b-f | $0.00000000 | $0.00000000 | 4096 |
sagemaker/meta-textgeneration-llama-2-70b | $0.00000000 | $0.00000000 | 4096 |
sagemaker/meta-textgeneration-llama-2-70b-b-f | $0.00000000 | $0.00000000 | 4096 |
together-ai-7.1b-20b | $0.00000040 | $0.00000040 | 1000 |
ollama/llama2 | $0.00000000 | $0.00000000 | 4096 |
ollama/llama2:13b | $0.00000000 | $0.00000000 | 4096 |
ollama/llama2:70b | $0.00000000 | $0.00000000 | 4096 |
ollama/llama2-uncensored | $0.00000000 | $0.00000000 | 4096 |
ollama/mistral | $0.00000000 | $0.00000000 | 8192 |
ollama/codellama | $0.00000000 | $0.00000000 | 4096 |
ollama/orca-mini | $0.00000000 | $0.00000000 | 4096 |
ollama/vicuna | $0.00000000 | $0.00000000 | 2048 |
deepinfra/meta-llama/Llama-2-70b-chat-hf | $0.00000070 | $0.00000090 | 4096 |
deepinfra/codellama/CodeLlama-34b-Instruct-hf | $0.00000060 | $0.00000060 | 4096 |
deepinfra/meta-llama/Llama-2-13b-chat-hf | $0.00000030 | $0.00000030 | 4096 |
deepinfra/meta-llama/Llama-2-7b-chat-hf | $0.00000020 | $0.00000020 | 4096 |
deepinfra/mistralai/Mistral-7B-Instruct-v0.1 | $0.00000020 | $0.00000020 | 4096 |
deepinfra/jondurbin/airoboros-l2-70b-gpt4-1.4.1 | $0.00000070 | $0.00000090 | 4096 |
perplexity/pplx-7b-chat | $0.00000000 | $0.00000000 | 8192 |
perplexity/pplx-70b-chat | $0.00000000 | $0.00000000 | 4096 |
perplexity/pplx-7b-online | $0.00000000 | $0.00050000 | 4096 |
perplexity/pplx-70b-online | $0.00000000 | $0.00050000 | 4096 |
perplexity/llama-2-13b-chat | $0.00000000 | $0.00000000 | 4096 |
perplexity/llama-2-70b-chat | $0.00000000 | $0.00000000 | 4096 |
perplexity/mistral-7b-instruct | $0.00000000 | $0.00000000 | 4096 |
perplexity/replit-code-v1.5-3b | $0.00000000 | $0.00000000 | 4096 |
anyscale/mistralai/Mistral-7B-Instruct-v0.1 | $0.00000010 | $0.00000010 | 16384 |
anyscale/HuggingFaceH4/zephyr-7b-beta | $0.00000010 | $0.00000010 | 16384 |
anyscale/meta-llama/Llama-2-7b-chat-hf | $0.00000010 | $0.00000010 | 4096 |
anyscale/meta-llama/Llama-2-13b-chat-hf | $0.00000020 | $0.00000020 | 4096 |
anyscale/meta-llama/Llama-2-70b-chat-hf | $0.00000100 | $0.00000100 | 4096 |
anyscale/codellama/CodeLlama-34b-Instruct-hf | $0.00000100 | $0.00000100 | 16384 |
cloudflare/@cf/meta/llama-2-7b-chat-fp16 | $0.00000190 | $0.00000190 | 3072 |
cloudflare/@cf/meta/llama-2-7b-chat-int8 | $0.00000190 | $0.00000190 | 2048 |
cloudflare/@cf/mistral/mistral-7b-instruct-v0.1 | $0.00000190 | $0.00000190 | 8192 |
cloudflare/@hf/thebloke/codellama-7b-instruct-awq | $0.00000190 | $0.00000190 | 4096 |
voyage/voyage-01 | $0.00000010 | $0.00000000 | 4096 |
voyage/voyage-lite-01 | $0.00000010 | $0.00000000 | 4096 |
You may also calculate token costs in LLM wrapper/framework libraries using callbacks.
pip install `'tokencost[llama-index]'`
To use the base callback handler, you may import it:
from tokencost.callbacks.llama_index import BaseCallbackHandler
and pass to your framework callback handler.
(Coming Soon)
Installation via GitHub:
git clone git@github.com:AgentOps-AI/tokencost.git
cd tokencost
pip install -e .
- Install
pytest
if you don't have it already
pip install pytest
- Run the
tests/
folder while in the parent directory
pytest tests
This repo also supports tox
, simply run python -m tox
.
Contributions to TokenCost are welcome! Feel free to create an issue for any bug reports, complaints, or feature suggestions.
TokenCost is released under the MIT License.