Code for benchmarking the speed of DeepSeek R1 from different providers' APIs.
Read the full report: DeepSeek R1: Comparing Pricing and Speed Across Providers
Currently supported:
- DeepSeek
- DeepInfra
- Fireworks
- Together
- Chutes
- Hyperbolic
- Azure AI Foundry
- Nebius
- Nvidia NIM
- Kluster
- Novita
Nvidia NIM is stuck at streaming the response without a timeout, so it is skipped for now.
TODO:
- Sambanova Cloud (waiting list)
- replicate
- SiliconFlow
Providers that I am not able to test due to high costs or lack of open access:
- Awesome Cloud (Contact sales)
- AWS Bedrock (Requires dedicated ec2 instance)
- featherless (Requires subscription)
- Avian (Requires dedicated deployment with 4 GPUs)
Watch list for DeepSeek R1 support:
Statistics of the speed of the API automatically generated by running analyze-speed.js
.
=== Overall Speed Statistics (tokens/second) ===
Using latest 50 benchmark runs
Fireworks : Median/Mean: 23.80/29.93, Range: 6.77-78.17 ±19.34, Error rate: 0.00%, Success/Error: 32/0
Kluster : Median/Mean: 20.25/20.66, Range: 14.44-33.75 ± 6.11, Error rate: 12.50%, Success/Error: 7/1
DeepSeek : Median/Mean: 18.02/23.96, Range: 11.28-67.60 ±15.56, Error rate: 15.63%, Success/Error: 27/5
Together : Median/Mean: 16.98/23.74, Range: 7.57-92.77 ±22.28, Error rate: 3.13%, Success/Error: 31/1
Novita : Median/Mean: 15.03/16.29, Range: 9.75-23.85 ± 4.86, Error rate: 0.00%, Success/Error: 9/0
Hyperbolic: Median/Mean: 14.45/15.11, Range: 4.49-31.26 ± 7.77, Error rate: 17.24%, Success/Error: 24/5
Nvidia : Median/Mean: 11.87/14.13, Range: 4.50-34.26 ± 9.63, Error rate: 33.33%, Success/Error: 12/6
Nebius : Median/Mean: 9.17/10.68, Range: 3.21-26.72 ± 7.42, Error rate: 0.00%, Success/Error: 22/0
DeepInfra : Median/Mean: 7.80/ 7.97, Range: 3.09-12.01 ± 1.75, Error rate: 0.00%, Success/Error: 32/0
Azure : Median/Mean: 6.61/11.01, Range: 1.89-41.85 ±12.36, Error rate: 3.85%, Success/Error: 25/1
=== Daily Statistics ===
Date: 21/03/2025
Together : Median/Mean: 86.10/88.27, Range: 85.93-92.77 ± 3.19, Error rate: 25.00%, Success/Error: 3/1
Fireworks : Median/Mean: 76.19/74.04, Range: 65.62-78.17 ± 4.95, Error rate: 0.00%, Success/Error: 4/0
Azure : Median/Mean: 39.81/39.03, Range: 34.65-41.85 ± 2.67, Error rate: 0.00%, Success/Error: 4/0
Nebius : Median/Mean: 24.02/24.58, Range: 23.53-26.72 ± 1.29, Error rate: 0.00%, Success/Error: 4/0
Hyperbolic: Median/Mean: 23.68/23.26, Range: 20.09-25.59 ± 2.15, Error rate: 0.00%, Success/Error: 4/0
Novita : Median/Mean: 22.00/20.55, Range: 14.34-23.85 ± 3.73, Error rate: 0.00%, Success/Error: 4/0
Kluster : Median/Mean: 20.25/18.70, Range: 14.44-21.40 ± 3.05, Error rate: 25.00%, Success/Error: 3/1
DeepSeek : Median/Mean: 15.57/15.69, Range: 14.06-17.57 ± 1.61, Error rate: 0.00%, Success/Error: 4/0
DeepInfra : Median/Mean: 10.40/10.14, Range: 7.73-12.01 ± 1.73, Error rate: 0.00%, Success/Error: 4/0
Date: 11/03/2025
Novita : Median/Mean: 13.57/12.88, Range: 9.75-15.39 ± 2.28, Error rate: 0.00%, Success/Error: 5/0
Date: 12/02/2025
Together : Median/Mean: 38.42/38.42, Range: 35.93-40.91 ± 2.49, Error rate: 0.00%, Success/Error: 2/0
Nvidia : Median/Mean: 33.92/33.92, Range: 33.59-34.26 ± 0.33, Error rate: 0.00%, Success/Error: 2/0
Fireworks : Median/Mean: 27.05/27.05, Range: 21.84-32.25 ± 5.21, Error rate: 0.00%, Success/Error: 2/0
Kluster : Median/Mean: 19.59/22.14, Range: 15.62-33.75 ± 7.31, Error rate: 0.00%, Success/Error: 4/0
DeepSeek : Median/Mean: 16.07/16.07, Range: 14.13-18.02 ± 1.94, Error rate: 0.00%, Success/Error: 2/0
Nebius : Median/Mean: 9.43/ 9.43, Range: 8.43-10.42 ± 1.00, Error rate: 0.00%, Success/Error: 2/0
DeepInfra : Median/Mean: 8.80/ 8.80, Range: 8.52- 9.09 ± 0.29, Error rate: 0.00%, Success/Error: 2/0
Azure : Median/Mean: 6.04/ 6.04, Range: 5.50- 6.59 ± 0.54, Error rate: 0.00%, Success/Error: 2/0
Hyperbolic: Median/Mean: 5.40/ 5.40, Range: 4.49- 6.31 ± 0.91, Error rate: 0.00%, Success/Error: 2/0
Date: 05/02/2025
Hyperbolic: Median/Mean: 21.13/21.13, Range: 16.03-26.24 ± 5.10, Error rate: 33.33%, Success/Error: 2/1
DeepSeek : Median/Mean: 20.00/20.00, Range: 16.35-23.64 ± 3.64, Error rate: 33.33%, Success/Error: 2/1
Fireworks : Median/Mean: 13.44/13.67, Range: 8.68-18.90 ± 4.18, Error rate: 0.00%, Success/Error: 3/0
Together : Median/Mean: 12.67/14.44, Range: 10.88-19.78 ± 3.84, Error rate: 0.00%, Success/Error: 3/0
DeepInfra : Median/Mean: 7.45/ 7.89, Range: 7.18- 9.03 ± 0.82, Error rate: 0.00%, Success/Error: 3/0
Nebius : Median/Mean: 6.75/ 6.76, Range: 3.21-10.31 ± 2.90, Error rate: 0.00%, Success/Error: 3/0
Azure : Median/Mean: 3.68/ 5.33, Range: 3.67- 8.64 ± 2.34, Error rate: 0.00%, Success/Error: 3/0
Nvidia : Error rate: 100.00%, Success/Error: 0/3
Date: 03/02/2025
Fireworks : Median/Mean: 31.70/31.34, Range: 29.53-32.79 ± 1.36, Error rate: 0.00%, Success/Error: 3/0
Together : Median/Mean: 16.87/16.51, Range: 15.67-16.98 ± 0.59, Error rate: 0.00%, Success/Error: 3/0
DeepSeek : Median/Mean: 16.34/16.34, Range: 16.34-16.34 ± 0.00, Error rate: 66.67%, Success/Error: 1/2
Nebius : Median/Mean: 9.91/ 8.80, Range: 3.25-13.23 ± 4.15, Error rate: 0.00%, Success/Error: 3/0
DeepInfra : Median/Mean: 7.83/ 7.90, Range: 7.30- 8.56 ± 0.52, Error rate: 0.00%, Success/Error: 3/0
Azure : Median/Mean: 6.71/ 6.69, Range: 6.61- 6.75 ± 0.06, Error rate: 0.00%, Success/Error: 3/0
Hyperbolic: Median/Mean: 6.66/ 6.85, Range: 6.44- 7.46 ± 0.44, Error rate: 0.00%, Success/Error: 3/0
Nvidia : Error rate: 100.00%, Success/Error: 0/3
Date: 02/02/2025
Fireworks : Median/Mean: 26.83/27.46, Range: 25.77-29.77 ± 1.69, Error rate: 0.00%, Success/Error: 3/0
Together : Median/Mean: 14.14/14.16, Range: 12.79-15.55 ± 1.13, Error rate: 0.00%, Success/Error: 3/0
Hyperbolic: Median/Mean: 13.64/13.77, Range: 12.41-15.26 ± 1.17, Error rate: 0.00%, Success/Error: 3/0
DeepSeek : Median/Mean: 13.44/13.37, Range: 13.21-13.45 ± 0.11, Error rate: 0.00%, Success/Error: 3/0
DeepInfra : Median/Mean: 8.89/ 8.80, Range: 8.23- 9.28 ± 0.43, Error rate: 0.00%, Success/Error: 3/0
Nvidia : Median/Mean: 6.76/ 9.39, Range: 6.60-14.82 ± 3.84, Error rate: 0.00%, Success/Error: 3/0
Azure : Median/Mean: 6.67/ 6.69, Range: 6.62- 6.78 ± 0.07, Error rate: 0.00%, Success/Error: 3/0
Nebius : Median/Mean: 4.80/ 6.23, Range: 3.73-10.15 ± 2.81, Error rate: 0.00%, Success/Error: 3/0
Date: 01/02/2025
Fireworks : Median/Mean: 27.15/27.15, Range: 26.94-27.35 ± 0.21, Error rate: 0.00%, Success/Error: 2/0
Together : Median/Mean: 20.88/20.88, Range: 20.79-20.97 ± 0.09, Error rate: 0.00%, Success/Error: 2/0
Nvidia : Median/Mean: 15.43/15.43, Range: 14.76-16.09 ± 0.67, Error rate: 0.00%, Success/Error: 2/0
Azure : Median/Mean: 6.92/ 6.92, Range: 6.92- 6.92 ± 0.00, Error rate: 50.00%, Success/Error: 1/1
DeepInfra : Median/Mean: 6.81/ 6.81, Range: 6.46- 7.17 ± 0.35, Error rate: 0.00%, Success/Error: 2/0
Nebius : Median/Mean: 4.83/ 4.83, Range: 4.16- 5.49 ± 0.67, Error rate: 0.00%, Success/Error: 2/0
DeepSeek : Error rate: 100.00%, Success/Error: 0/2
Hyperbolic: Error rate: 100.00%, Success/Error: 0/2
Date: 31/01/2025
Fireworks : Median/Mean: 28.99/32.51, Range: 13.24-54.62 ±16.21, Error rate: 0.00%, Success/Error: 5/0
DeepSeek : Median/Mean: 26.14/37.87, Range: 17.55-67.60 ±19.86, Error rate: 0.00%, Success/Error: 5/0
Together : Median/Mean: 18.22/18.58, Range: 15.44-20.66 ± 1.91, Error rate: 0.00%, Success/Error: 5/0
Nvidia : Median/Mean: 9.41/ 8.53, Range: 4.50-13.45 ± 3.37, Error rate: 0.00%, Success/Error: 5/0
DeepInfra : Median/Mean: 7.56/ 6.07, Range: 3.09- 8.45 ± 2.29, Error rate: 0.00%, Success/Error: 5/0
Hyperbolic: Median/Mean: 6.56/ 9.16, Range: 5.65-15.28 ± 4.34, Error rate: 40.00%, Success/Error: 3/2
Nebius : Median/Mean: 5.80/ 8.56, Range: 4.10-17.24 ± 4.87, Error rate: 0.00%, Success/Error: 5/0
Azure : Median/Mean: 5.64/ 4.64, Range: 1.89- 6.90 ± 2.09, Error rate: 0.00%, Success/Error: 5/0
Date: 30/01/2025
Fireworks : Median/Mean: 20.39/17.56, Range: 6.77-21.41 ± 5.18, Error rate: 0.00%, Success/Error: 6/0
DeepSeek : Median/Mean: 20.30/23.64, Range: 11.28-46.72 ±11.04, Error rate: 0.00%, Success/Error: 6/0
Together : Median/Mean: 15.54/15.03, Range: 8.69-21.72 ± 4.87, Error rate: 0.00%, Success/Error: 6/0
Hyperbolic: Median/Mean: 13.19/16.31, Range: 7.21-31.26 ± 8.14, Error rate: 0.00%, Success/Error: 5/0
DeepInfra : Median/Mean: 7.22/ 7.33, Range: 6.87- 7.89 ± 0.36, Error rate: 0.00%, Success/Error: 6/0
Azure : Median/Mean: 5.49/ 5.22, Range: 3.97- 5.95 ± 0.75, Error rate: 0.00%, Success/Error: 4/0
Date: 29/01/2025
Hyperbolic: Median/Mean: 22.84/22.84, Range: 20.22-25.47 ± 2.63, Error rate: 0.00%, Success/Error: 2/0
DeepSeek : Median/Mean: 20.49/31.09, Range: 15.80-67.60 ±21.17, Error rate: 0.00%, Success/Error: 4/0
Fireworks : Median/Mean: 16.73/16.98, Range: 14.15-20.32 ± 2.27, Error rate: 0.00%, Success/Error: 4/0
DeepInfra : Median/Mean: 9.46/ 8.82, Range: 6.61- 9.76 ± 1.28, Error rate: 0.00%, Success/Error: 4/0
Together : Median/Mean: 8.76/ 8.57, Range: 7.57- 9.18 ± 0.63, Error rate: 0.00%, Success/Error: 4/0
=== Final Benchmark Results ===
Current time: 2025-01-30T07:23:37.790Z
Test prompt: What is the capital of France?
DeepSeek : Speed: 24.34 tokens/s, Total: 424 tokens, Prompt: 12 tokens, Completion: 412 tokens, Time: 16.93s, Latency: 1.11s, Length: 1904 chars
Fireworks : Speed: 21.41 tokens/s, Total: 373 tokens, Prompt: 10 tokens, Completion: 363 tokens, Time: 16.96s, Latency: 2.16s, Length: 1785 chars
Together : Speed: 18.65 tokens/s, Total: 393 tokens, Prompt: 10 tokens, Completion: 383 tokens, Time: 20.54s, Latency: 0.56s, Length: 1782 chars
DeepInfra : Speed: 6.87 tokens/s, Total: 73 tokens, Prompt: 10 tokens, Completion: 63 tokens, Time: 9.16s, Latency: 1.04s, Length: 297 chars
Azure : Speed: 5.54 tokens/s, Total: 372 tokens, Prompt: 10 tokens, Completion: 362 tokens, Time: 65.34s, Latency: 5.35s, Length: 1783 chars
Full outputs:
- Check outputs directory for full outputs
A timeout of 10 seconds is used for all providers.
If the API does not respond within 10 seconds, the provider is skipped for that run.
This is why some providers are missing data.
- Install dependencies:
npm install
- Create a
.env
file in the root directory.
Follow the sample in the .env.example
file to set up your API keys.
- Make sure you have Node.js version 20 or higher installed.
Run the benchmark:
npm run benchmark # Regular benchmark
npm run benchmark-show-output # Show the API response while benchmarking
npm run analyze-speed # Analyze the speed of the API
The script will measure:
- Total tokens generated
- Response time
- First response latency
- Tokens per second
- Prompt and completion token counts
The benchmark script sends a standardized prompt to the DeepSeek API and measures:
- The time taken to receive the complete response
- The number of tokens in both the prompt and response
- Calculates the overall tokens per second processing speed
This helps in understanding the real-world performance of the DeepSeek API in your specific environment and use case.