From 2bfa234f05ec4c738f62067b5f5ea5f703b8e9f5 Mon Sep 17 00:00:00 2001 From: Casper Date: Thu, 2 Nov 2023 19:28:15 +0100 Subject: [PATCH] Benchmark info (#138) --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 5e40f1f5..ca631ef5 100644 --- a/README.md +++ b/README.md @@ -197,6 +197,9 @@ generation_output = model.generate( ## Benchmarks +- GPU: RTX 3090 +- Command: `python examples/benchmark --model_path ` + | Model Name | Version | Batch Size | Prefill Length | Decode Length | Prefill tokens/s | Decode tokens/s | Memory (VRAM) | |---------------|---------|------------|----------------|---------------|------------------|-----------------|------------------| | Vicuna 7B | GEMM | 1 | 64 | 64 | 2618.88 | 125.428 | 4.57 GB (19.31%) |