Fix Microbenchmark Profiling Memory Issues #597

morgandu · 2024-04-16T20:05:55Z

Below are changes in this PR

allow microbenchmark with prefill lengths
allow microbenchmark with loop iters
allow benchmark prefill / generate stage only
delete prefill_results to that limited the batch size
added / move pytree util funcs

patemotter

Overall LGTM, but I don't think we need the inference_utils.py file.

patemotter · 2024-04-18T22:32:34Z

MaxText/inference_microbenchmark.py



 def main(config):
  engine = maxengine.MaxEngine(config)
  params = engine.load_params()
-  prefill_lengths = [64, 128, 256, 512, 1024]
-  benchmark_loop_iters = 10
+  prefill_lengths = [int(l) for l in config.inference_microbenchmark_prefill_lengths.split(",")]


Does this work if you pass in a command line param like --inference-microbenchmark-prefill-lengths="512,1024" or something similar?

Yeah, example commands:
to run a single prefill length:

inference_microbenchmark_prefill_lengths=1024

to run a single stage:

inference_microbenchmark_stages=generate

patemotter · 2024-04-18T22:34:55Z

MaxText/inference_utils.py

I don't think we need to make a file solely for inference_utils at this time. Especially since these functions are not unique to inference. I would add them to max_utils.py since they cover fairly generic usecases.

inference_utils.py is an existing file, but I can move these functions to max_utils.py

morgandu · 2024-04-18T23:15:48Z

Overall LGTM, but I don't think we need the inference_utils.py file.

I am writing a batch inference, which need some common utility functions

vipannalla · 2024-04-18T23:37:04Z

MaxText/inference_microbenchmark.py

+    benchmark_results["AutoRegressive"], decode_state = ar_benchmark(
+      config, engine, params, decode_state, iters=benchmark_loop_iters, cache_size=cache_size, model_size=model_size)


For running just generate benchmark, you still need to populate kv cache to produce proper perf numbers?

You will still need to initialize a decode_state for generate step calculation

vipannalla · 2024-04-18T23:38:54Z

MaxText/configs/base.yml

@@ -261,3 +261,8 @@ vertex_tensorboard_project: ""
 # Region to create Vertex AI Tensorboard in for GCE, blank if running via XPK
 # Vertex AI supported regions: https://cloud.google.com/vertex-ai/docs/general/locations#available-regions
 vertex_tensorboard_region: ""
+


Not in this PR, but I see a need for separate inference specific config files in future -- both base.yml and model specific config.

rwitten

looks good

rwitten

Please remember to squash your commits.

- allow run specified stages - allow run specific prefill length(s) - delete prefill result - printout prefill result added funcs in max_utils

morgandu requested review from rwitten and gobbleturk as code owners April 16, 2024 20:05

morgandu requested a review from patemotter April 16, 2024 20:05

morgandu force-pushed the mor--inference branch 5 times, most recently from 8253dbb to 3f28ee3 Compare April 18, 2024 21:15

patemotter reviewed Apr 18, 2024

View reviewed changes

morgandu changed the title ~~Enable Inference mode and Fix Microbenchmark Profiling / Memory Issues~~ Fix Microbenchmark Profiling Memory Issues Apr 18, 2024

vipannalla reviewed Apr 18, 2024

View reviewed changes

morgandu force-pushed the mor--inference branch 2 times, most recently from 210d7b2 to 38831be Compare April 19, 2024 00:01

rwitten approved these changes Apr 19, 2024

View reviewed changes

morgandu force-pushed the mor--inference branch from f522aba to 50ae199 Compare April 19, 2024 00:20

rwitten approved these changes Apr 19, 2024

View reviewed changes

morgandu force-pushed the mor--inference branch from 3a3a1c5 to c8ed884 Compare April 19, 2024 01:22

github-actions bot added the pull ready label Apr 19, 2024

inference microbenchmark

b46783c

- allow run specified stages - allow run specific prefill length(s) - delete prefill result - printout prefill result added funcs in max_utils

morgandu force-pushed the mor--inference branch from c8ed884 to b46783c Compare April 19, 2024 16:52

copybara-service bot merged commit 0e1c078 into main Apr 19, 2024
9 checks passed

copybara-service bot deleted the mor--inference branch April 19, 2024 17:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Microbenchmark Profiling Memory Issues #597

Fix Microbenchmark Profiling Memory Issues #597

morgandu commented Apr 16, 2024 •

edited

Loading

patemotter left a comment

patemotter Apr 18, 2024

morgandu Apr 18, 2024

patemotter Apr 18, 2024

morgandu Apr 18, 2024 •

edited

Loading

morgandu commented Apr 18, 2024

vipannalla Apr 18, 2024

morgandu Apr 19, 2024

vipannalla Apr 18, 2024

rwitten left a comment

rwitten left a comment

		benchmark_results["AutoRegressive"], decode_state = ar_benchmark(
		config, engine, params, decode_state, iters=benchmark_loop_iters, cache_size=cache_size, model_size=model_size)

Fix Microbenchmark Profiling Memory Issues #597

Fix Microbenchmark Profiling Memory Issues #597

Conversation

morgandu commented Apr 16, 2024 • edited Loading

patemotter left a comment

Choose a reason for hiding this comment

patemotter Apr 18, 2024

Choose a reason for hiding this comment

morgandu Apr 18, 2024

Choose a reason for hiding this comment

patemotter Apr 18, 2024

Choose a reason for hiding this comment

morgandu Apr 18, 2024 • edited Loading

Choose a reason for hiding this comment

morgandu commented Apr 18, 2024

vipannalla Apr 18, 2024

Choose a reason for hiding this comment

morgandu Apr 19, 2024

Choose a reason for hiding this comment

vipannalla Apr 18, 2024

Choose a reason for hiding this comment

rwitten left a comment

Choose a reason for hiding this comment

rwitten left a comment

Choose a reason for hiding this comment

morgandu commented Apr 16, 2024 •

edited

Loading

morgandu Apr 18, 2024 •

edited

Loading