Add llama2 configs for GPU A3 #576

michelle-yooh · 2024-04-05T17:48:48Z

No description provided.

rwitten · 2024-04-05T17:59:02Z

MaxText/configs/a3/llama_2_7b/1vm.sh

+
+
+# 1 node, DATA_DP=1, ICI_FSDP=8
+python3 xpk/xpk.py workload create --cluster ${CLUSTER_NAME} --workload ${WORKLOAD_NAME} \


We don't put the XPK command inside of the script -- please match the style we use elsewhere.

Also unless I'm missing something we coudl break this into separately 5 env files and 1 script?

I believe we need 5 different env and configs.

yangyuwei · 2024-04-05T22:05:33Z

MaxText/configs/a3/llama_2_7b/16vm.sh

Shall we update the --xla_dump_to=gs://runner-maxtext-logs/yooh/llama2-70b-$(date +%Y-%m-%d-%H-%M)/HLO_dumps/ to get rid of personal directory and change 70b to 7b?

The same for MaxText/configs/a3/llama_2_7b/8vm.sh.

yangyuwei

Thanks Michelle for the changes!

michelle-yooh assigned rwitten, michelle-yooh and gobbleturk Apr 5, 2024

michelle-yooh requested review from rwitten and gobbleturk as code owners April 5, 2024 17:48

rwitten requested changes Apr 5, 2024

View reviewed changes

rwitten removed their assignment Apr 5, 2024

yangyuwei reviewed Apr 5, 2024

View reviewed changes

yangyuwei approved these changes Apr 5, 2024

View reviewed changes

NinaCai mentioned this pull request Apr 8, 2024

separate tpu and gpu end-to-end scripts #578

Merged

Add llama2 configs for GPU A3

1200dd7

michelle-yooh force-pushed the yooh/gpu_llama_configs branch from 600eca9 to 1200dd7 Compare April 8, 2024 21:07

rwitten approved these changes Apr 8, 2024

View reviewed changes

github-actions bot added the pull ready label Apr 8, 2024

copybara-service bot merged commit 78daad1 into main Apr 8, 2024
9 checks passed

copybara-service bot deleted the yooh/gpu_llama_configs branch April 8, 2024 23:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add llama2 configs for GPU A3 #576

Add llama2 configs for GPU A3 #576

michelle-yooh commented Apr 5, 2024

rwitten Apr 5, 2024

rwitten Apr 5, 2024

michelle-yooh Apr 5, 2024

yangyuwei Apr 5, 2024

yangyuwei Apr 5, 2024

michelle-yooh Apr 5, 2024

yangyuwei left a comment



		# 1 node, DATA_DP=1, ICI_FSDP=8
		python3 xpk/xpk.py workload create --cluster ${CLUSTER_NAME} --workload ${WORKLOAD_NAME} \

Add llama2 configs for GPU A3 #576

Add llama2 configs for GPU A3 #576

Conversation

michelle-yooh commented Apr 5, 2024

rwitten Apr 5, 2024

Choose a reason for hiding this comment

rwitten Apr 5, 2024

Choose a reason for hiding this comment

michelle-yooh Apr 5, 2024

Choose a reason for hiding this comment

yangyuwei Apr 5, 2024

Choose a reason for hiding this comment

yangyuwei Apr 5, 2024

Choose a reason for hiding this comment

michelle-yooh Apr 5, 2024

Choose a reason for hiding this comment

yangyuwei left a comment

Choose a reason for hiding this comment