Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add llama2 configs for GPU A3 #576

Merged
merged 1 commit into from
Apr 8, 2024
Merged

Conversation

michelle-yooh
Copy link
Collaborator

No description provided.



# 1 node, DATA_DP=1, ICI_FSDP=8
python3 xpk/xpk.py workload create --cluster ${CLUSTER_NAME} --workload ${WORKLOAD_NAME} \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't put the XPK command inside of the script -- please match the style we use elsewhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also unless I'm missing something we coudl break this into separately 5 env files and 1 script?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need 5 different env and configs.

@rwitten rwitten removed their assignment Apr 5, 2024
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we update the --xla_dump_to=gs://runner-maxtext-logs/yooh/llama2-70b-$(date +%Y-%m-%d-%H-%M)/HLO_dumps/ to get rid of personal directory and change 70b to 7b?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same for MaxText/configs/a3/llama_2_7b/8vm.sh.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

Copy link
Collaborator

@yangyuwei yangyuwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Michelle for the changes!

@michelle-yooh michelle-yooh force-pushed the yooh/gpu_llama_configs branch from 600eca9 to 1200dd7 Compare April 8, 2024 21:07
@copybara-service copybara-service bot merged commit 78daad1 into main Apr 8, 2024
9 checks passed
@copybara-service copybara-service bot deleted the yooh/gpu_llama_configs branch April 8, 2024 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants