-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model training guide #1406
base: main
Are you sure you want to change the base?
Model training guide #1406
Conversation
Here is the summary of changes. You are about to add 14 region tags.
This comment is generated by snippet-bot.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this pull-request, @ganochenkodg!
Overall, looking good. I'm guessing this is still work-in-progress?
Let me know when you're ready to merge. :)
I left some comments.
@@ -0,0 +1,32 @@ | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: Looks like we'll need Copyright 2024 Google LLC
Apache license headers for most of these files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
} | ||
|
||
# [START gke_model_train_standard_private_regional_cluster] | ||
module "training_cluster" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Will the Terraform (from the repo) be embedded in the tutorial as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "embedded"? In the guide we run "git clone" and "terraform apply" with this code
image: quay.io/jupyter/pytorch-notebook:cuda11-python-3.11 | ||
#image: tensorflow/tensorflow:2.17.0-gpu-jupyter | ||
#command: [ "/bin/bash", "-c", "--" ] | ||
#args: [ "while true; do sleep 30; done;" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Is this (chunk of commented-out code) intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest: Judging from the requirements listed here, we'll need a README.md file for the ai-ml/model-train folder. Could you please add a README.md file with a link to the cloud.google.com tutorial where these samples will be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added, but link is empty, we don't know it until guide is published
@NimJay thank you for all your comments, we will try to fix them soon |
TF, yaml and python code for Model training guide