-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSoC] Add New Parameter in tune
#2369
[GSoC] Add New Parameter in tune
#2369
Conversation
ref issue: #2340 |
PTAL👀 @andreyvelich @johnugeorge |
@andreyvelich I add my code here: katib/pkg/webhook/v1beta1/pod/inject_webhook.go Lines 140 to 142 in db17214
// Pass env variable KATIB_TRIAL_NAME to training containers using fieldPath.
for idx := range mutatedPod.Spec.Containers {
if mutatedPod.Spec.Containers[idx].Env == nil {
mutatedPod.Spec.Containers[idx].Env = []v1.EnvVar{}
}
mutatedPod.Spec.Containers[idx].Env = append(
mutatedPod.Spec.Containers[idx].Env,
v1.EnvVar{
Name: consts.EnvTrialName,
ValueFrom: &v1.EnvVarSource{
FieldRef: &v1.ObjectFieldSelector{
FieldPath: fmt.Sprintf("metadata.labels['%s']", consts.LabelTrialName),
},
},
},
)
} PTAL👀. Thank you for your review! |
/area gsoc |
I don't think, we should mutate |
Please rebase your PR as well @Electronic-Waste |
@Electronic-Waste Additionally, I think we forgot to add this line to the post-gen script of Katib SDK: # Import Katib report metrics functions
from kubeflow.katib.api.report_metrics import report_metrics Similar to how we include other imports to the @Electronic-Waste Please update this script, otherwise after we re-generate the SDK, your import will be deleted. |
Signed-off-by: Electronic-Waste <2690692950@qq.com>
Signed-off-by: Electronic-Waste <2690692950@qq.com>
Signed-off-by: Electronic-Waste <2690692950@qq.com>
Signed-off-by: Electronic-Waste <2690692950@qq.com>
Signed-off-by: Electronic-Waste <2690692950@qq.com>
Signed-off-by: Electronic-Waste <2690692950@qq.com>
Signed-off-by: Electronic-Waste <2690692950@qq.com>
Signed-off-by: Electronic-Waste <2690692950@qq.com>
3811459
to
e73826f
Compare
@andreyvelich Done! I've rebased the branch, modified the code to pass env variable only to the primary container and added the importing code to the post-gen script of Katib SDK. |
@andreyvelich Thank you for your detailed review! |
Signed-off-by: Electronic-Waste <2690692950@qq.com>
Signed-off-by: Electronic-Waste <2690692950@qq.com>
@@ -140,15 +141,37 @@ func (s *SidecarInjector) Mutate(pod *v1.Pod, namespace string) (*v1.Pod, error) | |||
// Add Katib Trial labels to the Pod metadata. | |||
mutatePodMetadata(mutatedPod, trial) | |||
|
|||
// Pass env variable KATIB_TRIAL_NAME to the primary container using fieldPath. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Electronic-Waste Additionally, since we always pass the Trial name to the training container, we should not allow to pass Trial name and namespace via template generator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's not friendly to users. Since junior users are unfamiliar with that, they may be puzzled about why they can't pass Trial's namespace and name to the training container. It's more straightforward for them to simply define trialParameter trialName
and use it in the pod env as an example suggests.
Also, it's more backward-compatible to just allow users passing Trial name and namespace via triallParameter and pod env. WDYT👀 @andreyvelich @tenzen-y @johnugeorge .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since junior users are unfamiliar with that, they may be puzzled about why they can't pass Trial's namespace and name to the training container.
Since we always pass KATIB_TRIAL_NAME to the Trial's Pods, users will always be able to read this environment in their training code, isn't ?
However, this can be useful for other part of the Pod spec: What if user wants to generate different volume which contains Trial name.
@Electronic-Waste Please can you an create an issue to track this discussion ? I think, we should discuss the future of Trial metadata parameters.
Signed-off-by: Electronic-Waste <2690692950@qq.com>
Signed-off-by: Electronic-Waste <2690692950@qq.com>
@andreyvelich I've wrapped the code into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Electronic-Waste!
just a small comment from me.
Signed-off-by: Electronic-Waste <2690692950@qq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, we should be ready to merge it.
Thanks to your contribution @tenzen-y 🎉
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andreyvelich The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* chore: add metrics_collector_config in tune function. Signed-off-by: Electronic-Waste <2690692950@qq.com> * rebase: rebase feat/new-param-tune to master. Signed-off-by: Electronic-Waste <2690692950@qq.com> * chore: add metrics collector kind list in comment. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix: always pass Trial name to the training container. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix: delete passing env variable logics in katib_client.py Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix: passing env variable KATIB_TRIAL_NAME in the webhook of pod. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix: pass env variable KATIB_TRIAL_NAME only to the primary container. Signed-off-by: Electronic-Waste <2690692950@qq.com> * chore: add report_metrics in post_gen.py. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix: change nil error to allErrs(deleted by accident). Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix: fix lint error in inject_webhook.go. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix: wrap env variables passing logics into mutatePodEnv. Signed-off-by: Electronic-Waste <2690692950@qq.com> * chore: add unit tests for mutatePodEnv. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix: delete protocmp. Signed-off-by: Electronic-Waste <2690692950@qq.com> --------- Signed-off-by: Electronic-Waste <2690692950@qq.com>
What this PR does / why we need it:
I add a new parameter
metrics_collector_config
totune
function. Design details: https://github.com/kubeflow/katib/blob/cc95ef03cb25df3d86a1a1c20c9c69bad17fce92/docs/proposals/push-based-metrics-collection.md#add-new-parameter-in-tuneWhich issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Checklist: