-
Notifications
You must be signed in to change notification settings - Fork 776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job-like workload supported by WorkloadSpread #1838
Job-like workload supported by WorkloadSpread #1838
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1838 +/- ##
==========================================
+ Coverage 47.91% 50.95% +3.04%
==========================================
Files 162 194 +32
Lines 23491 25288 +1797
==========================================
+ Hits 11256 12886 +1630
- Misses 11014 11099 +85
- Partials 1221 1303 +82
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
02f7b9c
to
02df344
Compare
0123426
to
c5c9fa3
Compare
c5c9fa3
to
969bbea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@@ -120,6 +121,7 @@ func (r *ControllerFinder) GetPodsForRef(apiVersion, kind, ns, name string, acti | |||
FieldSelector: fields.SelectorFromSet(fields.Set{fieldindex.IndexNameForOwnerRefUID: string(uid)}), | |||
} | |||
pods, err := listPods(&listOption) | |||
klog.V(5).InfoS("result of list pods with owner ref uid", "pods", len(pods), "err", err, "refUid", uid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The uid is not very suitable for viewing logs, please update it to be more log-friendly with some key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is very useful to check whether the pods are listed properly while debugging
@@ -91,6 +91,7 @@ func (r *ControllerFinder) GetPodsForRef(apiVersion, kind, ns, name string, acti | |||
labelSelector = obj.Selector | |||
workloadUIDs = append(workloadUIDs, obj.UID) | |||
} | |||
klog.V(5).InfoS("find pods and replicas result", "workloadReplicas", workloadReplicas, "workloadUIDs", workloadUIDs, "labelSelector", labelSelector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The uid is not very suitable for viewing logs, please update it to be more log-friendly with some key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is very useful to check whether the pods are listed properly while debugging
@@ -791,11 +800,13 @@ func initializeWorkloadsInWhiteList(c client.Client) { | |||
}) | |||
} | |||
} | |||
klog.InfoS("initialized workload list", "workloads", workloads) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and , "workloadSpread", klog.KObj(ws)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The effect of this function is global and it will execute only once.
…ge only a part of Pods owned by a target workload to support AI workloads like TFJob. And it also provides support for workloads without replicas. Signed-off-by: AiRanthem <zhongtianyun.zty@alibaba-inc.com>
Signed-off-by: AiRanthem <zhongtianyun.zty@alibaba-inc.com>
5d99bfa
to
cda1e9b
Compare
/lgtm |
* A TargetFilter is added to WorkloadSpread to make it possible to manage only a part of Pods owned by a target workload to support AI workloads like TFJob. And it also provides support for workloads without replicas. Signed-off-by: AiRanthem <zhongtianyun.zty@alibaba-inc.com> * fix some logs Signed-off-by: AiRanthem <zhongtianyun.zty@alibaba-inc.com> --------- Signed-off-by: AiRanthem <zhongtianyun.zty@alibaba-inc.com>
AI scenario is supported by WorkloadSpread
Ⅰ. Describe what this PR does
a targetFilter is added to WorkloadSpread to make it possible to manage only a part of Pods owned by a target workload. And it also provides support for workloads without replicas.
Ⅱ. Does this pull request fix one issue?
fixes #1818
Ⅲ. Describe how to verify it
Jobs like TFJob are supported, use WorkloadSpread on them.
Ⅳ. Special notes for reviews