Implement extension plugin framework #588

hzxuzhonghu · 2025-12-24T06:54:19Z

What type of PR is this?

/kind enhancement

What this PR does / why we need it:

Which issue(s) this PR fixes:
For #563

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

gemini-code-assist · 2025-12-24T06:54:38Z

Summary of Changes

Hello @hzxuzhonghu, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request proposes and details a new plugin framework for the ModelServing controller. The primary goal is to provide a flexible and extensible mechanism for users to customize the Pods generated by the controller, accommodating diverse inference engines and accelerator types without requiring direct modifications to the core ModelServing API. This framework introduces a plugins field in the ModelServingSpec, defines specific hook points for plugin invocation during the Pod lifecycle, and enables controlled mutation of Pod specifications.

Highlights

New spec.plugins field: Introduces a new optional field in ModelServingSpec to enable one or more plugins, each with a name, type, and configuration.
Defined Hook Points: Specifies initial hook points (OnPodCreate, OnPodReady) where plugins can be invoked, allowing for Pod mutation before creation and side-effect-free actions on readiness.
Pod Spec Mutation: Plugins gain the ability to mutate various aspects of the Pod spec, including labels, annotations, environment variables, volumes, node affinity, and scheduler hints.
Deterministic Plugin Execution: Plugins will execute in the order they are listed, with each plugin seeing the Pod object after previous mutations, ensuring predictable behavior.
Enhanced Observability: The framework includes mechanisms for observability, such as emitting Kubernetes Events for plugin application and errors, and extending ModelServing Status conditions.
Security and Stability Considerations: The proposal addresses security by initially restricting to built-in plugins and plans for future webhook/WASM support, along with strategies for API versioning and debuggability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

hzxuzhonghu · 2025-12-24T06:55:29Z

cc @VanderChen @LiZhenCheng9527 @JesseStutler @zl-cheng

This is much like volcano job's plugin framework

gemini-code-assist

Code Review

This pull request introduces a well-structured design proposal for a ModelServing plugin framework. The proposal is comprehensive, covering motivation, API changes, hook points, and implementation details. My review focuses on improving the clarity and consistency of the proposed API, ensuring that the design is robust and easy for developers to understand and use. The feedback includes suggestions to formally document all API fields shown in examples, clarify ambiguous fields, resolve inconsistencies in function signatures, and enhance the context provided to plugins.

gemini-code-assist · 2025-12-24T06:55:47Z

docs/proposal/modelserving-plugin-framework.md

+  - `name` (string) – plugin identifier
+  - `type` (string) – plugin kind (e.g., `BuiltIn`, `Webhook`), to allow future extension
+  - `config` (raw JSON) – plugin-specific configuration (opaque to the API server)
+  - `scope` (optional) – apply to entry/worker, roles, or all


The scope field is a great addition for targeting plugins. To improve clarity, it would be beneficial to better define its possible values and behavior. For example:

What are the exact enum values? e.g., All, Role, PodType.

If scope is Role, how does a user specify which roles to apply the plugin to? Perhaps scope could be an object like scope: { roles: ["role-a", "role-b"] }.

What is the distinction between entry/worker and roles? Does entry/worker apply to all roles but only to specific pod types within them?

Clarifying these aspects will make the feature easier to understand and use correctly.

gemini-code-assist · 2025-12-24T06:55:47Z

docs/proposal/modelserving-plugin-framework.md

+        gpuCount: 1
+    - name: huawei-ascend-defaults
+      type: BuiltIn
+      disabled: true


The example YAML includes a disabled: true field, which is a very useful feature for toggling plugins without removing them from the spec. However, this field is not mentioned in the API changes description (lines 65-69). It should be formally documented as part of the plugin API to ensure it is a supported and well-understood feature.

good catch, will remove

gemini-code-assist · 2025-12-24T06:55:48Z

docs/proposal/modelserving-plugin-framework.md

+
+Initial hook points:
+
+- `OnPodCreate(ctx, req) -> (podMutations, error)`


There's a discrepancy in the described signature for the OnPodCreate hook between this section and the Go interface draft.

Here on line 106, the signature is -> (podMutations, error), suggesting it returns a patch.

In the Go interface draft on line 194, it's ... error, implying in-place mutation of the Pod object within the HookRequest.

For built-in plugins, in-place mutation is generally more straightforward. It would be good to clarify this. For instance, you could state that the (podMutations, error) return signature is intended for future remote plugins (like webhooks), while built-in plugins perform in-place mutations. Please update the documentation to be consistent.

docs/proposal/modelserving-plugin-framework.md

Copilot

Pull request overview

This PR introduces a design proposal for a ModelServing plugin framework that enables users to customize Pods generated by the ModelServing controller for different inference engines and accelerators through an extensible plugin mechanism.

Key changes:

Proposes a new spec.plugins field in ModelServing API for configuring optional plugins
Defines hook points (OnPodCreate, OnPodReady) for plugin lifecycle integration
Outlines implementation strategy with built-in plugins as the initial approach

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/proposal/modelserving-plugin-framework.md

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 12 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/model-serving-controller/controller/model_serving_controller.go

client-go/applyconfiguration/workload/v1alpha1/pluginscope.go

client-go/applyconfiguration/workload/v1alpha1/modelservingspec.go

pkg/model-serving-controller/plugins/manager.go

pkg/model-serving-controller/controller/model_serving_controller.go

client-go/applyconfiguration/workload/v1alpha1/pluginscope.go

pkg/apis/workload/v1alpha1/model_serving_types.go

pkg/model-serving-controller/plugins/manager.go

pkg/model-serving-controller/controller/model_serving_controller.go

hzxuzhonghu · 2026-01-05T01:35:16Z

ping @VanderChen @LiZhenCheng9527 @JesseStutler @zl-cheng ptal

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

pkg/model-serving-controller/controller/model_serving_controller.go:1238

The controller integration for plugin hooks lacks test coverage. While the plugin manager itself has comprehensive unit tests in manager_test.go, there are no tests validating that the controller correctly calls the plugin hooks during pod creation (CreatePodsByRole) and readiness (handleReadyPod). Tests should verify that plugins are invoked at the correct lifecycle points and that errors are properly handled.

func (c *ModelServingController) CreatePodsByRole(ctx context.Context, role workloadv1alpha1.Role, mi *workloadv1alpha1.ModelServing, roleIndex int, servingGroupOrdinal int, revision string) error {
	servingGroupName := utils.GenerateServingGroupName(mi.Name, servingGroupOrdinal)
	// TODO: build the plugin chain only once for the ModelServing
	chain, err := c.buildPluginChain(mi)
	if err != nil {
		return fmt.Errorf("build plugin chain: %w", err)
	}
	roleID := utils.GenerateRoleID(role.Name, roleIndex)

	entryPod := utils.GenerateEntryPod(role, mi, servingGroupName, roleIndex, revision)
	taskName := c.gangManager.GenerateTaskName(role.Name, roleIndex)
	c.gangManager.AnnotatePodWithPodGroup(entryPod, mi, servingGroupName, taskName)
	entryReq := &plugins.HookRequest{
		ModelServing: mi,
		ServingGroup: servingGroupName,
		RoleName:     role.Name,
		RoleID:       roleID,
		IsEntry:      true,
		Pod:          entryPod,
	}
	if err := chain.OnPodCreate(ctx, entryReq); err != nil {
		return fmt.Errorf("execute OnPodCreate failed for entry pod %s: %v", entryPod.Name, err)
	}
	_, err = c.kubeClientSet.CoreV1().Pods(mi.Namespace).Create(ctx, entryPod, metav1.CreateOptions{})
	if err != nil && !apierrors.IsAlreadyExists(err) {
		return fmt.Errorf("failed to create entry pod %s: %v", entryPod.Name, err)
	}

	serviceSelector := map[string]string{
		workloadv1alpha1.GroupNameLabelKey: servingGroupName,
		workloadv1alpha1.RoleLabelKey:      role.Name,
		workloadv1alpha1.RoleIDKey:         utils.GenerateRoleID(role.Name, roleIndex),
		workloadv1alpha1.EntryLabelKey:     utils.Entry,
	}
	if role.WorkerTemplate != nil {
		if err := utils.CreateHeadlessService(ctx, c.kubeClientSet, mi, serviceSelector, servingGroupName, role.Name, roleIndex); err != nil {
			return fmt.Errorf("failed to create headless service: %v", err)
		}
	}

	for i := 1; i <= int(role.WorkerReplicas); i++ {
		workerPod := utils.GenerateWorkerPod(role, mi, entryPod, servingGroupName, roleIndex, i, revision)
		c.gangManager.AnnotatePodWithPodGroup(workerPod, mi, servingGroupName, taskName)
		workerReq := &plugins.HookRequest{
			ModelServing: mi,
			ServingGroup: servingGroupName,
			RoleName:     role.Name,
			RoleID:       roleID,
			IsEntry:      false,
			Pod:          workerPod,
		}
		if err := chain.OnPodCreate(ctx, workerReq); err != nil {
			return fmt.Errorf("execute OnPodCreate failed for worker pod %s: %v", workerPod.Name, err)
		}
		_, err := c.kubeClientSet.CoreV1().Pods(mi.Namespace).Create(ctx, workerPod, metav1.CreateOptions{})
		if err != nil && !apierrors.IsAlreadyExists(err) {
			return fmt.Errorf("failed to create worker pod %s: %v", workerPod.Name, err)
		}
	}
	return nil

pkg/model-serving-controller/controller/model_serving_controller.go:790

The controller integration for plugin hooks lacks test coverage. While the plugin manager itself has comprehensive unit tests, there are no tests validating that the controller correctly calls the plugin ready hooks and handles errors appropriately. Tests should verify the hook invocation and error propagation in the ready pod handling flow.

func (c *ModelServingController) handleReadyPod(mi *workloadv1alpha1.ModelServing, servingGroupName string, newPod *corev1.Pod) error {
	chain, err := c.buildPluginChain(mi)
	if err != nil {
		return fmt.Errorf("build plugin chain: %w", err)
	}
	if chain != nil {
		if err := chain.OnPodReady(context.TODO(), &plugins.HookRequest{
			ModelServing: mi,
			ServingGroup: servingGroupName,
			RoleName:     utils.PodRoleName(newPod),
			RoleID:       utils.PodRoleID(newPod),
			IsEntry:      newPod.Labels[workloadv1alpha1.EntryLabelKey] == utils.Entry,
			Pod:          newPod,
		}); err != nil {
			return err
		}
	}

	// Add the running pod to the global storage and try to update the ServingGroup status
	c.store.AddRunningPodToServingGroup(types.NamespacedName{
		Namespace: mi.Namespace,
		Name:      mi.Name,
	}, servingGroupName, newPod.Name, utils.PodRevision(newPod), utils.PodRoleName(newPod), utils.PodRoleID(newPod))
	ready, err := c.checkServingGroupReady(mi, servingGroupName)
	if err != nil {
		return fmt.Errorf("failed to check ServingGroup status, err: %v", err)
	}
	if ready {
		// All pods in the ServingGroup are running, so the ServingGroup status also needs to be set to running
		err = c.store.UpdateServingGroupStatus(utils.GetNamespaceName(mi), servingGroupName, datastore.ServingGroupRunning)
		if err != nil {
			return fmt.Errorf("failed to set ServingGroup %s status: %v", servingGroupName, err)
		}
		klog.V(2).Infof("Update ServingGroup %s status to Running", servingGroupName)
		c.enqueueModelServing(mi)
	} else {
		klog.V(4).Infof("ServingGroup %s still creating", servingGroupName)
	}
	return nil
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/model-serving-controller/controller/model_serving_controller.go

Copilot · 2026-01-05T03:38:23Z

pkg/model-serving-controller/controller/model_serving_controller.go

+		return fmt.Errorf("build plugin chain: %w", err)
+	}
+	if chain != nil {
+		if err := chain.OnPodReady(context.TODO(), &plugins.HookRequest{


Using context.TODO() is inappropriate here. The parent context (ctx) should be passed through to maintain proper context propagation, cancellation signals, and deadline handling. This is especially important for plugin hooks that may perform external operations.

Suggested change

if err := chain.OnPodReady(context.TODO(), &plugins.HookRequest{

if err := chain.OnPodReady(context.Background(), &plugins.HookRequest{

LiZhenCheng9527 · 2026-01-06T08:31:30Z

pkg/apis/workload/v1alpha1/model_serving_types.go


+	// Plugins defines optional plugin chain to customize serving pods.
+	// +optional
+	Plugins []PluginSpec `json:"plugins,omitempty"`


nit: Should we create a separate plugin.go file to store plugin-related definitions? I believe this approach would be more concise and easier to extend.

LiZhenCheng9527 · 2026-01-06T08:43:04Z

pkg/model-serving-controller/plugins/manager.go

+
+// OnPodCreate executes plugins in order. Mutations are applied to req.Pod.
+func (c *Chain) OnPodCreate(ctx context.Context, req *HookRequest) error {
+	if c == nil {


If c == nil, a panic will occur when this method is called.
Therefore, I consider the inspection here unnecessary.

Nope, if c is nil, calling c.OnPodCreate will not panic. But accessing its field <c.fieldA> will panic

LiZhenCheng9527 · 2026-01-06T08:44:12Z

pkg/model-serving-controller/plugins/manager.go

+
+// OnPodReady executes plugins' ready hooks in order.
+func (c *Chain) OnPodReady(ctx context.Context, req *HookRequest) error {
+	if c == nil {


hzxuzhonghu · 2026-01-12T06:55:28Z

@VanderChen @LiZhenCheng9527 @zl-cheng any other comments?

LiZhenCheng9527 · 2026-01-12T07:19:56Z

pkg/model-serving-controller/plugins/types.go

+	RoleName     string
+	RoleID       string
+	IsEntry      bool
+	Pod          *corev1.Pod


I have a question. The Pods specified here are the ones waiting to be processed. Is "isEntry" used to handle different situations? Can we either specify specific Pods for processing or specify entryPods or WorkerPods of a particular role.

LiZhenCheng9527 · 2026-01-12T07:38:18Z

/lgtm

LiZhenCheng9527 · 2026-01-16T08:55:13Z

Is there any update on this PR?

hzxuzhonghu · 2026-01-19T01:27:57Z

Will resolve conflict today

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/apis/workload/v1alpha1/model_serving_types.go

pkg/model-serving-controller/controller/model_serving_controller.go

pkg/apis/workload/v1alpha1/model_serving_types.go

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

LiZhenCheng9527 · 2026-01-22T07:00:04Z

/lgtm
/approve

volcano-sh-bot · 2026-01-22T07:00:14Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LiZhenCheng9527

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~charts/OWNERS~~ [LiZhenCheng9527]
~~client-go/OWNERS~~ [LiZhenCheng9527]
~~docs/OWNERS~~ [LiZhenCheng9527]
~~pkg/OWNERS~~ [LiZhenCheng9527]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copilot AI review requested due to automatic review settings December 24, 2025 06:54

volcano-sh-bot added the kind/enhancement New feature or request label Dec 24, 2025

volcano-sh-bot requested review from LiZhenCheng9527 and YaoZengzeng December 24, 2025 06:54

volcano-sh-bot added the size/L label Dec 24, 2025

Copilot started reviewing on behalf of hzxuzhonghu December 24, 2025 06:54 View session

gemini-code-assist bot reviewed Dec 24, 2025

View reviewed changes

Copilot AI reviewed Dec 24, 2025

View reviewed changes

docs/proposal/modelserving-plugin-framework.md Show resolved Hide resolved

docs/proposal/modelserving-plugin-framework.md Outdated Show resolved Hide resolved

docs/proposal/modelserving-plugin-framework.md Outdated Show resolved Hide resolved

Copilot AI review requested due to automatic review settings January 4, 2026 06:27

hzxuzhonghu force-pushed the add-extension-proposal branch from 4320be5 to 6f07111 Compare January 4, 2026 06:27

volcano-sh-bot added size/XL and removed size/L labels Jan 4, 2026

Copilot started reviewing on behalf of hzxuzhonghu January 4, 2026 06:28 View session

hzxuzhonghu changed the title ~~Add extension plugin framework design~~ Implement extension plugin framework Jan 4, 2026

Copilot AI reviewed Jan 4, 2026

View reviewed changes

volcano-sh-bot added size/XXL and removed size/XL labels Jan 4, 2026

Copilot AI review requested due to automatic review settings January 5, 2026 03:30

Copilot started reviewing on behalf of hzxuzhonghu January 5, 2026 03:31 View session

Copilot AI reviewed Jan 5, 2026

View reviewed changes

LiZhenCheng9527 reviewed Jan 6, 2026

View reviewed changes

hzxuzhonghu force-pushed the add-extension-proposal branch from 55d4548 to 3b5c611 Compare January 7, 2026 02:23

LiZhenCheng9527 reviewed Jan 12, 2026

View reviewed changes

volcano-sh-bot assigned LiZhenCheng9527 Jan 12, 2026

volcano-sh-bot added the lgtm label Jan 12, 2026

hzxuzhonghu added 2 commits January 19, 2026 11:57

Add extension plugin framework design

92a022e

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

Cleanup webhook

cd5abe8

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

Copilot AI review requested due to automatic review settings January 19, 2026 04:11

hzxuzhonghu force-pushed the add-extension-proposal branch from 3b5c611 to cd5abe8 Compare January 19, 2026 04:11

volcano-sh-bot removed the lgtm label Jan 19, 2026

Copilot started reviewing on behalf of hzxuzhonghu January 19, 2026 04:12 View session

Copilot AI reviewed Jan 19, 2026

View reviewed changes

update plugin target

2b10c4d

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

volcano-sh-bot added the lgtm label Jan 22, 2026

volcano-sh-bot added the approved label Jan 22, 2026

volcano-sh-bot merged commit 9ae5872 into volcano-sh:main Jan 22, 2026
15 checks passed

hzxuzhonghu deleted the add-extension-proposal branch January 22, 2026 09:48


		Initial hook points:

		- `OnPodCreate(ctx, req) -> (podMutations, error)`

	if err := chain.OnPodReady(context.TODO(), &plugins.HookRequest{
	if err := chain.OnPodReady(context.Background(), &plugins.HookRequest{

Implement extension plugin framework #588

Implement extension plugin framework #588

Uh oh!

Conversation

hzxuzhonghu commented Dec 24, 2025

Uh oh!

gemini-code-assist bot commented Dec 24, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

hzxuzhonghu commented Dec 24, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

hzxuzhonghu Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hzxuzhonghu commented Jan 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

LiZhenCheng9527 Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

LiZhenCheng9527 Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

hzxuzhonghu Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

LiZhenCheng9527 Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

hzxuzhonghu commented Jan 12, 2026

Uh oh!

LiZhenCheng9527 Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

LiZhenCheng9527 commented Jan 12, 2026