Skip to content

Conversation

@yashisrani
Copy link
Contributor

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #647

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
@gemini-code-assist
Copy link

Summary of Changes

Hello @yashisrani, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces crucial functionality for reporting the operational status of Gateway API resources (Gateways and HTTPRoutes) and custom InferencePool resources within the kthena-router system. By updating the status subresources of these objects, users gain immediate feedback on whether their configurations have been successfully processed and are active, significantly improving the observability and debuggability of the routing plane.

Highlights

  • Gateway API Status Updates: Introduced status updates for Gateway API Gateway resources, including Accepted and Programmed conditions for the Gateway itself and its listeners, providing real-time feedback on their operational state.
  • HTTPRoute Status Updates: Implemented status updates for Gateway API HTTPRoute resources, setting Accepted and ResolvedRefs conditions for parent Gateways managed by kthena-router, enhancing visibility into route resolution.
  • InferencePool Status Updates: Added status update capabilities for custom InferencePool resources, including a generic Accepted condition, which is updated via the dynamic Kubernetes client.
  • Controller Refactoring: Refactored controller constructors (NewGatewayController, NewHTTPRouteController, NewInferencePoolController) to accept respective Kubernetes client interfaces (gatewayclientset.Interface and dynamic.Interface), enabling direct API interaction for status updates.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces status updates for Gateway API resources (Gateway, HTTPRoute, and InferencePool). The changes correctly pass the required clients to the controllers and add the logic to update the status subresource after processing an object. The status update logic for Gateway and HTTPRoute looks good. However, I've found several issues in the InferencePool status update implementation, which appears to be incomplete and contains dead code. I've also made minor suggestions to replace context.TODO() with context.Background() for better code clarity.

Comment on lines 168 to 234
func (c *InferencePoolController) updateInferencePoolStatus(inferencePool *inferencev1.InferencePool) error {
inferencePool = inferencePool.DeepCopy()

// In version 1.2.0, InferencePool status is per-parent.
// For now, we'll maintain a generic parent status if it's referenced by any HTTPRoute.
// This is a simplified implementation.

acceptedCond := metav1.Condition{
Type: "Accepted",
Status: metav1.ConditionTrue,
Reason: "Accepted",
Message: "InferencePool has been accepted by kthena-router",
LastTransitionTime: metav1.Now(),
ObservedGeneration: inferencePool.Generation,
}

// We don't easily have the parent Gateway here, so we skip adding parents for now
// but we could implement a logic to find them from the store.
// To satisfy the lint and provide some status, we'll just ensure the object is updated.

// If the user wants specific fields, we can add a dummy parent or find real ones.
// For now, let's just update the object to trigger any observers.

_ = acceptedCond

// Convert back to unstructured to update status
content, err := runtime.DefaultUnstructuredConverter.ToUnstructured(inferencePool)
if err != nil {
return fmt.Errorf("failed to convert InferencePool to unstructured: %w", err)
}

unstructuredObj := &unstructured.Unstructured{Object: content}
gvr := inferencev1.SchemeGroupVersion.WithResource("inferencepools")
_, err = c.dynamicClient.Resource(gvr).Namespace(inferencePool.Namespace).UpdateStatus(context.TODO(), unstructuredObj, metav1.UpdateOptions{})
return err
}

func (c *InferencePoolController) setInferencePoolCondition(inferencePool *inferencev1.InferencePool, parentRef inferencev1.ParentReference, newCond metav1.Condition) {
// Find or create parent status
var parentStatus *inferencev1.ParentStatus
for i := range inferencePool.Status.Parents {
if inferencePool.Status.Parents[i].ParentRef.Name == parentRef.Name {
parentStatus = &inferencePool.Status.Parents[i]
break
}
}

if parentStatus == nil {
inferencePool.Status.Parents = append(inferencePool.Status.Parents, inferencev1.ParentStatus{
ParentRef: parentRef,
ControllerName: inferencev1.ControllerName(ControllerName),
})
parentStatus = &inferencePool.Status.Parents[len(inferencePool.Status.Parents)-1]
}

// Update conditions in parent status
for i, cond := range parentStatus.Conditions {
if cond.Type == newCond.Type {
if cond.Status == newCond.Status && cond.Reason == newCond.Reason {
newCond.LastTransitionTime = cond.LastTransitionTime
}
parentStatus.Conditions[i] = newCond
return
}
}
parentStatus.Conditions = append(parentStatus.Conditions, newCond)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The implementation for updating InferencePool status has several issues:

  1. The updateInferencePoolStatus function is incomplete. It defines acceptedCond but doesn't use it, resulting in a call to UpdateStatus with an unmodified object. This is inefficient and confusing.
  2. The setInferencePoolCondition function is never called, making it dead code.
  3. The parent reference comparison in setInferencePoolCondition (line 209) is incorrect. It only compares the Name, but should also compare SectionName and Port to uniquely identify the parent, similar to isSameParentRef in httproute_controller.go.

As the comments note, finding the parent Gateway is the main challenge. Given these issues, I recommend either fully implementing the status update logic (including finding parents and fixing the comparison) or removing this partial implementation for now to avoid confusion. If you choose to remove it, a // TODO comment would be appropriate.

}
}

_, err := c.gatewayClient.GatewayV1().Gateways(gateway.Namespace).UpdateStatus(context.TODO(), gateway, metav1.UpdateOptions{})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using context.TODO() is generally discouraged as it's a placeholder for when the correct context is unknown. For background tasks like this that are not tied to a specific incoming request, context.Background() is a more appropriate choice. While a larger refactoring to pass a cancellable context down from the Run method would be ideal, switching to context.Background() is a good immediate improvement.

Suggested change
_, err := c.gatewayClient.GatewayV1().Gateways(gateway.Namespace).UpdateStatus(context.TODO(), gateway, metav1.UpdateOptions{})
_, err := c.gatewayClient.GatewayV1().Gateways(gateway.Namespace).UpdateStatus(context.Background(), gateway, metav1.UpdateOptions{})

c.setHTTPRouteParentStatus(httpRoute, parentStatus)
}

_, err := c.gatewayClient.GatewayV1().HTTPRoutes(httpRoute.Namespace).UpdateStatus(context.TODO(), httpRoute, metav1.UpdateOptions{})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using context.TODO() is generally discouraged as it's a placeholder for when the correct context is unknown. For background tasks like this that are not tied to a specific incoming request, context.Background() is a more appropriate choice. While a larger refactoring to pass a cancellable context down from the Run method would be ideal, switching to context.Background() is a good immediate improvement.

Suggested change
_, err := c.gatewayClient.GatewayV1().HTTPRoutes(httpRoute.Namespace).UpdateStatus(context.TODO(), httpRoute, metav1.UpdateOptions{})
_, err := c.gatewayClient.GatewayV1().HTTPRoutes(httpRoute.Namespace).UpdateStatus(context.Background(), httpRoute, metav1.UpdateOptions{})


unstructuredObj := &unstructured.Unstructured{Object: content}
gvr := inferencev1.SchemeGroupVersion.WithResource("inferencepools")
_, err = c.dynamicClient.Resource(gvr).Namespace(inferencePool.Namespace).UpdateStatus(context.TODO(), unstructuredObj, metav1.UpdateOptions{})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using context.TODO() is generally discouraged as it's a placeholder for when the correct context is unknown. For background tasks like this that are not tied to a specific incoming request, context.Background() is a more appropriate choice. While a larger refactoring to pass a cancellable context down from the Run method would be ideal, switching to context.Background() is a good immediate improvement.

Suggested change
_, err = c.dynamicClient.Resource(gvr).Namespace(inferencePool.Namespace).UpdateStatus(context.TODO(), unstructuredObj, metav1.UpdateOptions{})
_, err = c.dynamicClient.Resource(gvr).Namespace(inferencePool.Namespace).UpdateStatus(context.Background(), unstructuredObj, metav1.UpdateOptions{})

Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
c.setGatewayCondition(gateway, programmedCond)

// Update listener status
gateway.Status.Listeners = make([]gatewayv1.ListenerStatus, len(gateway.Spec.Listeners))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not write a setGatewayListenerConditions function?

Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have read gateway status setting, some abnormal case like: no error is reported even we cannot startup a specific listener.

Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
portInfo.ShutdownFunc = cancel

// Start the server
lm.store.SetListenerStatus(config.GatewayKey, config.ListenerName, nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain why set nil(// Clear listener status in store)

Comment on lines 460 to 470
// Update all listeners on this port with the error
lm.mu.RLock()
if pi, ok := lm.portListeners[p]; ok {
pi.mu.Lock()
pi.LastError = err
for _, l := range pi.Listeners {
lm.store.SetListenerStatus(l.GatewayKey, l.ListenerName, err)
}
pi.mu.Unlock()
}
lm.mu.RUnlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can only have one listener here, because L418. BTW, the lock in lock looks super risky.

	portInfo, exists := lm.portListeners[port]
	if !exists {


// Listener status tracking
listenerStatusMutex sync.RWMutex
listenerStatuses map[string]map[string]error // gatewayKey -> listenerName -> error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems you donot actually cleanup when gw or listener deleted.

return c.updateGatewayStatus(gateway)
}

func (c *GatewayController) updateGatewayStatus(gateway *gatewayv1.Gateway) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add corresponding E2E to make sure the status setting for Gateway/HTTPRoute/InferencePool all work properly.

Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign yaozengzeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feat] Implement status fields of API: Gateway, HTTPRoute and InferencePool

5 participants