Skip to content

Conversation

cjqzhao
Copy link
Contributor

@cjqzhao cjqzhao commented Aug 22, 2025

Add graceful switch load balancing policy.

@dfawley @easwars

@cjqzhao cjqzhao force-pushed the add-gracefulswitch branch 2 times, most recently from 551c605 to ecd0f18 Compare August 22, 2025 17:16
@dfawley dfawley changed the title feat(grpc): add gracefulswitch feat(grpc): add gracefulswitch load balancing policy Aug 22, 2025
@cjqzhao cjqzhao force-pushed the add-gracefulswitch branch from ecd0f18 to 28869a7 Compare August 24, 2025 05:35
rename mock picker and remove spaces

make test picker private
@cjqzhao cjqzhao force-pushed the add-gracefulswitch branch from 28869a7 to 962fb29 Compare August 24, 2025 05:52
Comment on lines +46 to +48
/**
Struct for Graceful Switch.
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write a good docstring for this.

Struct for Graceful Switch.
*/
pub struct GracefulSwitchPolicy {
subchannel_to_policy: HashMap<WeakSubchannel, ChildKind>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not use the ChildManager for this policy? I was really hoping we could.

The goal was to make it so that no policy would need to manage its own tracking of children.

Ideally graceful switch could do something with the incoming resolver update so that the update sharder would produce 1 or 2 children with the proper configs.

If this is too much to change at the last minute for you, then we can figure out what TODOs to add.

Copy link
Contributor Author

@cjqzhao cjqzhao Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it a lot easier to manage children directly when I looked into using ChildManager but I can look into redesigning. The challenge about this is that the ResolverUpdate from shard_update would not be used and there would have to be some other design. I think that changes will need to be made for the Child Manager due to the Child manager forwarding resolver updates to children even when unnecessary. I will draft an appropriate design and get approval from you and Easwar.

Comment on lines +67 to +70
match config.unwrap().convert_to::<Arc<GracefulSwitchLbConfig>>() {
Ok(cfg) => (*cfg).clone(),
Err(e) => panic!("convert_to failed: {e}"),
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the match on the Result instead of .unwrap()?

if update.service_config.as_ref().is_ok_and(|sc| sc.is_some()) {
return Err("can't do service configs yet".into());
}
let cfg: Arc<GracefulSwitchLbConfig> =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should be able to elide this type?

Comment on lines +63 to +65
if update.service_config.as_ref().is_ok_and(|sc| sc.is_some()) {
return Err("can't do service configs yet".into());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong. Graceful switch shouldn't need to do anything special with service configs. It should just forward it to all children.

Comment on lines +74 to +76
// Determine if we can switch the new policy in. If there is no children
// yet or the new policy isn't the same as the latest policy, then
// we can swap.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Determine if we can switch the new policy in. If there is no children
// yet or the new policy isn't the same as the latest policy, then
// we can swap.
// Determine if we can switch to the new policy. If there is no child
// yet or the new policy isn't the same as the previous policy, then
// we can swap.

if needs_switch {
target_child_kind = self.switch_to(config);
}
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add a scope here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to locking. I wanted to lock the managing_policy just to call resolve_update and then release it for resolve_child_controller.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants