Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CM switch_controller service timeout as parameter to spawner.py #1790

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

christophfroehlich
Copy link
Contributor

This is a workaround for ros-controls/gz_ros2_control#421 and ros-controls/gazebo_ros2_control#380

If the simulation is started in paused mode, one can now configure also the timeout for the switch_controller service to be completed (the simulation has to be unpaused to perform the controller switches).

Also fixes docs after changes to the spawners in #1562

@christophfroehlich christophfroehlich added backport-humble This label should be used by maintaines only! Label triggers PR backport to ROS2 humble. backport-iron This label should be used by maintaines only! Label triggers PR backport to ROS2 Iron. labels Oct 13, 2024
Copy link

codecov bot commented Oct 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.01%. Comparing base (ec70ae1) to head (f68c491).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1790   +/-   ##
=======================================
  Coverage   88.01%   88.01%           
=======================================
  Files         121      121           
  Lines       12412    12416    +4     
  Branches     1109     1109           
=======================================
+ Hits        10924    10928    +4     
  Misses       1083     1083           
  Partials      405      405           
Flag Coverage Δ
unittests 88.01% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
controller_manager/controller_manager/spawner.py 72.22% <100.00%> (+0.44%) ⬆️
controller_manager/controller_manager/unspawner.py 69.23% <100.00%> (+1.66%) ⬆️

Copy link
Contributor

@fmauch fmauch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view it isn't really clear from a user's perspective, what the difference between these two timeouts is.

Also: Wouldn't it make sense to default to 0.0 with both of them? This can probably also happen when activation of a hardware component fails, right?

@christophfroehlich
Copy link
Contributor Author

Thanks for the feedback.

From my point of view it isn't really clear from a user's perspective, what the difference between these two timeouts is.

The old one --controller-manager-timeout is for the service call timeout

@param service_timeout Timeout (in seconds) to wait until the service is available. 0 means
waiting forever, retrying every 10 seconds.

while the new one --controller-manager-switch-timeout is for the switch_controller service. For example, it fails if the controller does not activate within the given timeout when the RT update method of the CM is called.

Any suggestions for a better name?

Also: Wouldn't it make sense to default to 0.0 with both of them?

5s was the hardcoded value before, 0s as switch_controller timeout will be overridden with 1s in the CM, I'm not sure which value is the best here.

This can probably also happen when activation of a hardware component fails, right?

No, this value is only used for the switch_controller service.

@fmauch
Copy link
Contributor

fmauch commented Oct 13, 2024

while the new one --controller-manager-switch-timeout is for the switch_controller service. For example, it fails if the controller does not activate within the given timeout when the RT update method of the CM is called.

I see, that does make sense. I was not aware of that, sorry. So the problem is not that this particular service isn't available, but the CM itself doesn't manage to do the requested switch within the requested time. I think the name might be fine, but the docstring / the output could explain that better. My main motivation is: If "the average user" would stumble into ros-controls/gz_ros2_control#421, would she know that this timeout has to be increased? And wouldn't it make sense to allow an infinite amount of time for such as case? This way, users could specify the switch time for their spawners to infinity in launchfiles that start gz in paused mode.

@christophfroehlich
Copy link
Contributor Author

I see, that does make sense. I was not aware of that, sorry. So the problem is not that this particular service isn't available, but the CM itself doesn't manage to do the requested switch within the requested time. I think the name might be fine, but the docstring / the output could explain that better. My main motivation is: If "the average user" would stumble into ros-controls/gz_ros2_control#421, would she know that this timeout has to be increased?

you are totally right, I try to come up with something.

And wouldn't it make sense to allow an infinite amount of time for such as case? This way, users could specify the switch time for their spawners to infinity in launchfiles that start gz in paused mode.

@saikishor what do you think, can we switch the CM behavior to wait infinitely if the time is set to 0?

@saikishor
Copy link
Member

@saikishor what do you think, can we switch the CM behavior to wait infinitely if the time is set to 0?

Switching this wouldn't be a big issue in terms of code, but I think it is better to stick to a constant value and let the people play with it. If we leave it to infinity, if for some reason, it is blocked, we cannot access any other services at all. This would be a huge cumbersome to debug in my opinion

@saikishor
Copy link
Member

Any suggestions for a better name?

@christophfroehlich @fmauch How about --switch-timeout I don't think we have to be that explicit in the arg naming, instead we can add the necessary information to the documentation of the arg

@christophfroehlich
Copy link
Contributor Author

Switching this wouldn't be a big issue in terms of code, but I think it is better to stick to a constant value and let the people play with it. If we leave it to infinity, if for some reason, it is blocked, we cannot access any other services at all. This would be a huge cumbersome to debug in my opinion

ok, so do you prefer to set the default in the spawner to the same like in the cm?

@saikishor
Copy link
Member

ok, so do you prefer to set the default in the spawner to the same like in the cm?

It can be different I don't mind, what I don't prefer to avoid is infinite waiting with no timeout at all. Usually, it needs to happen immediately in the same cycle.

controller_manager/controller_manager/spawner.py Outdated Show resolved Hide resolved
controller_manager/controller_manager/unspawner.py Outdated Show resolved Hide resolved
controller_manager/doc/userdoc.rst Outdated Show resolved Hide resolved
controller_manager/doc/userdoc.rst Outdated Show resolved Hide resolved
@christophfroehlich
Copy link
Contributor Author

I just tried it with unpaused gz. and with a timeout of 1s it fails: it needs more time from the plugin loading until the simulation runs.
I changed it to 5.0s again, which was the hardcoded behavior before.

christophfroehlich and others added 2 commits October 14, 2024 10:48
Co-authored-by: Felix Exner (fexner) <exner@fzi.de>
Co-authored-by: Felix Exner (fexner) <exner@fzi.de>
Copy link
Contributor

@fmauch fmauch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for all the iterations!

saikishor
saikishor previously approved these changes Oct 14, 2024
Copy link
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for checking everything

Copy link
Contributor

mergify bot commented Oct 17, 2024

This pull request is in conflict. Could you fix it @christophfroehlich?

saikishor
saikishor previously approved these changes Oct 26, 2024
Copy link
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

mergify bot commented Oct 30, 2024

This pull request is in conflict. Could you fix it @christophfroehlich?

Copy link
Contributor

mergify bot commented Oct 31, 2024

This pull request is in conflict. Could you fix it @christophfroehlich?

saikishor
saikishor previously approved these changes Oct 31, 2024
Copy link
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

mergify bot commented Nov 6, 2024

This pull request is in conflict. Could you fix it @christophfroehlich?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-humble This label should be used by maintaines only! Label triggers PR backport to ROS2 humble. backport-iron This label should be used by maintaines only! Label triggers PR backport to ROS2 Iron.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants