Skip to content

Conversation

@ShravanDeva5327
Copy link
Contributor

Continuation of #191

Tests for both Trace launch action and ros2 trace command are added with dual_session enabled

@ShravanDeva5327
Copy link
Contributor Author

@christophebedard can you please review this?

@christophebedard christophebedard self-requested a review September 7, 2025 15:15
Copy link
Member

@christophebedard christophebedard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I just have some minor-ish requests!

<launch>
<arg name="session-name" default="my-session-name" />
<arg name="snapshot-mode" default="false" />
<arg name="dual-session" default="false" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and snapshop-mode, which I didn't notice in #195) should be used below with <trace>, e.g.:

snapshot-mode="$(var snapshot-mode)"
dual-session="$(var dual-session)"

Same for the YAML launch file in test_action_substitutions_frontend_yaml below.


# Pre-configure dual session, start nodes and check that the snapshot session exists
ls, launch_task = self.pre_configure_dual_session_and_run_nodes(tmpdir, session_name)
self.assertTracingSessionExist(session_name+'-snapshot', snapshot_mode=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests often have more hardcoded values/literals (e.g., strings, numbers) because it makes test code more explicit and easier to read, but I think in this case we should have variables for the snapshot & runtime session names, since they're used so many times in this test:

session_name_snapshot = session_name + '-snapshot'
session_name_runtime = session_name + '-runtime'

Comment on lines 813 to 814
self.assertTracingSessionExist(session_name+'-snapshot', snapshot_mode=True)
self.assertTracingSessionExist(session_name+'-runtime')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check here (and not after pausing below) that the snapshot trace contains initialization events/data. You can simply use the list with topic_name -> /ping//pong below.


However, this brings a potential problem (which doesn't mean we shouldn't do the above!). We first start the launch file, which configures the snapshot session and starts the nodes, and then we trigger a snapshot here. This creates a race: If we end up triggering the snapshot before the nodes have had time to properly start, the snapshot trace will not contain the expected initialization events. You added a 1-second sleep, but we've seen in nightly CI jobs that nodes can be really slow to start, so we try to avoid relying on hardcoded sleeps. The more robust solution here is to wait for the publishers/subscriptions (/ping, /pong) to exist before running ros2 start here.

To do that, you can use rclpy (which is already an indirect dependency of test_ros2trace, but you'll need to make it an explicit <test_depend> in the package.xml) to:

  1. Create a node (e.g., using this test's name).
  2. Create a /ping subscription and a /pong subscription to match the publishers created by test_ping & test_pong.
  3. Use Subscription.get_publisher_count() to wait for >= 1 publisher for each of the 2 subscriptions. You'll have to write some logic that waits for sub.get_publisher_count() >= 1 and sleeps just a bit (e.g., 01 second) in between checking for publishers, but eventually times out if it takes too long, e.g., more than 10 seconds. Some similar code: https://github.com/ros2/rosbag2/blob/311921bf488018aa3d05d7f042bd6f5038338c2a/rosbag2_py/test/common.py#L43-L54 used for example here https://github.com/ros2/rosbag2/blob/311921bf488018aa3d05d7f042bd6f5038338c2a/rosbag2_py/test/test_transport.py#L114-L117.

Comment on lines +821 to +865
expected_trace_data = [
('topic_name', '/ping'),
('topic_name', '/pong'),
]
num_events = self.assertTraceContains(trace_dir, expected_field_value=expected_trace_data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should check the runtime trace data specifically as opposed to the whole trace, just to make sure the runtime trace contains what we think it should contain.

That means you can't use topic_name here, since that's from an init tracepoint (that wouldn't be recorded by the runtime trace, since it starts recording too late), and runtime events don't contain a lot of strings like topic names. I think you could instead check the trace event names using _name:

        expected_trace_data = [
            ('_name', 'ros2:rcl_publish'),
            ('_name', 'ros2:rmw_publish'),
        ]

Comment on lines 838 to 842
# Resume tracing
ret = self.run_trace_subcommand(['resume', session_name, '--dual-session'])
self.assertEqual(0, ret)
self.assertTracingSessionExist(session_name+'-snapshot', snapshot_mode=True)
self.assertTracingSessionExist(session_name+'-runtime')
Copy link
Member

@christophebedard christophebedard Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to check that resuming with --dual-session triggers a new snapshot. However, you can't really check the trace data for that new/second snapshot, since it's probably empty because there are no init events later at runtime. You can simply check that there's 1 directory under ${session_name}/snapshot/ before resuming and 2 directories after resuming.

For example, I just tested it:

$ ll ~/.ros/tracing/my-dual-tracing-session/snapshot/
total 16
drwxrwx--- 4 christophe.bedard christophe.bedard 4096 Sep  7 12:21 ./
drwxrwx--- 4 christophe.bedard christophe.bedard 4096 Sep  7 12:20 ../
drwxrwx--- 3 christophe.bedard christophe.bedard 4096 Sep  7 12:20 snapshot-20250907-122029-0/
drwxrwx--- 3 christophe.bedard christophe.bedard 4096 Sep  7 12:21 snapshot-20250907-122110-1/

@christophebedard
Copy link
Member

Note: Shravan will pick this back up after finishing the GSoC final submission/report.

@ros-discourse
Copy link

This pull request has been mentioned on Open Robotics Discourse. There might be relevant details there:

https://discourse.openrobotics.org/t/gsoc-2025-improvements-to-ros-2-tracing/50062/1

@ros-discourse
Copy link

This pull request has been mentioned on Open Robotics Discourse. There might be relevant details there:

https://discourse.openrobotics.org/t/gsoc-2025-improvements-to-ros-2-tracing/50063/1

Signed-off-by: Shravan Deva <devashravan7@gmail.com>
Signed-off-by: Shravan Deva <devashravan7@gmail.com>
Signed-off-by: Shravan Deva <devashravan7@gmail.com>
@christophebedard
Copy link
Member

As mentioned offline, I will give it another shot next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants