-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Causes of vastly different ICP runtimes #596
Comments
Hi @kam3k ! While it is possible that the runtimes oscillate, the processing times you are reporting are oddly long. As a reference, we normally process point clouds two orders of magnitudes larger while still being real-time (although at 10 Hz). Are you running the code on some edge hardware or a standard computer? Is it natively installed or a containerized application? |
@boxanm Thanks for the reply. This is running natively installed (not containerized) on a developer laptop. I built both libpointmatcher (commit I'd like to add that I did some more testing that shows that the run time varies significantly despite identical inputs and set up. That is, just calling ICP in a loop and measuring the run time produces very different results (e.g., 20 ms vs 200 ms). But your response is very encouraging, suggesting that there is a solution somewhere that will easily make this runnable in real time. |
Can you please also report the output of |
No change in behaviour when using One thing I'm doing "different" is the configuration is loaded into the |
I've just run some tests where the configuration isn't done every time and it had no effect. Sounds like there is something going on with the build. Perhaps one of the dependencies are built in debug? But none of them (boost, eigen) are built from source, just installed via |
For another datapoint, I completely uninstalled libpointmatcher and libnabo and then instead installed them using |
@kam3k I would suggest removing all nonessential filters, both input, and outliers. Then, switch to the Point2Point minimizer to maximally simplify the setup. From the logs you reported, it seems that most of the time is taken by the ICP loop itself. I suppose the fluctuating runtimes come mostly from a varying number of iterations. What value is your CounterTransformationChecker set to? Do the registration results you are getting make sense? Are the scans well aligned? Running a profiler and identifying the function call that takes the most time might be our best option here. @nicolaslauzon did you experience anything similar when testing |
That would be @aguenette who tested libpointmatcher on the f1tenth. |
To really try and narrow things down I've just used the default configuration (i.e., What happened was sometimes all ten calls would be very fast. But about 40% of the time, just the first call would be slow, and the remaining nine would be fast. For example, most of the time things would look good for the ten calls:
But very often, just the first call would be substantially slower (even though the number of iterations didn't change):
Note that it was always the first call that was slower. So when I remove the loop (and only call ICP once per cloud pair), it's always "the first call" and therefore I get sporadic runtimes. |
Alright, that's interesting! Seeing values in the range of milliseconds is the expected result. I'll run a similar test on my side, but probably not before the end of the week. |
Here's a test that anyone can run using the examples that come with libpointmatcher. After building, go inside the directory
which will run the For example,
|
@boxanm Yes I can confirm I was seeing the same issue you're fixing in #598 and that after I use the new flag the issue is fixed. Thanks for that, the durations in the log make way more sense when using your new branch. Here is my plot (1000 samples) for the same dataset. The mean was 0.0053 s with a standard deviation of 0.0228 s. I get a median that is still slower than your slowest time (0.00042) but the same order of magnitude. I don't know what to make of these massive and frequently occurring outliers. |
I'm looking around for anything else I can think of that could lead the cause. One is the
I still get these warnings during compilation:
|
@kam3k for the OpenMP, I think CMake has a problem finding the full OpenMP installation. Here's what the output looks like when
|
Issue found! I started profiling the above libnabo program and found that the CPU time taken by OpenMP threads could vary pretty significantly, so I started looking into that and found this: https://stackoverflow.com/a/70134444 Setting the environment variable But right now I don't know the consequences of setting |
What would cause the ICP runtime to vary significantly, despite very similar inputs?
More specifically, I am running scan matching on a sequence of scans from the same 2D lidar. There is < 20 cm of displacement and < 2 deg of rotation between the two point clouds, and I provide ICP with an initial guess from odometry. In other words, the reading and reference clouds begin nearly aligned, and ICP is just making a small adjustment.
My data points filters (applied to both reading and reference) are:
The matcher is KDTreeMatcher, the outlier filters are TrimmedDistOutlierFilter and SurfaceNormalOutlierFilter, the error minimizer is PointToPlaneWithCovErrorMinimizer, the transformation checkers are DifferentialTransformationChecker and CounterTransformationChecker, and a RigidTransformation transformation is specified.
Both the reading and reference clouds have about ~1000 pts each (this could possibly be downsampled if needed).
When I run ICP using this set up, the alignment time varies significantly, fluctuating between ~8 ms to ~300 ms. My lidar data is coming in at 25 Hz, which means this could run online if ICP took < 40 ms (which is does sometimes).
I was wondering, what is likely the cause of these big fluctuations in runtime?
I took a look a the logs, and the pre-processing times can vary, but not as much as ICP itself (sometimes it is ~50x slower than other times).
As an aside, the log reports pre-processing times of the clouds, and of ICP, but the times don't make sense. For example, it will say:
PointMatcher::icp - reference pre-processing took 0.112477 [s]
PointMatcher::icp - reading pre-processing took 0.133762 [s]
PointMatcher::icp - 4 iterations took 0.905763 [s]
which if you sum up, adds up to more than one second. But measuring how long the call to the ICP function took (with
std::steady_clock
) is much less, in this case it was 120 ms.At the end of the day my goal is to have ICP run quickly (< 40 ms) but perhaps more importantly, consistently. Are there data filters and/or parts of the pipeline that can be configured, added, or removed to help achieve a more consistent, fast runtime?
Any help would be greatly appreciated. Thank you.
The text was updated successfully, but these errors were encountered: