Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] After Boost 1.83 upgrade murmur hash is no longer faster than Boost hash #2574

Open
droberts195 opened this issue Sep 29, 2023 · 0 comments
Labels

Comments

@droberts195
Copy link
Contributor

This is not a bug, more of an observation that's interesting to make searchable in an issue.

Since we upgraded to Boost 1.83 the test that asserts that murmur hash lookup performance is better than the default Boost hash lookup performance regularly fails, even with an 8% tolerance on the timings.

BOOST_TEST_REQUIRE(murmurLookupTime < (defaultLookupTime * 108) / 100);

Some examples of timings from before the Boost upgrade:

2023-08-16 09:08:30,805755 UTC [4304] DEBUG /buildkite/builds/bk-agent-prod-k8s-1692174718318253470/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@301 default insert runtime = 1855ms, murmur insert runtime = 1467ms
2023-08-16 09:08:30,805765 UTC [4304] DEBUG /buildkite/builds/bk-agent-prod-k8s-1692174718318253470/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@303 default lookup runtime = 5110ms, murmur lookup runtime = 2882ms

2023-06-12 14:12:54,350673 UTC [4280] DEBUG /buildkite/builds/bk-agent-elastic-k8s-1686576616312423140/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@301 default insert runtime = 2067ms, murmur insert runtime = 1683ms
2023-06-12 14:12:54,350683 UTC [4280] DEBUG /buildkite/builds/bk-agent-elastic-k8s-1686576616312423140/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@303 default lookup runtime = 6868ms, murmur lookup runtime = 4048ms

2023-09-21 08:36:59,722971 UTC [4304] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695283442532840168/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@301 default insert runtime = 1214ms, murmur insert runtime = 864ms
2023-09-21 08:36:59,722979 UTC [4304] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695283442532840168/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@303 default lookup runtime = 3493ms, murmur lookup runtime = 1779ms

The most obvious thing to note here is the massive variation in performance that the BuildKite Kubernetes cluster that we run on delivers. So we cannot compare the 3 sample runs with each other. But within each run it's pretty clear that murmur hash is significantly faster for lookups than Boost hash.

Since the Boost upgrade from 1.77 to 1.83 we have these:

2023-09-26 11:25:45,478041 UTC [4311] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695724698789569410/elastic/ml-cpp-snapshot-builds/lib/core/unittest/CHashingTest.cc@301 default insert runtime = 1998ms, murmur insert runtime = 2143ms
2023-09-26 11:25:45,478067 UTC [4311] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695724698789569410/elastic/ml-cpp-snapshot-builds/lib/core/unittest/CHashingTest.cc@303 default lookup runtime = 2193ms, murmur lookup runtime = 2250ms

2023-09-29 09:53:44,909002 UTC [4286] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695978320865836751/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@301 default insert runtime = 1889ms, murmur insert runtime = 1971ms
2023-09-29 09:53:44,909012 UTC [4286] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695978320865836751/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@303 default lookup runtime = 2147ms, murmur lookup runtime = 2342ms

The first is an example of where the test passed due to the 8% tolerance, but Boost hash was still faster. The second is an example of a test failure as Boost hash was faster by more than 8%.

Based on these figures it would be going too far to say Boost hash is now better. The performance of the two hash algorithms is roughly the same now. This means there's no value in removing murmur hash from our codebase, but equally no benefit in jumping through hoops to use it in preference to Boost hash in future code changes.

droberts195 added a commit to droberts195/ml-cpp that referenced this issue Sep 29, 2023
Since the upgrade from Boost 1.77 to 1.83 murmur hash is
no longer significantly faster than the default Boost hash.

More details in elastic#2574
droberts195 added a commit that referenced this issue Oct 3, 2023
Since the upgrade from Boost 1.77 to 1.83 murmur hash is
no longer significantly faster than the default Boost hash.

More details in #2574
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant