-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use loole to improve channel performance #228
Conversation
Very interesting. I'm curious how it's possible to be faster than flume, is there something I can read about how loole achieves this awesome performance? :) |
Thanks, not yet. Currently, the only source of information is the source code itself. However, to summarize, some potential reasons for the performance difference could include the absence of cloning, minimal abstraction, minimal moves, and possibly a more efficient algorithm (although I haven't reviewed Flume's source code). In the future, it's conceivable that |
Cool thanks :) I'll have a closer look at this tomorrow hopefully. Thanks so far :) |
You are welcome. |
Hello, thank you for reviewing my pull request. I appreciate your feedback. However, I have discovered that there is an issue with the benchmark code that causes inconsistent and unreliable results. Even using flume itself, if you run the benchmark repeatedly it shows randomly faster or slower results. This also happens using loole. Thanks again. |
If this is correct, I think this PR is no longer valid. |
Hmm interesting. Sorry, I didn't yet have the time to look into this PR, please accept by apologies. I'd like to run both benchmarks myself and I'll report what happens on my machine |
-- loole --
test read_f16_as_f16_uncompressed_1thread ... bench: 8,388,010 ns/iter (+/- 3,397,072)
test read_f16_as_f16_zip_1thread ... bench: 17,489,190 ns/iter (+/- 3,253,988)
test read_f16_as_f16_zip_nthreads ... bench: 9,099,960 ns/iter (+/- 3,757,732)
test read_f16_as_f32_uncompressed_1thread ... bench: 8,234,400 ns/iter (+/- 3,470,654)
test read_f16_as_f32_zip_1thread ... bench: 17,849,420 ns/iter (+/- 1,432,790)
test read_f16_as_f32_zip_nthreads ... bench: 8,874,635 ns/iter (+/- 3,389,003)
test read_f16_as_u32_uncompressed_1thread ... bench: 8,163,710 ns/iter (+/- 2,859,007)
test read_f32_as_f16_uncompressed_1thread ... bench: 29,717,520 ns/iter (+/- 2,255,945)
test read_f32_as_f16_zips_1thread ... bench: 87,762,410 ns/iter (+/- 2,405,089)
test read_f32_as_f16_zips_nthreads ... bench: 32,962,650 ns/iter (+/- 2,531,739)
test read_f32_as_f32_uncompressed_1thread ... bench: 29,741,710 ns/iter (+/- 2,462,639)
test read_f32_as_f32_zips_1thread ... bench: 88,301,830 ns/iter (+/- 1,456,787)
test read_f32_as_f32_zips_nthreads ... bench: 33,265,970 ns/iter (+/- 1,701,066)
test read_f32_as_u32_uncompressed_1thread ... bench: 29,956,220 ns/iter (+/- 2,818,988)
-- flume --
test read_f16_as_f16_uncompressed_1thread ... bench: 8,104,510 ns/iter (+/- 3,180,752)
test read_f16_as_f16_zip_1thread ... bench: 17,785,760 ns/iter (+/- 2,046,426)
test read_f16_as_f16_zip_nthreads ... bench: 9,142,180 ns/iter (+/- 3,819,020)
test read_f16_as_f32_uncompressed_1thread ... bench: 8,028,740 ns/iter (+/- 2,876,322)
test read_f16_as_f32_zip_1thread ... bench: 17,656,880 ns/iter (+/- 1,436,841)
test read_f16_as_f32_zip_nthreads ... bench: 8,718,960 ns/iter (+/- 3,392,016)
test read_f16_as_u32_uncompressed_1thread ... bench: 8,115,830 ns/iter (+/- 2,718,979)
test read_f32_as_f16_uncompressed_1thread ... bench: 29,787,260 ns/iter (+/- 2,147,523)
test read_f32_as_f16_zips_1thread ... bench: 87,121,620 ns/iter (+/- 3,145,823)
test read_f32_as_f16_zips_nthreads ... bench: 33,109,410 ns/iter (+/- 2,087,108)
test read_f32_as_f32_uncompressed_1thread ... bench: 30,033,080 ns/iter (+/- 40,816,735)
test read_f32_as_f32_zips_1thread ... bench: 88,689,050 ns/iter (+/- 6,816,531)
test read_f32_as_f32_zips_nthreads ... bench: 33,697,520 ns/iter (+/- 9,957,318)
test read_f32_as_u32_uncompressed_1thread ... bench: 30,318,000 ns/iter (+/- 4,575,068) These are the numbers on my machine. Your insights seem plausible, there is no difference. |
That makes sense, thanks for clarifying. |
Hello, this pull request replaces Flume with Loole, my recent safe implementation of MPMC channel that has successfully passed all Flume tests for all implemented APIs. Consequently, both channel-related benchmarks in read.rs have demonstrated the following improvements:
Loole's GitHub: https://github.com/mahdi-shojaee/loole
Thanks