Skip to content

Commit

Permalink
Updated benchmark with newer compilers and hardware
Browse files Browse the repository at this point in the history
  • Loading branch information
wqking committed Dec 30, 2023
1 parent 60761df commit 7a186e3
Show file tree
Hide file tree
Showing 7 changed files with 110 additions and 88 deletions.
171 changes: 87 additions & 84 deletions doc/benchmark.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
# Benchmarks

Hardware: Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz
Software: Windows 10, MSVC 2017, MinGW GCC 7.2.0
Time unit: milliseconds (unless explicitly specified)
Hardware: HP laptop, Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz, 16 GB RAM
Software: Windows 10, MinGW GCC 11.3.0, MSVC 2022
Time unit: milliseconds (unless explicitly specified)

Unless it's specified, the default compiler is GCC.
The hardware used for benchmark is pretty medium to low end at the time of benchmarking (December 2023).

## EventQueue enqueue and process -- single threading

Expand All @@ -22,26 +25,26 @@ Time unit: milliseconds (unless explicitly specified)
<td>10M</td>
<td>100</td>
<td>100</td>
<td>401</td>
<td>1146</td>
<td>289</td>
<td>939</td>
</tr>
<tr>
<td>100k</td>
<td>1000</td>
<td>100M</td>
<td>100</td>
<td>100</td>
<td>4012</td>
<td>11467</td>
<td>2822</td>
<td>9328</td>
</tr>
<tr>
<td>100k</td>
<td>1000</td>
<td>100M</td>
<td>1000</td>
<td>1000</td>
<td>4102</td>
<td>11600</td>
<td>2923</td>
<td>9502</td>
</tr>
<table>

Expand All @@ -68,7 +71,7 @@ The EventQueue is processed in one thread. The Single/Multi threading in the tab
<td>10M</td>
<td>100</td>
<td>100</td>
<td>2283</td>
<td>1824</td>
</tr>
<tr>
<td>SpinLock</td>
Expand All @@ -77,7 +80,7 @@ The EventQueue is processed in one thread. The Single/Multi threading in the tab
<td>10M</td>
<td>100</td>
<td>100</td>
<td>1692</td>
<td>1303</td>
</tr>

<tr>
Expand All @@ -87,7 +90,7 @@ The EventQueue is processed in one thread. The Single/Multi threading in the tab
<td>10M</td>
<td>100</td>
<td>100</td>
<td>3446</td>
<td>2989</td>
</tr>
<tr>
<td>SpinLock</td>
Expand All @@ -96,7 +99,7 @@ The EventQueue is processed in one thread. The Single/Multi threading in the tab
<td>10M</td>
<td>100</td>
<td>100</td>
<td>3025</td>
<td>3186</td>
</tr>

<tr>
Expand All @@ -106,7 +109,7 @@ The EventQueue is processed in one thread. The Single/Multi threading in the tab
<td>10M</td>
<td>100</td>
<td>100</td>
<td>4000</td>
<td>3151</td>
</tr>
<tr>
<td>SpinLock</td>
Expand All @@ -115,7 +118,7 @@ The EventQueue is processed in one thread. The Single/Multi threading in the tab
<td>10M</td>
<td>100</td>
<td>100</td>
<td>3076</td>
<td>3049</td>
</tr>

<tr>
Expand All @@ -125,7 +128,7 @@ The EventQueue is processed in one thread. The Single/Multi threading in the tab
<td>10M</td>
<td>100</td>
<td>100</td>
<td>1971</td>
<td>1657</td>
</tr>
<tr>
<td>SpinLock</td>
Expand All @@ -134,7 +137,7 @@ The EventQueue is processed in one thread. The Single/Multi threading in the tab
<td>10M</td>
<td>100</td>
<td>100</td>
<td>1755</td>
<td>1659</td>
</tr>

<tr>
Expand All @@ -144,7 +147,7 @@ The EventQueue is processed in one thread. The Single/Multi threading in the tab
<td>10M</td>
<td>100</td>
<td>100</td>
<td>928</td>
<td>708</td>
</tr>
<tr>
<td>SpinLock</td>
Expand All @@ -153,7 +156,7 @@ The EventQueue is processed in one thread. The Single/Multi threading in the tab
<td>10M</td>
<td>100</td>
<td>100</td>
<td>2082</td>
<td>1891</td>
</tr>
</table>

Expand All @@ -164,7 +167,7 @@ When there are fewer threads (about around the number of CPU cores which is 4 he
## CallbackList append/remove callbacks

The benchmark loops 100K times, in each loop it appends 1000 empty callbacks to a CallbackList, then remove all that 1000 callbacks. So there are totally 100M append/remove operations.
The total benchmarked time is about 21000 milliseconds. That's to say in 1 milliseconds there can be 5000 append/remove operations.
The total benchmarked time is about 16000 milliseconds. That's to say in 1 milliseconds there can be 6000 append/remove operations.

## CallbackList invoking VS native function invoking

Expand All @@ -181,114 +184,114 @@ Iterations: 100,000,000

<tr>
<td rowspan="2">Inline global function</td>
<td>MSVC 2017</td>
<td>217</td>
<td>1501</td>
<td>6921</td>
<td>MSVC</td>
<td>139</td>
<td>1267</td>
<td>3058</td>
</tr>
<tr>
<td>GCC 7.2</td>
<td>187</td>
<td>1489</td>
<td>4463</td>
<td>GCC</td>
<td>141</td>
<td>1149</td>
<td>2563</td>
</tr>

<tr>
<td rowspan="2">Non-inline global function</td>
<td>MSVC 2017</td>
<td>241</td>
<td>1526</td>
<td>6544</td>
<td>MSVC</td>
<td>143</td>
<td>1273</td>
<td>3047</td>
</tr>
<tr>
<td>GCC 7.2</td>
<td>233</td>
<td>1488</td>
<td>4787</td>
<td>GCC</td>
<td>132</td>
<td>1218</td>
<td>2583</td>
</tr>

<tr>
<td rowspan="2">Function object</td>
<td>MSVC 2017</td>
<td>194</td>
<td>1498</td>
<td>6433</td>
<td>MSVC</td>
<td>139</td>
<td>1198</td>
<td>2993</td>
</tr>
<tr>
<td>GCC 7.2</td>
<td>212</td>
<td>1485</td>
<td>4951</td>
<td>GCC</td>
<td>141</td>
<td>1107</td>
<td>2633</td>
</tr>

<tr>
<td rowspan="2">Member virtual function</td>
<td>MSVC 2017</td>
<td>207</td>
<td>1533</td>
<td>6558</td>
<td>MSVC</td>
<td>159</td>
<td>1221</td>
<td>3076</td>
</tr>
<tr>
<td>GCC 7.2</td>
<td>212</td>
<td>1485</td>
<td>4489</td>
<td>GCC</td>
<td>140</td>
<td>1231</td>
<td>2691</td>
</tr>

<tr>
<td rowspan="2">Member non-virtual function</td>
<td>MSVC 2017</td>
<td>214</td>
<td>1533</td>
<td>6390</td>
<td>MSVC</td>
<td>140</td>
<td>1266</td>
<td>3054</td>
</tr>
<tr>
<td>GCC 7.2</td>
<td>211</td>
<td>1486</td>
<td>4872</td>
<td>GCC</td>
<td>140</td>
<td>1193</td>
<td>2701</td>
</tr>

<tr>
<td rowspan="2">Member non-inline virtual function</td>
<td>MSVC 2017</td>
<td>206</td>
<td>1522</td>
<td>6578</td>
<td>MSVC</td>
<td>158</td>
<td>1223</td>
<td>3103</td>
</tr>
<tr>
<td>GCC 7.2</td>
<td>182</td>
<td>1666</td>
<td>4593</td>
<td>GCC</td>
<td>133</td>
<td>1231</td>
<td>2676</td>
</tr>

<tr>
<td rowspan="2">Member non-inline non-virtual function</td>
<td>MSVC 2017</td>
<td>206</td>
<td>1491</td>
<td>6992</td>
<td>MSVC</td>
<td>134</td>
<td>1266</td>
<td>3028</td>
</tr>
<tr>
<td>GCC 7.2</td>
<td>205</td>
<td>1486</td>
<td>4490</td>
<td>GCC</td>
<td>134</td>
<td>1205</td>
<td>2652</td>
</tr>

<tr>
<td rowspan="2">All functions</td>
<td>MSVC 2017</td>
<td>1374</td>
<td>10951</td>
<td>29973</td>
<td>MSVC</td>
<td>91</td>
<td>903</td>
<td>2214</td>
</tr>
<tr>
<td>GCC 7.2</td>
<td>1223</td>
<td>9770</td>
<td>22958</td>
<td>GCC</td>
<td>89</td>
<td>858</td>
<td>1852</td>
</tr>

</table>
Expand Down
2 changes: 2 additions & 0 deletions tests/benchmark/b1_callbacklist_invoking_vs_cpp.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@ struct FunctionObject

TEST_CASE("b1, CallbackList invoking vs C++ invoking")
{
std::cout << std::endl << "b1, CallbackList invoking vs C++ invoking" << std::endl;

constexpr int iterateCount = 1000 * 1000 * 10;
constexpr int callbackCount = 10;

Expand Down
6 changes: 4 additions & 2 deletions tests/benchmark/b2_map_vs_unordered_map.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ std::string generateRandomString(const int length){

TEST_CASE("b2, std::map vs std::unordered_map")
{
std::cout << std::endl << "b2, std::map vs std::unordered_map" << std::endl;

constexpr int stringCount = 1000 * 1000;
std::vector<std::string> stringList(stringCount);
for(auto & s : stringList) {
Expand Down Expand Up @@ -99,7 +101,7 @@ TEST_CASE("b2, std::map vs std::unordered_map")
}
});
}
std::cout << mapInsertTime << " " << mapLookupTime << std::endl;
std::cout << unorderedMapInsertTime << " " << unorderedMapLookupTime << std::endl;
std::cout << "Map: insert " << mapInsertTime << " lookup " << mapLookupTime << std::endl;
std::cout << "UnordereMap: insert " << unorderedMapInsertTime << " lookup " << unorderedMapLookupTime << std::endl;
}

6 changes: 6 additions & 0 deletions tests/benchmark/b3_b5_eventqueue.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,8 @@ struct B3PoliciesSingleThreading {

TEST_CASE("b3, EventQueue, one thread")
{
std::cout << std::endl << "b3, EventQueue, one thread" << std::endl;

doExecuteEventQueue<B3PoliciesMultiThreading>("Multi threading", 100, 1000 * 100, 100);
doExecuteEventQueue<B3PoliciesMultiThreading>("Multi threading", 1000, 1000 * 100, 100);
doExecuteEventQueue<B3PoliciesMultiThreading>("Multi threading", 1000, 1000 * 100, 1000);
Expand All @@ -181,6 +183,8 @@ struct B4PoliciesMultiThreading {

TEST_CASE("b4, EventQueue, multi threads, mutex")
{
std::cout << std::endl << "b4, EventQueue, multi threads, mutex" << std::endl;

doMultiThreadingExecuteEventQueue<B4PoliciesMultiThreading>("Mutex", 1, 1, 1000 * 1000 * 10, 100);
doMultiThreadingExecuteEventQueue<B4PoliciesMultiThreading>("Mutex", 1, 3, 1000 * 1000 * 10, 100);
doMultiThreadingExecuteEventQueue<B4PoliciesMultiThreading>("Mutex", 2, 2, 1000 * 1000 * 10, 100);
Expand All @@ -194,6 +198,8 @@ struct B5PoliciesMultiThreading {

TEST_CASE("b5, EventQueue, multi threads, spinlock")
{
std::cout << std::endl << "b5, EventQueue, multi threads, spinlock" << std::endl;

doMultiThreadingExecuteEventQueue<B5PoliciesMultiThreading>("Spinlock", 1, 1, 1000 * 1000 * 10, 100);
doMultiThreadingExecuteEventQueue<B5PoliciesMultiThreading>("Spinlock", 1, 3, 1000 * 1000 * 10, 100);
doMultiThreadingExecuteEventQueue<B5PoliciesMultiThreading>("Spinlock", 2, 2, 1000 * 1000 * 10, 100);
Expand Down
2 changes: 2 additions & 0 deletions tests/benchmark/b6_callbacklist_add_remove_callbacks.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@

TEST_CASE("b6, CallbackList add/remove callbacks")
{
std::cout << std::endl << "b6, CallbackList add/remove callbacks" << std::endl;

using CL = eventpp::CallbackList<void ()>;
constexpr size_t callbackCount = 1000;
constexpr size_t iterateCount = 1000 * 100;
Expand Down
Loading

0 comments on commit 7a186e3

Please sign in to comment.