Skip to content

Conversation

@brian90013
Copy link

Thank you for this repo. I use it as a sanity check when running on a new platform. Once it helped me detect a misconfigured server where the DIMMs were not in the optimal slots.

I recently obtained a server with 24 cores (48 threads) and the OpenMP SIZE assertion failed. This MR updates SIZE to work with all core counts from 8, 16, 24...112, 120, 128.

@hholst80
Copy link

@brian90013 Explain the rationale here? The new SIZE is (1*SIZE_OF_GB + 32504*1024LL). What is this residual term and how did you determine it?

@brian90013
Copy link
Author

I want to support CPU counts from 8 to 128 so making SIZE divisible by LCM(8, 16...120, 128) will take care of assert(SIZE % omp_get_max_threads() == 0);. The code mentions "Have PAGE_SIZE buffering" so I also want SIZE % PAGE_SIZE == 0. A final LCM(8, 16...120, 128, 4096) = 184504320. Dividing that by 1024 * 1024 * 1024 = 5.8. Rounding to 6 yields 6 * 184504320 = 1107025920 which is my new proposed SIZE.

@hholst80
Copy link

hholst80 commented Mar 28, 2025

I am not convinced that this should be a compile time constant. It would be better to allocate 1 GB and then cut down instead and set the size dynamically depending on the number of cores allocated to the test.

Let

10**3 >= chunksize * cores --> chunksize = floor(10**3 / cores).

Then we use that as the effective buffer size:

buffersize = floor(10**3 / cores) * cores

The buffer can stay at 1GB and we just use as much as we can to avoid the modulo assert.

This would work fine with also a odd processor count like 22 or even 21.

EDIT: I made an oversight that the chunksize must be divisible by PAGE_SIZE. I fished that in the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants