-
-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report CPU frequency as measured by rdtsc() instead of 3 GHz #301
base: master
Are you sure you want to change the base?
Conversation
It makes timer accounting issues like rurban#241 more visible
Sounds good, I'll test. Thanks. |
Thanks! I'm looking forward to your comments and/or approval to continue submitting a few more chunks from my patch stack :-) |
Also as branch darkk-cpu-mhz for testing on more machines |
👍 Would it be useful if I upload results from my machines to this PR as well? |
For a /proc/cpuinfo 3400 it calcs 2712. I'd rather prefer reading /proc/cpuinfo first and only fallback to this then. and for a bigger machine of mine with 3550 MHz (AMD Ryzen 9 7950X3D 16-Core) it says: |
No, look exactly at how how we perform the timer start and stop. It's better than a mere rdtsc |
That's true, it implements barriers and that's important. However, as far as I see, it still uses the same 64-bit MSR read by I'm looking at Intel 64 and IA-32 Architectures Software Developer Manual v083 at 19.17 TIME-STAMP COUNTER paragraph of the volume 3A (page 3763 in the PDF). It states the following:
As far as I understand, it's quite different from cycle counter coming from performance registers. Am I getting it wrong and/or looking at the wrong place altogether? |
I've tested the branch on It's an old Intel CPU that comes from era of I've tuned
I assume that frequency fluctuates for 600/1500 case a bit due to dynamic scaling kicking in at the startup time. The measurement is quite stable across runs for other cases.
@rurban please, suggest me, how should I proceed with this branch? I think of integrating djb's |
As I said: |
It makes timer accounting issues like #241 more visible.
The overall logic is to sample (wall-clock-ns, cycle-counter) pairs and take the longest possible valid interval out of 2,999 SpeedTest() calls during
--test=SpeedBulk
.The code has to provide reasonable MHz estimates accounting for:
donothing32
sha3-256
GetCpuFreqMHz()
That's why the code samples the pairs on every SpeedTest() call and not just twice in some "good" places.