Make most(if not all) float operations single precision#4
Make most(if not all) float operations single precision#4Line-fr merged 7 commits intoLine-fr:mainfrom
Conversation
|
Thank you for the useful pull request! I am sure it will help lower end nvidia gpus that have limited fp64 compute capabilities On the other hand, I think that I will remove the CPU fp64 translations since it doesnt really bring any benefits. Without AVX, fp64 and fp32 have the same performances when not using too much bandwidth. I will work on the commit tonight (for my time zone so in about 4 hours) |
|
It did generate different code in my tests, since without f at the end of float literals it assumes double precision(and it's also a habit of mine since I usually work with ancient compilers where it does matter) Edit: I'm not sure if this matters for exact numbers but like I said, just a habit of mine |
|
Also, I think powf is windows only so I cannot allow that |
|
Id don't think powf is windows only? it should be part of the standard library. |
|
I reviewed it, it seems alright for me when you are ready I will be able to merge |
|
Finally done with this, both ssimulacra2 and butter should output the same as your code with a ~5% speed increase(on lower end hardware) |
|
It really sucks that I basically have only one system I can test this on, but from all runs I did on a RTX 4060 mobile there were some improvements. What exactly did you use to output these graphs? |
|
These graphs should be about the same as the ReadMeGraph is the usageScript folder |
|
fiou I am sorry to have taken so much time to do it! |


Considering the final results and operations on the GPU are meant to be single precision, it doesn't make much sense to do some with double precision(as that information gets lost anyway), by making it all single precision it's slightly faster.
I also noticed sometimes abs was used instead of fabs or fabsf, which i replaced with the float versions(unless this was intentional or gets overloaded?).
If there's anything wrong with this, let me know!