-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Update q_math.c Q_rsqrt 10x #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Removing unnecessary variables increases speed 10x from 1000 ns to 100 ns.
|
Have you benchmarked this with optimizations turned on? Doubtful it will make a difference in the end. |
Yes, I did a performance test, that's because the code in assembly is smaller, it also reduces the times in which it saves in memory and calls the memory again to do the other calculations, that makes it 10 times faster |
|
I did the tests in the online tools. I compared original vs new.
OnlineGDB:
New:
OnlineGDB:
I don't see much difference using |
In theory you can save the saving of 'y' in memory outside the game and place the return directly in the last function, that saves a step, and yes it is true some compilers already include optimization. |
|
I see, but it doesn't change anything with float Q_rsqrt( float number )
{
long i;
float x2;
x2 = number * 0.5F;
i = 0x5f3759df - ( * ( long * ) &number >> 1 );
return * ( float * ) &i * ( 1.5F - ( x2 * (* ( float * ) &i) * (* ( float * ) &i) ) );
}Even with the very reduced version: float Q_rsqrt( float number )
{
float x2 = number * 0.5f;
long i = 0x5f3759df - ( *( long* )&number >> 1 );
number = *( float* )&i;
return ( number * ( 1.5f - ( x2 * number * number ) ) );
}Nothing changes. The result is the same. |
|
@andreselcientifico You're better off looking at _mm_rsqrt_ss (x86/x64) or vrsqrte_f32 (ARM), those provide a serious speed improvement and they are more precise |
|
Using an inline union seems to be much faster than normal, although it could just be the way that I'm timing them. these get about 9x faster than the source and the proposed changes in this PR when I run them https://godbolt.org/z/TTPK8Yvc6
|
|
El vie., 11 de julio de 2025 12:18, hydrophobis ***@***.***>
escribió:
… *hydrophobis* left a comment (id-Software/Quake-III-Arena#6)
<#6 (comment)>
Using an inline union seems to be much faster than normal, although it
could just be the way that I'm timing them. these get about 9x faster than
the source and the proposed changes in this PR when I run them
https://godbolt.org/z/TTPK8Yvc6
https://onlinegdb.com/OQt6Qwekn
And it runs about 3x faster on my local Win11 22H2 (0.018 vs 0.007)
Screenshot.2025-07-11.121626.png (view on web)
<https://github.com/user-attachments/assets/6f952c42-8a36-4a17-b5e3-48b418eba51b> Screenshot.2025-07-11.121635.png
(view on web)
<https://github.com/user-attachments/assets/6a228b3e-2b95-4c68-9490-fcbee385fac5> image.png
(view on web)
<https://github.com/user-attachments/assets/1ab3abf2-97a9-499f-8ec2-da131a7d41d4>
—
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AYNDVQLSE6PLO5VU27JT74T3H7WWBAVCNFSM6AAAAAB4ZXWXKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTANRTGEZDAOJYG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|



Removing unnecessary variables increases speed 10x from 1000 ns to 100 ns.