-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TESTING NEEDED: JIT Sparse Function Table, by riperiperi #83
Conversation
Setup Ryzen 5 5600 We need better evidence to try to understand what the issue might be; so far, I've only seen benefits in this PR. Claiming a 40fps loss without information on your setup, game version, log and a video showing it doesn't help at all. Here’s a sample of this game; if it didn't match exactly, it's even better now. Before (1.2.59): Main.mp4After: Pr.83.mp4 |
So it wouldn’t be an issue in the PR but rather something related to the game being on Unreal, maybe stuttering. |
Did you remember to delete the PPTC cache for this game? I myself had some crashes before I thought of deleting the cache, and you didn't mention whether you remembered to perform this vital step. Let us know if you have trouble figuring out how to do this. |
Tried this build out on my M2 Max MacBook Pro last night and experienced random stuttering and more frequent crashes, despite rebuilding PPTC cache for all games. FPS was definitely higher, but it was far less stable. |
I tested this with M3 Air 24GB, at least on BOTW FPS are lower. I tested this with the same portable folder and just loaded the save in Kakariko, not doing anything else, with rebuilded PPTC cache of course and VSync turned off Unrestricted FPS is usually 42-44fps and 15-16fps when energy saver is on (restricting powerdraw to 7.5w). With the PR, it is 40-42fps and 13-14fps with battery saver. EDIT A video for reference: https://youtu.be/TfpgtIKdKJo |
What game(s)? |
Set up I'm seeing identical system performance compared to Ryujinx r.6253fe1 ("Mirror Build"). Before each launch, I purged shader cache, deleted the content of the PPTG folder, as well as told it to Queue PPTC Rebuild. Maybe I'm missing something? Ryujinx is using:
|
For anyone who isn't seeing any performance improvement from the builds linked above, could you try the older builds from which this sparse jit change was merged? Here is the older sparse jit build for windows. It is possible that we're not seeing the expected boost in performance from the GreemDev version because of accuracy improvements made since then that slowed down the emulator, and which may be the real bottleneck. EDIT: also, don't forget to test at 1x resolution to minimise the chances of running into a GPU bottleneck rather than a CPU bottleneck, which is the one we are testing. |
Do you happen to have a link for the Mac version? |
Unfortunately I don't own a mac and therefore have no way of compiling the old version for macOS. However, if such a build did exist, it would only show performance improvements on x86 macs and not Apple Silicon ones like the M1 anyway. As far as we can tell, periperi didn't update the ARM64 JIT with the same sparse JIT changes like he did with the x86 one. |
Ok, thank do for the heads up! |
Aqui é somente conversa em inglês/Here are only comments in English. You need to test without a filter, as it seems that FSR is active. |
Arch Linux user here! All 16 games in my library sawperformance gains from the update. Keep up the good work! The only game to stutter was TOTK
|
After virus clean up, This PR given 100% stable and no crash to me, a vote to merge. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Can you explain what you mean by "virus clean up"? |
Setup: TOTK: JIT Sparse: Some FPS improvements, but random stuttering and flashing |
What are your settings you tested this with? |
Base settings, no filtering |
Weird, I am unable to reproduce it on Mac With base settings you still mean Hypervisor turned off, right? |
VSync > Disabled |
Don't worry , since i have report here about this PR memory leaking( 15GB hold & to be crash ), but found out actually my windows 11 somehow infected Nvidia's telemetry bot, by denying Nvidia's caches once, got fixed. |
I've tested it with "The Legend of Zelda: Tears of the Kingdom" using UltraCam mod. This PR had the best performance:
Scene:For each version I loaded the same save, teleported to Lookout Landing, removed all armor and weapons. In-game time is 5:40-5:55 PM. Game was running in fullscreen mode. Execution:In the scene the frame rate was varying by around 5 fps over time. Before taking the screenshots I was visually monitoring the fps counter and took each screenshot when the frame rate was at its peak. It probably would have been better to use the UltraCam internal benchmark functionality + CapFrameX instead of relying on screenshots... Hardware:
Software:
Screenshots:Ryujinx v1.2.69: 75 fps (no UltraCam, V-SYNC off) Ryujinx v1.2.0+0833a59: 78 fps I have now used the UltraCam built in benchmark for Lookout Landing. Improvements to average FPS: 4.8% Averaged results from multiple runs: Raw data for v1.2.69...
Discarded results of first 2 runs:
Results of 16 runs were used:
Raw data for v1.2.0+0833a59...
Discarded results of first 2 runs:
Results of 13 runs were used:
v1.2.69 Log...
|
@usr20210909 you can get more consistent results by running a few of the benchmarks built into UltraCam. just run 3 and average them. |
Done - added more data to my comment. |
Great job, guys! I tested the Ryujinx v1.2.69 yesterday on this configuration: Ryujinx v1.2.69 (Vsync OFF, 1x, no mods) i3-1115G4 Intel UHD Xe G4 16 GB (8+8) SATA SSD I tested some games like Mario Party Jamboree, the new ML:B, EoW and other older games, and all of them performed BETTER than the latest released build of Ryujinx. One of the games that I couldn't run stably and had a lot of frame drops, even with the 60 fps mod, was Disney Epic Mickey Rebrushed. Now, it works without frame drops and always above 30 fps. It's much more stable! I was really impressed and you guys are on the right track. Ryujinx 1403 is known for running with a lot of stutters here and your fork ran wonderfully well, even achieving a constant 60 fps in some older games (SM3DW+BF). I didn't use any mods in the games. Some examples: Mario Party: Jamboree: Even without mods, it ran better than the last Sudachi build + performance mods. About 20% overall and even better in some areas. Zelda EoW: Amazing! It ran so much better than the last Ryujinx v1.1.403 build. It had higher frame rates and was more stable even running without any mods, versus Ryujinx v1.1.403 with mods (no AA, DOF off, 60 fps potato). |
More testing: I have updated my Windows 11 from 23H2 to 24H2, which is supposed to improve the performance of AMD Zen 3/4/5 CPUs. Also instead of using the older Ryujinx version 1.2.69 I ran the new test with the latest 1.2.72. In 1.2.69...1.2.72 I see no changes that would affect the performance, so the results between those two versions should be comparable. See here for used HW and SW: #83 (comment) 1st conclusion: 24H2 provides a better performance over 23H2
2nd conclusion: This PR is only slightly faster than 1.2.72 when running under 24H2
Here is the updated graph with the added new test results: Raw data for v1.2.72 (23H4)...
Raw data for v1.2.0+0833a59 (23H4)...
|
My specs: Intel core I5 9600KF OC at 4.8GHz, Gigabyte Geforce RTX 3060 with factory OC, OS: Windows Results: With JITSparse That patch is really good in allcases (better FPS, less lag spikes) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Continuation of #83 (comment): Benchmark result when using CapFrameX instead of relying on UltraCam's internal benchmark results so the frame times can be visualized in a graph. First two runs were discarded while keeping the results of the third run only. There's a lot of FPS variance in each run (probably due to many NPCs in the area doing their thing) so this comparison is not very robust. Maybe running the benchmark in a different area of the map with less NPCs would provide a more stable result. Conclusion:
|
And here we go again with more TotK benchmarking! Since there's a lot of FPS variance in the Lookout Landing area for each run (probably due to several free roaming NPCs), which leads to inconsistent results, I now ran the other four UltraCam benchmarks and recorded the data with CapFrameX. I collected the data of three runs for each area and discarded the first run since it has the worst performance. So each data set contains the result of four runs - two for each Ryujinx version. Adding any more results makes the data hard to read. KakarikoThere's lower FPS in the first seconds and a peak in the frametimes towards the end in one of the runs recorded with 1.2.72. Overall this PR has a slightly better performance. Great Sky IslandTwo big frametime spikes in a run recorded with 1.2.72. This PR also caused a frametime spike in the middle of a run. No clear winner here. Goron CityA clearly better performance in this PR - even with a single huge frametime spike in the middle of one of its runs. Korok ForestNo huge spikes in any of those runs. This PR shows better average and 1% FPS results. |
An improved variant of this PR has been merged. Closing. |
When testing this PR, please clear your PTC cache for a game.
Makes CPU-heavy games faster at the cost of more memory mappings.
Testing needed.