RichTextLabel performance observations #7510
Replies: 2 comments 1 reply
-
You can use a C++ profiler in a debug build of the engine. |
Beta Was this translation helpful? Give feedback.
-
Edit: This apparently just offloads the work to be done, probably defers it so it isn't picked up by this timing method. RTL nodes really do just seem to be this slow.Just a little follow up as I've stumbled onto a bit more information about RTL performance. Enabling Fit Content on an RTL causes it to take almost 500% longer to be added to the scene than if it's disabled. MRP results in My use case is stacking RTLs vertically interspersed with other Control nodes inside of a ScrollContainer (for complex tooltips) which requires Fit Content to work. I dynamically build the tooltip by calling functions like What I will do is find a way to never actually remove RTLs from the tooltip and simply hide / move them to where they need to be to avoid this performance issue. For what it's worth setting Heads up: rapid flickeringrtl_add_child_mrp.mp4 |
Beta Was this translation helpful? Give feedback.
-
Posting my findings here so I can stop opening and editing issues that I'm not sure are even bugs. No idea if this is useful information, but I have seen a few "text is very laggy" issues opened since 4.0 launched.
Rendering BBCode with no line breaks
Significantly slower in 4.X than 3.5. This seems to be somewhat expected behaviour based on godotengine/godot#79599 (comment) but the disparity is very large.
Of note, it seems like all the time is spent in RichTextLabel::_get_item_at_pos if I'm reading the profiler correctly. (VerySleepy 0.9.1 profiling 4.0.3 with debugging symbols)
Godot_v4.2-dev3_win64_2023-08-15_23-08-00.mp4
x264.4.1_text_rendering.mp4
x264.3.5_text_rendering.mp4
Clearing and Setting BBCode text every frame
Significantly slower in 4.X than 3.5. This is not an efficient way to use RTLs but it is a very big performance drop compared to what it was before. Of course, RTLs are also much more functional now so maybe this is a required tradeoff.
Here are the profiler results for this MRP. (VerySleepy 0.9.1 profiling 4.0.3 with debugging symbols, for 10 seconds total)
External module / thread waiting stuff
Including external modules
![image](https://private-user-images.githubusercontent.com/3820082/260994745-5e308f21-153b-4779-a054-742b4205b3c3.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0MzYyMzYsIm5iZiI6MTczOTQzNTkzNiwicGF0aCI6Ii8zODIwMDgyLzI2MDk5NDc0NS01ZTMwOGYyMS0xNTNiLTQ3NzktYTA1NC03NDJiNDIwNWIzYzMucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTNUMDgzODU2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9N2E2M2YwZWYyODQ2NDAyOTVkYzZkYzY3NmU1MTFkMTQ3YWE5MWFjOGVjMjNiYjk0MDIzOGFjNTQ1NDY2Yjk4ZSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.A0elSRp-YxG-IGsbX5DcrR-FrPunFXgcJ0A-aedehgw)
Lots of time spent starting up Threads maybe? I can't really understand these resultsI think this is just waiting for thread joinsI noticed ntdll was filtered out so here's a completely unfiltered (afaik) list
![image](https://private-user-images.githubusercontent.com/3820082/261004462-8a064290-0a34-46f3-bef0-f5878fb4c1ad.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0MzYyMzYsIm5iZiI6MTczOTQzNTkzNiwicGF0aCI6Ii8zODIwMDgyLzI2MTAwNDQ2Mi04YTA2NDI5MC0wYTM0LTQ2ZjMtYmVmMC1mNTg3OGZiNGMxYWQucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTNUMDgzODU2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YzE0OGI1MmI2MmFhN2Q3OTRiMWM4M2JjNzU4NjBmMmZkYmEwZjNkYTViZDI1OGVlNGE0MDAwYTI3MmMwYjdhYiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.7RRAs78zJrrXFGfgbKODKIWhegX-GzmYPovE8Y4lzGI)
Of note, the call to Clear itself seems to take a while when there are a lot of tags in the RTL already.
x264.2023-08-16.02-06-43.mp4
Threading
I couldn't get what looked like any useful profiler information for where clearing and setting is spending its time because most of the time was spent doing stuff with threads.
I didn't notice any threading stuff when it came to rendering BBCode with no line breaks.
Enabling threading to mitigate these stalls isn't an option if you're changing text every frame due to godotengine/godot#80613.
Beta Was this translation helpful? Give feedback.
All reactions