-
-
Notifications
You must be signed in to change notification settings - Fork 23.5k
Description
Tested versions
v4.5.1
System information
Godot v4.5.1.mono (eb5a059c3) - Windows 11 (build 26200) - Multi-window, 2 monitors - Vulkan (Forward+) - dedicated NVIDIA GeForce RTX 4090 (NVIDIA; 32.0.15.7680) - 13th Gen Intel(R) Core(TM) i9-13900K (32 threads) - 127.7 GiB memory
Issue description
My team is working on loading user-supplied glTF models at runtime for an XR application. We are experimenting with loading the glTF model on a background thread to reduce frame stalling.
We are also experimenting with the "Separate" rendering model (despite the warnings) to understand what problems we might have.
With the "Separate" thread model, we see load times that are roughly 30x slower than if we use a background thread with "Safe" thread model. I have instrumented the Godot engine to isolate the problem and have found it is due to lock contention in CommandQueueMT. Specifically this is what I see:
- GLTFDocument's
generate_scenewill import each node.- As part of import,
convert_importer_mesh_instance_3dwill createMeshInstance3Dobjects and set their mesh. - Creating the
MeshInstance3Dobject often takes 10+ milliseconds. Setting the mesh also often takes 10+ milliseconds. With "Safe" thread model these operations are instant. More on what causes this later... - So if there are tens of thousands of nodes (we're working with complex industrial models that can be hundreds of megabytes) we can see this take 17 minutes for one of our models. BTW, Godot does wonderfully rendering with good performance once it is loaded, and does wonderfully loading quickly if we use "Safe" thread model.
- As part of import,
- When the "Separate" thread model is used, the
RenderingServer's_draw()function will be queued on theCommandQueueMTand execute on the RenderingServer's render pump thread._draw()is very slow because it includes the swapchain synchronization. If I turn off vsync (which we do for XR), then it is "only" 2-3x slower than when using "Safe" thread model.- While
_draw()is running (or any queued work for that matter), a mutex in theCommandQueueMTis acquired which prevents new work from being queued.
I instrumented Godot and have this timeline which shows what is happening:

I drew a box around a single node/mesh being imported and there are four places where it pushes into the CommandQueue. Each of these four places block on acquiring a mutex because the _draw() method (shown on a separate lower track in that screenshot) is being invoked out of the Command Queue which is holding that lock.
I am still new to Godot and its architecture but one possible solution comes to mind: CommandQueueMT gets smarter around locking. Perhaps it could have separate pending and flushing queues with separate locks? Flush would swap the queues atomically to minimize contention with queuing work.
Safe thread model doesn't have this problem because the _draw() command does not run via CommandQueueMT.
Steps to reproduce
Here is the loading code I use, which I invoke from a button press:
public async void _on_button_pressed()
{
Node3D sceneNode = null!;
await Task.Run(() =>
{
var state = new GltfState();
var doc = new GltfDocument();
Error readErr = doc.AppendFromFile("path to very large GLB file", state);
sceneNode = (Node3D)doc.GenerateScene(state);
});
await ToSignal(GetTree(), SceneTree.SignalName.ProcessFrame); // hop back to main thread
AddChild(sceneNode);
}Minimal reproduction project (MRP)
separate-thread-model-slow.zip
Note this doesn't include the GLB file that I am testing with. The "PATH_TO_GLB_GOES_HERE.glb" line of code in node_3d.gd will need to be changed to something.