Is it not better to queue only render and binding commands? #3862

BrodaJarek · 2021-04-27T13:40:35Z

BrodaJarek
Apr 27, 2021

Hi,

Is it not better to queue only render (glDraw*, glBlit*, glClear*, glDispatch*) and binding commands and leave others on the main thread? I've got into this question, because of glMapBufferRange and persistent mapping. With the current approach there are 3 ways to achieve it:

upload data from the main thread to the render thread, but it will be CPU consuming for large data size.
allocate into a command buffer directly, upload data into this pointer and then memcpy this buffer into the mapped gpu one. (but why do we need to allocate a new buffer when we only want to get the pointer from the gpu)
create a pointer, pass it into the function that maps the buffer and wait (one or more frame) until it will be valid.

Wouldn't it be simpler and faster to just call, lets say we need a uniform buffer, ubo = driver.createUniformBuffer(size, nullptr, WRITE_BIT | PERSISTANT_BIT | COHERENT_BIT); on the main thread, then:
void* mappedBuffer = driver.mapRange(ubo, size, offset, bindingPoint, WRITE_BIT | PERSISTANT_BIT | COHERENT_BIT); also on the main thread?

and then it is very easy to update the mapped buffer efficiently

romainguy · 2021-04-27T16:02:51Z

romainguy
Apr 27, 2021
Maintainer

How would this work exactly? All GL commands must be submitted on the same thread, or we would have to use shared contexts or change the thread ownership of the context. Either way we do not want to ever expose our GL context to the main thread as this would cause the application to mess with the GL state.

For UBOs in particular there is absolutely no need to use map buffer. You can instead do what we do for textures and pass a pointer in the command queue and have the driver thread invoke a callback when the call to glBuffer(Sub)Data has been issued to free up the pointer (or whatever else should be done). Note that map buffer isn't always a win in terms of performance (as we have experienced ourselves on various drivers/architectures). I've even seen MapBuffer(Range) implementation cause 1 to N copies inside the GPU drivers 👎

0 replies

BrodaJarek · 2021-04-27T20:27:54Z

BrodaJarek
Apr 27, 2021
Author

Unfortunately shared contexts or change the thread ownership of the context does not sound encouraging, but I did not mean to use MapBuffer every frame, because as you linked up glBufferSubData is better. I think mapping the buffer once in a constructor and reusing it with like double/triple buffering for synchronization is better than glBufferSubData even with a preallocated space with glBufferStorage. pseudo code

void main()
{
    struct Material {
        vec4 ambientColor;
        vec4 diffuseColor;
        vec4 specularColor;
        int specularPower;
    };

    const int numMaterials = 100;
    const int numDraws = 16384;


    //this is executed on the main thread
    ubo = CreateUniformBuffer(sizeof(Material) * numMaterials, WRITE_BIT | PERSISTANT_BIT | COHERENT_BIT);

    //map only once, * 2 for double-buffering
    Material* materials = (Material*)ubo.MapRange(sizeof(Material) * numMaterials * 2, WRITE_BIT | PERSISTANT_BIT | COHERENT_BIT);

    int bufferOffset= 0;
    while(true) {
        //the first frame modifies the first chunk of the data, the second frame modifies the second chunk 
        //and then the third frame modifies again the first chunk  
        AddDataToMappedMaterial(bufferOffset);       
        
        MultiDrawArrays(numDraws, otherParams);
         
         bufferOffset= (bufferOffset + 1) % 2;
    }
}

more or less, with this approach gpu driver has minimal overhead because the buffer is only mapped once.

1 reply

dontweaks May 2, 2021

Yeah, that's kinda weird for me too, why it queues also buffer creation and everything, since either nv_command_list extension and Vulkan API only queue commands that you described.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it not better to queue only render and binding commands? #3862

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Is it not better to queue only render and binding commands? #3862

BrodaJarek Apr 27, 2021

Replies: 2 comments · 1 reply

romainguy Apr 27, 2021 Maintainer

BrodaJarek Apr 27, 2021 Author

dontweaks May 2, 2021

BrodaJarek
Apr 27, 2021

Replies: 2 comments 1 reply

romainguy
Apr 27, 2021
Maintainer

BrodaJarek
Apr 27, 2021
Author