[WIP] Added skeleton of batch based GPU assignment #2820

spectre-ns · 2025-01-04T18:39:11Z

Checklist

DO NOT MERGE

The title and commit message(s) are descriptive.
Small commits made to fix your PR have been squashed to avoid history pollution.
Tests have been added for new features or bug fixes.
API of new functions and classes are documented.

Description

@JohanMabille This is a skeleton for how to move simple operations to the GPU using a similar strategy as with XSIMD. I'm curious if this would be an extensible strategy. I know the code doesn't compile I have made many short-cuts to demonstrate the concept.

Points of Concern:

Containers are copied multiple times when referenced in multiple expressions rather than one immutable shadow copy.
- GPU memory allocations and host - device transfers are expensive
Expressions are evaluated serially through the expression tree when multiple streams/thread could be used in a reduction tree.
Each batch is essentially a kernel launch which has overhead... ie. no kernel fusion. (this would require us to generate kernels with metatemplate code... This would likely require implementing the assignment operation as a kernel launch across a thread grid)
Currently proposing we use thrust or something similar from AMD/Intel which will have a cost as well but this eliminates the need to worry about launching kernels, streams and synchronization.
The current method 'dispatches' work from the host to a device in an opaque way. We could also create a gpu_container for the public interface and attempt to implement the assignment as a CUDA kernel.

#192

Added skeleton of batch based GPU assignment

1904331

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Added skeleton of batch based GPU assignment #2820

[WIP] Added skeleton of batch based GPU assignment #2820

spectre-ns commented Jan 4, 2025 •

edited

Loading

[WIP] Added skeleton of batch based GPU assignment #2820

Are you sure you want to change the base?

[WIP] Added skeleton of batch based GPU assignment #2820

Conversation

spectre-ns commented Jan 4, 2025 • edited Loading

Checklist

Description

Points of Concern:

spectre-ns commented Jan 4, 2025 •

edited

Loading