Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I started working on making operators device-independent. For CUDA (and I think other GPU libs as well), scalar indexing is the cause of incompatibility, so rewriting the operators to do their tasks with high-level matrix operations and range indexing would allow the use of the same source code for CPU and GPU arrays.
In the case of Conv and Eye, it was enough to extend the list of type parameters and replace
for
loops withinmul!
using scalar indexing with a slightly more sophisticated array indexing. Also, I created thestorageTypeDisplayString
property andAbstractOperatorsCudaExt
extension so that printing an operator would optionally display their storage device. E.g.:Finally, I created an abstract type for both
Eye
andConv
to allow the specialization of one or more functions for other GPU array types (e.g., some GPU libraries might not support real FFTs, only complex FFTs, and a custom Conv struct subtypingAbstractConv
along with a customized constructor might be enough to add support for that GPU library).Is it a good idea to do the following steps for the rest of the operators as well?
Filt
andMIMOFilt
, which should be left CPU-only)