FoldsCUDA

FoldsCUDA.jl provides Transducers.jl-compatible fold (reduce) implemented using CUDA.jl. This brings the transducers and reducing function combinators implemented in Transducers.jl to GPU. Furthermore, using FLoops.jl, you can write parallel for loops that run on GPU.

API

FoldsCUDA exports CUDAEx, a parallel loop executor. It can be used with the parallel for loop created with FLoops.@floop, Base-like high-level parallel API in Folds.jl, and extensible transducers provided by Transducers.jl.

Examples

`findmax` using FLoops.jl

You can pass CUDA executor FoldsCUDA.CUDAEx() to @floop to run a parallel for loop on GPU:

julia> using FoldsCUDA, CUDA, FLoops

julia> using GPUArrays: @allowscalar

julia> xs = CUDA.rand(10^8);

julia> @allowscalar xs[100] = 2;

julia> @allowscalar xs[200] = 2;

julia> @floop CUDAEx() for (x, i) in zip(xs, eachindex(xs))
           @reduce() do (imax = -1; i), (xmax = -Inf32; x)
               if xmax < x
                   xmax = x
                   imax = i
               end
           end
       end

julia> xmax
2.0f0

julia> imax  # the *first* position for the largest value
100

`extrema` using `Transducers.TeeRF`

julia> using Transducers, Folds

julia> @allowscalar xs[300] = -0.5;

julia> Folds.reduce(TeeRF(min, max), xs, CUDAEx())
(-0.5f0, 2.0f0)

julia> Folds.reduce(TeeRF(min, max), (2x for x in xs), CUDAEx())  # iterator comprehension works
(-1.0f0, 4.0f0)

julia> Folds.reduce(TeeRF(min, max), Map(x -> 2x)(xs), CUDAEx())  # equivalent, using a transducer
(-1.0f0, 4.0f0)

More examples

For more examples, see the examples section in the documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
.buildkite		.buildkite
.github/workflows		.github/workflows
benchmark		benchmark
docs		docs
examples		examples
src		src
test		test
.gitignore		.gitignore
.mergify.yml		.mergify.yml
LICENSE		LICENSE
Makefile		Makefile
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FoldsCUDA

API

Examples

`findmax` using FLoops.jl

`extrema` using `Transducers.TeeRF`

More examples

About

Releases 10

Packages

Contributors 3

Languages

License

JuliaFolds/FoldsCUDA.jl

Folders and files

Latest commit

History

Repository files navigation

FoldsCUDA

API

Examples

findmax using FLoops.jl

extrema using Transducers.TeeRF

More examples

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 10

Packages 0

Contributors 3

Languages

`findmax` using FLoops.jl

`extrema` using `Transducers.TeeRF`

Packages