Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding SIMD support #48

Open
jackmott opened this issue Jul 25, 2016 · 2 comments
Open

Adding SIMD support #48

jackmott opened this issue Jul 25, 2016 · 2 comments

Comments

@jackmott
Copy link

I'm been going through the Streams code trying to figure out how to add SIMD support. For example this SIMD enhanced fold performs very well compared to an inlined version of the core library fold.

   static member inline SIMDFold folder combiner (start:'T) (values : 'T[]) =        
        let mutable i = 0;
        let mutable v = Vector<'T>(start)
        while i < values.Length - Vector<'T>.Count do            
            v <- folder v (Vector<'T>(values,i))
            i <- i + Vector<'T>.Count
        i <- 0
        let mutable result = start        
        while i < Vector<'T>.Count do
            result <- combiner result v.[i]
            i <- i+1
        result

Adding support to streams has a few considerations:

  • I think it would only make sense to allow it with Arrays and maybe ResizeArrays
  • The elements in the array have to be valid SIMD types
  • I'd love for it to be available in parallel and non parallel streams
  • Can it be added to Streams and ParStreams in some way? Or should there be a separate SIMDStream and ParSIMDStream?
  • Can we handle a mix of scalar and vector operations on the stream? (example below)

Ideally, say we had an Array of floats - values
We would want to be able to do something like

values
|> SIMDStream.simdMap (fun e -> e*e)  //operations on Vector<float>s
|> SIMDStream.map (fun e -> if (e < 5) then 0 else 3)  //scalar operations on floats
|> SIMDStream.simdSum

This would people could mix and match SIMD operations with scalar ones as sometimes has to be done.

I'm going through the streams code trying to see how this could be done, and it isn't entirely clear. Some parts of the composed function would need to operator on a Vector while others would need to iterate Vector.Count times to operate on individual elements of the array?

If there is interest in this I'd love to contribute but need some guidance as I don't fully understand how the streams work yet.

@palladin
Copy link
Contributor

Hi Jack,
It is certainly exciting to have vectorized streams, but with the current design I don't think that it is possible. Something like array |>Stream.ofArray |>Stream.filter (fun ...) |> Stream.simdfold is definitely problematic. Plus we need to have perfect stream fusion or else the virtual calls will dominate perf wise. One possible direction for vectorized loops is to have a new Streams library targeting perfect stream fusion and then just compile with .net native and hope that the C++ backend will vectorize our loop.

@jackmott
Copy link
Author

I see. It is too bad that RyuJIT doesn't do any automatic vectorizing, it would be quite nice to get the same kind of auto vectorizing that C compilers do, but at runtime so it can target the available instruction sets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants