Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sections talking about Swift SIMD types and interop with them #313

Merged
merged 6 commits into from
Jan 29, 2024
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion proposed/swift-interop.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,13 @@ When calling a function that returns an opaque struct, the Swift ABI always requ

At the lowest level of the calling convention, we do not consider Library Evolution to be a different calling convention than the Swift calling convention. Library Evolution requires that some types are passed by a pointer/reference, but it does not fundamentally change the calling convention. Effectively, Library Evolution forces the least optimizable choice to be taken at every possible point. As a result, we should not handle Library Evolution as a separate calling convention and instead we can manually handle it at the projection layer.

For frozen structs and enums, Swift has a complicated lowering process where the struct or enum type's layout are recursively flattened to a sequence of primitives. If this sequence is length 4 or less, the values of this type are split into the elements of this sequence for parameter passing instead of passing the struct as a whole. Structs and enums that cannot be broken down in this way are passed by-reference to their specified frozen layout. Due to high implementation cost in the RyuJIT, in particular in the `UnmanagedCallersOnly` scenario, we should implement this first pass of lowering in the projection layer; the only types allowed for `CallConvSwift` calling convention in method or function pointer signatures are primitives, our special Swift register types, and pointer types. For reference, this lowering pass is done in the Swift compiler when lowering from Swift IL to LLVM IR. This design decision reinforces our direction of having the Runtime layer of Swift interop support similar features as the LLVM IR representation of Swift.
For frozen structs and enums, Swift has a complicated lowering process where the struct or enum type's layout are recursively flattened to a sequence of primitives. If this sequence is length 4 or less, the values of this type are split into the elements of this sequence for parameter passing instead of passing the struct as a whole. Structs and enums that cannot be broken down in this way are passed by-reference to their specified frozen layout. When a frozen struct or enum with a valid primitive sequence of 4 elements or less is returned from a function, it is returned if it were a structure of the elements of the primitive sequence. Due to high implementation cost in the RyuJIT, in particular in the `UnmanagedCallersOnly` scenario, we should implement this first pass of lowering in the projection layer. The only types allowed for `CallConvSwift` calling convention in method or function pointer parameters are primitives, our special Swift register types, and pointer types. In return types, we will also allow structure types to support returning the primitive type sequences correctly. For reference, this lowering pass is done in the Swift compiler when lowering from Swift IL to LLVM IR. This design decision reinforces our direction of having the Runtime layer of Swift interop support similar features as the LLVM IR representation of Swift.
Copy link
Member

@jkotas jkotas Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this match how LLVM deals with it? Are arguments handled in Swift IL lowering, but return values left to codegen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both parameters and return values are inspected in Swift IL lowering and lowered to primitive type sequences of 4 or less primitives if possible. Parameters that are lowered to this sequence are passed as separate parameters, one for each element of the sequence. If there is a valid primitive sequence for the return type, the actual return type in LLVM IR is a struct of the elements of the type sequence, not the original struct type. Processing this struct into which registers to return it through or return it by a return buffer is then handled by LLVM.

I've validated this by looking at the IR emitted by the Swift compiler on Compiler Explorer: https://godbolt.org/z/o1h6Y5de8

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good reason for the different handling of the return values vs. arguments in Swift/LLVM toolchain? Does this difference show up in the Swift public surface or is it just an internal implementation detail of the Swift toolchain that can change in future without breaking the Swift ABI?

It looks weird to standardize the different handling of the return values vs. arguments in public surface. On the other hand, we should be able to add the struct handling for arguments in future if needed, without breaking anything. So I guess it is ok to start with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the only way that Swift could implement the "lower to a type sequence of primitives" consistently for return values and parameters while still allowing enregistering return values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it be the only way? They could have done all lowering in the LLVM codegen part as part of Swift calling convention handling. Is there anything fundamental preventing that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, the algorithm you are describing here is part of the ABI: https://github.com/apple/swift/blob/d1d9fd1a2e478189e6eec7c48a0b952d9063859b/docs/ABI/CallingConvention.rst#L926-L993
It sounds like we are going to end up with ABI specific handling within the projection tooling regardless of whether we handle structs within the runtime or not. If that is the case, should we go with the tried-and-tested LLVM approach until we have a good grasp around the exact details of this handling and are confident that it all maps reasonably well to structs that can be described in IL? Are we already confident enough about this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding now is that we are going to be running the first part of the ABI twice. We need to run it in projection tooling regardless, because it is necessary for types that cannot be represented in IL. It is going to result in a struct of primitives. We are then also running the algorithm in the runtime, once more, and hoping that the results of running the first part of the algorithm twice is the same as what Swift+LLVM end up implementing.

What is the benefit of this compared to doing what Swift+LLVM does and avoiding wrapping the primitive sequence in a struct unless necessary (returns)?
I can see one benefit, which is that Swift types that are directly representable in C# can be defined in C# and used directly in interop. However, because there are Swift types that cannot be represented in C# the general guidance is always going to be to use the projection tooling.

Am I understanding this correctly? Does the diversion from Swift+LLVM make sense in this light?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another example that I'm going to reference below: https://godbolt.org/z/Y8Yxvdc3W

Here's how I view it:

The projection will handle the struct/enum layout. So it will lower X in the example above to something like:

[LayoutKind(Explicit)]
 struct X
{
    [FieldOffset(0)]
    private Foo f;
    
    private struct B
    {
         private Bar b;
         private int i;
         private int descriminator;
    }

    [FieldOffset(0)]
    private B b;
}

The projection layer does not need to lower X to a primitive type sequence, it just needs to determine the layout that Swift uses to represent each case and the descriminator.

Then the JIT/VM would handle lowering the X struct to a primitive type sequence.

Basically, the CallConvSwift signatures will always be able to use named types like Swift, and the JIT will handle all of the primitive type sequence logic and the register allocation logic in a combined pass that better fits RyuJITs architecture.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the JIT/VM would handle lowering the X struct to a primitive type sequence.

How exactly would it do this? Are you confident that the intermediate results during the ABI handling of enums can always be described with structs in this way, and that running the "primitive type sequencing" algorithm on these structs will result in the right thing? Or are we expecting that we are going to reconstruct the Swift source of truth on the runtime side and trying to give all Swift types an IL representation that the runtime knows how to parse?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very confident that with the struct layout mechanisms that exist in .NET, we can construct a C# struct type with a matching layout for any frozen enum type from a Swift source of truth, especially since 32-bit targets don't need to support Swift interop, and that we can build the tooling in a way that the VM/JIT's primitive type sequencing algorithm will end up with the same results.

jkoritzinsky marked this conversation as resolved.
Show resolved Hide resolved
jkoritzinsky marked this conversation as resolved.
Show resolved Hide resolved

##### SIMD Types

We will pass the `System.Runtime.Intrinsics.VectorX<T>` types in SIMD registers as we do with the managed calling convention. We will treat the `Vector2/3/4` types as non-SIMD types (and block their usage directly as parameters in the `CallConvSwift` signature as is the case with other structs).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We will pass the `System.Runtime.Intrinsics.VectorX<T>` types in SIMD registers as we do with the managed calling convention. We will treat the `Vector2/3/4` types as non-SIMD types (and block their usage directly as parameters in the `CallConvSwift` signature as is the case with other structs).
We will pass the `System.Runtime.Intrinsics.VectorX<T>` types in SIMD registers. We will treat the `Vector2/3/4` types as non-SIMD types (and block their usage directly as parameters in the `CallConvSwift` signature as is the case with other structs).

I don't think the managed convention does this on all platforms (today).

Do we allow interop with these types in other interop scenarios? It sounds like it is going to add dotnet/runtime#8300 + dotnet/runtime#9578 as part of the work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have support on ARM64 due to HFA/HVA support. If I'm wrong, then yes this would add in those two issues as part of this work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, on ARM64 we support it, but not on x64.

Is the SIMD interop important enough to warrant implementing it for x64? Those two issues on their own are large work items.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like macOS x64 is still going to be widely supported when .NET 9 releases, so it depends on if the libraries we want to support are high enough priority. For example, the Accelerate framework has many APIs that take the SIMD types.

@kotlarmilos what are the Apple libraries that we're targeting for .NET 9?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would Vector2/3/4 be blocked? They have always been supported for interop and have been treated the equivalent of user-defined structs containing 2, 3, or 4 float fields (which is exactly how they are defined).

Vector64/128/256/512<T> and Vector<T> are all blocked from interop. The former set because Windows doesn't correctly handle SIMD returns today (this isn't vectorcall, but rather missing handling for the default x64 calling convention) and the latter because it doesn't make sense from an interop perspective today.


Yes, on ARM64 we support it, but not on x64.

This should exist for Unix already as well and only be missing for Windows x64, since that doesn't pass vectors differently (only returns them differently). __vectorcall would be required for Windows x64 HFA/HVA support and is still desirable long term so that we better optimize such perf critical functions; it just hasn't bubbled up in priority yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest concern here is that ABI is an extremely complex space and interop is one of those spaces where users want both simplicity and reduced overhead, especially when generating larger binding libraries.

Apple has also notably broken ABI in the past or deviated conventions from the norm on new platforms and so it is entirely possible some new platform comes on and now every single bit of ObjC/Swift interop code is DoA.

I think it is ultimately much better (even if its not what is done for the initial release due to timing constraints or w/e) that we have this support in the runtime as a detail of the CallConv support and that users are ultimately able to write a delegate* unmanaged[CallConvSwift]<T, U, V> that mirrors the underlying ObjC/Swift signature that would be exposed to C/C++ using the official Swift tooling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not seem hard to make these particular calls aggressively inlined if we think that's beneficial.

These calls are typically going to have try/catch block in them to convert the .NET exception into switft error. You would have to implement inlining of methods with exception handling to make this work...

Copy link
Member

@tannergooding tannergooding Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is to say, a user should be able to export Swift bindings to C using official Apple tooling and then use another existing tool, such as ClangSharp, CppAst, etc; which can generate blittable P/Invoke bindings from a C header and expect it to work.

If we can't achieve that, I expect we will have a lot of downstream pain/headaches from the community, especially as it gets into more complex bindings and libraries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These calls are typically going to have try/catch block in them to convert the .NET exception into switft error. You would have to implement inlining of methods with exception handling to make this work...

Don't tempt me :-) (Note that this is actually part of our .NET 9 plan, and I also think it would be much more likely we end up with this support than appetite for improving the UCO Swift case in the future.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re handling tuples: I think we can still handle tuples at the projection layer since the splitting of a tuple into separate arguments is done at the SIL layer and is very straightforward (it doesn't have nearly the same complexity as the "primitive type sequence" lowering) especially if JIT implementation cost for tuples would be too much.

Additionally, the primitive type sequence lowering happens after the tuple lowering (so each tuple element can be lowered to a sequence of up to 4 primitives), so handling tuples in the projection layer doesn't interfere with the primitive sequence handling.


CoreCLR and NativeAOT currently block the `VectorX<T>` types from P/Invokes as this behavior is currently not well-supported by RyuJIT. Depending on implementation cost and the number of APIs we wish to support, we may want to block the `VectorX<T>` types from `CallConvSwift` initially until we can implement the correct behavior. We can always add support for these types later.

##### Automatic Reference Counting and Lifetime Management

Expand Down Expand Up @@ -150,6 +156,15 @@ We plan to interop with Swift's Library Evolution mode, which brings an addition

If possible, Swift tuples should be represented as `ValueTuple`s in .NET. If this is not possible, then they should be represented as types with a `Deconstruct` method similar to `ValueTuple` to allow a tuple-like experience in C#.

##### SIMD types

Swift has its own built-in SIMD types; however they're named based on the number of elements, not based on the width of the vector type. For example, Swift has `SIMD2<T>`, `SIMD4<T>`, up to `SIMD64<T>`. When the instantiations of these types correspond to an intrinsic vector type, they are treated as that type. Otherwise, they are treated as a struct of vectors. In .NET, our vector types are named based on their vector with, so `Vector128<T>`, `Vector256<T>`, etc.

For instantiated generic types that are within the size of an processor intrinsic vector type, there exists a correspondence between a Swift SIMD type and a .NET SIMD type. For example, `SIMD4<Int32>` corresponds to `Vector128<Int32>`.
However, this correspondence breaks down for SIMD types larger than the largest vector register width (i.e. larger than 512 bytes) or for unconstrained generic types like `SIMD4<T>`. These cases; however, should be quite rare. In the "too-large" case, the values are passed into Swift as though the type is a struct of vectors. In the case of unconstrained generic types, the SIMD values are passed indirectly. Both of these cases are suboptimal and we don't know of any public Swift APIs that fall into either of these scenarios.

We recommend that the projection tooling will map each Swift SIMD instantiation to the corresponding `VectorX<T>` type in .NET. For cases where there is no corresponding type or where an API takes or returns an unconstrained generic `SIMDX<T>` value, we can map the APIs to regular projected structs for the SIMD types based on the above rules.

#### Projection Tooling Components

The projection tooling should be split into these components:
Expand Down
Loading