Platform Libraries #313

wasabii · 2023-04-24T21:10:17Z

wasabii
Apr 24, 2023
Maintainer

Much of the work of updating IKVM between OpenJDK versions is fixing up the C# code which replaces the C code from OpenJDK, or the various forks of OpenJDK Java code that relies upon .NET APIs instead of C code. Things like NIO, NIO2, java.net, etc, are replacements for Java classes, or replacement for C files, instead implemented in C#. This is a very error-prone process. The specifics of how the C code works matters. What exceptions get thrown when. What error messages look like. And since this is forked code, each time OpenJDK updates their logic, we have to hand fix that.

And we haven't even covered all of OpenJDK. Large sets, like AWT, ECC, various sound APIs, are just uncovered currently for lack of time to reimplement.

It would be nice to increase the amount of C and Java code we can consume directly from the OpenJDK project. We would reduce the number of forks we have, and how stable our code was.

So let's examine some of the issues with consuming more of the native code from OpenJDK.

a) Sometimes OpenJDK delivers the same Java class with different implementations for different platforms. An example that comes to mind is java.lang.ProcessImpl. There is a Win32 and Unix version of ProcessImpl.
b) Native code sometimes references JVM internal structures. Sometimes these are just #defines. But other times they are calls directly into hotspot to reveal various functionality.
c) Building native code cross platform is hard.

Let's focus on (a) first.

OpenJDK distribution method is a bit different than IKVM. OpenJDK can build a single java binary and runtime library for a platform. Because it is the platform. You install Java, on a platform, and you install it for a platform. IKVM is a bit different. While we do offer platform-specific drops, it is a primary use case that users be able to directly call into Java platform classes from .NET code, distributing IKVM along with the eventual application or library the user is building. It is not necessarily known at build time which platform the code will eventually be running on. In this case we're a library. Much as a JAR file can't know the platform it is running on until it gets there, IKVM can't. But, that JAR file usually runs on a platform with OTHER JAR files (rt.jar) that DO know what platform they are on, because they came with the runtime.

So, we need to bundle an assembly that works for every platform that that assembly might be copied to. But, this isn't possible. We can't bundle two copies of ProcessImpl in one Assembly.

At a minimum, we have to bundle multiple assemblies. One with each version of ProcessImpl.

We can take a naïve approach, and bundle IKVM.Java for each OS we support. But how can this be possible? They can't both be named IKVM.Java and sit in the same bin/ directory of the user's application. Nor can we pick which one to bundle, because then the user's application would only run on that platform.

So, somehow, we need to distribute multiple copies of ProcessImpl, per OS, and have that be transparent to the end application.

.NET Core faces a similar problem in their distribution of the runtime itself (just as Java does). They need to distribute OS specific versions of various runtime assemblies, but have user assemblies that reference types in these assemblies without reference to the platform specific version. They accomplish this by making use of TypeForwardedTo.

TypeForwardedToAttribute allows a type to exist in one assembly, but be directed to another concrete type at runtime. When the runtime does binding, it finds the referenced assembly, notices it has a TypeForwardedTo for the requested type, and then recuses into the load operation to load that other type instead. .NET Core makes use of this for assemblies like mscorlib, netstandard, and System.Runtime. Many of the types in these assemblies no longer exist in those assemblies at runtime. Instead they are directed to System.Private.CoreLib.

This allows an assembly compiled against mscorlib, looking for say, System.Object, to find it at runtime in System.Private.CoreLib instead.

They take it a step further, and use it to hide System.Private.CoreLib completely. At build time, they distribute a version of System.Runtime that contains System.Object. So when user's use the C# compiler, it generates an assembly referencing System.Runtime. But when that assembly is loaded, it finds the type in System.Private.CoreLib instead.

We can do the same thing. We can distribute multiple copies of IKVM.Java. One can be a "reference assembly", used at build time, which contains the public API surface for Java. This will have things like ProcessImpl in it. But no actual implementation. At runtime, we can distribute a version of IKVM.Java with a TypeForwardedTo attribute that directs the class to a platform specific assembly.

So, for instance, we could distribute the Java classes in an assembly named "IKVM.Java.Platform". We could distribute one of these assemblies for each target platform. The IKVM.Java version we distribute could have TypeForwardedTo(IKVM.Java.Platform,java.lang.ProcessImpl). When ProcessImpl is loaded from code expecting it in IKVM.Java, it would go looking for the IKVM.Java.Platform type instead. We can then preload this assembly into the AppDomain or ALC, based on the current platform, and it would answer for that type.

A naive approach for this would be to distribute a complete IKVM.Java.Platform for each target platform. But the size of that is about 60mb. So, 60 * Linux * Windows(x86) * Windows(x64) * Windows(arm) * Wndows(arm64) * Mac(x64) * Mac(arm64). That's a lot of stuff to distribute. If the user was doing a specific RID publish, we could limit it to only the required target assembly. This actually seems like a good approach to start with, as most .NET apps probably target a specific RID.

But a good second step would be to split the assemblies across the libraries. So only platform-specific stuff ends up in .Platform.

This would require us to be able to build two versions of IKVM.Java: a ref-assembly version and a type-forwarding version. ikvmc can't yet build either.

And even if it could, what would it include anyways? The way the CLR does it is by generating a ref assembly from some committed C# source code, where each method body is replaced with 'throw null'. They don't use the compiler for this. They have a tool called GenAPI which can update this C# code from a source assembly. But otherwise they sort of manage it by hand. Is this smart for Java? Might be. The public API surface of Java is very stable.

If we're building multiple IKVM.Java.Platform assemblies.... and hand building ref assemblies, this might be fine. ikvmc can just be run once per-platform, building each output. Yes, multiple runs. And that's a build speed concern. That's a problem on initial build. But what about subsequent builds? Well, since the goal is to largely rely on the OpenJDK code, which is immutable.... rebuilds would rarely be required: nothing is supposed to be changing anyways. Only if the user messed with some of the IKVM specific code, such as that in local/, would a rebuild of them each be required. But we want to cut down on that anyways. Of course, it would need to be rebuilt if anything in IKVM.Runtime changed. But, that's a present day concern. And one we'd be improving on by reducing the amount of IKVM.Runtime specific code at play. Net benefit? I think so.

How then to build the TypeForwarded to version of the assembly? Couldn't that be auto generated from the ref assembly? Would the ref assembly contain all the types? Well, it should only include public or internal stuff. But not private stuff. But that's okay, right? Any types inside IKVM.Java.Platform should only reference IKVM.Java.Platform itself, including those private types.

So: proposed plan.

a) Generate some sort of code gen approach for dumping the C# behind a ref-assembly version of IKVM.Java. This can be recursive. We could just take the existing IKVM.Java and parse it back out to C# with signatures only. The require future devs to update it by hand.
b) Generate some build process to produce a typeforwarded version of the same. Kind of the same process. Take the ref assembly, and just replace every Type with a TypeForwardedTo.
c) IKVM.Runtime needs to have the machinery to preload the right IKVM.Java.Platform.
d) Figure out how the Project structure looks, and how this ends up in the Nuget package.

wasabii · 2023-04-24T21:19:55Z

wasabii
Apr 24, 2023
Maintainer Author

b) Native code sometimes references JVM internal structures. Sometimes these are just #defines. But other times they are calls directly into hotspot to reveal various functionality.

Once (a) is finished, we are pulling in the true OpenJDK implementations of .java files which mostly use JNI to call into native libraries for their implementation. So, we need a way to build that native code. And, we can't rely on the versions of the native code that are built by OpenJDK itself, as that code is built against hotspot, and in many cases directly references hotspot structures.

We would need to implement a C project in our solution, for these native libraries. Importing the .c and .h files from OpenJDK. But, substituting out the portions that access Hotspot.

This shouldn't be super difficult from a code perspective. We can have a C project that references some .c and .h files from OpenJDK, but then also has local copies of .c files replacing the functions that had a Hotspot dependency. We would have to reimplement some C code. But nowhere near as much as otherwise. The Hotspot stuff is going to be basic stuff around memory allocation, etc.

This will require some way to have C code in our solution, built for all of our platforms. There is no MSBuild project type that handles this properly. .vcxprojs get the closest, but they only run on Windows. And even then their cross platform support is weak. Having to have different projects for Linux/Windows. And no support at all for Mac OS.

At this point I'm leaning towards a custom project type. Something that can open in VS, much like our IKVM.Net.SDK projects. But otherwise doesn't use any existing machinery from the SDKs. And we'd custom implement the invocation of the C compiler.

There's not a lot we need here. The ability to invoke the compiler. The ability to invoke the linker. Some ItemGroups for .c files. Translated to .o files. Translated to an output.

If we rely on purely ILVM, our task becomes a bit easier. We can think of the project as a .clangproj file. With a nested build for TargetMachine. We can the use conditions on TargetMachine to alter header locations, compiler options, etc, based on the platform being targeted. The Build can run for each TargetMachine much as NETSDK projects run for each TFM. But we don't need negotiation. Just the ability to summarize the nested project output as native libraries for the consuming projects.

The biggest issue I see here is doing xplat builds. The C code from OpenJDK is going to require Windows headers for Windows, Linux headers for Linux, and OS X headers for Mac. This means the user is going to have to have some fashion of SDK environment available for them all. We can't likely distribute these. So, what's the best play here? Not sure yet.

0 replies

wasabii · 2023-04-24T21:44:47Z

wasabii
Apr 24, 2023
Maintainer Author

So, to summarize, I think we can pull this off.

Infrastructure to build multiple copies of IKVM.Java: IKVM.Java as a ref-assembly, IKVM.Java as a forwarding assembly, and IKVM.Java.Platform for each target.
Infrastructure to build C code.
Some modifications to IKVM.Runtime to preload IKVM.Java.Platform based on the current runtime.

And I don't think we need to implement ref-assembly support inside ikvmc for this, since we have to handle that external to ikvmc anyways.

0 replies

wasabii · 2024-03-09T16:25:24Z

wasabii
Mar 9, 2024
Maintainer Author

1 and 2 are done. We now build multiple copies of IKVM.Java. However, right now, they're not distributed, and only used because they also now output .h files for JNI signatures used by the native projects, which are completed in 2. We have a new project type, clangproj, that combines native libraries using Clang. We've started moving a LOT of the OpenJDK code into this, and at this time have deprecated quite a few code paths that we were hand doing in C#.

It's allowed us to do things like enable MIDI support, and ECC, and a few other things that we've been waiting on for awhile.

(3) remains difficult, if not impossible. I have no proper solution yet. For .NET 5+, it can work fine. The .deps.json file can specify specific versions of IKVM.Java for each RID. And .NET respects this. And the nuget packages can deliver IKVM.Java in such a way that the final app will have .deps.json written properly...... but this doesn't work for Framework. At this time I have no good solution for preloading the proper version of IKVM.Java on Framework. The only saving grace I've got here is that Framework is really only used on Windows or WINE, both of which would require the Windows version.

Mono netfx however wouldn't work this way. But Mono netfx support is pretty broken anyways. I always wanted to get Mono back as a supported platform.... but this might be the reason not to.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IKVM

Platform Libraries #313

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

IKVM

Platform Libraries #313

wasabii Apr 24, 2023 Maintainer

Replies: 3 comments

wasabii Apr 24, 2023 Maintainer Author

wasabii Apr 24, 2023 Maintainer Author

wasabii Mar 9, 2024 Maintainer Author

wasabii
Apr 24, 2023
Maintainer

wasabii
Apr 24, 2023
Maintainer Author

wasabii
Apr 24, 2023
Maintainer Author

wasabii
Mar 9, 2024
Maintainer Author