Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Using vcpkg in the .NET product #311

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,4 +100,5 @@ Use update-index to regenerate it:
| | [Ref returns in C# and IL verification.](proposed/verifiable-ref-returns.md) | |
| | [Swift Interop](proposed/swift-interop.md) | [Andy Gocke](https://github.com/agocke), [Jeremy Koritzinsky](https://github.com/jkoritzinsky) |
| | [Target AVX2 in R2R images](proposed/vector-instruction-set-default.md) | [Richard Lander](https://github.com/richlander) |
| | [Using vcpkg in .NET projects and the VMR](proposed/vcpkg.md) | [Jeremy Koritzinsky](https://github.com/jkoritzinsky) |

128 changes: 128 additions & 0 deletions proposed/vcpkg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# Using vcpkg in .NET projects and the VMR

**Owner** [Jeremy Koritzinsky](https://github.com/jkoritzinsky)

## Scenarios and User Experience

In .NET today, we handle native dependencies and native dependency flow by a few different mechanisms:

- Vendoring dependencies, aka including a copy of the source with our applied patches in the product repository.
- Git submodules pointing to the source repo of the dependency.
- NuGet packages that flow pre-built between our repositories.

For many of our dependencies, these build options work decently well. However, these mechanisms have a few drawbacks:

- Vendoring code increases the size of our repositories and increases build times.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, we have been vendoring fairly small projects. Vendoring does not substantially increase the repo footprint.

Personally, I love the simplicity and flexibility of vendoring for our use cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dotnet/runtime uses vendoring for its native dependencies today, but dotnet/aspnetcore does not (they use submodules), possibly because their dependencies are more optional (Windows-only as the native code is IIS-specific) and they're larger (GoogleTest is pretty chunky).

Our mechanism for vendoring code also requires us to add additional (although I agree small) CMake logic to support our source-build customers opting-out of our copies of libraries and using the system libraries instead.

I believe we could adjust how we vendor our projects to help mitigate the second issue as well as other issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vendoring is not SB friendly in cases where the dependency is already included in a distro. Distro maintainers would prefer our applied patches were upstreamed. This reduces the maintenance overhead for them. From a security perspective there is only one version to maintain.

- Submodules aren't cloned by default and have a worse UX than vendoring code, and cannot support patches.
- NuGet packages are not easy to consume from native CMake projects, so we end up having to wire up MSBuild to consume them and construct paths to pass to CMake.

Additionally, [the VMR design documents on external source](https://github.com/dotnet/arcade/blob/main/Documentation/UnifiedBuild/VMR-Strategy-For-External-Source.md) defines that we shouldn't use submodules directly in our build.

For native dependencies, we could alternatively use Vcpkg, a cross-platform C++ package manager provided by another team at Microsoft.
Vcpkg would make it easier for us to both consume native dependencies in our projects, as many C++ libraries that we may consider using are already available in vcpkg.
Copy link
Member

@jkotas jkotas Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation for this proposal seems to be based on an assumption that we will add a bunch more C++ dependencies to the .NET project. Is that correct? What are those dependencies that you expect to add and why?

Copy link
Member Author

@jkoritzinsky jkoritzinsky Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've spoken before about adding the GSL in the future. In dnmd (which we hope to productize in a few years), we're adding unit tests based on googletest and using microsoft/wil (the "product" code isn't using these dependencies). We've had plans on and off for switching our zlib dependencies to zlib-ng each release, which will require a much more expensive CMake integration than our existing vendoring solutions for zlib and zlib-intel for maximum performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it would be useful to list existing dependencies of .NET project to be migrated to use vcpkg as part of this proposal.

It would also make it easier for us to contribute our own native projects, like the nethost library, back to vcpkg in an official manner.
Using vcpkg or another package manager would also help us avoid some of the issues we've had with vendoring projects. At least once, we've accidentally disabled SDL-required compiler warnings across the product because of how we hook up our vendored dependencies.

## Requirements

- We must be able to version and patch dependencies ourselves.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we need to be able to build with private patches, without disclosing the patches publicly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a point-in-time requirement? Like some security fixes that you want to apply and then publicly disclose later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is primarily for security fixes.

- We must be able to produce private patches that we can consume without public publishing.
- We must be able to consume dependencies from vcpkg in our projects.
- It must be possible to use vcpkg in offline build scenarios.
- It must be possible to use vcpkg in the VMR builds.
- It must be possible to change a dependency to use a distro/system package instead of a vcpkg dependency

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add something about how it should be possible (maybe not trivial, but possible) for distro packages to be able to use the native dependency that's packaged in a distro over the dependency specified/vendored in .NET?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a requirement for that. I've already addressed that this is doable later in the proposal without much work.

### Goals

- Enable .NET product developers to easily update, patch, or otherwise change native dependencies without increasing repo size or accidentally causing product-wide changes for dependency-specific fixes.
- Enable our distro partners to easily change dependencies from external dependencies (ie. vendored today) to distro-provided packages with minimal friction.

### Non-Goals

- Replace all existing native dependency mechanisms.
- Initially, using vcpkg from MSBuild is considered out of scope.
- Using vcpkg in .NET 9 is out of scope.

## Stakeholders and Reviewers

- @jkoritzinsky
- @mmitche
- @jkotas
- @AaronRobinsonMSFT
- @wtgodbe

## Design

### Acquiring vcpkg

Arcade should provide functionality in the scripts in `eng/common` to acquire the vcpkg tool.
vcpkg will be able requested by an entry in the global.json file for a repository.
The Arcade build scripts will define which version of vcpkg is used.

As we will be using our own custom registry, Arcade can do a sparse checkout of the vcpkg repository to reduce the amount of data that needs to be downloaded, excluding the default registry.

To support offline builds, the vcpkg repo and a matching version of the [vcpkg-tool repository](https://github.com/microsoft/vcpkg-tool) will be included as a submodule in the source-build-externals repository.

### Custom registry

We can either create a new repository for the registry or use the source-build-externals repository as a vcpkg registry as well. This registry will, similar to the `dotnet-public` NuGet feed, contain all of the public packages that we want to consume from within our projects. However, unlike the `dotnet-public` NuGet feed, we will need to modify the portfiles for the packages that we want to consume to not use online sources in an offline build. This likely means that for each package, we'll need to create a submodule in source-build-externals or another repository for each package that we want to consume.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about the following:

How often you see the versions of these packages changing?
You mentioned the importance of patching earlier. Where would that be done if the source-build-externals(SBE) repo were used? Would the patching be done in a dotnet fork of the original source and then SBE would reference the fork?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually pull a new version of our dependencies about once a year, mainly to get the upstream version that has the patches that we made locally and sent to the upstream in the prior release.

For patches, we'd either patch in a dotnet fork (which SBE would then reference) or we'd use patch files which vcpkg can apply as part of the package configuration.


Alternatively, we could build an alternative mechanism in the VMR that pre-clones all dependencies when updating the VMR. This option seems more expensive, and as such we should avoid it if possible.

#### Private patches

We can use the `internal/*` branches of the internal mirror of whichever repository we choose to use for the registry to handle private patches.

When we need to create a patch, we would update the `internal/*` branch to point to private forks of the repos we want to patch or to contain the patch files for the patches we want to apply.
When we create the internal branch mirrors, we will commit a change that changes each vcpkg manifest to point at the `internal/*` branch of the registry repository instead of the public branch.

After a release, we would need to ensure that we don't push the reference to the internal registry to the public repos. Since the public repositories wouldn't be able to access the internal registry, our CI lanes should prevent this problem.

### Using vcpkg

All repositories that use vcpkg must use the manifest mode of vcpkg to declare their dependencies. Classic mode is not supported.

### Source-build partners and conditionally system dependencies

For some dependencies, we may want to use a system-installed version of the dependency in source-build scenarios.
For these cases, our source-build partners can remove the dependencies from the vcpkg manifest.
Since the CMake integration with vcpkg uses the standard `find_library` and `find_package` CMake commands, the system dependencies will be found instead of the vcpkg dependencies with no change to the CMake source code.

### Flowing native dependencies with binary caching

vcpkg has a binary caching feature that can use NuGet.
We should be able to utilize this feature for flowing native dependencies that are consumed by native code (such as `dotnet/llvm-project` assets) to avoid having to run the build through MSBuild to locate the dependencies or rebuilding the dependencies in each repository. This is not something that needs to be done immediately, but is something we can consider in the future instead of our existing package flow model.

## Q & A

- Why use vcpkg instead of another package manager?
- Vcpkg is already used by many teams at Microsoft, and is already supported by many of the dependencies that we may want to consume.
- Vcpkg is cross-platform.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this true for all platforms we support today? what about s390x, ppc64le, riscv or other community maintained ports like FreeBSD?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we'd be defining our own registry, we can define custom tuples for the various different platforms we support. I don't see anything in the vcpkg tool repository that wouldn't work on our various target machines, but they also don't document which platforms it works on. This does provide some risk.

Copy link
Member

@akoeplinger akoeplinger Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I'm less concerned about the vcpkg packages themselves since they can be patched (which we'd need to do anyway to support a new platform in case a dependency doesn't support it) but about vcpkg itself.

It looks like it is written in C++ and either downloaded or built as part of the bootstrapping (https://github.com/microsoft/vcpkg/blob/a1a1cbc975abf909a6c8985a6a2b8fe20bbd9bd6/scripts/bootstrap.sh#L196) so that doesn't sound too bad. Though there is a new vcpkg-artifacts which seems to rely on TypeScript/nodejs for the CLI but I'm not sure how that relates to the main vcpkg and what the future plans for that are.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, vcpkg-artifacts is used for pulling down the binary cache packages/artifacts. I believe we can avoid using that in the VMR (which is where we need to be able to do offline builds).

- Vcpkg allows us to define custom target triples so we can add support for our packages for our targets.
- Vcpkg allows CMake consumers to consume dependencies as if they were installed on the machine instead of requiring custom CMake logic to find dependencies. Supporting changing a package between a vcpkg and system dependency would require more work with other package managers.
- What is vcpkg's minimum toolset requirements? Do these conflict with the requirements for .NET?
- Vcpkg requires CMake 3.14 or newer. We require 3.20 or newer, so we should be fine.
- Vcpkg requires a C++ compiler that supports C++17. Although we don't use C++17, all of the compilers we support do support C++17.
Copy link
Member

@huoyaoyuan huoyaoyuan Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds interesting. Is it confirmed that all our target platforms, including non-popular ones like FreeBSD/Tizen/ARMv6, support C++ 17?
I'd like to see the exact compiler versions gathered so that we can confidently use new features.

- Which packages would be migrated to use vcpkg?
- In dotnet/runtime, we would migrate the following dependencies:
- libunwind
- Given that we already have to support a system-provided version of libunwind, this is a good first product dependency to migrate
- In dotnet/runtime, we could migrate the following dependencies:
- Rapidjson
- zlib
- zlib-intel
- brotli
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should llvm and icu be in this list too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add them in a separate section as we are already flowing them through NuGet and we'd want to use the binary caching mechanism in vcpkg to flow them instead of building them in dotnet/runtime.

Copy link
Member

@akoeplinger akoeplinger Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the outputs of llvm are not just used during the dotnet/runtime build but ship to end-users as well (llc and opt are used during AOT compilation as part of e.g. a MAUI app build), I don't think we can remove the nugets for that.

- With vcpkg binary caching, we could flow dependencies from the following repos through vcpkg for non-VMR scenarios:
- dotnet/llvm-project
- dotnet/icu
- > For VMR scenarios, we could either "install" these packages in a folder or consume them as source instead of using vcpkg.
- > We'd still need to ship the existing packages that we ship to end users, but we would remove the internal consumption of those packages.
- If we switch to zlib-ng from zlib and zlib-intel, we would consume zlib-ng through vcpkg.
- In dotnet/aspnetcore we would migrate the following dependencies if we were to support vcpkg + MSBuild:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the IIS dependency mentioned in your other comment (https://github.com/dotnet/designs/pull/311/files#r1446709622) be in this list too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look at how the IIS headers are pulled in and add it if it would make sense.

- GoogleTest
- IISLib (we could pull this from microsoft/IIS.Common with a package we define. A package doesn't currently exist in the vcpkg registry for it.)
- Some libraries that we're looking at productizing in the future have these dependencies, which we'd also use vcpkg for:
- GoogleTest
Copy link
Member

@jkotas jkotas Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GoogleTest and Google Benchmark are test-only dependencies. They do not need to participate in the official or offline builds, with all the requirements. I guess they are the best place to start with.

- Google Benchmark
- microsoft/WIL
- In the future, we may use microsoft/GSL, which we would use vcpkg for.
Loading