Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Using vcpkg in the .NET product #311

Closed
wants to merge 6 commits into from

Conversation

jkoritzinsky
Copy link
Member

Right now we consume C++ dependencies in multiple manners throughout the product. As we work on modernizing the native code in dotnet/runtime, we've talked about taking additional dependencies on native libraries. As we start down this path, we may also start to have similar dependencies as other repos in the project (such as Google Test, which dotnet/aspnetcore depends on).

This document proposes a mechanism through which we could use the Vcpkg package manager to reference our native dependencies consistently and in a VMR and source-build compatible manner.

This proposal is not for .NET 9. We already have more than enough work for .NET 9 with the VMR. This work would be targeted at .NET 10 at the earliest.


For many of our dependencies, these build options work decently well. However, these mechanisms have a few drawbacks:

- Vendoring code increases the size of our repositories and increases build times.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, we have been vendoring fairly small projects. Vendoring does not substantially increase the repo footprint.

Personally, I love the simplicity and flexibility of vendoring for our use cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dotnet/runtime uses vendoring for its native dependencies today, but dotnet/aspnetcore does not (they use submodules), possibly because their dependencies are more optional (Windows-only as the native code is IIS-specific) and they're larger (GoogleTest is pretty chunky).

Our mechanism for vendoring code also requires us to add additional (although I agree small) CMake logic to support our source-build customers opting-out of our copies of libraries and using the system libraries instead.

I believe we could adjust how we vendor our projects to help mitigate the second issue as well as other issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vendoring is not SB friendly in cases where the dependency is already included in a distro. Distro maintainers would prefer our applied patches were upstreamed. This reduces the maintenance overhead for them. From a security perspective there is only one version to maintain.

Additionally, [the VMR design documents on external source](https://github.com/dotnet/arcade/blob/main/Documentation/UnifiedBuild/VMR-Strategy-For-External-Source.md) defines that we shouldn't use submodules directly in our build.

For native dependencies, we could alternatively use Vcpkg, a cross-platform C++ package manager provided by another team at Microsoft.
Vcpkg would make it easier for us to both consume native dependencies in our projects, as many C++ libraries that we may consider using are already available in vcpkg.
Copy link
Member

@jkotas jkotas Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation for this proposal seems to be based on an assumption that we will add a bunch more C++ dependencies to the .NET project. Is that correct? What are those dependencies that you expect to add and why?

Copy link
Member Author

@jkoritzinsky jkoritzinsky Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've spoken before about adding the GSL in the future. In dnmd (which we hope to productize in a few years), we're adding unit tests based on googletest and using microsoft/wil (the "product" code isn't using these dependencies). We've had plans on and off for switching our zlib dependencies to zlib-ng each release, which will require a much more expensive CMake integration than our existing vendoring solutions for zlib and zlib-intel for maximum performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it would be useful to list existing dependencies of .NET project to be migrated to use vcpkg as part of this proposal.


## Requirements

- We must be able to version and patch dependencies ourselves.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we need to be able to build with private patches, without disclosing the patches publicly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a point-in-time requirement? Like some security fixes that you want to apply and then publicly disclose later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is primarily for security fixes.

## Q & A

- Why use vcpkg instead of another package manager?
- Vcpkg is already used by many teams at Microsoft, and is already used by many of the dependencies that we may want to consume.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that vcpkg is useful for applications.
 
What are the open source projects similar to .NET runtime that use vcpkg? For example, are there open source projects that use vcpkg and that are included in mainstream Linux distros? How do they deal with the offline build requirements, etc.?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if there are any projects that use vcpkg and are included in mainstream Linux distros. To my knowledge, most projects with non-test dependencies that are included in mainstream Linux distros have their build-time dependencies available as dependencies in the distro (through -dev packages or the source packages) and they reference them as shared libraries at runtime. Testing dependencies are usually vendored in (LLVM vendors GoogleTest and Google Benchmark).

We may be the first in this space.

The offline scenario is why I was thinking of vcpkg over another package manager. Using vcpkg allows us to use the same CMake whether the package is a system-provided dependency or a vcpkg-provided dependency, enabling our Linux distro partners to easily switch dependencies to system-provided if desired.

- zlib-intel
- brotli
- If we switch to zlib-ng from zlib and zlib-intel, we would consume zlib-ng through vcpkg.
- In dotnet/aspnetcore we would migrate the following dependencies if we were to support vcpkg + MSBuild:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the IIS dependency mentioned in your other comment (https://github.com/dotnet/designs/pull/311/files#r1446709622) be in this list too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look at how the IIS headers are pulled in and add it if it would make sense.

- Rapidjson
- zlib
- zlib-intel
- brotli
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should llvm and icu be in this list too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add them in a separate section as we are already flowing them through NuGet and we'd want to use the binary caching mechanism in vcpkg to flow them instead of building them in dotnet/runtime.

Copy link
Member

@akoeplinger akoeplinger Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the outputs of llvm are not just used during the dotnet/runtime build but ship to end-users as well (llc and opt are used during AOT compilation as part of e.g. a MAUI app build), I don't think we can remove the nugets for that.

- In dotnet/aspnetcore we would migrate the following dependencies if we were to support vcpkg + MSBuild:
- GoogleTest
- Some libraries that we're looking at productizing in the future have these dependencies, which we'd also use vcpkg for:
- GoogleTest
Copy link
Member

@jkotas jkotas Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GoogleTest and Google Benchmark are test-only dependencies. They do not need to participate in the official or offline builds, with all the requirements. I guess they are the best place to start with.


### Goals

- Enable developers of .NET to consume native dependencies from vcpkg.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think that "use tech XYZ" should be the goal for eng infrastructure investments.

The goal for infrastructure investment should be to make the eng infrastructure cheaper, simpler, faster, etc. For example, check https://github.com/dotnet/arcade/blob/05493e05d4bbf262f1be1bd517ac95f5bff3a2ef/Documentation/UnifiedBuild/Overview.md#goals .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rephrase this point. My main idea here was to show how this engineering improvement could assist us in more easily shipping an official nethost vcpkg package (something our customers want and some members of our team unofficially maintain) and provide some product benefit in addition to the infrastructure benefits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could assist us in more easily shipping an official nethost vcpkg package

What makes vcpkg packages official? My understanding is that the vcpkg packages are published by merging a PR to https://github.com/microsoft/vcpkg. Would that step go away somehow?

(In any case, we can reduce the patch set required to publish the package, but that should not require introducing vcpkg official build dependency.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure exactly what makes the packages official, but I know that they explicitly mark unofficial packages as unofficial by putting targets generated by them in the unofficial namespace to ensure they won't clash with official packages. I agree that we can make official vcpkg packages without using vcpkg ourselves, but using vcpkg ourselves makes things a little easier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a closer look at what makes a package "official". The "official" vs "unofficial" terminology mainly refers to which CMake targets are available. If a target is exported from CMake by a patch added in the portfile, then it is an "unofficial" target and should be namespaced as such. If the target is exported by the CMake scripts in the source repository, then the targets are considered "official".

I'll remove this goal as it looks like we would just need to upstream the patches to install CMake exports in our build scripts to make them "official"


- Why use vcpkg instead of another package manager?
- Vcpkg is already used by many teams at Microsoft, and is already used by many of the dependencies that we may want to consume.
- Vcpkg is cross-platform.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this true for all platforms we support today? what about s390x, ppc64le, riscv or other community maintained ports like FreeBSD?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we'd be defining our own registry, we can define custom tuples for the various different platforms we support. I don't see anything in the vcpkg tool repository that wouldn't work on our various target machines, but they also don't document which platforms it works on. This does provide some risk.

Copy link
Member

@akoeplinger akoeplinger Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I'm less concerned about the vcpkg packages themselves since they can be patched (which we'd need to do anyway to support a new platform in case a dependency doesn't support it) but about vcpkg itself.

It looks like it is written in C++ and either downloaded or built as part of the bootstrapping (https://github.com/microsoft/vcpkg/blob/a1a1cbc975abf909a6c8985a6a2b8fe20bbd9bd6/scripts/bootstrap.sh#L196) so that doesn't sound too bad. Though there is a new vcpkg-artifacts which seems to rely on TypeScript/nodejs for the CLI but I'm not sure how that relates to the main vcpkg and what the future plans for that are.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, vcpkg-artifacts is used for pulling down the binary cache packages/artifacts. I believe we can avoid using that in the VMR (which is where we need to be able to do offline builds).


### Custom registry

We can either create a new repository for the registry or use the source-build-externals repository as a vcpkg registry as well. This registry will, similar to the `dotnet-public` NuGet feed, contain all of the public packages that we want to consume from within our projects. However, unlike the `dotnet-public` NuGet feed, we will need to modify the portfiles for the packages that we want to consume to not use online sources in an offline build. This likely means that for each package, we'll need to create a submodule in source-build-externals or another repository for each package that we want to consume.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about the following:

How often you see the versions of these packages changing?
You mentioned the importance of patching earlier. Where would that be done if the source-build-externals(SBE) repo were used? Would the patching be done in a dotnet fork of the original source and then SBE would reference the fork?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually pull a new version of our dependencies about once a year, mainly to get the upstream version that has the patches that we made locally and sent to the upstream in the prior release.

For patches, we'd either patch in a dotnet fork (which SBE would then reference) or we'd use patch files which vcpkg can apply as part of the package configuration.

@steveisok steveisok requested a review from lateralusX January 11, 2024 17:56
- We must be able to consume dependencies from vcpkg in our projects.
- It must be possible to use vcpkg in offline build scenarios.
- It must be possible to use vcpkg in the VMR builds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add something about how it should be possible (maybe not trivial, but possible) for distro packages to be able to use the native dependency that's packaged in a distro over the dependency specified/vendored in .NET?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a requirement for that. I've already addressed that this is doable later in the proposal without much work.

- Vcpkg allows CMake consumers to consume dependencies as if they were installed on the machine instead of requiring custom CMake logic to find dependencies. Supporting changing a package between a vcpkg and system dependency would require more work with other package managers.
- What is vcpkg's minimum toolset requirements? Do these conflict with the requirements for .NET?
- Vcpkg requires CMake 3.14 or newer. We require 3.20 or newer, so we should be fine.
- Vcpkg requires a C++ compiler that supports C++17. Although we don't use C++17, all of the compilers we support do support C++17.
Copy link
Member

@huoyaoyuan huoyaoyuan Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds interesting. Is it confirmed that all our target platforms, including non-popular ones like FreeBSD/Tizen/ARMv6, support C++ 17?
I'd like to see the exact compiler versions gathered so that we can confidently use new features.

@jkoritzinsky
Copy link
Member Author

After investigating this approach more and the costs, I've come to the following conclusion:

  • Servicing isn't a blocker for vcpkg, even large-scale internal projects have figured out how to do it.
  • Source-Build isn't really compatible with vcpkg at all. We'd effectively have to push all of our dependencies into the source-build world themselves, which isn't practical.
  • Building vcpkg ourselves and using our own isolated package repository (to enable offline builds) takes away the primary improvements vcpkg brings.

Instead, we will consider using CMake's FetchContent module to enable easily integrating dependencies. With FetchContent, we can customize the build of the components to our needs, and we can have components needed for offline builds stored within the repository. Non-source-build required dependencies (like test-only dependencies) we can still pull from the network, and we can migrate to locally-vendored with minimal cost.

Our project to use zlib-ng instead of zlib is the first use case of this new model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants