Skip to content

POC of using cargo package for installation#68

Draft
luca-della-vedova wants to merge 1 commit intocolcon:cottsay/install-libsfrom
luca-della-vedova:luca/cargo_package_installation
Draft

POC of using cargo package for installation#68
luca-della-vedova wants to merge 1 commit intocolcon:cottsay/install-libsfrom
luca-della-vedova:luca/cargo_package_installation

Conversation

@luca-della-vedova
Copy link
Member

This PR targets #67 and illustrates the idea I have been toying with about using cargo package to install libraries through colcon-cargo.
It's just a POC to facilitate discussion, ideally we can at least be smarter with caching / remove duplicated work and rework install folder structures.
The proposed implementation in #67 uses cargo package --list to get a list of files that would be installed then uses the output to copy files from the package's source directory into the install space.
This however, is not sufficient for packages that are in a cargo workspace, since cargo package actually performs a series of operations to the Cargo.toml to, for example, remove relative path dependencies, resolve workspace dependencies etc. (point 1 and some of point 2 here).
This shows in the fact that what we actually install in #67 doesn't compile if applied to a more complex package.

I came up with a "simple" test case for both #67 and this PR to show the difference.

Setup

Create a workspace and bring a custom tokio (where I just added a public hello_world function) and a POC crate that tests it (has a library and an executable that call the new tokio function).

mkdir -p tokio_ws/src
cd tokio_ws/src
git clone https://github.com/luca-della-vedova/tokio
git clone https://github.com/luca-della-vedova/cargo_custom_tokio_package

Test with #67

Remove dev_depends from here as done in this PR to avoid circular dependency errors

# TODO Install or checkout the colcon-cargo branch in #67
cd ~/tokio_ws
rm -rf build install log
colcon build --packages-up-to tokio
cd install/tokio/share/cargo/registry/tokio-1.48.0
cargo build

Since the tokio Cargo.toml has one of these "workspace inheritances", the build will actually fail, making the crate effectively non functional:

:~/tokio_ws/install/tokio/share/cargo/registry/tokio-1.48.0$ cargo build
error: failed to parse manifest at `~/tokio_ws/install/tokio/share/cargo/registry/tokio-1.48.0/Cargo.toml`

Caused by:
  error inheriting `lints` from workspace root manifest's `workspace.lints`

Caused by:
  failed to find a workspace root

The snippet is preserved when copying the file and cannot be resolved since the file is copied to a different location and its workspace's Cargo.toml is not in its parent folder anymore.

[lints]
workspace = true

Test with this branch

# TODO Install this PR's colcon-cargo version
cd ~/tokio_ws
rm -rf build install log
colcon build --packages-up-to tokio
cd install/tokio/share/tokio/rust/tokio-1.48.0 # I just put it here to follow what we do with other rust packages, happy to bikeshed
cargo build

You will see that the package builds correctly, meaning the workspace migration was done by Cargo.
Inspecting the installed Cargo.toml, we can see that the linter configuration was migrated to a standalone configuration that does not reference the workspace's Cargo.toml:

[lints.rust.unexpected_cfgs]
level = "warn"
priority = 0
check-cfg = [
    "cfg(fuzzing)",
    "cfg(loom)",
    "cfg(mio_unsupported_force_poll_poll)",
    "cfg(tokio_allow_from_blocking_fd)",
    "cfg(tokio_internal_mt_counters)",
    "cfg(tokio_no_parking_lot)",
    "cfg(tokio_no_tuning_tests)",
    "cfg(tokio_unstable)",
    'cfg(target_os, values("cygwin"))',
]

Additionally, if now you build the sample package:

# TODO Install this PR's colcon-cargo version
cd ~/tokio_ws
rm -rf build install log
colcon build --packages-up-to cargo_custom_tokio_package

It will build successfully and running it will work, meaning the hardcoded patch correctly resolved to a tokio with the custom function:

$ install/cargo_custom_tokio_package/bin/cargo_custom_tokio_package 
HELLO THERE

Conclusion

  • Just copying files is not sufficient, cargo package does a lot more for non-toy projects. I demonstrated this with tokio but most major projects out there have workspace dependencies that would make them non functional (i.e. serde).
  • This approach should make packages functional and use a similar approach to what we currently do in colcon-ros-cargo / message generation. Patching is hardcoded but left as future work (perhaps through pallet patcher?)
  • There are rough edges, specifically I wonder if cargo package always builds a package even if it was unchanged, since I see that install times are quite significant even when no changes are made.

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>
Copy link

@Blast545 Blast545 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't test the example, but checked the code and the documentation of cargo package and this LGTM

Comment on lines +96 to +97
# REVERT, remove test dependency for false positive circular dep error
'build': depends | build_depends,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why this causes the circular dep problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an interesting issue.
Cargo itself actually allows circular dependencies for dev dependencies but colcon doesn't, so if we add all dev-dependencies to our build depends, we might fall into some false positives.
Details on when and why is this allowed are in the cargo book. From my understanding it's because a test is a standalone target that is built separately from the library so there isn't strictly a circular dependency.
The documentation advises against this practice but, of course, a lot of packages in the wild go and use it, such as tokio or serde that are very widespread.
Note that however this is just a hack to be able to test this PR, I don't think we should merge this change. I believe what we would really want is indeed include dev dependencies in our dependency tree, but at the same time disregard dependency cycle errors only when they occur while resolving dev dependencies (and perhaps only cargo dev dependencies). This is however quite tricky and we should probably explore it separately.

@cottsay
Copy link
Member

cottsay commented Feb 11, 2026

Sorry for the long silence on this, but your thorough explanation on this has led me down quite a rabbit hole.

In short: you're absolutely right. You've convinced me that "normalizing" the manifest is absolutely necessary when we install it into the workspace. Further, I now that believe it extends beyond "library" crates and applies to all participants in a workspace, even those that only build binaries.

I have also discovered (I believe) that it isn't possible to patch a dependency that refers to a sibling a package's workspace. At first glance, it might seem okay to forego the patch and let cargo consume the dependency directly from the source directory, but that could result in unexpected behavior when a downstream build in a separate workspace tries to use that same dependency and consumes it from the install space instead. Ideally a developer would always re-build a package (i.e. the upstream dependency) which has been modified, but I could see someone distributing a built workspace that "worked on their machine" not knowing that the sources for the dependency package hadn't been properly rebuilt (and reinstalled).


I'm experimenting with code which unconditionally normalizes the manifest before operating on it. Maybe we can get away with only doing it for workspace packages, but until we can really enumerate everything that happens during cargo's manifest normalization process, I'm inclined to just do it every time.

The only supported way to normalize the manifest, as far as I can tell, is using cargo package. We can pass lots of flags to make this as lightweight as possible (like --no-verify --exclude-lockfile --offline), but it's still a pretty heavy hammer just to process the manifest. Maybe this would be a good feature request for cargo.

To use the manifest during our operations (namely build and install), I tried leveraging the --manifest-file option, but it appears that cargo looks for source files based on the location of the manifest and not the working directory so given that we don't want to change the package sources with the normalized manifest, it would mean that we need to populate the normalized manifest directory with the source files as well. We could extract all of the crate contents to do this, or we could copy/symlink the source files from the package source directory to avoid the decompression cycle. I'll punt the specifics of this question for later discussion.

Since we're essentially re-homing the whole crate at this point, the thought occurred to me that maybe we should just build it there and not use --manifest-file at all. However, it appears that while source file discovery is rooted from the manifest location, the discovery of .cargo/config.toml happens based on the working directory. We currently respect any .cargo/config.toml files that the user has in their source tree as this maintains better parity with standalone cargo behavior.

It's worth noting that making cargo package part of all package builds does impose some new restrictions on the manifests. Notably, all dependencies must now include version constraints and can no longer have only a path to another crate. The dependencies can however have a version = '*', as that value is prohibited only on crates.io and not by cargo itself. Really, this new restriction only applies to non-library crates, and IMO is not a really a problem. Here's an example of a dependency that would no longer be valid:

[dependencies]
local-rust-pure-library = {package = "rust-pure-library", path = "../rust-pure-library"}

My (current) conclusion

Colcon operates on sources on a per-package basis. Effort is made to isolate each package, sometimes to validate proper dependency declaration, and sometimes to ensure that the resulting install space can be used by downstream processes or "child" workspaces. In many ways this is in opposition to much of how the cargo tool consumes workspace members, so we should make an effort to "de-workspace" the packages using cargo's manifest normalization. This should (help) ensure that we aren't crossing the package boundary in cargo in opposition with colcon's efforts, and allows us to patch dependencies that are (or were previously) workspace siblings.

Dragons

  1. There's an argument that cargo workspaces, even outside the context of the cargo tool, violate colcon's package boundaries in their nature as information (namely metadata) from the workspace root affects the workspace member builds. I don't really see a way around this, and even if there was, circumventing workspaces entirely would cause a pretty heavy behavioral deviation from standalone cargo. I'm not even sure it would be possible to just "ignore" workspaces.
  2. There's an argument that .cargo/config.toml discovery along the source directory chain is also a violation of colcon's package boundaries. This point is more interesting to me, and I'd love to hear thoughts from cargo users. While colcon isolates packages from other packages, it makes no effort (by design) to isolate package from the developer environment. If we moved the working directory into the "build" space, .cargo/config.toml discovery would still happen at the workspace root (at least for a "typical" workspace layout). Probably better to stay focused on what we need and revisit this later...
  3. My proposal is clearly a further deviation from how standalone cargo operates under the hood, and I'm getting more and more concerned about build performance and integration with cargo-native IDEs and tooling. I won't pretend that I have any satisfying answers here. Moving away from consuming the source files directly in the source directories is a huge red flag in this context. My only hope would be that someday cargo might offer new features to better support our scenario.

Sorry lots of text there. I'll try to follow up with some prototypes and examples, and share any new discoveries along the way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants