Skip to content

Conversation

arrdem
Copy link
Collaborator

@arrdem arrdem commented Sep 25, 2025

Design doc

The rules_python strategy of implementing pip using installs as repository actions is fundamentally broken. Repository actions should be used only for downloading resources, build actions must be reserved for action time when toolchains and RBE can be employed.

This PR is a clean-sheet approach to implementing pip sourced dependencies under Bazel based on offloading much of the work rules_python must do in its repository actions to a more complete lockfile similar to what rules_pycross chose to do and for the same reasons.

This PR is very much a work in progress, but aims to make generalized Python crossbuilds and source builds tractable.

Changes are visible to end-users: yes

  • Searched for relevant documentation and updated as needed: yes/no
  • Breaking change (forces users to change their own code or config): yes/no
  • Suggested release notes appear below: yes/no

Test plan

TBD

Copy link

aspect-workflows bot commented Sep 25, 2025

Test

Buildkite build #452 is running...


# Extract the archive
ctx.actions.run(
executable = "tar",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grab the tar toolchain?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I need to drop in bsdtar here.


for minor in MINORS:
selects.config_setting_group(
name = "is_{}{}{}".format(interpreter, major, minor),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally %s is faster than .format because it avoids a function call and has simpler (faster) arg conversion. Also if you have only a single substitution it helps even more because you don't even need to construct a tuple. Up to you where to use it, but at least for the functions in the extension that are called repeatedly I would strongly consider it. (Honestly I pretty much always use it unless I have a complex/multiline template with 4+ args and/or using an arg multiple times)

Copy link
Contributor

@dzbarsky dzbarsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you said you love nits...i left a bunch of nits. feel free to apply or disregard as you see fit!

# We loop up to the second-to-last item to ensure we always have a 'next' stage.
for i, current_stage in enumerate(stages):
selects.config_setting_group(
name = "{}".format(current_stage.name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: name = current_stage.name :)

selects.config_setting_group(
name = "{}".format(current_stage.name),
match_any = [
":{}".format(current_stage.condition),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

":" + current_Stage.condition or ":%s" % current_stage.condition are both faster and less chars :)

"""
selects.config_setting_group(
name = "{}",
match_all = {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be a bit clearer to put the brackets in the expression here, especially since I think the way you did it I think the trailing ] isn't indented correctly? Here's a similar thing I did with build_deps here

# Collect all hubs, ensure we have no dupes
for mod in module_ctx.modules:
for hub in mod.tags.declare_hub:
hub_specs.setdefault(hub.hub_name, {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw you use setdefault like this a bunch, I think you can tweak the pattern slightly to avoid repeated lookups:

hub_specs.setdefault(hub.hub_name, {})[mod.name] = 1

problems = []
for hub_name, modules in hub_specs.items():
if len(modules.keys()) > 1:
problems.append(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could combine this with the previous loop and make hub_specs store hub_name -> mod_name(just check if you already have an entry and add toproblems) instead of of hub_name -> mod_name -> 1`. I guess it would make it harder to handle 3 modules all using the same name, but meh? It would probably lead to simpler usage in the rest of the extension

load("//pip/private/constraints/platform:defs.bzl", "supported_platform")
load(":parse_whl_name.bzl", "parse_whl_name")

def format_arms(d):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load(":parse_whl_name.bzl", "parse_whl_name")

def format_arms(d):
content = [" \"{}\": \"{}\"".format(k, v) for k, v in d.items()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe clearer to use single quotes on the outside and/or repr the k/v to avoid inner ones?

def sort_select_arms(arms):
# {(python, platform, abi): target}
pairs = list(arms.items())
pairs = sorted(pairs, key=select_key, reverse=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you pass arms.items() directly into sorted and skip the copy?

# FIXME: Use bsdtar here
archive = ctx.attr.src[DefaultInfo].files.to_list()[0]
ctx.actions.run(
executable = "/usr/bin/unzip",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does @bazel_tools//tools/zip:zipper help?

return [
# FIXME: Need to generate PyInfo here
DefaultInfo(
files = depset([
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess you could create this depset once and reuse it below

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants