Skip to content

Why is the python package so large? #2688

Closed
@mistercrunch

Description

@mistercrunch

Hey, digging into my docker images, I found that the largest thing I have in there is the playwright bundle, more specifically something located at playwright/driver/node

$ du -h .venv/lib/python3.10/site-packages/playwright/driver/node
118M    .venv/lib/python3.10/site-packages/playwright/driver/node

From my understanding the python bundle comes with a CLI that installs specific bundles when running it, and those install in user space, on demand, at places like /Users/{USER}/Library/Caches/ms-playwright/

So what's in driver/node? Is it really required? Why is it so large?

Activity

mxschmitt

mxschmitt commented on Dec 16, 2024

@mxschmitt
Member

Playwright language bindings (e.g. Python) talk over a client/server protocol to the Playwright Driver over stdin/out. The driver deals with all the browser communication, retries and logic around it. It is written in Node.js, hence we ship a copy of Node.js in the wheel. So we use it for this and for browser installation/cli interaction.

We once investigated into creating a smaller Node.js binary by e.g. stripping out ICU - we should maybe revisit that.

mistercrunch

mistercrunch commented on Dec 17, 2024

@mistercrunch
Author

Gotcha. Wondering if that code could be "minimized", compressed, or bundled differently. Hard to tell as an outsider to the project...

mxschmitt

mxschmitt commented on Dec 17, 2024

@mxschmitt
Member

Its the Node.js binary which is large. Unfortunately on macOS binaries tend to be larger compared to Linux. We were exploring in the past building the Node.js version without ICU in order to make it smaller - I can revisit that experiment and get some actual numbers.

mistercrunch

mistercrunch commented on Dec 17, 2024

@mistercrunch
Author

Oh interesting - curious on the reason why arm builds might be bigger than amd (?)

mxschmitt

mxschmitt commented on Dec 18, 2024

@mxschmitt
Member

amd is similar - its more Linux vs. macOS because of Mach-O vs. ELF

kalekseev

kalekseev commented on Feb 1, 2025

@kalekseev

Whould be nice to have an ability to install just playwright python part without nodejs and driver

It's already possible to use another nodejs with PLAYWRIGHT_NODEJS_PATH

return (os.getenv("PLAYWRIGHT_NODEJS_PATH", str(driver_path / "node")), cli_path)

In nix we have to patch _driver.py and setup.py in order to use own nodejs and driver https://github.com/NixOS/nixpkgs/blob/4aa0449341ac1f0f95a1db3a76188816657afacd/pkgs/development/python-modules/playwright/driver-location.patch#L17

@mxschmitt whould you accept a PR with one more env variable PLAYWRIGHT_DRIVER_CLI_PATH to control driver's cli.js location? This way we could at least drop the _driver.py patch.

mxschmitt

mxschmitt commented on Feb 5, 2025

@mxschmitt
Member

I'll close it for now since we don't plan any investment on that area. Playwright is not comfortable to run with an arbitrary PLAYWRIGHT_DRIVER_CLI_PATH location, since driver versions are not compatible with each other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @kalekseev@mistercrunch@mxschmitt

        Issue actions

          Why is the python package so large? · Issue #2688 · microsoft/playwright-python