Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Pyinstaller ends up in Page.goto: Target page, context or browser has been closed #2683

Open
yaniswang opened this issue Dec 9, 2024 · 14 comments

Comments

@yaniswang
Copy link

yaniswang commented Dec 9, 2024

Version

1.49.0

Steps to reproduce

Code: test.py

from playwright.sync_api import sync_playwright

ua = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36'

with sync_playwright() as playwright:
    browser = playwright.chromium.launch(headless=True)

    context = browser.new_context(user_agent=ua)
    page = context.new_page()
    page.goto('https://www.google.com/')
    page.screenshot(path=f'test.png')
    browser.close()

env: Raspberry Pi

  1. python -m venv test
  2. export PLAYWRIGHT_BROWSERS_PATH=0
  3. python -m playwright install
  4. python test.py -> works
  5. pyinstaller -F test.py
  6. ./dist/test

Expected behavior

no error

Actual behavior

Traceback (most recent call last):
  File "test.py", line 10, in <module>
    page.goto('https://www.baidu.com/')
  File "playwright/sync_api/_generated.py", line 9006, in goto
  File "playwright/_impl/_sync_base.py", line 115, in _sync
  File "playwright/_impl/_page.py", line 551, in goto
  File "playwright/_impl/_frame.py", line 145, in goto
  File "playwright/_impl/_connection.py", line 61, in send
  File "playwright/_impl/_connection.py", line 528, in wrap_api_call
playwright._impl._errors.TargetClosedError: Page.goto: Target page, context or browser has been closed
Call log:
  - navigating to "https://www.google.com/", waiting until "load"

[PYI-32348:ERROR] Failed to execute script 'test' due to unhandled exception!

Additional context

If execute the source code directly, the result is correct, but after packaging with pyinstaller, it is not correct.
The same code packaged under Windows, the results are also correct.

Environment

- Operating System: [Debian 6.1.31]
- CPU: [arm64]
- Browser: [chromium_headless_shell-1148]
- Python Version: [3.9.2]
- Other info:
@yaniswang yaniswang changed the title [Bug]: [Bug]: Page.goto: Target page, context or browser has been closed Dec 9, 2024
@mxschmitt
Copy link
Member

mxschmitt commented Dec 10, 2024

I can reproduce on Linux - it was working at some point - most likely a regression in Pyinstaller. Since Pyinstaller is not a priority for us, I'll add the p3-collecting-feedback label for now. I recommend filing it on the Pyinstaller side.

@mxschmitt mxschmitt changed the title [Bug]: Page.goto: Target page, context or browser has been closed [Bug]: Pyinstaller ends up in Page.goto: Target page, context or browser has been closed Dec 10, 2024
@yaniswang
Copy link
Author

Pyinstaller will unpack file to /tmp/xxxxx, then run /tmp/xxxx.../node /tmp/xxxx.../cli.js.
Is the node process closed by pyinstaller?
Or is the chromium process closed?

@mxschmitt
Copy link
Member

Looks like not all files are getting packed, when running with DEBUG=pw:browser ./dist/test:

 pw:browser [pid=4400][err] [1210/015401.302369:ERROR:egl_util.cc(44)] Failed to load GLES library: /tmp/_MEIl7ax9R/playwright/driver/package/.local-browsers/chromium_headless_shell-1148/chrome-linux/libGLESv2.so: /tmp/_MEIl7ax9R/playwright/driver/package/.local-browsers/chromium_headless_shell-1148/chrome-linux/libGLESv2.so: cannot open shared object file: No such file or directory +4ms
  pw:browser [pid=4400][err] [1210/015401.304551:ERROR:viz_main_impl.cc(181)] Exiting GPU process due to errors during initialization +1ms
  pw:browser [pid=4400][err] [1210/015401.323427:ERROR:egl_util.cc(44)] Failed to load GLES library: /tmp/_MEIl7ax9R/playwright/driver/package/.local-browsers/chromium_headless_shell-1148/chrome-linux/libGLESv2.so: /tmp/_MEIl7ax9R/playwright/driver/package/.local-browsers/chromium_headless_shell-1148/chrome-linux/libGLESv2.so: cannot open shared object file: No such file or directory +20ms
  pw:browser [pid=4400][err] [1210/015401.326653:ERROR:viz_main_impl.cc(181)] Exiting GPU process due to errors during initialization +0ms
  pw:browser [pid=4400][err] [1210/015401.338589:ERROR:egl_util.cc(44)] Failed to load GLES library: /tmp/_MEIl7ax9R/playwright/driver/package/.local-browsers/chromium_headless_shell-1148/chrome-linux/libGLESv2.so: /tmp/_MEIl7ax9R/playwright/driver/package/.local-browsers/chromium_headless_shell-1148/chrome-linux/libGLESv2.so: cannot open shared object file: No such file or directory +11ms
  pw:browser [pid=4400][err] [1210/015401.342275:ERROR:viz_main_impl.cc(181)] Exiting GPU process due to errors during initialization +1ms
  pw:browser [pid=4400][err] [1210/015401.389535:ERROR:nss_util.cc(227)] Error initializing NSS with a persistent database (sql:/home/codespace/.pki/nssdb): libsoftokn3.so: cannot open shared object file: No such file or directory +37ms
  pw:browser [pid=4400][err] [1210/015401.389590:ERROR:nss_util.cc(112)] Error initializing NSS without a persistent database: NSS error code: -5925 +1ms
  pw:browser [pid=4400][err] [1210/015401.389605:FATAL:nss_util.cc(114)] nss_error=-5925, os_error=0 +0ms
  pw:browser [pid=4400][err] [1210/015401.505083:ERROR:ssl_client_socket_impl.cc(878)] handshake failed; returned -1, SSL error code 1, net_error -3 +116ms
  pw:browser [pid=4400] <process did exit: exitCode=null, signal=SIGTRAP> +5ms

Looking at

libGLESv2.so: cannot open shared object file: No such file or directory

This prevents Chromium from running

@yaniswang
Copy link
Author

yaniswang commented Dec 10, 2024

I try pyinstaller test.py, then copy *.so to ./dist/test/_internal/.

Then get errors:

  pw:browser <launched> pid=5492 +17ms
  pw:browser [pid=5492][err] [1210/101732.603381:WARNING:sandbox_linux.cc(430)] InitializeSandbox() called with multiple threads in process gpu-process. +308ms
  pw:browser [pid=5492][err] [1210/101732.608758:WARNING:viz_main_impl.cc(85)] VizNullHypothesis is disabled (not a warning) +1ms
  pw:browser [pid=5492][err] [1210/101732.704051:ERROR:command_buffer_proxy_impl.cc(131)] ContextResult::kTransientFailure: Failed to send GpuControl.CreateCommandBuffer. +81ms
  pw:browser [pid=5492][err] [1210/101732.845452:ERROR:nss_util.cc(227)] Error initializing NSS with a persistent database (sql:/home/pi/.pki/nssdb): libsoftokn3.so: cannot open shared object file: No such file or directory +141ms
  pw:browser [pid=5492][err] [1210/101732.845742:ERROR:nss_util.cc(112)] Error initializing NSS without a persistent database: NSS error code: -5925 +1ms
  pw:browser [pid=5492][err] [1210/101732.845797:FATAL:nss_util.cc(114)] nss_error=-5925, os_error=0 +0ms
  pw:browser [pid=5492][err] [1210/101732.863893:ERROR:ssl_client_socket_impl.cc(878)] handshake failed; returned -1, SSL error code 1, net_error -3 +26ms
  pw:browser [pid=5492] <process did exit: exitCode=null, signal=SIGTRAP> +35ms
  pw:browser [pid=5492] starting temporary directories cleanup +2ms
Traceback (most recent call last):
  File "test.py", line 8, in <module>
    page.goto('https://www.google.com/')
  File "playwright/sync_api/_generated.py", line 9006, in goto
  File "playwright/_impl/_sync_base.py", line 115, in _sync
  File "playwright/_impl/_page.py", line 551, in goto
  File "playwright/_impl/_frame.py", line 145, in goto
  File "playwright/_impl/_connection.py", line 61, in send
  File "playwright/_impl/_connection.py", line 528, in wrap_api_call
playwright._impl._errors.TargetClosedError: Page.goto: Target page, context or browser has been closed
Call log:
  - navigating to "https://www.google.com/", waiting until "load"

It report missing: /home/pi/.pki/nssdb

@mxschmitt
Copy link
Member

Looks like Pyinstaller v5 was working and v6.0.0 broke it. cc @rokm. I guess we should file it upstream.

@rokm
Copy link

rokm commented Dec 10, 2024

Looks like Pyinstaller v5 was working and v6.0.0 broke it. cc @rokm. I guess we should file it upstream.

Does this mean that if you replace PyInstaller 6.x with 5.13.2 in your tests above, it works? Or does it mean that you recall it working with v5 at some point in time (with version of bundled chromium was in use back then)?

FWIW, if I try to freeze the above example on my main Fedora 41 system with v6, it seems to work. But that's likely because I have a compatible libGLESv2.so installed on the system.

The collect_data_files helper you are using in your hook is not intended for blanket collection of bundled browsers. It specifically tries to avoid collecting shared libraries - but due to oversight, on linux, it excludes unversioned *.so files but not versioned ones (e.g., *.so.0). So to get libGLESv2.so collected, the hook should use

from PyInstaller.utils.hooks import collect_data_files, collect_dynamic_libs
datas = collect_data_files('playwright')
binaries = collect_dynamic_libs('playwright')

But as far as I can tell, this was the case in v5 as well.

One thing that has changed, though, is that in v6 collected files are automatically re-classified as data vs. binaries, and so even if you "smuggle" in binaries (executables and shared libs) via datas, they undergo binary dependency analysis. This means that we now collect additional dependencies, such as libnss3.so.

However, libnss3.so dynamically loads some of its (plugin-like) components, such as libsoftokn3.so. These are invisible to binary dependency analysis, and are not picked up. This is likely the cause of

pw:browser [pid=5492][err] [1210/101732.845452:ERROR:nss_util.cc(227)] Error initializing NSS with a persistent database (sql:/home/pi/.pki/nssdb): libsoftokn3.so: cannot open shared object file: No such file or directory +141m

Try to either remove the bundled copy of libnss3.so, or manually ensure that libsoftokn3.so (as well as libfreebl3.so, libfreeblpriv3.so, libnssckbi.so, and libnssdbm3.so) is collected.

@mxschmitt
Copy link
Member

Does this mean that if you replace PyInstaller 6.x with 5.13.2 in your tests above, it works?

Yes, I tried pip install pyinstaller==5.13.2 and with that it was working. The version of Playwright in both cases was v1.49.0.

One thing that has changed, though, is that in v6 collected files are automatically re-classified as data vs. binaries, and so even if you "smuggle" in binaries (executables and shared libs) via datas, they undergo binary dependency analysis.

Can we disable this analysis?

@rokm
Copy link

rokm commented Dec 10, 2024

Does this mean that if you replace PyInstaller 6.x with 5.13.2 in your tests above, it works?

Yes, I tried pip install pyinstaller==5.13.2 and with that it was working. The version of Playwright in both cases was v1.49.0.

Hmm, in that case, can you check if libGLESv2.so is collected?

If I try to build with 5.13.2 (in onedir mode for easier debugging), I see libvulkan.so.1 collected in dist/<name>/playwright/driver/package/.local-browsers/chromium_headless_shell-1148/chrome-linux, while libEGL.so, libGLESv2.so, and libvk_swiftshader.so are not collected.

One thing that has changed, though, is that in v6 collected files are automatically re-classified as data vs. binaries, and so even if you "smuggle" in binaries (executables and shared libs) via datas, they undergo binary dependency analysis.

Can we disable this analysis?

Not really. And on linux, the line between which shared libraries should be collected from the build system and which not is always rather blurry and situation dependent.

@mxschmitt
Copy link
Member

Looks like LD_LIBRARY_PATH is set to /workspaces/tmp/dist/test/_internal which breaks the Chromium launch. If I remove dist/test/_internal/libnss3.so it seems to launch correctly.

@yaniswang
Copy link
Author

I try pyinstaller==5.13.2, it works

@rokm
Copy link

rokm commented Dec 11, 2024

Looks like LD_LIBRARY_PATH is set to /workspaces/tmp/dist/test/_internal which breaks the Chromium launch.

That's normal, though. We always set LD_LIBRARY_PATH to top-level application directory - the difference between v5 and v6 is that now all contents are put into _internal directory. So I don't think LD_LIBRARY_PATH per-se is the problem; rather, it's the extra collected/discovered libraries, as you also found out.

As a side note: if you want to launch a process from the frozen application that should use system shared libraries instead of bundled ones, it is up to you to clear LD_LIBRARY_PATH (or restore it from LD_LIBRARY_PATH_ORIG, if available) - see https://pyinstaller.org/en/stable/runtime-information.html#ld-library-path-libpath-considerations.

If I remove dist/test/_internal/libnss3.so it seems to launch correctly.

Is dist/test/_internal/libnss3.so a hard-copy of system libnss3.so, or is it a symlink to playwright/driver/package/.local-browsers/firefox-1466/firefox/libnss3.so? Perhaps that's the real problem here - during the binary dependency analysis in v6, we find these extra shared libraries that are bundled with firefox, and symlink them to top-level application directory.

If it is a symlink, does it help if you add

bindepend_symlink_suppression = ['**/playwright/driver/package/.local-browsers/*/*/*.so']

to the hooks? (Unfortunately the implementation is using pathlib.Path.match() for matching, so recursive globs are not supported in the patterns). This should prevent the symlink for firefox's libs to be created in the top-level application directory. Then the linss3.so might not be collected at all, or (more likely) it will be the system copy.

@mxschmitt
Copy link
Member

Is dist/test/_internal/libnss3.so a hard-copy of system libnss3.so, or is it a symlink to playwright/driver/package/.local-browsers/firefox-1466/firefox/libnss3.so?

These are all copies. I did not see any symlink.

we find these extra shared libraries that are bundled with firefox

So far this issue is about Chromium only - I did a brief attempt with Firefox and WebKit and for them it seems to work correctly.

Some more debugging notes:

Bad as well: LD_LIBRARY_PATH=/workspaces/tmp/dist/test/_internal python test.py

Makes it work: rm dist/test/_internal/libnss3.so

Looks like it tries to locate libsoftokn3.so next to it which is not there. If I do:

cp /lib/x86_64-linux-gnu/nss/libsoftokn3.so dist/test/_internal/

it works. If I compare it with v5, libnss3 was not collected in v5, so it was not an issue.

LD_LIBRARY_PATH=/workspaces/tmp/dist/test/_internal:/lib/x86_64-linux-gnu/nss/ DEBUG=pw:browser python test.py works as well.

@rokm
Copy link

rokm commented Dec 12, 2024

Thanks for additional info @mxschmitt !

Is dist/test/_internal/libnss3.so a hard-copy of system libnss3.so, or is it a symlink to playwright/driver/package/.local-browsers/firefox-1466/firefox/libnss3.so?

These are all copies. I did not see any symlink.

I would like to be able to fully reproduce this locally, so I can look at it from different angles. So to clarify:

  • you were testing with all browser backends installed (or at least with both chromium_headless_shell and firefox)? (I find it a bit odd that libnss3.so copy from firefox was not picked up by binary dependency analysis as well...)
  • are you testing with latest PyInstaller v6 (v6.11.1) or v6.0.0 (the first version where behavior changed)?
  • what distribution are you testing on?

we find these extra shared libraries that are bundled with firefox

So far this issue is about Chromium only - I did a brief attempt with Firefox and WebKit and for them it seems to work correctly.

Some more debugging notes:

Bad as well: LD_LIBRARY_PATH=/workspaces/tmp/dist/test/_internal python test.py

Makes it work: rm dist/test/_internal/libnss3.so

Looks like it tries to locate libsoftokn3.so next to it which is not there. If I do:

cp /lib/x86_64-linux-gnu/nss/libsoftokn3.so dist/test/_internal/

it works. If I compare it with v5, libnss3 was not collected in v5, so it was not an issue.

LD_LIBRARY_PATH=/workspaces/tmp/dist/test/_internal:/lib/x86_64-linux-gnu/nss/ DEBUG=pw:browser python test.py works as well.

Yeah, this makes sense, and is likely an issue with all Chrome/Chromium based packages.

I.e., our QtWebEngine hook helper explicitly tries to collect extra nss plugins.

But I see now that this part should probably be moved into binary dependency analysis itself, because libnss3.so might end up being collected as dependency of other packages' binaries. As seen here, playwright is one such example, but I imagine we would be running into the same problem with for example cefpython3 (were it not for the fact that is unmaintained and does not support recent python versions).

(Alternatively, we could also consider blocking libnss3.so from being collected via this exclude list, but this would apply globally...)

@mxschmitt
Copy link
Member

You were testing with all browser backends installed (or at least with both chromium_headless_shell and firefox

Initially I used only Chromium, so I was doing python -m playwright install chromium. If you leave chromium out, it will install all 3 browser engines we support.

are you testing with latest PyInstaller v6 (v6.11.1) or v6.0.0 (the first version where behavior changed)?

Actually I tested on 6.0.0 = on 6.11.1 it seems to happen as well.

We have more instructions here on how to repro it.

what distribution are you testing on?

Ubuntu 20.04.6.

I.e., our QtWebEngine hook helper explicitly tries to collect extra nss plugins.

Interesting hack!

(Alternatively, we could also consider blocking libnss3.so from being collected via this exclude list, but this would apply globally...)

That good from my side - but not sure how other projects would benefit from it / if it would hurt someone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants