Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: clean up runtime code for python setup #26042

Merged
merged 6 commits into from
Feb 26, 2025

Conversation

jdstrand
Copy link
Contributor

@jdstrand jdstrand commented Feb 19, 2025

ccd5d22 introduced working but temporary code for setting up the python runtime environment. This cleans that up:

  • chore: clean up runtime code for python setup

    ccd5d22 introduced working but temporary code for setting up the
    python runtime environment. This cleans that up:

    • refactor various find_python() functionality into virtualenv.rs
    • refactor PYTHONHOME calculation to virtualenv.rs:find_python_install()
    • adjust init_pyo3() to temporarily set PYTHONHOME based on
      virtualenv.rs:find_python_install() as this is the only place it is
      needed (indeed, venv activation scripts try to remove it)

    Importantly, virtualenv.rs:find_python_install() tries to find the
    python build standalone runtime based on a few heuristics. This function
    could be improved in the fullness of time to, eg, be configured via a
    build parameter.

    Also, virtualenv.rs:find_python() can be used pre and post venv
    activation. Before entering the venv, it will use find_python_install()
    which is useful for things like setting up the initial venv. After
    entering the venv, it will honor VIRTUAL_ENV (as set by
    virtualenv.rs:initialize_venv()) to find python, which is important for
    install packages with pip and having them installed into the venv.

  • chore: update README_processing_engine.md for default venv

  • chore: add bug reference for venv migrations with python minor releases

  • chore: update README_processing_engine.md for default venv

  • chore: add bug reference for venv migrations with python minor releases

  • chore: add security URLs to README_processing_engine.md

  • chore: find_python_home() returns Option. Thanks Jackson Newhouse

  • fix: manually set sys.prefix, exec_prefix and sys.path

    When activating a venv in the shell, sys.base_prefix and
    sys.base_exec_prefix should be set to the installation location while
    sys.prefix and sys.exec_prefix should be set to the venv dir.
    Unfortunately, when initialize_venv() and init_pyo3() are called, we
    can't use Py_InitializeFromConfig() to set any of these and certain
    platforms are unable to find python-build-standalone. For now, we'll
    temporarily set PYTHONHOME in init_pyo3() to the installation location
    to make python work. By setting it at this point in the code, sys.prefix
    and sys.exec_prefix end up also being set to the installation location,
    which is fine when not under a venv, but is different from when entering
    an venv.

    To address this, in PYTHON_INIT.call_once() and when VIRTUAL_ENV is set
    (which initialize_venv() will have set at this point), manually set
    sys.prefix and sys.exec_prefix to what is in VIRTUAL_ENV.

    Similarly, when activating a venv in the shell, sys.path is appended to
    have the venv's site-packages dir. Previously we were setting PYTHONPATH
    in initialize_venv() which ensures that the venv's site-packages dir is
    in sys.path, but this ends up having the venv's site-packages dir first
    in sys.path. To correct this, don't set PYTHONPATH any more and instead
    adjust PYTHON_INIT.call_once() to append the venv's site-packages dir to
    sys.path when VIRTUAL_ENV is set.

    Finally, when exiting init_pyo3(), unconditionally unset PYTHONHOME when
    VIRTUAL_ENV is set (like activation scripts do) and restore/unset when
    it isn't.

    Prior to these changes, all target incorrectly had the venv's
    site-packages first in sys.path and OSX and Windows additionally had an
    incorrect sys.prefix and sys.exec_prefix. With these initialization
    changes in place, the runtime environment for the plugins is much closer
    to that of a shell activated venv.

Importantly, virtualenv.rs:find_python_install() tries to find the python build standalone runtime based on a few heuristics. This function could be improved in the fullness of time to, eg, be configured via a build parameter.

Also, virtualenv.rs:find_python() can be used pre and post venv activation. Before entering the venv, it will use find_python_install() which is useful for things like setting up the initial venv. After entering the venv, it will honor VIRTUAL_ENV (as set by virtualenv.rs:initialize_venv()) to find python, which is important for install packages with pip and having them installed into the venv.

Testing

This has been tested to work correctly for all of:

  • Darwin arm64
  • Linux amd64
  • Docker amd64
  • Windows amd64

The paths are correctly setup when .venv doesn't exist, when it does and when --virtual-env-location is set (to an existing or non-existing dir).

Furthermore, I did testing before and after fix: manually set sys.prefix, exec_prefix and sys.path 1f8001d0.

Shell activated venv

It is useful to compare what a shell activated venv looks like to know how we compare:

#
# Create an alternative venv with python-build-standalone to demonstrate how things are set up
#

# linux
$ ./tmp/influxdb3-pe/install/python/bin/python3 -m venv ~/tmp/influxdb3-pe/venv-alt
$ source ~/tmp/influxdb3-pe/venv-alt/bin/activate
(venv-alt)$ python3 -c "import sys; [print('%s = %s' % (a, getattr(sys, a))) for a in ['base_exec_prefix', 'base_prefix', 'exec_prefix', 'executable', 'path', 'prefix'] if not a.startswith('_')]"
base_exec_prefix = /home/jamie/tmp/influxdb3-pe/install/python
base_prefix = /home/jamie/tmp/influxdb3-pe/install/python
exec_prefix = /home/jamie/tmp/influxdb3-pe/venv-alt
executable = /home/jamie/tmp/influxdb3-pe/venv-alt/bin/python3
path = ['', '/home/jamie/tmp/influxdb3-pe/install/python/lib/python311.zip', '/home/jamie/tmp/influxdb3-pe/install/python/lib/python3.11', '/home/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/lib-dynload', '/home/jamie/tmp/influxdb3-pe/venv-alt/lib/python3.11/site-packages']
prefix = /home/jamie/tmp/influxdb3-pe/venv-alt

# osx
$ ./tmp/influxdb3-pe/install/python/bin/python3 -m venv ~/tmp/influxdb3-pe/venv-alt
$ source ~/tmp/influxdb3-pe/venv-alt/bin/activate
(venv-alt)$ python3 -c "import sys; [print('%s = %s' % (a, getattr(sys, a))) for a in ['base_exec_prefix', 'base_prefix', 'exec_prefix', 'executable', 'path', 'prefix'] if not a.startswith('_')]"
base_exec_prefix = /Users/jamie/tmp/influxdb3-pe/install/python
base_prefix = /Users/jamie/tmp/influxdb3-pe/install/python
exec_prefix = /Users/jamie/tmp/influxdb3-pe/venv-alt
executable = /Users/jamie/tmp/influxdb3-pe/venv-alt/bin/python3
path = ['', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python311.zip', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/lib-dynload', '/Users/jamie/tmp/influxdb3-pe/venv-alt/lib/python3.11/site-packages']
prefix = /Users/jamie/tmp/influxdb3-pe/venv-alt

# windows
$ influxdb3-pe\install\python\python.exe -m venv Z:\influxdb3-pe\venv-alt
$ Z:\influxdb3-pe\venv-alt\Scripts\activate
(venv-alt)$ python -c "import sys; [print('%s = %s' % (a, getattr(sys, a))) for a in ['base_exec_prefix', 'base_prefix', 'exec_prefix', 'executable', 'path', 'prefix'] if not a.startswith('_')]"
base_exec_prefix = Z:\influxdb3-pe\install\python
base_prefix = Z:\influxdb3-pe\install\python
exec_prefix = Z:\influxdb3-pe\venv-alt
executable = Z:\influxdb3-pe\venv-alt\Scripts\python.exe
path = ['', 'Z:\\influxdb3-pe\\install\\python\\python311.zip', 'Z:\\influxdb3-pe\\install\\python\\DLLs', 'Z:\\influxdb3-pe\\install\\python\\Lib', 'Z:\\influxdb3-pe\\install\\python', 'Z:\\influxdb3-pe\\venv-alt', 'Z:\\influxdb3-pe\\venv-alt\\Lib\\site-packages']
prefix = Z:\influxdb3-pe\venv-alt

On all OSes:

  • base_prefix and base_exec_prefix point to the python-build-standalone runtime dir
  • prefix and exec_prefix point to the venv dir
  • path has the venv's site-packages at the end of the list

Current main

All targets for current main incorrectly have the venv's site-packages first in sys.path and OSX and Windows additionally had an incorrect sys.prefix and sys.exec_prefix. This doesn't seem to affect the operations of plugins.

Linux

#
# linux
#
$ ~/tmp/influxdb3-pe/install/influxdb3 serve --node-id=local01 --object-store=file --data-dir ~/tmp/influxdb3-pe/data --plugin-dir ~/tmp/influxdb3-pe/data/plugins

# WRONG: client - show paths (prefix/exec_prefix are venv,
# base_prefix/base_exec_prefix are python install location BUT SYS.PATH HAS
# VENV's SITE-PACKAGES FIRST
$ ~/tmp/influxdb3-pe/install/influxdb3 test schedule_plugin -d mydb01 simple.py
{
  "trigger_time": "2025-02-25T20:49:17Z",
  "log_lines": [
    "INFO: sys.prefix = /home/jamie/tmp/influxdb3-pe/data/plugins/.venv",
    "INFO: sys.exec_prefix = /home/jamie/tmp/influxdb3-pe/data/plugins/.venv",
    "INFO: sys.base_prefix = /home/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.base_exec_prefix = /home/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.executable = /home/jamie/tmp/influxdb3-pe/data/plugins/.venv/bin/python3",
    "INFO: sys.path = ['/home/jamie/tmp/influxdb3-pe/data/plugins/.venv/lib/python3.11/site-packages', '/home/jamie/tmp/influxdb3-pe/install/python/lib/python311.zip', '/home/jamie/tmp/influxdb3-pe/install/python/lib/python3.11', '/home/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/lib-dynload']",
    "INFO: 'requests' location = not found"
  ],
  "database_writes": {},                   
  "errors": []  
}

OSX

#
# osx
#
# serve
$ ~/tmp/influxdb3-pe/install/influxdb3 serve --node-id my_host --object-store file --data-dir ./tmp/influxdb3-pe/data --plugin-dir ./tmp/influxdb3-pe/data/plugins

# WRONG client - show paths (PREFIX/EXEC_PREFIX ARE PYTHON INSTALL LOCATION,
# base_prefix/base_exec_prefix are python install location BUT SYS.PATH HAS
# VENV's SITE-PACKAGES FIRST
$ ~/tmp/influxdb3-pe/install/influxdb3 test schedule_plugin -d mydb01 simple.py
{
  "trigger_time": "2025-02-25T20:54:38Z",
  "log_lines": [
    "INFO: sys.prefix = /Users/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.exec_prefix = /Users/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.base_prefix = /Users/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.base_exec_prefix = /Users/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.executable = /Users/jamie/tmp/influxdb3-pe/install/influxdb3",
    "INFO: sys.path = ['/Users/jamie/tmp/influxdb3-pe/data/plugins/.venv/lib/python3.11/site-packages', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python311.zip', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/lib-dynload', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/site-packages']",
    "INFO: 'requests' location = not found"
  ],
  "database_writes": {},
  "errors": []
}

Windows

#
# windows
#
# serve
> influxdb3-pe\install\influxdb3.exe serve --node-id=local01 --object-store=file --data-dir Z:\influxdb3-pe\data --plugin-dir Z:\influxdb3-pe\data\plugins

# WRONG - client - show paths (PREFIX/EXEC_PREFIX ARE PYTHON INSTALL LOCATION,
# base_prefix/base_exec_prefix are python install location BUT SYS.PATH HAS
# VENV's SITE-PACKAGES FIRST
> influxdb3-pe\install\influxdb3.exe test schedule_plugin -d mydb01 simple.py
{
  "trigger_time": "2025-02-25T20:57:53Z",
  "log_lines": [
    "INFO: sys.prefix = Z:\\influxdb3-pe\\install\\python",
    "INFO: sys.exec_prefix = Z:\\influxdb3-pe\\install\\python",
    "INFO: sys.base_prefix = Z:\\influxdb3-pe\\install\\python",
    "INFO: sys.base_exec_prefix = Z:\\influxdb3-pe\\install\\python",
    "INFO: sys.executable = Z:\\influxdb3-pe\\install\\influxdb3.exe",
    "INFO: sys.path = ['Z:\\\\influxdb3-pe\\\\data\\\\plugins\\\\.venv\\\\Lib\\\\site-packages', 'Z:\\\\influxdb3-pe\\\\install\\\\python311.zip', 'Z:\\\\influxdb3-pe\\\\install\\\\python\\\\DLLs', 'Z:\\\\influxdb3-pe\\\\install\\\\python\\\\Lib', 'Z:\\\\influxdb3-pe\\\\install', 'Z:\\\\influxdb3-pe\\\\install\\\\python', 'Z:\\\\influxdb3-pe\\\\install\\\\python\\\\Lib\\\\site-packages']",
    "INFO: 'requests' location = not found"
  ],
  "database_writes": {},
  "errors": []
}

With this PR

While the current main behavior doesn't seem to be causing problems, it seems like there could be potential for them. With this PR, the runtime environment for the plugins is much closer to that of a shell activated venv.

Linux

#
# linux
#
# server
$ ~/tmp/influxdb3-pe/install/influxdb3 serve --node-id=local01 --object-store=file --data-dir ~/tmp/influxdb3-pe/data --plugin-dir ~/tmp/influxdb3-pe/data/plugins
...

# client - show paths (prefix/exec_prefix are venv,
# base_prefix/base_exec_prefix are python install location, sys.path has
# .venv's site-packages appended
$ ~/tmp/influxdb3-pe/install/influxdb3 test schedule_plugin -d mydb01 simple.py
{
  "trigger_time": "2025-02-25T20:25:11Z",
  "log_lines": [
    "INFO: sys.prefix = /home/jamie/tmp/influxdb3-pe/data/plugins/.venv",
    "INFO: sys.exec_prefix = /home/jamie/tmp/influxdb3-pe/data/plugins/.venv",
    "INFO: sys.base_prefix = /home/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.base_exec_prefix = /home/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.executable = /home/jamie/tmp/influxdb3-pe/data/plugins/.venv/bin/python3",
    "INFO: sys.path = ['/home/jamie/tmp/influxdb3-pe/install/python/lib/python311.zip', '/home/jamie/tmp/influxdb3-pe/install/python/lib/python3.11', '/home/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/lib-dynload', '/home/jamie/tmp/influxdb3-pe/data/plugins/.venv/lib/python3.11/site-packages']",
    "INFO: 'requests' location = not found"
  ],
  "database_writes": {},
  "errors": []
}

# client - install requests
$ ~/tmp/influxdb3-pe/install/influxdb3 install package requests

# client - show paths (as above, but requests was found in venv)
$ ~/tmp/influxdb3-pe/install/influxdb3 test schedule_plugin -d mydb01 simple.py
{
  "trigger_time": "2025-02-25T20:27:57Z",
  "log_lines": [
    "INFO: sys.prefix = /home/jamie/tmp/influxdb3-pe/data/plugins/.venv",
    "INFO: sys.exec_prefix = /home/jamie/tmp/influxdb3-pe/data/plugins/.venv",
    "INFO: sys.base_prefix = /home/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.base_exec_prefix = /home/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.executable = /home/jamie/tmp/influxdb3-pe/data/plugins/.venv/bin/python3",
    "INFO: sys.path = ['/home/jamie/tmp/influxdb3-pe/install/python/lib/python311.zip', '/home/jamie/tmp/influxdb3-pe/install/python/lib/python3.11', '/home/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/lib-dynload', '/home/jamie/tmp/influxdb3-pe/data/plugins/.venv/lib/python3.11/site-packages']",
    "INFO: 'requests' location = /home/jamie/tmp/influxdb3-pe/data/plugins/.venv/lib/python3.11/site-packages/requests/__init__.py"
  ],
  "database_writes": {},
  "errors": []
}

OSX

# osx
# server
$ ~/tmp/influxdb3-pe/install/influxdb3 serve --node-id my_host --object-store file --data-dir ./tmp/influxdb3-pe/data --plugin-dir ./tmp/influxdb3-pe/data/plugins

# client - show paths (prefix/exec_prefix are venv,
# base_prefix/base_exec_prefix are python install location, sys.path has
# .venv's site-packages appended
$ ~/tmp/influxdb3-pe/install/influxdb3 test schedule_plugin -d mydb01 simple.py
{
  "trigger_time": "2025-02-25T20:33:07Z",
  "log_lines": [
    "INFO: sys.prefix = /Users/jamie/tmp/influxdb3-pe/data/plugins/.venv",
    "INFO: sys.exec_prefix = /Users/jamie/tmp/influxdb3-pe/data/plugins/.venv",
    "INFO: sys.base_prefix = /Users/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.base_exec_prefix = /Users/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.executable = /Users/jamie/tmp/influxdb3-pe/install/influxdb3",
    "INFO: sys.path = ['/Users/jamie/tmp/influxdb3-pe/install/python/lib/python311.zip', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/lib-dynload', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/site-packages', '/Users/jamie/tmp/influxdb3-pe/data/plugins/.venv/lib/python3.11/site-packages']",
    "INFO: 'requests' location = not found"
  ],
  "database_writes": {},
  "errors": []
}

# client - install requests
$ ~/tmp/influxdb3-pe/install/influxdb3 install package requests

# client - show paths (as above, but requests was found in venv)
$ ~/tmp/influxdb3-pe/install/influxdb3 test schedule_plugin -d mydb01 simple.py
{
  "trigger_time": "2025-02-25T20:34:54Z",
  "log_lines": [
    "INFO: sys.prefix = /Users/jamie/tmp/influxdb3-pe/data/plugins/.venv",
    "INFO: sys.exec_prefix = /Users/jamie/tmp/influxdb3-pe/data/plugins/.venv",
    "INFO: sys.base_prefix = /Users/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.base_exec_prefix = /Users/jamie/tmp/influxdb3-pe/install/python",
    "INFO: sys.executable = /Users/jamie/tmp/influxdb3-pe/install/influxdb3",
    "INFO: sys.path = ['/Users/jamie/tmp/influxdb3-pe/install/python/lib/python311.zip', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/lib-dynload', '/Users/jamie/tmp/influxdb3-pe/install/python/lib/python3.11/site-packages', '/Users/jamie/tmp/influxdb3-pe/data/plugins/.venv/lib/python3.11/site-packages']",
    "INFO: 'requests' location = /Users/jamie/tmp/influxdb3-pe/data/plugins/.venv/lib/python3.11/site-packages/requests/__init__.py"
  ],
  "database_writes": {},
  "errors": []
}

Windows

# windows
# server
> influxdb3-pe\install\influxdb3.exe serve --node-id=local01 --object-store=file --data-dir Z:\influxdb3-pe\data --plugin-dir Z:\influxdb3-pe\data\plugins
...

# client - show paths (prefix/exec_prefix are venv,
# base_prefix/base_exec_prefix are python install location, sys.path has
# .venv's site-packages appended
> influxdb3-pe\install\influxdb3.exe test schedule_plugin -d mydb01 simple.py
{
  "trigger_time": "2025-02-25T17:34:16Z",
  "log_lines": [
    "INFO: sys.prefix = Z:\\influxdb3-pe\\data\\plugins\\.venv",
    "INFO: sys.exec_prefix = Z:\\influxdb3-pe\\data\\plugins\\.venv",
    "INFO: sys.base_prefix = Z:\\influxdb3-pe\\install\\python",
    "INFO: sys.base_exec_prefix = Z:\\influxdb3-pe\\install\\python",
    "INFO: sys.executable = Z:\\influxdb3-pe\\install\\influxdb3.exe",
    "INFO: sys.path = ['Z:\\\\influxdb3-pe\\\\install\\\\python311.zip', 'Z:\\\\influxdb3-pe\\\\install\\\\python\\\\DLLs', 'Z:\\\\influxdb3-pe\\\\install\\\\python\\\\Lib', 'Z:\\\\influxdb3-pe\\\\install', 'Z:\\\\influxdb3-pe\\\\install\\\\python', 'Z:\\\\influxdb3-pe\\\\install\\\\python\\\\Lib\\\\site-packages', 'Z:\\\\influxdb3-pe\\\\data\\\\plugins\\\\.venv\\\\Lib\\\\site-packages']",
    "INFO: 'requests' location = not found"
  ],
  "database_writes": {},
  "errors": []
}

# client - install requests
> influxdb3-pe\install\influxdb3.exe install package requests

# client - show paths (as above, but requests was found in venv)
> influxdb3-pe\install\influxdb3.exe test schedule_plugin -d mydb01 simple.py
{
  "trigger_time": "2025-02-25T17:35:54Z",
  "log_lines": [
    "INFO: sys.prefix = Z:\\influxdb3-pe\\data\\plugins\\.venv",
    "INFO: sys.exec_prefix = Z:\\influxdb3-pe\\data\\plugins\\.venv",
    "INFO: sys.base_prefix = Z:\\influxdb3-pe\\install\\python",
    "INFO: sys.base_exec_prefix = Z:\\influxdb3-pe\\install\\python",
    "INFO: sys.executable = Z:\\influxdb3-pe\\install\\influxdb3.exe",
    "INFO: sys.path = ['Z:\\\\influxdb3-pe\\\\install\\\\python311.zip', 'Z:\\\\influxdb3-pe\\\\install\\\\python\\\\DLLs', 'Z:\\\\influxdb3-pe\\\\install\\\\python\\\\Lib', 'Z:\\\\influxdb3-pe\\\\install', 'Z:\\\\influxdb3-pe\\\\install\\\\python', 'Z:\\\\influxdb3-pe\\\\install\\\\python\\\\Lib\\\\site-packages', 'Z:\\\\influxdb3-pe\\\\data\\\\plugins\\\\.venv\\\\Lib\\\\site-packages']",
    "INFO: 'requests' location = Z:\\influxdb3-pe\\data\\plugins\\.venv\\Lib\\site-packages\\requests\\__init__.py"
  ],
  "database_writes": {},
  "errors": []
}

Closes #26012

@jdstrand
Copy link
Contributor Author

Testing: this has been lightly tested (linux/amd64 and windows/amd64). If @jacksonrnewhouse is happy with the approach, I'll take in feedback and do a full round of testing.

@jacksonrnewhouse - rebased on main and retested on linux/amd64. This is ready to review for the approach. If you're generally happy with it, I'll do full testing.

Copy link
Contributor

@jacksonrnewhouse jacksonrnewhouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@jdstrand
Copy link
Contributor Author

jdstrand commented Feb 21, 2025

@jacksonrnewhouse - thanks for the review! In doing further testing and looking at #26050, I need followup commits (which may end up fixing #26050 in the process).

UPDATE: I'm going to mark this as DRAFT just to be sure that it doesn't get committed before I have those ready.

@jdstrand jdstrand marked this pull request as draft February 21, 2025 19:02
@jdstrand
Copy link
Contributor Author

@jacksonrnewhouse - thanks for the review! In doing further testing and looking at #26050, I need followup commits (which may end up fixing #26050 in the process).

UPDATE: I'm going to mark this as DRAFT just to be sure that it doesn't get committed before I have those ready.

Since you approved, I'll take that to mean that you like the approach. When I have the follow-up commits ready, I'll do full testing and list it in this issue, then undraft.

ccd5d22 introduced working but temporary code for setting up the
python runtime environment. This cleans that up:

* refactor various find_python() functionality into virtualenv.rs
* refactor PYTHONHOME calculation to virtualenv.rs:find_python_install()
* adjust init_pyo3() to temporarily set PYTHONHOME based on
  virtualenv.rs:find_python_install() as this is the only place it is
  needed (indeed, venv activation scripts try to remove it)

Importantly, virtualenv.rs:find_python_install() tries to find the
python build standalone runtime based on a few heuristics. This function
could be improved in the fullness of time to, eg, be configured via a
build parameter.

Also, virtualenv.rs:find_python() can be used pre and post venv
activation. Before entering the venv, it will use find_python_install()
which is useful for things like setting up the initial venv. After
entering the venv, it will honor VIRTUAL_ENV (as set by
virtualenv.rs:initialize_venv()) to find python, which is important for
install packages with pip and having them installed into the venv.
@jdstrand jdstrand force-pushed the jdstrand/pe-standalone-cleanup branch 3 times, most recently from f86bf2f to f57def5 Compare February 25, 2025 20:12
@jdstrand jdstrand force-pushed the jdstrand/pe-standalone-cleanup branch 2 times, most recently from c57fe65 to f474906 Compare February 25, 2025 21:05
@jdstrand jdstrand marked this pull request as ready for review February 25, 2025 21:53
@jdstrand
Copy link
Contributor Author

@jacksonrnewhouse - thanks for the review! In doing further testing and looking at #26050, I need followup commits (which may end up fixing #26050 in the process).
UPDATE: I'm going to mark this as DRAFT just to be sure that it doesn't get committed before I have those ready.

Since you approved, I'll take that to mean that you like the approach. When I have the follow-up commits ready, I'll do full testing and list it in this issue, then undraft.

@jacksonrnewhouse - ok, when doing full testing I noticed a couple of things with sys.prefix and sys.path that I thought I would fix in this PR (ie, address some small issues in the previous setup).

When activating a venv in the shell, sys.base_prefix and
sys.base_exec_prefix should be set to the installation location while
sys.prefix and sys.exec_prefix should be set to the venv dir.
Unfortunately, when initialize_venv() and init_pyo3() are called, we
can't use Py_InitializeFromConfig() to set any of these and certain
platforms are unable to find python-build-standalone. For now, we'll
temporarily set PYTHONHOME in init_pyo3() to the installation location
to make python work. By setting it at this point in the code, sys.prefix
and sys.exec_prefix end up also being set to the installation location,
which is fine when not under a venv, but is different from when entering
an venv.

To address this, in PYTHON_INIT.call_once() and when VIRTUAL_ENV is set
(which initialize_venv() will have set at this point), manually set
sys.prefix and sys.exec_prefix to what is in VIRTUAL_ENV.

Similarly, when activating a venv in the shell, sys.path is appended to
have the venv's site-packages dir. Previously we were setting PYTHONPATH
in initialize_venv() which ensures that the venv's site-packages dir is
in sys.path, but this ends up having the venv's site-packages dir first
in sys.path. To correct this, don't set PYTHONPATH any more and instead
adjust PYTHON_INIT.call_once() to append the venv's site-packages dir to
sys.path when VIRTUAL_ENV is set.

Finally, when exiting init_pyo3(), unconditionally unset PYTHONHOME when
VIRTUAL_ENV is set (like activation scripts do) and restore/unset when
it isn't.

Prior to these changes, all target incorrectly had the venv's
site-packages first in sys.path and OSX and Windows additionally had an
incorrect sys.prefix and sys.exec_prefix. With these initialization
changes in place, the runtime environment for the plugins is much closer
to that of a shell activated venv.
@jdstrand jdstrand force-pushed the jdstrand/pe-standalone-cleanup branch from f474906 to 1f8001d Compare February 25, 2025 22:01
@jdstrand jdstrand merged commit aa09d76 into main Feb 26, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clean up runtime code for finding/using standalone python [fixed: and entering a venv]
2 participants