Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DuckDB has error in PyO3 module when combined with pyarrow on Windows #4856

Open
mracsko opened this issue Jan 14, 2025 · 0 comments
Open

DuckDB has error in PyO3 module when combined with pyarrow on Windows #4856

mracsko opened this issue Jan 14, 2025 · 0 comments
Labels

Comments

@mracsko
Copy link

mracsko commented Jan 14, 2025

Bug Description

The following example reproduces an issue with PyO3, DuckDB, and pyarrow on Windows. The issue is related to the import order of a custom PyO3 module with bundled duckdb (pyo3_duckdb_pyarrow) and the pyarrow module.

Tne problem only occurs on Windows. The code works on Linux.

Full project to reproduce can be found here: https://github.com/mracsko/pyo3_duckdb_pyarrow

I am not sure if the issue is strictly PyO3 related or maybe DuckDB, pyarrow, Python or Rust or the combination of those are causing the issue.

Steps to Reproduce

Build the provided Windows Container with Docker:

docker build -t reproduce-issue .

It is important to use Windows Containers.

Backtrace

thread '<unnamed>' panicked at C:\Users\ContainerAdministrator\.cargo\registry\src\index.crates.io-6f17d22bba15001f\duckdb-1.1.1\src\config.rs:127:13:
assertion `left == right` failed
  left: 1
 right: 0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "C:\app\test.py", line 4, in <module>
    pyo3_duckdb_pyarrow.run()
pyo3_runtime.PanicException: assertion `left == right` failed
  left: 1
 right: 0
The command 'cmd /S /C .venv\Scripts\activate.bat && python test.py' returned a non-zero code: 1

Your operating system and version

Windows Container (mcr.microsoft.com/windows/servercore:ltsc2022)

Your Python version (python --version)

3.12.8

Your Rust version (rustc --version)

1.83.0

Your PyO3 version

0.23.4

How did you install python? Did you use a virtualenv?

Used venv, see installation details in the Dockerfile

Additional Info

Detailed Issue description

The issue is related to the import order of pyo3_duckdb_pyarrow module and pyarrow. If the pyarrow module is imported before pyo3_duckdb_pyarrow, the following error occurs on the run method of the pyo3_duckdb_pyarrow module:

thread '<unnamed>' panicked at C:\Users\ContainerAdministrator\.cargo\registry\src\index.crates.io-6f17d22bba15001f\duckdb-1.1.1\src\config.rs:127:13:
assertion `left == right` failed
  left: 1
 right: 0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "C:\app\test.py", line 4, in <module>
    pyo3_duckdb_pyarrow.run()
pyo3_runtime.PanicException: assertion `left == right` failed
  left: 1
 right: 0
The command 'cmd /S /C .venv\Scripts\activate.bat && python test.py' returned a non-zero code: 1

The referred code is from the DuckDB crate:

fn set(&mut self, key: &str, value: &str) -> Result<()> {
    if self.config.is_none() {
        let mut config: ffi::duckdb_config = ptr::null_mut();
        let state = unsafe { ffi::duckdb_create_config(&mut config) };
        assert_eq!(state, ffi::DuckDBSuccess);
        self.config = Some(config);
    }
    ...
}

The assert_eq!(state, ffi::DuckDBSuccess); lines fails because let state = unsafe { ffi::duckdb_create_config(&mut config) }; returns ffi::DuckDBError.

According to the documentation of (see here) this only can fail due to malloc issues: ... This will always succeed unless there is a malloc failure. ...

Works

If the module is loaded before pyarrow it works:

import pyo3_duckdb_pyarrow
import pyarrow

pyo3_duckdb_pyarrow.run()

Fails

If the module is loaded after pyarrow it fails:

import pyarrow
import pyo3_duckdb_pyarrow

pyo3_duckdb_pyarrow.run()

Versions

The container installs the following versions:

  • Install Python 3.12.8
  • Latest VS Build Tools (https://aka.ms/vs/17/release/vs_BuildTools.exe):
    • Microsoft.VisualStudio.Component.Windows10SDK.18362
  • Latest Rust wit Rustup
  • Python packages:
    • pyarrow==18.1.0
    • maturin==1.8.1
  • Rust dependencies:
    • pyo3: 0.23.4
    • duckdb: 1.1.1

Additional notes

  • The issue cannot be reproduced on my home computer, but it can be reproduced on my work computer and in the provided container.
  • Original issue found with PyO3 0.22.6, but could be reproduced with 0.23.4.
  • Reproduced with Python 3.10.11, 3.12.8 and 3.13.1.
  • Reproduced with MVSC Microsoft.VisualStudio.Component.Windows11SDK.22000 and Microsoft.VisualStudio.Component.Windows10SDK.18362.
  • Reproduced with PyO3 Feature abi3-py311 and abi3-py38.
  • Reproduced with Docker base image mcr.microsoft.com/windows/servercore:ltsc2022 and mcr.microsoft.com/windows:ltsc2019.

Files

Beside the above mentioned 2 test files, the following files are required to reproduce the issue.

Cargo.toml

[package]
name = "pyo3_duckdb_pyarrow"
version = "0.1.0"
edition = "2021"
rust-version = "1.83.0"

[lib]
name = "pyo3_duckdb_pyarrow"
crate-type = ["cdylib"]

[dependencies]
pyo3 = { version = "0.23.4", features = ["extension-module", "abi3-py38" ] }
duckdb = { version = "1.1.1", features = ["bundled", "r2d2"] }

pyproject.toml

[build-system]
requires = ["maturin>=1.7,<2.0"]
build-backend = "maturin"

[proaject]
name = "pyo3_duckdb_pyarrow"
requires-python = ">=3.10"

dynamic = ["version"]

src/lib.rs

use pyo3::pymodule;

#[pymodule]
mod pyo3_duckdb_pyarrow {
    use pyo3::pyfunction;
    use duckdb::DuckdbConnectionManager;

    #[pyfunction]
    fn run() {
        let pool = DuckdbConnectionManager::memory();
        println!("Connection pool created: {}", pool.is_ok());
    }
}

Dockerfile

FROM mcr.microsoft.com/windows/servercore:ltsc2022

#Install Python 3.12.8
ADD https://www.python.org/ftp/python/3.12.8/python-3.12.8-amd64.exe /python-3.12.8.exe
RUN powershell.exe -Command \
    $ErrorActionPreference = 'Stop'; \
	Start-Process c:\python-3.12.8.exe -ArgumentList '/quiet InstallAllUsers=1 PrependPath=1' -Wait ; \
	Remove-Item -Force python-3.12.8.exe;

SHELL ["cmd", "/S", "/C"]

#Latest VS Build Tools
ADD https://aka.ms/vs/17/release/vs_BuildTools.exe /vs_buildtools.exe
RUN vs_buildtools.exe --quiet --wait --norestart --nocache \
    --installPath C:\BuildTools \
    --add Microsoft.Component.MSBuild \
    --add Microsoft.VisualStudio.Component.Windows10SDK.18362 \
    --add Microsoft.VisualStudio.Component.VC.Tools.x86.x64	\
 || IF "%ERRORLEVEL%"=="3010" EXIT 0

ENTRYPOINT ["C:\\BuildTools\\Common7\\Tools\\VsDevCmd.bat", "&&", "powershell.exe", "-NoLogo", "-ExecutionPolicy", "Bypass"]

# Latest Rust wit Rustup
ADD https://win.rustup.rs/x86_64 /rustup-init.exe
RUN start /w rustup-init.exe -y -v && echo "Error level is %ERRORLEVEL%"
RUN del rustup-init.exe

RUN setx /M PATH "C:\Users\ContainerAdministrator\.cargo\bin;%PATH%"

WORKDIR /app

COPY . .

RUN python -m venv .venv
RUN .venv\Scripts\activate.bat && pip install maturin==1.8.1
RUN .venv\Scripts\activate.bat && pip install pyarrow==18.1.0
RUN .venv\Scripts\activate.bat && maturin develop
RUN .venv\Scripts\activate.bat && python test-works.py
RUN .venv\Scripts\activate.bat && python test-fails.py
@mracsko mracsko added the bug label Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant