Skip to content

Conversation

@falquaddoomi
Copy link
Collaborator

@falquaddoomi falquaddoomi commented Dec 3, 2025

A somewhat recent change to the DuckDB Python library deprecated duckdb.typing and added a mechanism for redirecting symbol imports from the deprecated package to the new duckdb.sqltypes module that replaces it (PR: duckdb/duckdb-python#96). This mechanism unfortunately didn't work for us, since we were importing the base duckdb package and referencing, e.g., duckdb.typing.VARCHAR directly.

This PR changes references from duckdb.typing.X to duckdb.sqltypes.X, resolving the issue.

After some discussion and testing (thanks, @d33bs!), it turned out that the backend was erroneously upgrading DuckDB a full minor version up when what it should have been doing was installing the specific versions from its (previously unused) uv.lock file. The PR now does the following:

  • Adds a "slow" marker for tests, which you can add to a test via the @pytest.mark.slow decorator.
    • These tests must be explicitly run by passing -m "slow" when invoking pytest, which will run only the slow tests.
    • All tests, slow or not, can be run by invoking pytest -m "slow or not slow".
  • Introduces two tests, one for the creation of the withdrarxiv DuckDB database file from embeddings (marked as slow), and one for searching it
  • In the backend image build, copies in uv.lock and uses the --frozen flag to ensure that the locked versions are installed, putting us back on DuckDB 1.3.1
  • Reverts back to importing VARCHAR from duckdb.typing, allowing us to use the migration shim introduced in 1.4.1
    • (Once we move to 1.4.x, we should change the reference anyway, but at least the code won't break now if we were to move to 1.4.1)

@falquaddoomi falquaddoomi requested a review from d33bs December 3, 2025 23:25
Copy link
Collaborator

@d33bs d33bs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, LGTM!

Aside, would it make sense to create a simple test which helps ensure this works as expected? This might help catch the error during other automated work.

@falquaddoomi
Copy link
Collaborator Author

Aside, would it make sense to create a simple test which helps ensure this works as expected? This might help catch the error during other automated work.

That's a great idea; in fact, in the process of adding that test I discovered that the backend was installing packages with uv without the --frozen flag, which was causing it to install a newer minor version of DuckDB (1.4.1) that introduced this breaking change. The manugen_ai package itself is installed with the --frozen tag (I believe that was your change, actually) and installs 1.3.1, avoiding, as you said, these kinds of surprises. I'll update this PR to include the --frozen flag in the backend.

I don't see a compelling reason to switch to 1.4.1 yet, so I'm going to change it to import VARCHAR from duckdb.typing instead, which will allow us to use the migration mechanism introduced in 1.4 should we decide to upgrade. Of course, when we do upgrade we should also switch to the non-deprecated sqltypes module, but IMO it can't hurt to future-proof it a bit by changing how we import VARCHAR.

Anyway, I'll update the PR with tests and make the changes I described above. Thanks!

…e of the migration shim introduced in 1.4.1. Adds pre-commit upgrades, linting tweaks.
…st marker for slow tests, disabled by default. Marks embedding creation test as slow.
@falquaddoomi falquaddoomi changed the title Resolves duckdb typing module deprecation Resolves duckdb typing module deprecation, locks backend Python versions Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants