-
Notifications
You must be signed in to change notification settings - Fork 732
Ducklake support in DuckDB scripts #6035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Deploying windmill with
|
Latest commit: |
eeb78e6
|
Status: | ✅ Deploy successful! |
Preview URL: | https://fbcade08.windmill.pages.dev |
Branch Preview URL: | https://di-ducklake.windmill.pages.dev |
/ai review this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to 1389172 in 2 minutes and 28 seconds. Click for details.
- Reviewed
306
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
5
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. backend/Cargo.toml:247
- Draft comment:
Upgrade to duckdb 1.3.1 looks intentional. Please double-check that the new version is fully backwards‐compatible with existing queries and features. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the PR author to double-check compatibility, which violates the rule against asking for confirmation or verification of intentions. It also relates to a dependency change, which should not be commented on unless it's a recognized issue.
2. backend/windmill-worker/src/duckdb_executor.rs:597
- Draft comment:
The regex used in transform_attach_ducklake is hard-coded. Validate that it robustly matches all supported ducklake attachment syntaxes and consider documenting the expected format. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The regex pattern is indeed hard-coded and parses a specific format. However, this appears to be an internal implementation detail for parsing a specific SQL syntax. The code handles the parsing results appropriately with error messages. The comment is suggesting documentation but doesn't point out any actual issues with the implementation. The regex could potentially miss some valid ducklake attachment syntaxes, and documentation would help future maintainers understand the expected format. The code already has clear error handling and the format appears to be an internal implementation detail rather than a public API that needs documentation. The regex pattern itself is relatively simple and its purpose is clear from the surrounding code. The comment should be deleted as it suggests documentation for an internal implementation detail without identifying any actual issues. The code handles parsing and errors appropriately.
3. backend/windmill-worker/src/duckdb_executor.rs:605
- Draft comment:
When handling the 'windmill' db_type, defaulting to 'NO_USER'/'NO_PWD' may hide missing credential issues. Consider failing fast or logging a warning if credentials are absent. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This is a new code block that handles Postgres URL parsing for DuckDB attachments. The concern about masking credential issues is valid - using placeholder values could hide real authentication problems. However, this appears to be an intentional design choice to handle optional credentials in a Postgres URL. The code is properly handling Option types and providing fallback values that would be clear to debug. The comment raises a legitimate security concern about error visibility. However, I may be overlooking whether these default values are actually problematic in practice - perhaps they're expected to work in some valid use cases. Given that this is handling optional fields in a URL parser, and the defaults are clearly marked as missing values rather than empty strings, this appears to be a reasonable implementation that won't hide real issues. The comment should be deleted. While it raises a valid concern, the current implementation appears to be an intentional and reasonable design choice for handling optional credentials in URLs.
4. backend/windmill-worker/src/duckdb_executor.rs:619
- Draft comment:
Enforcing that resource paths start with '$res:' is clear; consider adding a function-level doc-comment describing this requirement for maintainability. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% The comment is about code documentation rather than a bug or issue. While documentation is good, the requirement is already very clear from the error message and validation check in the code. The comment doesn't point out any actual problems that need fixing. Adding a doc comment would be nice-to-have but not essential since the code is self-documenting through its validation and error message. The requirement for '$res:' prefix could be considered an important API contract that should be documented at the function level. Future maintainers might benefit from seeing this requirement up front before reading the implementation. While documentation is valuable, the code already clearly communicates this requirement through explicit validation and a descriptive error message. The comment doesn't identify a real issue that needs fixing. This comment should be deleted as it suggests purely informational documentation without identifying any actual problems that need to be fixed. The code is already clear about the requirement.
5. backend/windmill-worker/src/duckdb_executor.rs:633
- Draft comment:
Using query.replacen with a formatted literal to replace the ducklake attachment string might be brittle if spacing or formatting vary. Ensure the input strictly follows the expected syntax. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_edJ9eEs9JDwldR2e
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
} | ||
|
||
pub fn parse_postgres_url(url: &str) -> Result<PostgresUrlComponents, Error> { | ||
let parsed_url = url::Url::parse(url).map_err(|_| Error::BadConfig("Invalid PostgreSQL URL".to_string()))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In parse_postgres_url(), consider including more context from the underlying URL parse error to improve debugging, rather than using a generic message.
let parsed_url = url::Url::parse(url).map_err(|_| Error::BadConfig("Invalid PostgreSQL URL".to_string()))?; | |
let parsed_url = url::Url::parse(url).map_err(|e| Error::BadConfig(format!("Invalid PostgreSQL URL: {e}")))?; |
Claude finished @rubenfiszel's task —— View job PR Review: Ducklake Support in DuckDB ScriptsReviewing the code changes for Ducklake integration... Todo List:
Overall AssessmentThis PR adds solid support for Ducklake syntax in DuckDB scripts. The implementation follows existing patterns well and integrates cleanly into the current query processing pipeline. However, there are several areas that need attention before merging. Detailed Review✅ Strengths
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed bb3ca52 in 2 minutes and 21 seconds. Click for details.
- Reviewed
185
lines of code in6
files - Skipped
0
files when reviewing. - Skipped posting
6
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. backend/windmill-api/openapi.yaml:11075
- Draft comment:
New ducklake endpoints added. Ensure that their authentication and documentation clearly state any permission requirements. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
2. backend/windmill-worker/src/duckdb_executor.rs:607
- Draft comment:
Hardcoding 'dbname=ducklake_catalog' in the ATTACH statement seems intentional; ensure that the default port fallback (5432) is acceptable for all deployments. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
3. backend/windmill-worker/src/duckdb_executor.rs:617
- Draft comment:
When processing the resource path in transform_attach_ducklake, consider trimming whitespace before checking that it starts with '$res:'. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
4. frontend/src/lib/components/InstanceSettings.svelte:359
- Draft comment:
New 'Ducklake' tab added to the settings UI. Verify that its user experience and integration are consistent with the rest of the settings. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
5. frontend/src/lib/components/instanceSettings.ts:404
- Draft comment:
A new empty settings category 'Ducklake' is added. Ensure that any future Ducklake-specific settings are populated as needed. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
6. frontend/src/lib/components/settings/DucklakeSettings.svelte:10
- Draft comment:
The DucklakeSettings UI for initializing the ducklake_catalog database looks good. Consider if additional persistent error handling or user guidance is needed for repeated failures. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
Workflow ID: wflow_V5QU1BVoSJojGXGL
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
@@ -245,3 +247,25 @@ async fn list_configs() -> error::JsonResult<String> { | |||
"Config listing available only in the enterprise version".to_string(), | |||
)) | |||
} | |||
|
|||
async fn ducklake_catalog_exists( | |||
_authed: ApiAuthed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ducklake_catalog_exists endpoint accepts an '_authed' parameter but does not enforce any role checks. Consider adding an auth check (or document why it's left open).
_authed: ApiAuthed, | ||
Extension(db): Extension<DB>, | ||
) -> error::Result<()> { | ||
sqlx::query!("CREATE DATABASE ducklake_catalog") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The init_ducklake_catalog_db function executes 'CREATE DATABASE ducklake_catalog' without checking if it exists. Consider graceful handling for repeated initialization.
Ducklake works with these syntaxes :
Everything works great with Postgres and S3
Important points :
AzureBlobStorageFileSystem: DirectoryExists is not implemented!
Important
Add Ducklake support in DuckDB scripts with new syntax, API endpoints, and UI settings.
ATTACH 'ducklake:windmill'
andATTACH 'ducklake:postgres:$res:u/my/db_resource'
.ducklake_catalog_exists
andinit_ducklake_catalog_db
inconfigs.rs
to manage Ducklake catalog.DirectoryExists
not implemented.duckdb
version inCargo.toml
from1.2.2
to1.3.1
.DucklakeSettings.svelte
for managing Ducklake settings in the UI.InstanceSettings.svelte
andinstanceSettings.ts
to include Ducklake settings.parse_postgres_url()
function inlib.rs
to parse PostgreSQL URLs.POWERSHELL_INSTALL_CODE
inbash_executor.rs
for PowerShell module management.This description was created by
for bb3ca52. You can customize this summary. It will automatically update as commits are pushed.