-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a method to add schema information to a BigQueryRelation #1232
base: develop
Are you sure you want to change the base?
Conversation
* @param schema The schema. | ||
* @return A new relation with the schema added. | ||
*/ | ||
public Relation addSchema(@Nullable Schema schema) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please validate there is no existing schema unless you expect schema overrides
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should we ideally be doing in case a schema already exists? Since we're returning a new Relation
I think overriding the schema should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how do you plan to use this method. Generally it's better to introduce methods when they are needed (even with mock implementation) than visa versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've mentioned that already in the PR description above. I plan to use it in Wrangler in transform()
just like any other Relation
operation. I don't see any other way to add a schema to the Relation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whangler should have nothing to do with schema management. How would Wrangler know the schema of a SQL expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it's unneded burden on the plugin. Given known schema of the original relation and known schema of all expressions, resulting schema should be automatically contructed by the platform
This PR adds a way to add schema information to a relation. This can be called by a plugin that wishes to request validation.
The current intention is that this will be called by Wrangler in the
transform()
method to supply schema information (which Wrangler currently already has in theoSchema
variable).