Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shortcut common type inference cases to fail fast, speed up inference #660

Merged
merged 3 commits into from
Sep 7, 2023

Conversation

srowen
Copy link
Collaborator

@srowen srowen commented Sep 2, 2023

In schema inference, many different types are tried out for each input. This can get really slow in some cases, especially where the true type is just 'string'. This adds several shortcuts in the type inference code, to fail fast before expensive parsing code is run, where it's clear the parsing won't work. This also avoids using a thrown exception in one case for better speed.

@srowen srowen self-assigned this Sep 2, 2023
@srowen
Copy link
Collaborator Author

srowen commented Sep 4, 2023

I've got a customer checking out this change too. If I put it in, I'll also need to get this applied to the oustanding patch vs Spark that ports this.

@srowen srowen merged commit 994e357 into databricks:master Sep 7, 2023
3 checks passed
@srowen srowen deleted the InferDateOpt branch September 7, 2023 03:46
@srowen srowen added this to the 0.17.0 milestone Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants