Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: give more memory to the Delta Like workers #271

Merged
merged 1 commit into from
Aug 29, 2023
Merged

Conversation

mikix
Copy link
Contributor

@mikix mikix commented Aug 29, 2023

We were noticing that after about 60M rows, a delta lake table can no longer be merged to, with only 2G of driver memory.

So bumping it to 4G for extra room. This may mean that batch sizes should be adjusted downward, as the driver takes more system memory.

I will notify folks in our slack about this possible adjustment.

Checklist

  • Consider if documentation (like in docs/) needs to be updated
  • Consider if tests should be added

We were noticing that after about 60M rows, a delta lake table can no
longer be merged to, with only 2G of driver memory.

So bumping it to 4G for extra room. This may mean that batch sizes
should be adjusted downward, as the driver takes more system memory.
Comment on lines +35 to +36
We've found `--batch-size=100000` works well for 16GB of memory.
And `--batch-size=500000` works well for 32GB of memory.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are what I've been using. 🤷

@mikix mikix merged commit 4211901 into main Aug 29, 2023
2 checks passed
@mikix mikix deleted the mikix/more-driver-mem branch August 29, 2023 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants