Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: optimize memory for stacks tsv import into rocksdb #634

Merged
merged 2 commits into from
Aug 12, 2024

Conversation

rafaelcr
Copy link
Collaborator

@rafaelcr rafaelcr commented Aug 7, 2024

This PR changes the way chainhook imports a Stacks node TSV into rocksdb.

Before, it loaded the entire canonical chinstate (including the full block JSON messages) onto a VecDeque in memory and then drained that data into rocksdb. This was a very memory intensive process which crashed our dev pods ever time it ran.

Now, the process was changed to a VecDeque that only keeps the line numbers of the TSV where the block data exists, so it can later read blocks from the file 1 by 1 and insert them into rocksdb.

@rafaelcr rafaelcr changed the title fix: refresh rocksdb connection after importing TSV blocks fix: optimize memory for stacks tsv import into rocksdb Aug 8, 2024
@rafaelcr rafaelcr marked this pull request as ready for review August 8, 2024 19:25
@rafaelcr rafaelcr requested a review from tippenein August 8, 2024 19:26
Copy link
Contributor

@tippenein tippenein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have a noticeable difference in processing speed? Does that even matter?

Code looks good. Just curious if you've run this

@rafaelcr
Copy link
Collaborator Author

Thanks @tippenein . Yep, it does, it runs a bit faster I assume because it doesn't have to deal with the very large data structure that was in place before but perhaps that's only in the dev env I was using to test

@rafaelcr rafaelcr merged commit dcf545c into develop Aug 12, 2024
10 of 12 checks passed
@rafaelcr rafaelcr deleted the fix/tsv-read branch August 12, 2024 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants