Skip to content

Comments

Fix bug in chunk size calculation#5786

Closed
lutter wants to merge 2 commits intomasterfrom
lutter/chunk-size
Closed

Fix bug in chunk size calculation#5786
lutter wants to merge 2 commits intomasterfrom
lutter/chunk-size

Conversation

@lutter
Copy link
Collaborator

@lutter lutter commented Jan 31, 2025

When inserting entities during subgraph syncing, we split the entities that need to be written into chunks which we want to be as large as possible for best performance. Since Postgres only allows using 65k bind variables, the number of entities we can insert in one chunk is roughly 65k/#columns. The code did not account for the causality_region when calculating the number of columns to be inserted.

Since we had issues with this calculations in the past and it's next to impossible for operators to work around it, this also introduces an environment variable GRAPH_STORE_INSERT_EXTRA_COLS that operators can use to work around this issue should we ever screw up this calculation again.

The code assumed one column too few when calculating chunk size which can
cause errors
@lutter lutter requested a review from zorancv January 31, 2025 20:04
@lutter lutter self-assigned this Jan 31, 2025
Copy link
Contributor

@zorancv zorancv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lutter lutter closed this Feb 3, 2025
@lutter lutter deleted the lutter/chunk-size branch February 3, 2025 17:34
@lutter
Copy link
Collaborator Author

lutter commented Feb 3, 2025

This was actually merged at commit 63ea9d7. Not sure why github is showing this as closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants