Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADBDEV-6520: Refactor strings into arguments when deleting/inserting into a table #41

Open
wants to merge 75 commits into
base: gpdb
Choose a base branch
from

Conversation

RekGRpth
Copy link
Member

@RekGRpth RekGRpth commented Oct 28, 2024

Refactor strings into arguments when deleting/inserting into a table

diskquota used long strings when deleting/inserting into the
diskquota.table_size table. This resulted in high memory consumption, both for
constructing such long strings and for parsing them. This patch reworks the
logic to use arguments consisting of arrays. As a result, significantly less
memory is spent, since the query itself is very short and there is no need to
waste memory on constructing and parsing it, array arguments are passed as is.

This patch also fixes the error duplicate key value violates unique constraint
"table_size_pkey". In the flush_to_table_size function, it may happen that when
updating the table size, the delete goes to one batch and the insert goes to
another, and the insert is performed earlier and a duplicate error occurs.


It is easier to view the changes with the "Hide whitespace" option enabled.

@RekGRpth RekGRpth changed the title ADBDEV-6520: Refactor strings to arguments in diskquota ADBDEV-6520: Refactor strings into arguments when deleting/inserting into a table Oct 29, 2024
@RekGRpth RekGRpth marked this pull request as ready for review October 29, 2024 04:08
@dkovalev1
Copy link

Are there any evidence that "significantly less memory is spent" and "fixes the error duplicate key value violates unique constraint" ?

@RekGRpth
Copy link
Member Author

Are there any evidence that "significantly less memory is spent" and "fixes the error duplicate key value violates unique constraint" ?

That's obvious! String representations of integer arrays take significantly more memory. And uniqueness violations will no longer happen because deletion will always happen before insertion.

Base automatically changed from ADBDEV-6577 to gpdb November 1, 2024 03:37
@hilltracer hilltracer self-requested a review November 1, 2024 13:58
@hilltracer
Copy link

Please describe in detail how I as reviewer, can check the changes in work:

  • what tests check these changes?
  • how can I compare memory spending before and after patch?
  • how can I get this duplicate key error to verify that the patch fixes it?

@RekGRpth
Copy link
Member Author

RekGRpth commented Nov 5, 2024

what tests check these changes?

There are no such tests.

how can I compare memory spending before and after patch?

This is practically difficult to do because you need a very large cluster with a very large number of active tables.

how can I get this duplicate key error to verify that the patch fixes it?

This is almost impossible because it would require a huge number of active tables.

how I as reviewer, can check the changes in work:

This is more of a theoretical patch.

Copy link

@hilltracer hilltracer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch also fixes the error duplicate key ...

Please describe in more detail what the error is, how "the delete goes to one batch and the insert goes to another", and how the patch fixes this?

src/quotamodel.c Show resolved Hide resolved
src/quotamodel.c Show resolved Hide resolved
src/quotamodel.c Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants