Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The size of the database explodes when PostgreSQL uses the archive log. #2353

Closed
mdc-webdb opened this issue Jan 27, 2025 · 6 comments
Closed
Labels
bug Something isn't working

Comments

@mdc-webdb
Copy link

Whenever the feeds are updated,
the entire database is rewritten instead of just the changes. This leads to an extreme use of space on the database server if the wal archive mode is used. However, this is a prerequisite for incremental database backups.

@mdc-webdb mdc-webdb added the bug Something isn't working label Jan 27, 2025
@mdc-webdb
Copy link
Author

In my case the database itself will be 3GB, but when the updater runs(and find updates), on every run 17GB wal archive logs are generated.

@4D617274696E
Copy link

We are currently investigating a solution for this problem, no ETA yet.

@timopollmeier
Copy link
Member

I've created #2355 to limit the size of certain INSERT queries that were getting very large and there will be another patch release of gvmd once this is merged.

@mdc-webdb
Copy link
Author

Hi @timopollmeier,
I have tested it, but it is making it much worse.
Now with feed update of today, 27G of WAL archive logs are generated for an database with an size of 11GB.

SELECT pg_size_pretty( pg_database_size('gvmd') );
 pg_size_pretty 
----------------
 11 GB

After the feed update

du -h --max-depth=1 /var/lib/pgsql/wal_archive
27G     /var/lib/pgsql/wal_archive

Before:

du -h --max-depth=1 /var/lib/pgsql/wal_archive
30M     /var/lib/pgsql/wal_archive

@bjoernricks
Copy link
Contributor

Hi @mdc-webdb, as we don't use the wal archive mode (at least not at the moment) we don't have a solution for your issue. If you have ideas on how to improve the situation, let us know in this issue. Maybe we can take another look into it again. Until then I am sorry but we wont be able to fix it.

@mdc-webdb
Copy link
Author

Hi @bjoernricks,
I think that before a record is written to the database during the update, it must first be checked whether it already exists with the same content. This is because an SQL update statement is always executed ‘stupidly’, even if what is written is already in the database. This means that a transaction is created each time, which must be logged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants