Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove deleted_at (Postgres) and _deleted_at (Mongo) #696

Closed
jnm opened this issue Mar 28, 2021 · 1 comment · Fixed by #732
Closed

Remove deleted_at (Postgres) and _deleted_at (Mongo) #696

jnm opened this issue Mar 28, 2021 · 1 comment · Fixed by #732
Assignees

Comments

@jnm
Copy link
Member

jnm commented Mar 28, 2021

We are preparing to remove deleted_at (Postgres) and _deleted_at (Mongo) for performance reasons. We will stop setting these attributes in the legacy view and delete the submissions outright, as was always done for DELETE HTTP requests to the API. We will also include a migration that deletes all rows in Postgres' logger_instance that have any non-null value for deleted_at, and likewise remove any documents in the Mongo instances collection that have _deleted_at set.

django-reversion will not be changed. logger_instancehistory rows will be removed when their corresponding rows in logger_instance are removed. This matches the existing behavior when deleting with the API, but it will apply retroactively to all instances / submissions that were deleted with the legacy view, i.e. instances that have been marked as deleted but still exist in the database.

Originally posted by @jnm in #659 (comment)

ℹ️ Once this is done, we can remove _deleted_at from queries in KPI as well.

@jnm jnm self-assigned this Mar 28, 2021
@jnm
Copy link
Member Author

jnm commented Mar 29, 2021

I'm purging most of these on production now to avoid a long, downtime-inducing migration later:

import sys
import time
batch_size = 25
failures = []
byebye = list(Instance.objects.exclude(deleted_at=None).values_list('pk', flat=True))
while True:
    pk_batch = byebye[:batch_size]
    if not pk_batch:
        break
    del_candidates = Instance.objects.filter(pk__in=pk_batch)
    # paranoia
    del_candidates = del_candidates.exclude(deleted_at=None)
    try:
        del_candidates.delete()
    except Exception as e:
        for dc in del_candidates:
            try:
                dc.delete()
            except Exception as e:
                sys.stdout.write('!')
                sys.stdout.flush()
                failures.append(dc.pk)
        sys.stdout.write('\n')
    del byebye[:batch_size]
    sys.stdout.write('\r{} remain'.format(len(byebye)))
    sys.stdout.flush()
    time.sleep(1)

The hacky excepts are to cope with #697

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant