-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Some of the calls to the database could be fetching extra info from the database that is not necessary when initially filtering just a few trials from the larger database. We could consider using only() to initially only pull relevant columns for the larger filter, then let django do lazy loading of other info needed later (at the cost of extra, but smaller calls to the DB).
Example
for instance, this django code
# find trials with some videos not uploaded
not_uploaded = Video.objects.filter(video='',
updated_at__gte=datetime.now() + timedelta(minutes=-15)).values_list("trial__id", flat=True)
if isMonoQuery == 'False':
uploaded_trials = Trial.objects.filter(updated_at__gte=datetime.now() + timedelta(days=-7)).exclude(
id__in=not_uploaded).exclude(session__isMono=True)
else:
uploaded_trials = Trial.objects.filter(updated_at__gte=datetime.now() + timedelta(days=-7)).exclude(
id__in=not_uploaded).filter(session__isMono=True)
leads to the SQL call
SELECT "mcserver_trial"."id", "mcserver_trial"."session_id", "mcserver_trial"."status",
"mcserver_trial"."name", "mcserver_trial"."meta", "mcserver_trial"."created_at",
"mcserver_trial"."updated_at", "mcserver_trial"."server", "mcserver_trial"."is_docker",
"mcserver_trial"."hostname", "mcserver_trial"."processed_duration",
"mcserver_trial"."processed_count", "mcserver_trial"."git_commit",
"mcserver_trial"."trashed", "mcserver_trial"."trashed_at"
FROM "mcserver_trial"
INNER JOIN "mcserver_session"
ON ("mcserver_trial"."session_id" = "mcserver_session"."id")
WHERE ("mcserver_trial"."updated_at" >= '2025-06-02T00:00:00+00:00'::timestamptz
AND NOT ("mcserver_trial"."id" IN (
SELECT U0."trial_id" FROM "mcserver_video" U0
WHERE (U0."updated_at" >= '2025-06-09T00:00:00+00:00'::timestamptz
AND U0."video" = '')
))
AND NOT ("mcserver_session"."isMono"));
The long SELECT call is likely due to some of the fields needed after a single trial is selected to update these fields.
Steps to try:
- use
only()to reduce the firstSELECTcall (e.g.,id,updated_at,isMono) - let django lazy load some fields (e.g.,
status,server,processed_count) - check the new SQL calls from django to see if it reduces overhead using
EXPLAIN ANALYZE - see if extra calls from the database are worth reducing the larger initial filter
Metadata
Metadata
Assignees
Labels
No labels