-
Notifications
You must be signed in to change notification settings - Fork 11
Fix incorrect session grouping #904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for antenna-preview canceled.
|
ami/main/models.py
Outdated
| # Get only newly added images (images without an event) | ||
| image_qs = image_qs.filter(event__isnull=True) | ||
|
|
||
| images = list(image_qs.order_by("timestamp")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You likely don't have to evaluate the queryset yet with list(image_qs). You can check if images are found with images_qs.exists(), which is efficient for large datasets.
ami/main/models.py
Outdated
| event = None | ||
| if use_existing: | ||
| # Look for overlap or proximity | ||
| for existing_event in existing_events: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are looping over a queryset, you can do for existing_event in events_qs, which supposedly avoids loading the whole queryset result into memory. Sometimes you need to convert to a list so you can index the list like events[3], but often you never need to convert to a list.
| email = os.environ.get("DJANGO_SUPERUSER_EMAIL", "Unknown") | ||
| password = os.environ.get("DJANGO_SUPERUSER_PASSWORD", "Unknown") | ||
| logger.info(f"Test user credentials: {email} / {password}") | ||
| password = os.environ.get("DJANGO_SUPERUSER_PASSWORD", "Unknown") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this intentional?
| return created | ||
|
|
||
|
|
||
| def create_captures_in_range( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks helpful, thanks!
ami/main/models.py
Outdated
|
|
||
| if event: | ||
| if use_existing: | ||
| # Adjust times if necessary (merge) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this check necessary? I think you just checked if and existing event has the exact start & end time. Perhaps you meant to do an OR query? If an existing event has either the same start or end time as the group.
If there is an existing event with exactly the same start AND end time (for same deployment), then I don't think we should check for use_existing. Just re-use those without question.
|
This is looking good!! One of our oldest issues :) Will you also add a function for fixing existing events? It can be just a Django admin function in the Session list view (allow selecting multiple sessions). We need something to fix the sessions like this: https://antenna.insectai.org/projects/18/sessions/2579 One approach could be to set the images in the selected sessions so that I still like the idea of a function that can scan all sessions in the deployment (or project) and "detect" if there are images that shouldn't be there (based on the gap setting). Then we can alert the user that it needs to be regrouped. Will you make a follow-up ticket for making the max gap setting a Project setting? for #893 |
|
I pushed a change with my suggested action for removing source images from existing events. I also noticed that Occurrences have a cached field that keeps track of the event as well, so this needs to be updated in our grouping methods. There are other ways to keep occurrences in sync, but they will likely be per-occurrence or per-image update, whereas this can update more at once. |
…ryset evaluations
…olnickLab/antenna into fix/group-images-into-sessions
…b/antenna into fix/group-images-into-sessions
…olnickLab/antenna into fix/group-images-into-sessions
| queryset.dissociate_related_objects() | ||
| self.message_user(request, f"Dissociated {queryset.count()} events from captures and occurrences.") | ||
|
|
||
| @admin.action(description="Fix sessions by regrouping images") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like having a dedicated fix_sessions action. However you can use the function above (queryset.dissociate_related_objects()) to remove images and occurrences from the Event. That's what I designed it for. Something like:
queryset.dissociate_related_objects()
for deployment in deployments:
group_images_into_events(deployment)
I think use_existing=True works in this case
…nes not assinged to an event
…mages-into-sessions
…mages-into-sessions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mohamedelabbas1996 I just brought this up-to-date with main and rebased the migration and fixed one test. I noticed there are type errors for event.end since the end time can be None if the event is on-going. We should handle these None cases, or if that adds too much complexity, we can start requiring an end time on the model, and set it to the last capture's timestamp (and perhaps use another method to detect ongoing events, like if the end timestamp is within the time gap of the current real-world time).
|
Can you confirm how much this will affect existing sessions when we deploy? And after we re-sync a deployment? I think it's pretty safe if this only affects existing sessions on-demand. When new data is synced for a deployment, will a split happen automatically if an existing session is incorrect? (e.g. 12 hour session with 2 short test sessions within it). Or is it only appends, prepends and new sessions? I'm just trying to gauge how much can change when we deploy this. Thanks for refreshing my memory! I would like to do an audit of all sessions in the live DB that are over 9 hours, then we can see what will happen to those. |
Since we call group_images_into_events with use_existing=True by default during syncs, existing sessions are mostly unaffected: only newly added images are considered, and they either get grouped into a new event or merged into an overlapping/nearby existing event; in some cases this can also cause two existing sessions to merge if the new images bridge the gap between them, but no existing sessions are ever split. If there are no new images, the function returns immediately and makes no changes. Incorrectly grouped sessions will not be fixed automatically without calling the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes incorrect session grouping by refactoring the image grouping logic and removing the problematic group_by field from the Event model. The changes improve how images are organized into monitoring sessions based on timestamp proximity rather than simple date-based grouping.
- Replaces date-based grouping with time-range-based merging logic using a
use_existingflag - Removes the
group_byfield from Event model and adds new unique constraint on deployment/start/end - Adds admin actions for fixing sessions and managing event associations
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| ami/utils/dates.py | Adds utility function for checking time range overlap and proximity |
| ami/tests/fixtures/storage.py | Adds support for custom beginning timestamp in test data generation |
| ami/tests/fixtures/main.py | Adds new test helper functions and fixes duplicate logging statements |
| ami/tasks.py | Updates regroup task to use new grouping logic without returning events |
| ami/main/tests.py | Adds comprehensive test coverage for new grouping behavior scenarios |
| ami/main/models.py | Major refactoring of Event model and grouping logic with new manager/queryset |
| ami/main/migrations/0071_remove_event_unique_event_and_more.py | Database migration removing group_by field and updating constraints |
| ami/main/admin.py | Adds new admin actions for session management |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| password = os.environ.get("DJANGO_SUPERUSER_PASSWORD", "Unknown") | ||
| logger.info(f"Test user credentials: {email} / {password}") |
Copilot
AI
Sep 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines are duplicated from lines 468-469. Remove the duplicate logging statements.
| password = os.environ.get("DJANGO_SUPERUSER_PASSWORD", "Unknown") | |
| logger.info(f"Test user credentials: {email} / {password}") |
|
|
||
| audit_event_lengths(deployment) | ||
|
|
||
| audit_event_lengths(deployment) |
Copilot
AI
Sep 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The audit_event_lengths(deployment) call is duplicated (also on line 1268). Remove one of these duplicate calls.
| audit_event_lengths(deployment) |
| This is useful when the event is being deleted or dissociated from its captures. | ||
| It does not delete the event itself, but removes its associations with source images and occurrences. | ||
| This was created to reassociate source imag es and occurrences with a new event |
Copilot
AI
Sep 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an extra space in 'imag es'. It should be 'images'.
| This was created to reassociate source imag es and occurrences with a new event | |
| This was created to reassociate source images and occurrences with a new event |
| "id", | ||
| "path", | ||
| ) | ||
| search_fields = ("id", "path", "event__start__date") |
Copilot
AI
Sep 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The search field 'event__start__date' is incorrect for Django lookups. It should be 'event__start__date' for exact date matching or use a different approach like 'event__start' for datetime searching.
| search_fields = ("id", "path", "event__start__date") | |
| search_fields = ("id", "path") |
|
Hi @rhine3! We have this fix for events/sessions that are incorrectly grouped (sessions that span multiple days, or a single session that contains multiple short sessions). I am testing the fix on a recent db snapshot and seeing how it will affect existing deployments & sessions. Below is the output for one deployment in your project. It seems that most changes are about splitting an event into multiple events - one main 7hr event and then some random 1 minute test sessions where the camera was turned on briefly. This will not be run automatically when we deploy it, but it will be run next time your deployment images are synced. Would this change throw off your existing analysis, or do you welcome it? Thanks in advance for your feedback. Summary for a single deployment (#220 LEPS-033_Box1) All deployments in project 84 |
|
Realized I replied to you directly but never posted it here - yes, I HUGELY welcome this change! I do want to exclude the 1min camera turn-on events from my analysis. |



Summary
This PR fixes the issue of incorrectly grouped sessions (events) by refactoring and improving the image grouping logic when syncing a deployment.
List of Changes
Refactored
group_images_into_eventsto support a newuse_existingflag:If
use_existing=True:If
use_existing=False:Removed the
group_byfield from theEventmodel.Added two admin actions:
Related Issues
Closes #237
Detailed Description
Previously, session grouping relied on the
group_byfield, which reused an existing event if a group had the same start date. This caused issues when images taken on the same day—but far apart in time—were incorrectly grouped into a single session, even though timestamp-based grouping split them into multiple groups. Since all groups shared the same start date, they got assigned to the same event due togroup_by.This PR fixes this issue by improving the
group_images_into_eventsfunction. It introduces ause_existingflag to control the behavior: whenuse_existing=False, all deployment images are regrouped; whenTrue, only new images (those not yet assigned to an event) are processed. Images are grouped based on their timestamps using amax_time_gapthreshold, and then each group is either merged into an existing event (if overlapping or close enough anduse_existing=True) or assigned to a new event (or an existing one if its start and end time exactly match the group). Thegroup_byfield is removed from theEventmodel, and an admin action is added to help fix incorrectly grouped sessions. Additionally, cached fields in event related models (e.g.Occurrence) are updated accordingly.Screenshots
N/A
Deployment Notes
N/A
Checklist