add analytics module, historical data logging, and session statistics#29
add analytics module, historical data logging, and session statistics#29arpit-k16 wants to merge 3 commits intoprattyan:mainfrom
Conversation
…istics - Add analytics.py: summary stats, congestion metrics, session summary, CSV export - Log traffic snapshots to data/traffic_session_*.csv every 5s for analysts - Optional POST to Django API (TRAFFIC_API_URL) when backend is running - Add Session Statistics card to dashboard (mean, min, max, std, congestion) - Django: TrafficSnapshot model, CRUD + GET .../summary/?hours=24 - README: document data/analytics, fix app.py -> main.py, project structure - requirements.txt: clean minimal deps; contributing.md: main.py and analytics Co-authored-by: Cursor <cursoragent@cursor.com>
📝 WalkthroughWalkthroughAdds an analytics module and session CSV logging to the Dash app, optional posting to a new Django API, and extends the Django API with a TrafficSnapshot model, serializer, viewset (with summary action), admin registrations, and a migration. Changes
Sequence DiagramsequenceDiagram
participant Dashboard as Dashboard/Main App
participant Analytics as Analytics Module
participant CSV as CSV File System
participant API as Django API
Dashboard->>Analytics: compute_summary_stats(traffic_history)
Analytics-->>Dashboard: stats
Dashboard->>Analytics: compute_congestion_stats(traffic_history)
Analytics-->>Dashboard: congestion metrics
Dashboard->>Analytics: session_summary(traffic_history, vehicle_types, start_time)
Analytics-->>Dashboard: session summary
Dashboard->>CSV: append snapshot row to /data/traffic_session_YYYYMMDD.csv
CSV-->>Dashboard: file path / error
Dashboard->>API: POST /traffic-snapshots/ (optional snapshot payload)
API-->>Dashboard: 200 or error (ignored)
Dashboard->>Dashboard: render session-stats UI
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In `@analytics.py`:
- Line 94: Replace the dynamic import call in the elapsed time calculation by
adding a top-level "import time" with the other stdlib imports and changing the
line that sets elapsed_sec (which currently uses __import__("time").time()) to
use time.time() instead (i.e., elapsed_sec = max(0, time.time() - start_time));
update references to the time module only where needed in analytics.py.
In `@main.py`:
- Around line 73-78: Replace the inline dynamic import with a top-level import:
add "import csv" at the top of main.py and change the inline
__import__("csv").DictWriter usage to csv.DictWriter where the writer is created
(the line assigning w using DictWriter for _session_csv_path and row). Ensure
existing behavior (writeheader when file not present and w.writerow(row))
remains unchanged.
- Around line 82-96: The current unconditional POST to TRAFFIC_API_URL blocks
the callback when the API is down; change this to an opt-in flow and log
failures: before building payload/Request, check an explicit opt-in like
os.environ.get("TRAFFIC_API_ENABLED") == "1" (or ensure TRAFFIC_API_URL is
explicitly set and not the default localhost) and skip the HTTP block if not
enabled; preserve payload and req creation only when enabled, and replace the
bare except with a logging call (e.g., logger.exception or logger.error with
exception info) instead of silently passing, referencing TRAFFIC_API_URL,
TRAFFIC_API_ENABLED, payload, urllib.request.Request and urllib.request.urlopen
to locate the code to modify.
In `@README.md`:
- Around line 98-112: The fenced directory-tree code block in README.md is
missing a language specifier (MD040); update the opening fence for the block
that contains the traffic-manage tree to include a language identifier such as
"text" (i.e., change the opening ``` to ```text) so markdownlint stops flagging
it and the block renders with explicit language metadata.
In `@traffic_manage/api/migrations/0002_trafficsnapshot.py`:
- Line 17: Update the migration's timestamp field to match the model's behavior
by replacing auto_now_add=True with default=timezone.now (and keep
db_index=True) so client-supplied timestamps from the Dash app (main.py) are not
ignored; also add "from django.utils import timezone" to the migration imports
and ensure the field remains editable per the model (i.e., do not set
editable=False).
In `@traffic_manage/api/views.py`:
- Around line 37-40: TrafficSnapshotViewSet currently inherits from ModelViewSet
which exposes full CRUD; change it to use ReadOnlyModelViewSet and add an
explicit create mixin (e.g., mixins.CreateModelMixin) if only creation is
needed, or keep ReadOnlyModelViewSet and add appropriate permission_classes
(e.g., IsAuthenticated or a custom permission) to restrict write operations;
update the class definition for TrafficSnapshotViewSet and ensure any allowed
write methods (create) are only available when the permission_classes permit
them.
- Around line 49-53: The code calls request.query_params.get("hours", type=int)
which is invalid in Django; change it to fetch the raw value via
request.query_params.get("hours"), then safely convert to int (e.g., try/except
ValueError) and only apply the timedelta/filter when the parsed integer is > 0;
update the block that assigns hours, computes since with timezone.now() and
timedelta, and applies qs = qs.filter(timestamp__gte=since) accordingly so
malformed or absent values are ignored instead of raising TypeError.
🧹 Nitpick comments (6)
analytics.py (1)
105-126:export_session_csvderives fieldnames from only the first row.If rows have inconsistent keys (e.g., a later dict has an extra key),
DictWriterwill silently drop those values (defaultextrasaction='raise'would actually raise — but only for extra keys; missing keys produce empty cells). Consider collecting the union of all keys or documenting the assumption that all dicts share the same schema.main.py (2)
86-87: Redundant timestamp manipulation.Line 86 filters out
"timestamp"fromrow, then line 87 immediately adds it back. The payload ends up identical torow. Either simplify topayload = dict(row)or remove these two lines.♻️ Proposed fix
- payload = {k: v for k, v in row.items() if k != "timestamp"} - payload["timestamp"] = row["timestamp"] + payload = dict(row)
43-45: Remove the unusedtraffic_snapshotsvariable.The list is allocated on line 44 but is never appended to or read anywhere in the codebase. Remove it unless it's intended for future use.
traffic_manage/api/models.py (1)
27-42: Consider adding a__str__method and validatingcongestion_pctrange.Two minor suggestions:
- All other models lack
__str__too, but sinceTrafficSnapshotwill appear in Django admin (per the admin registration), a__str__would improve usability.congestion_pcthas no constraint preventing values outside 0–100, which could lead to misleading analytics.Proposed improvements
decision = models.CharField(max_length=64, blank=True) + def __str__(self): + return f"Snapshot {self.timestamp:%Y-%m-%d %H:%M:%S} ({self.vehicle_count} vehicles)" + class Meta: ordering = ["-timestamp"]For validation, consider using
MinValueValidator/MaxValueValidatoror aPositiveIntegerFieldwith a check constraint oncongestion_pct.traffic_manage/api/views.py (2)
48-48: Useself.get_queryset()instead ofTrafficSnapshot.objects.all()to respect any future filtering or overrides.Suggested change
- qs = TrafficSnapshot.objects.all() + qs = self.get_queryset()
62-68: Rounding logic is repetitive — consider a helper or dict comprehension.Minor nit; the three nearly identical
if/roundblocks could be condensed.Example
- if agg.get("mean_vehicle_count") is not None: - agg["mean_vehicle_count"] = round(agg["mean_vehicle_count"], 2) - if agg.get("std_vehicle_count") is not None: - agg["std_vehicle_count"] = round(agg["std_vehicle_count"], 2) - if agg.get("mean_congestion") is not None: - agg["mean_congestion"] = round(agg["mean_congestion"], 2) + for key in ("mean_vehicle_count", "std_vehicle_count", "mean_congestion"): + if agg.get(key) is not None: + agg[key] = round(agg[key], 2)
Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@main.py`:
- Around line 759-761: The call to session_summary is using the live shared
deque traffic_history instead of the thread-local snapshot history_list, causing
a race; change the call at the ANALYTICS_AVAILABLE block to pass the
already-snapshotted history_list (not traffic_history) into session_summary.
Locate the call to session_summary and replace the argument traffic_history with
history_list, and if session_summary's parameter type or processing assumes a
deque, adapt session_summary (function definition) to accept the snapshot format
of history_list so no concurrent access to the live deque occurs.
🧹 Nitpick comments (4)
analytics.py (2)
75-77: Magic number15in congestion ratio denominator.The hard-coded
15on line 77 silently caps the congestion ratio for low-traffic scenarios. Consider extracting it as a parameter (likethreshold_high) or a module-level constant with a comment explaining the rationale, so future maintainers understand the cap.
122-126: Fieldnames derived solely from the first row may cause issues with heterogeneous dicts.If any subsequent row contains keys not present in
rows[0],DictWriterwill raiseValueErrorby default. If rows could have varying keys, consider collecting the union of all keys:♻️ Suggested hardening
- keys = list(rows[0].keys()) + keys = list(dict.fromkeys(k for row in rows for k in row))main.py (2)
753-756: Side-effectful I/O inside a Dash callback.
_log_snapshot_to_csvperforms file writes and potentially a blocking HTTP POST (up to 1 s) on the Dash callback thread, which delays the UI response. Consider offloading this to a background thread or adcc.Interval-driven non-blocking mechanism to keep the dashboard responsive.
44-46:traffic_snapshotslist is declared but never used.This list is initialized on line 45 but is never appended to or read anywhere in this file. If it's intended scaffolding for future use, consider adding a
# TODOcomment; otherwise remove it to avoid confusion.
…docs Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@traffic_manage/api/views.py`:
- Around line 37-43: TrafficSnapshotViewSet is read-only but main.py posts to
/api/traffic-snapshots/, causing 405s; update the view to accept creates by
adding CreateModelMixin (e.g., inherit from mixins.CreateModelMixin plus
viewsets.ReadOnlyModelViewSet’s base or switch to viewsets.ModelViewSet) so POST
requests from main.py succeed—modify the TrafficSnapshotViewSet class
declaration to include CreateModelMixin (or use ModelViewSet) and ensure
serializer_class and queryset remain unchanged.
🧹 Nitpick comments (5)
traffic_manage/api/views.py (1)
45-76: Summary action looks correct after prior fixes.The
hoursquery param parsing withtry/exceptis now safe. Aggregation and rounding logic handlesNonevalues properly (e.g., empty querysets or SQLiteStdDevreturningNone).One minor note: the
from datetime import timedeltaon line 58 is an inline import — consider moving it to the top-level imports for consistency.main.py (4)
44-46:traffic_snapshotslist is declared but never used.The
traffic_snapshotslist on line 45 is populated nowhere in the visible code. If it was intended for in-memory snapshot accumulation, it's dead code currently. Either wire it up or remove it to avoid confusion.
49-60:_snapshot_rowreads the globalvehicle_typesdirectly — consider passing it as a parameter.This function implicitly depends on the global
vehicle_typesdict, which is written by the video thread. Making it an explicit parameter improves testability and makes the data flow clearer.
86-97: No URL scheme validation onTRAFFIC_API_URL.
urllib.request.urlopen(line 97) will follow any scheme (file:,ftp:, etc.) provided via the environment variable. While this is admin-controlled, a simple scheme check would harden it.🛡️ Proposed fix
api_url = os.environ.get("TRAFFIC_API_URL") - if api_url: + if api_url and api_url.startswith(("http://", "https://")):
757-759: Side-effect (CSV I/O + network POST) inside a Dash callback.
_log_snapshot_to_csvperforms file I/O and potentially a blocking HTTP POST (0.5s timeout) on every 5th callback invocation. This runs on the Dash server's callback thread and can delay dashboard updates. Consider offloading this to a background thread or queue for better responsiveness.
Changes
data/traffic_session_*.csv; optional POST to Django API whenTRAFFIC_API_URLis setTrafficSnapshotmodel, CRUD +GET /api/traffic-snapshots/summary/?hours=24for aggregated analyticsSummary by CodeRabbit
New Features
Documentation