Skip to content

Release: develop → main 통합#79

Merged
lsh1215 merged 100 commits intomainfrom
develop
Feb 8, 2026
Merged

Release: develop → main 통합#79
lsh1215 merged 100 commits intomainfrom
develop

Conversation

@lsh1215
Copy link
Member

@lsh1215 lsh1215 commented Feb 8, 2026

Summary

  • feat: FCM 대시보드 토픽 알림 지원
  • refactor: 백엔드 MSA 구조 개편 및 CI 워크플로우 정비
  • fix: 모니터링 스택을 DataDog/Terraform에서 오픈소스(OpenTelemetry, Prometheus, Grafana, Loki)로 전환
  • ci: Artifact Registry 푸시 및 배포 트리거 워크플로우 추가
  • docs: 배포 가이드 및 모니터링 문서 업데이트
  • test: 유닛 테스트 및 부하 테스트 스크립트 추가

Changes

  • 139 files changed, 10213 insertions(+), 596 deletions(-)
  • 100 commits from develop branch

Test plan

  • CI lint/test 통과 확인
  • GCE 수동 배포 검증 완료 (6개 인스턴스)

lsh1215 and others added 29 commits January 27, 2026 11:34
Apply black/isort auto-formatting and fix remaining issues:
- Remove unused imports (F401) across views, tasks, and tests
- Remove unused variables (F841) in test files
- Break long line (E501) in test_tasks.py
- Add noqa for intentional try/except availability checks
- Add setup.cfg with flake8 per-file-ignores for Django settings
- Simplify CI flake8 command to use setup.cfg config

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Separate ci.yml into 3 independent workflow files that run
simultaneously instead of sequentially:
- lint.yml: flake8, black, isort
- test.yml: pytest + MySQL
- docker-build.yml: main/ocr/alert image builds

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add initial migration files for vehicles, detections, notifications apps
  (tables were never created because migrations were missing)
- Fix test fixtures and assertions to use BigIntegerField ID references
  (vehicle_id, detection_id) instead of ForeignKey-style object assignment
- Add databases="__all__" to all django_db markers for multi-database access

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…CM dedup

- Remove unused DetectionCreateSerializer and NotificationCreateSerializer
- Add explicit .using("detections_db") in MQTT subscriber ingestion point
- Remove dangerous GRANT ALL ON *.* super admin privilege from init.sql
- Replace inline Firebase initialization in notification_tasks with core/firebase/fcm module
… cleanup

- Use config_from_object in celery.py to read settings from Django (removes duplication)
- Fix CELERYD_* to CELERY_WORKER_* for namespace compatibility
- Enforce strict MSA boundary in DB router allow_relation (return False for cross-DB)
- Replace inline GCS code in ocr_tasks with core.gcs.client.download_image
- Add explicit using="detections_db" to all Detection .save() calls in ocr_tasks
- DEBUG=False, ALLOWED_HOSTS from env (no wildcard)
- Add 4-database MSA structure matching dev.py
- Add DATABASE_ROUTERS for MSA DB routing
- CORS whitelist from env (no CORS_ORIGIN_ALLOW_ALL)
- All credentials from environment variables (no defaults)
- Cache EasyOCR Reader as worker-level singleton (avoid model reload per task)
- Fix OCR retry: keep status=processing during retries, set failed only on final retry
- Add blocking/non-blocking mode to MQTT subscriber (daemon thread support)
…ata safety

- Simplify VehicleViewSet to standard DRF serializer.save() pattern (I-1)
- Add detected_at datetime string parsing validation in MQTT subscriber (I-5)
- Enhance health check with DB and RabbitMQ connectivity verification (I-6)
- Remove sensitive fcm_token from NotificationSerializer response (I-8)
- Implement DLQ consumer task for dead-lettered message handling (I-9)
- Remove unused `import json` in dlq_tasks.py (F401)
- Add missing newline at end of notifications/serializers.py (W292)
- FCMClient에 subscribe_to_topic(), send_to_topic() 메서드 추가
- register_fcm에서 DASHBOARD 토큰을 dashboard_alerts 토픽에 자동 구독
- FCM_MOCK 환경변수로 테스트 모드 지원
- ocr_tasks: send_notification 호출을 차량 매칭 조건 밖으로 이동
- notification_tasks: dashboard_alerts 토픽 브로드캐스트 추가
- 차량 개별 푸시와 토픽 브로드캐스트 이중 구조로 개편
- 토픽 알림 이력에 fcm_token="topic:dashboard_alerts" 저장
- 토픽 알림 중복 전송 방지 (autoretry 시 idempotency 체크)
- null 안전성 강화 (ocr_result, location 미확인 시 기본값 처리)
- Detection.DoesNotExist 시 retry 전략으로 변경 (타이밍 이슈 대응)
- apply_async 실패 시 OCR 완료 상태 보호 (try/except 래핑)
- 에러 로깅 개선 (silent pass 제거)
- CORS 빈 문자열 폴백 방지 및 중복 설정 제거 (prod.py)
- 중복 DB 인덱스 제거 (Detection, Notification, Vehicle)
- statistics() 메서드 import 중복 제거 및 period_map 패턴 적용
- 불필요한 __future__ import 제거 (Python 3 전용)
- DLQ 태스크 export 누락 보완 (tasks/__init__.py)
- QuerySet.update()의 auto_now 우회 관련 주석 추가
DataDog 모니터링과 Terraform IaC를 제거하고 오픈소스 모니터링 스택으로
전환하기 위한 정리 작업. datadog-agent 서비스 블록 및 관련 env 파일 삭제,
.gitignore에서 Terraform 항목을 모니터링 볼륨 항목으로 교체.
ddtrace를 제거하고 opentelemetry-instrument로 자동 계측 전환.
django-prometheus 미들웨어와 /metrics 엔드포인트 추가.
로그 포맷에 trace_id/span_id 삽입하여 Loki-Jaeger 연동 지원.
OTel Collector, Jaeger, Prometheus, Grafana, Loki, Promtail, cAdvisor,
mysqld-exporter, celery-exporter, k6 서비스로 구성된 모니터링 스택 추가.
RabbitMQ prometheus 플러그인 연동, Grafana 데이터소스 자동 프로비저닝,
Loki에서 trace_id 클릭 시 Jaeger 트레이스 연결 지원.
…stack

MONITORING.md 신규 작성: 아키텍처, PromQL 쿼리 모음, OTel 계측 상세,
로그-트레이스 연동, GCP 멀티 인스턴스 배포 가이드 포함.
PERFORMANCE_TEST.md에서 DataDog/InfluxDB 참조를 Prometheus/Grafana로 교체,
k6 결과를 Prometheus remote write로 전송하도록 업데이트.
Add defaults for otelTraceID/otelSpanID in log formatter to prevent
KeyError in services not wrapped by opentelemetry-instrument (e.g. Flower).
- Add k6 HTTP load test (load-test.js) for REST API endpoints
- Add MQTT load test (mqtt-load-test.py) for real IoT pipeline simulation
- Fix Loki 429 errors by increasing stream/ingestion limits
- Delete obsolete PERFORMANCE_TEST.md (referenced non-existent files)
- Delete unused rabbitmq.conf and enabled_plugins (plugins enabled via command)
- Fix docker-compose.monitoring.yml:
  - Remove dead MQTT env vars from k6 service
  - Fix celery-exporter: remove wrong depends_on, add restart policy
  - Add restart policy to mysqld-exporter for cross-file dependency
  - Move cAdvisor to linux profile (macOS incompatible)
- Add comprehensive PERFORMANCE_TEST_GUIDE.md with actual working procedures
- Fix OTEL_TRACES_SAMPLER value (parentbased_tracealways → parentbased_always_on)
- Fix k6 script paths in MONITORING.md (/scripts/tests/*.js → /scripts/load-test.js)
- Fix container names in PERFORMANCE_TEST_GUIDE.md (speedcam-ocr-worker → speedcam-ocr)
- Remove non-existent python Docker service reference
- Remove stale Make requirement from DEPLOYMENT.md
- Add uid to Jaeger datasource for Loki→Jaeger trace linking
- Fix promtail regex to match uppercase hex trace IDs
@lsh1215 lsh1215 merged commit d780306 into main Feb 8, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants