Add model latency endpoint #4599

A-Artemis · 2025-08-25T13:29:39Z

Summary

Added a MetricsService which collects and retrieves the model's inference latencies. It uses shared memory so that it is process safe when accessing the recorded measurements.
The latencies are stored in memory and only a total of 1024 entries are kept.

How to test

http://[geti-tune.localhost/api/pipelines/ace3f1da-fdd9-4048-a95e-a647ed969442/metrics](http://geti-tune.localhost/api/pipelines/ace3f1da-fdd9-4048-a95e-a647ed969442/metrics)

Checklist

I have added unit tests to cover my changes.
I have added integration tests to cover my changes.

License

I submit my code changes under the same Apache License that covers the project.
Feel free to contact the maintainers if that's a concern.
I have updated the license header for each file (see an example below).

# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

github-actions · 2025-08-25T13:31:39Z

Docker Image Sizes

Image	Size
geti-tune-backend-pr-4599	1.2G
geti-tune-backend-sha-fc057a8	1.2G
geti-tune-ui-pr-4599	50M
geti-tune-ui-sha-fc057a8	50M

Add test for MetricsCollector

…e test fixture

codecov-commenter · 2025-08-26T11:08:21Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…to 60 seconds

backend/app/api/endpoints/pipelines.py

backend/tests/integration/services/test_pipeline_service.py

backend/app/services/pipeline_service.py

backend/app/workers/inference.py

…tion

backend/app/services/metrics_collector.py

…cy measurement

Copilot

Pull Request Overview

This PR adds model latency tracking capabilities by implementing a MetricsCollector that uses shared memory for process-safe storage of inference latency measurements, along with a new API endpoint to retrieve pipeline metrics.

Implements a MetricsCollector using shared memory to track model inference latencies across processes
Adds pipeline metrics API endpoint that returns latency statistics (avg, min, max, p95, latest) over configurable time windows
Integrates latency collection into the inference workflow by recording start/end times and storing measurements

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
backend/app/services/metrics_collector.py	Core MetricsCollector implementation with shared memory and circular buffer
backend/app/workers/inference.py	Integration of latency measurement recording in inference workflow
backend/app/services/pipeline_service.py	Pipeline metrics calculation and percentile computation logic
backend/app/api/endpoints/pipelines.py	New GET endpoint for retrieving pipeline metrics with validation
backend/app/schemas/metrics.py	Pydantic models for metrics API response structure
backend/app/services/model_service.py	Enhanced LoadedModel to include model ID for metrics tracking
backend/app/schemas/model_activation.py	Added active_model_id field to ModelActivationState
backend/tests/unit/services/test_metrics_collector.py	Comprehensive unit tests for MetricsCollector functionality
backend/tests/unit/services/test_pipeline_service.py	Unit tests for pipeline metrics calculation

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

backend/app/services/pipeline_service.py

backend/tests/unit/services/test_pipeline_service.py

…nsions into aurelien/4537-model-latency-endpoint

itallix

Great progress! I've added a few suggestions for further improvement.

backend/app/services/metrics_collector.py

backend/app/services/pipeline_service.py

backend/app/services/metrics_collector.py

…nsions into aurelien/4537-model-latency-endpoint # Conflicts: # backend/app/workers/inference.py

backend/app/core/scheduler.py

backend/tests/unit/services/test_metrics_collector.py

backend/app/services/pipeline_service.py

itallix

LGTM!

A-Artemis added 2 commits August 25, 2025 15:18

4537 Added model metrics calculations and endpoint

d7db710

linting

815abbc

A-Artemis added 2 commits August 25, 2025 15:39

more linting

9fdfab6

style checks

97329f0

A-Artemis changed the title ~~4537 model latency endpoint~~ Add model latency endpoint Aug 26, 2025

A-Artemis self-assigned this Aug 26, 2025

A-Artemis linked an issue Aug 26, 2025 that may be closed by this pull request

Endpoint to monitor model latency #4537

Closed

A-Artemis added the Geti Tune Backend Issues related to Geti Tune backend label Aug 26, 2025

A-Artemis added 3 commits August 26, 2025 11:12

Add tests for pipeline metrics retrieval and error handling

0b94c42

Add max age update and reset functionality to MetricsCollector

71fe231

Add test for MetricsCollector

Update ModelActivationState to include active_model_id in ModelServic…

d7d3cba

…e test fixture

A-Artemis marked this pull request as ready for review August 26, 2025 10:48

A-Artemis requested review from itallix, leoll2 and warrkan as code owners August 26, 2025 10:48

A-Artemis requested a review from Copilot August 26, 2025 10:48

This comment was marked as outdated.

Sign in to view

Handle ValueError in exception handling, and update default duration …

5ead08b

…to 60 seconds

leoll2 reviewed Aug 26, 2025

View reviewed changes

A-Artemis added 8 commits August 26, 2025 13:44

Rename duration_seconds to time_window

d284e8e

Update pipeline to check for RUNNING status

e85c870

Clean up inferencer with correct metric recording on inference comple…

2bd9e54

…tion

Moved pipeline tests to unit tests

0b6c18c

Allowed LatencyMetric to be None

aeb3d8e

removed redundant comments

aca4671

Mypy fixes

216034f

Test fixes

54ccf07

itallix reviewed Aug 26, 2025

View reviewed changes

backend/app/services/metrics_collector.py Outdated Show resolved Hide resolved

A-Artemis added 2 commits September 1, 2025 15:38

Refactor MetricsCollector to use shared memory for process-safe laten…

d3c4470

…cy measurement

Fix type annotation for shared memory array in MetricsCollector

09e3550

leoll2 requested a review from Copilot September 1, 2025 13:49

Copilot AI reviewed Sep 1, 2025

View reviewed changes

backend/app/services/pipeline_service.py Outdated Show resolved Hide resolved

backend/tests/unit/services/test_pipeline_service.py Outdated Show resolved Hide resolved

A-Artemis added 3 commits September 1, 2025 16:17

Merge branch 'develop' of github.com:open-edge-platform/training_exte…

c0732ef

…nsions into aurelien/4537-model-latency-endpoint

Grammar correction

2ceff8f

Fix shared memory array initialization in MetricsCollector

c950acb

itallix reviewed Sep 1, 2025

View reviewed changes

backend/app/services/metrics_collector.py Outdated Show resolved Hide resolved

backend/app/services/metrics_collector.py Outdated Show resolved Hide resolved

leoll2 reviewed Sep 1, 2025

View reviewed changes

backend/app/services/pipeline_service.py Outdated Show resolved Hide resolved

backend/app/services/metrics_collector.py Show resolved Hide resolved

A-Artemis added 5 commits September 2, 2025 09:17

MetricsCollector is no longer a singleton

a82cd33

Removed check for existing model id

5c5e210

Corrected shared memory initialisation and name sharing

4908981

Fixed mypy issue

7986672

Merge branch 'develop' of github.com:open-edge-platform/training_exte…

120fab8

…nsions into aurelien/4537-model-latency-endpoint # Conflicts: # backend/app/workers/inference.py

itallix reviewed Sep 2, 2025

View reviewed changes

backend/app/core/scheduler.py Outdated Show resolved Hide resolved

backend/tests/unit/services/test_metrics_collector.py Outdated Show resolved Hide resolved

backend/app/services/pipeline_service.py Outdated Show resolved Hide resolved

A-Artemis added 3 commits September 2, 2025 13:39

removed print

8e1ff35

Renamed to MetricsServices. Corrected MetricsService initialisation

2d8abc6

Moved multiprocessing test fixture, and skipped tests if on win32 sys.

bfbcbd9

leoll2 previously approved these changes Sep 3, 2025

View reviewed changes

Added endpoint tests

f2224f2

A-Artemis dismissed leoll2’s stale review via f2224f2 September 3, 2025 08:25

itallix approved these changes Sep 3, 2025

View reviewed changes

leoll2 approved these changes Sep 3, 2025

View reviewed changes

A-Artemis merged commit 6b23297 into develop Sep 3, 2025
24 of 25 checks passed

A-Artemis deleted the aurelien/4537-model-latency-endpoint branch September 3, 2025 13:11

Add model latency endpoint #4599

Add model latency endpoint #4599

Uh oh!

Conversation

A-Artemis commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How to test

Checklist

License

Uh oh!

github-actions bot commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Docker Image Sizes

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov-commenter commented Aug 26, 2025

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

itallix left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

itallix left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

A-Artemis commented Aug 25, 2025 •

edited

Loading

github-actions bot commented Aug 25, 2025 •

edited

Loading