Skip to content

Conversation

@itallix
Copy link
Contributor

@itallix itallix commented Jan 30, 2026

Summary

  • Implemented staged datasets endpoints
  • Service works only with zipped archives

Checklist

  • The PR title and description are clear and descriptive
  • I have manually tested the changes
  • All changes are covered by automated tests
  • All related issues are linked to this PR (if applicable)
  • Documentation has been updated (if applicable)

Copilot AI review requested due to automatic review settings January 30, 2026 12:58
@itallix itallix requested review from a team, jpggvilaca and leoll2 as code owners January 30, 2026 12:58
@itallix itallix linked an issue Jan 30, 2026 that may be closed by this pull request
@github-actions github-actions bot added TEST Any changes in tests DOC Improvements or additions to documentation Geti Tune Backend Issues related to Geti Tune backend labels Jan 30, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a complete set of REST API endpoints for managing staged datasets, enabling users to upload, list, retrieve, download, and delete dataset archives in a staging area.

Changes:

  • Added new API endpoints for staged dataset operations (upload, list, get, download, delete)
  • Implemented StagedDatasetService for dataset file management and metadata inference
  • Added comprehensive unit and integration test coverage for the new endpoints and service

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
application/docs/api.md Added API documentation for staged datasets endpoints
application/backend/tests/unit/routers/test_dataset_ie.py Added unit tests for staged dataset API endpoints
application/backend/tests/unit/routers/conftest.py Added type hint to test client fixture
application/backend/tests/integration/services/test_staged_dataset_service.py Added integration tests for staged dataset service
application/backend/app/settings.py Added staged_datasets_dir configuration setting
application/backend/app/services/staged_dataset_service.py Implemented service for managing staged dataset files and metadata
application/backend/app/services/init.py Exported StagedDatasetService from services module
application/backend/app/models/dataset.py Added DatasetFormat enum and StagedDataset model
application/backend/app/models/init.py Exported dataset models from models module
application/backend/app/api/schemas/dataset.py Added model validator to populate StagedDatasetView from StagedDataset
application/backend/app/api/routers/dataset_ie.py Implemented REST API endpoints for staged dataset operations
application/backend/app/api/io_utils.py Added file_iterator utility for streaming file responses
application/backend/app/api/dependencies.py Added dependency injection for StagedDatasetService

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

github-actions bot commented Jan 30, 2026

📊 Test coverage report

Metric Coverage
Lines 34.6%
Functions 73.8%
Branches 88.3%
Statements 34.6%

@github-actions
Copy link

github-actions bot commented Jan 30, 2026

Docker Image Sizes

CPU

Image Size
geti-tune-cpu:pr-5370 2.97G
geti-tune-cpu:sha-bd985e4 2.97G

GPU

Image Size
geti-tune-gpu:pr-5370 10.95G
geti-tune-gpu:sha-bd985e4 10.95G

XPU

Image Size
geti-tune-xpu:pr-5370 9.76G
geti-tune-xpu:sha-bd985e4 9.76G

Copy link
Contributor

@leoll2 leoll2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small comments, LGTM overall

@itallix itallix requested a review from leoll2 February 9, 2026 14:50
@itallix itallix added this pull request to the merge queue Feb 9, 2026
@itallix itallix removed this pull request from the merge queue due to a manual request Feb 10, 2026
@itallix itallix added this pull request to the merge queue Feb 10, 2026
Merged via the queue into develop with commit b21bdaa Feb 10, 2026
32 checks passed
@itallix itallix deleted the vitalii/5267-staged-datasets branch February 10, 2026 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DOC Improvements or additions to documentation Geti Tune Backend Issues related to Geti Tune backend TEST Any changes in tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement endpoints to manage staged datasets

2 participants