Skip to content

Conversation

@deanq
Copy link
Contributor

@deanq deanq commented Jan 14, 2026

Prerequisite: runpod/tetra-rp#140

Summary

Implements dual-mode runtime support enabling worker-tetra Docker images to serve both Live Serverless and Flash Deployed Apps through a unified handler architecture.

Please see docs/Runtime_Execution_Paths.md

Implementation

Deployment Mode Detection

  • Flash Detection: Checks for FLASH_IS_MOTHERSHIP, FLASH_MOTHERSHIP_ID, or FLASH_RESOURCE_NAME environment variables
  • Live Serverless: Has RUNPOD_* vars but NO FLASH_* vars → skips artifact unpacking
  • Local Development: No RUNPOD_* vars → skips artifact unpacking

Handler Integration

  • Both handler.py and lb_handler.py call maybe_unpack() at module-level initialization
  • Thread-safe extraction with double-checked locking pattern
  • Idempotent: multiple calls only extract once

Request Routing

  • Live Serverless: Request includes function_code or class_code → Dynamic execution
  • Flash Deployed: Request omits code fields → Manifest-based routing to pre-deployed code
  • Single RemoteExecutor handles both modes automatically

Flash Function Execution

  • O(1) function lookup via flash_manifest.json registry
  • Dynamic import from /app directory
  • Handles both sync and async functions
  • Same cloudpickle serialization as Live Serverless

Changes

Core Implementation

  • src/unpack_volume.py: Updated _should_unpack_from_volume() to distinguish Flash from Live Serverless
  • src/handler.py: Added maybe_unpack() call at module-level
  • src/lb_handler.py: Added maybe_unpack() call at module-level
  • src/remote_executor.py: Added Flash detection logic and _execute_flash_function() method

Testing

  • Unit Tests: Added 15+ tests for Flash detection and execution logic
  • Integration Tests: Added 7 tests for dual-mode coexistence
  • Coverage: All existing 474+ tests pass, coverage maintained

Documentation

  • docs/Runtime_Execution_Paths.md: New documentation for dual-mode architecture

Backward Compatibility

Fully backward compatible - All existing Live Serverless requests continue working unchanged:

  • Requests with function_code use existing dynamic execution path
  • No protocol changes required
  • Local development unaffected

Verification

Deployment Modes Tested

  • ✅ Local dev (no unpacking)
  • ✅ Live Serverless (no unpacking, RUNPOD_* only)
  • ✅ Flash Mothership (unpacking, FLASH_IS_MOTHERSHIP=true)
  • ✅ Flash Child (unpacking, FLASH_MOTHERSHIP_ID set)

Test Results

  • All 474+ existing tests pass
  • 7 new integration tests pass
  • 15+ new unit tests pass
  • Code formatting, linting, and type checks pass

Related

Test Plan

Manual Testing:

# In Flash deployment (FLASH_* env vars set)
ls /app/  # Should see extracted artifacts
python -c "import sys; print('/app' in sys.path)"  # Should be True

# Test both modes
curl -X POST /execute -d '{"function_code": "def f(): return 1"}'  # Live Serverless
curl -X POST /execute -d '{"function_name": "my_flash_func"}'      # Flash Deployed

CI/CD:

  • All automated tests run on PR
  • Docker images build successfully
  • Handler tests validate both execution modes

deanq added 11 commits January 13, 2026 22:26
Update deployment detection logic in unpack_volume.py to properly
distinguish between Flash Deployed Apps and Live Serverless mode.

Changes:
- Check for Flash-specific environment variables (FLASH_IS_MOTHERSHIP,
  FLASH_MOTHERSHIP_ID, FLASH_RESOURCE_NAME)
- Add debug logging for deployment mode detection
- Update tests to cover all deployment scenarios

Environment variable matrix:
- Local Dev: No RUNPOD_* → skip unpacking
- Live Serverless: Has RUNPOD_* but no FLASH_* → skip unpacking
- Flash Deployed: Has RUNPOD_* AND any FLASH_* → unpack artifacts

This prevents Live Serverless deployments from incorrectly attempting
to unpack Flash artifacts, which would cause errors.
Add maybe_unpack() call at handler initialization to extract Flash
deployment artifacts from shadow volumes.

The unpacking is:
- Conditional: Only runs for Flash deployments (detected via FLASH_* env vars)
- Thread-safe: Uses double-checked locking
- Idempotent: Multiple calls only extract once
- Non-breaking: No-op for Live Serverless and local development

Part of AE-1348: Enable dual-mode runtime for Live Serverless
and Flash Deployed Apps.
Add maybe_unpack() call at Load Balancer handler initialization to
extract Flash deployment artifacts from shadow volumes.

The unpacking is:
- Conditional: Only runs for Flash deployments (detected via FLASH_* env vars)
- Thread-safe: Uses double-checked locking
- Idempotent: Multiple calls only extract once
- Non-breaking: No-op for Live Serverless and local development

Part of AE-1348: Enable dual-mode runtime for Live Serverless
and Flash Deployed Apps.
Implement dual-mode runtime support for both Live Serverless and Flash Deployed Apps.

Changes:
- Add Flash detection in ExecuteFunction() that checks for presence of function_code or class_code
- Flash mode: neither function_code nor class_code present → route to Flash path
- Live Serverless mode: either field present → route to existing execution path
- Implement _execute_flash_function() for Flash execution:
  - Load flash_manifest.json from /app directory
  - Lookup function in function_registry (O(1) lookup)
  - Import function from module path
  - Handle both async and sync functions
  - Deserialize args/kwargs using SerializationUtils
  - Return serialized results via FunctionResponse
- Implement _load_flash_manifest() helper for manifest loading
- Add comprehensive error handling and logging

This enables worker-tetra images to serve both deployment modes without code changes.

Tasks: 3-4 from AE-1348 implementation plan
Add comprehensive unit tests for Flash vs Live Serverless routing logic.

Changes:
- Add 4 new tests for Flash detection routing:
  - test_flash_detection_routes_to_flash_path_function
  - test_flash_detection_routes_to_flash_path_class
  - test_live_serverless_detection_with_function_code
  - test_live_serverless_detection_with_class_code
- Update existing validation test to reflect optional code fields
  - function_code and class_code are optional for Flash deployments
  - Only function_name/class_name remain required

Task: 5 from AE-1348 implementation plan
Add comprehensive tests for Flash function execution and manifest loading.

Changes:
- Add 7 new tests for Flash execution methods:
  - test_flash_execution_success: successful function execution
  - test_flash_execution_function_not_in_registry: missing function error
  - test_flash_execution_function_not_in_resource: registry/resource mismatch error
  - test_flash_execution_async_function: async function handling
  - test_flash_execution_handles_exception: exception handling
  - test_load_flash_manifest_success: manifest loading
  - test_load_flash_manifest_not_found: missing manifest error
- Add mock_open import for manifest file mocking

Task: 6 from AE-1348 implementation plan
Add proper type annotations to satisfy mypy type checker.

Changes:
- Add assertion for non-None function_name before getattr
- Update _load_flash_manifest return type to dict[str, Any]
- Add explicit type annotation for manifest variable
- Apply ruff formatting to remote_executor.py and unpack_volume.py

All quality checks pass: format, lint, type, tests (198/198), coverage (80%)
Add comprehensive integration tests for Flash and Live Serverless coexistence.

Changes:
- Add test_flash_integration.py with 7 integration tests:
  - test_dual_mode_coexistence: same executor handles both modes
  - test_flash_execution_end_to_end_with_manifest: manifest-based routing
  - test_flash_execution_with_async_function: async function handling
  - test_flash_manifest_missing_function: error handling
  - test_flash_manifest_file_not_found: missing manifest error
  - test_flash_function_import_failure: import error handling
  - test_live_serverless_backward_compatibility: existing behavior preserved

Task: 7 from AE-1348 implementation plan
Document unified handler architecture for Flash and Live Serverless coexistence.

Changes:
- Add unified handler flow diagram with mode detection branching
- Document deployment mode detection with environment variables table
- Add request format examples comparing both modes
- Document shared cloudpickle serialization protocol
- List all key files involved in dual-mode support

Task 8 from AE-1348 implementation plan
Updates tetra-rp to commit f34f046 which makes function_code and class_code optional
for Flash deployment requests.

This fix resolves CI/CD test failures where FunctionRequest validation was incorrectly
requiring code fields even for Flash deployments where code is pre-deployed.
@deanq deanq requested a review from Copilot January 14, 2026 09:59
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements dual-mode runtime support enabling worker-tetra Docker images to serve both Live Serverless and Flash Deployed Apps through a unified handler architecture. The implementation adds automatic deployment mode detection based on environment variables and routes requests to the appropriate execution path without requiring protocol changes.

Changes:

  • Enhanced deployment mode detection to distinguish Flash deployments from Live Serverless based on FLASH_* environment variables
  • Added Flash function execution path in RemoteExecutor with manifest-based routing
  • Integrated maybe_unpack() calls in both handlers to support Flash artifact extraction
  • Made function_code optional in FunctionRequest to support Flash deployments
  • Added comprehensive test coverage for dual-mode coexistence and Flash execution

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tetra-rp Updated subproject commit reference
tests/unit/test_unpack_volume.py Updated tests to validate Flash-specific environment variable detection and distinguish Flash from Live Serverless
tests/unit/test_remote_executor.py Added tests for Flash detection logic, routing, and execution including manifest loading and error handling
tests/unit/test_remote_execution.py Made function_code optional in FunctionRequest validation to support Flash deployments
tests/integration/test_flash_integration.py Added integration tests for dual-mode coexistence, Flash execution paths, and backward compatibility
src/unpack_volume.py Enhanced _should_unpack_from_volume() to detect Flash deployments via FLASH_* environment variables
src/remote_executor.py Added Flash detection and execution logic with _execute_flash_function() and _load_flash_manifest() methods
src/lb_handler.py Added module-level maybe_unpack() call for Flash artifact extraction
src/handler.py Added module-level maybe_unpack() call for Flash artifact extraction
docs/Runtime_Execution_Paths.md Added comprehensive documentation for dual-mode architecture and execution flows

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
AE-1744: Update tetra-rp submodule to track main branch for latest SDK features
@deanq deanq merged commit fd568c2 into main Jan 16, 2026
18 checks passed
@deanq deanq deleted the deanq/ae-1348-boot-app-serve-role branch January 16, 2026 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants