feat(load-balancer): implement Live Load Balancer runtime Docker infrastructure #45

deanq · 2026-01-04T07:08:07Z

Prerequisite: runpod/tetra-rp#131

Summary

Implements complete Docker infrastructure and testing support for LiveLoadBalancerSLSResource remote code execution capabilities. This enables local development and testing of Load Balancer-based serverless functions with HTTP /execute endpoint before production deployment.

Key Components

FastAPI Handler (src/lb_handler.py): HTTP-based /execute endpoint for synchronous function execution with /health check
Dockerfile-lb: Docker image based on PyTorch CUDA runtime with FastAPI and uvicorn support
Testing Infrastructure (src/test-lb-handler.sh): Comprehensive test suite mirroring test-handler.sh pattern
CI/CD Pipeline: Automated GitHub Actions jobs for building and pushing Load Balancer images across test, main, and production stages
Smoketest Support: New make targets (smoketest-macos-lb-build, smoketest-macos-lb) for local Docker validation
Documentation: Complete architectural guides on Load Balancer runtime and CI/CD pipeline

Technical Details

Direct HTTP POST to /execute endpoint with FunctionRequest JSON payloads
Supports both direct and RunPod-wrapped request formats for compatibility
Full dependency installation support (Python packages via uv, system packages via apt/nala)
Port 8000 exposure for HTTP communication
CMD placeholder strategy allowing RunPod to override with specific handlers at runtime

Test Results

✅ All 14 handler tests passing locally
✅ 13/14 tests passing in Docker smoketest (1 minor numpy import environment issue)
✅ All quality checks passing (format, lint, typecheck, coverage)

Test Plan

Local testing with make test-lb-handler
Docker smoketest with make smoketest-macos-lb
Quality gates passing with make quality-check
CI/CD jobs validate image builds
Production deployment verification

Related Issues

Implements infrastructure for runpod/tetra-rp#131

…astructure Add complete Docker infrastructure and testing support for LiveLoadBalancerSLSResource remote code execution: - **FastAPI Handler** (src/lb_handler.py): HTTP-based /execute endpoint for synchronous function execution - **Dockerfile-lb**: Docker image based on PyTorch CUDA runtime with FastAPI/uvicorn support - **Testing**: Comprehensive test suite with test-lb-handler.sh script mirroring test-handler.sh pattern - **CI/CD Pipeline**: Automated GitHub Actions jobs for building and pushing Load Balancer images (test, main, prod stages) - **Smoketest Support**: New make targets (smoketest-macos-lb-build, smoketest-macos-lb) for local Docker validation - **Documentation**: Complete guides on Load Balancer runtime and CI/CD pipeline architecture Key Features: - Direct HTTP POST to /execute endpoint with FunctionRequest JSON payloads - Support for both direct and RunPod-wrapped request formats - Full dependency installation (Python and system packages) - Port 8000 exposure for HTTP communication - All 14 handler tests passing locally; 13/14 passing in Docker container This enables local development and testing of Load Balancer-based serverless functions before production deployment.

Implement CPU-only version of Load Balancer runtime (Dockerfile-lb-cpu), mirroring the pattern of Dockerfile vs Dockerfile-cpu for queue-based workers. Changes: - Create Dockerfile-lb-cpu with python:3.12-slim base (no GPU/CUDA dependencies) - Add build-lb-cpu and smoketest-macos-lb-cpu make targets - Add docker-test-lb-cpu, docker-main-lb-cpu, docker-prod-lb-cpu CI/CD jobs - Update Docker_Build_Pipeline.md documentation to include CPU Load Balancer images - Support builds for development (:main tag) and releases (semantic versioning) This provides lightweight CPU-only option alongside GPU Load Balancer for deployments without GPU requirements.

…ndler The uvicorn CMD in both Dockerfile-lb and Dockerfile-lb-cpu was referencing 'handler:app' but the actual module is 'lb_handler.py'. This caused ASGI app loading errors when workers were deployed. Change both Dockerfiles to use 'lb_handler:app' so uvicorn can properly find and load the FastAPI app.

Remove unnecessary /health endpoint and update docstring. Keep /ping (required by RunPod) and /execute (for local dev). All 14 handler tests passing.

Update both Dockerfile-lb and Dockerfile-lb-cpu to expose and listen on port 80 instead of 8000, per RunPod Load Balancer specification. Changes: - EXPOSE 80 instead of 8000 - CMD uses --port 80 instead of --port 8000 - Update inline comments to reflect correct port

Update the if __name__ == '__main__' uvicorn server to use port 80, consistent with Dockerfile and RunPod Load Balancer specification.

Copilot

Pull request overview

This PR implements the complete Docker infrastructure and testing support for Load Balancer-based serverless function execution with HTTP endpoints. It enables local development and testing of functions via a /execute endpoint before production deployment.

Key changes:

FastAPI-based HTTP handler with /ping health check and /execute endpoints for remote function execution
New Dockerfiles (Dockerfile-lb, Dockerfile-lb-cpu) supporting both GPU and CPU Load Balancer runtimes
Comprehensive CI/CD pipeline with automated testing, building, and pushing of Load Balancer images across test/main/production stages

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tetra-rp	Updated submodule reference to branch with Load Balancer resource implementation
src/lb_handler.py	New FastAPI handler providing `/ping` and `/execute` endpoints for HTTP-based function execution
src/test-lb-handler.sh	Test script for validating Load Balancer handler with comprehensive endpoint testing
src/test-handler.sh	Updated to use uv-managed Python for consistency
src/tests/test_lb_simple_function.json	Test fixture for simple addition function execution
pyproject.toml	Added FastAPI and uvicorn dependencies for HTTP handler support
Dockerfile-lb	GPU-enabled Load Balancer Docker image with PyTorch CUDA runtime
Dockerfile-lb-cpu	CPU-only Load Balancer Docker image
Makefile	New build/test targets for Load Balancer images and smoketests
.github/workflows/ci.yml	CI/CD jobs for testing and deploying Load Balancer images
docs/Load_Balancer_Docker_Infrastructure.md	Architecture documentation for Load Balancer runtime
docs/Docker_Build_Pipeline.md	CI/CD pipeline documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Dockerfile-lb

Dockerfile-lb-cpu

docs/Load_Balancer_Docker_Infrastructure.md

…script and documentation This fixes the port inconsistency identified in PR #45 review comments. The Load Balancer uses port 80 by default (RunPod standard), but the test script and documentation incorrectly referenced port 8000. Changes: - Updated test-lb-handler.sh to use PORT=80 - Updated all documentation examples and references to use port 80 instead of 8000 - Aligns test infrastructure with actual Dockerfile configuration Resolves all 4 review comments on PR #45.

…ction branch

The --python flag specifies which Python interpreter to install packages to. Without it, uv pip install --system was returning success but not actually installing packages to the system site-packages. This caused import errors when trying to use installed dependencies. Fixes test_dependencies.json failures in CI/CD.

…/runpod-workers/worker-tetra into deanq/ae-1102-live-load-balancer

…tial package detection Implement differential installation to avoid unnecessary package reinstalls. The _filter_installed_packages method checks if packages are already installed before attempting installation, preventing redundant operations like reinstalling PyTorch in pytorch-based Docker images. Changes: - Filter out already-installed packages before installation - Simplify Docker installation to use standard uv pip and pip commands - Add package import check to detect existing installations - Handle package name variations (e.g., package-name vs package_name) Fixes test_dependencies.json failures in CI/CD.

…/runpod-workers/worker-tetra into deanq/ae-1102-live-load-balancer

The differential installation filter was interfering with package installation in Docker. Reverting to main's proven working approach which uses: - uv pip install --system (with accelerate_downloads) - pip install (without acceleration) This ensures numpy and other dependencies install correctly in the Docker runtime.

The previous version used 'uv run python3 handler.py' which created an isolated environment that didn't have access to system-installed packages. In Docker, packages are installed via 'uv pip install --system' to the system site-packages, so using 'uv run' prevented the tests from finding dependencies like numpy. This fix detects the environment and: - Uses 'python handler.py' in Docker (where python is available with system packages) - Uses 'uv run python3 handler.py' locally (where uv manages dependencies) Fixes: Test failures in docker-test CI/CD job

deanq added 6 commits January 3, 2026 23:05

refactor(lb-handler): simplify to single /ping endpoint

60458d6

Remove unnecessary /health endpoint and update docstring. Keep /ping (required by RunPod) and /execute (for local dev). All 14 handler tests passing.

fix: use port 80 in development server

6f095d7

Update the if __name__ == '__main__' uvicorn server to use port 80, consistent with Dockerfile and RunPod Load Balancer specification.

deanq requested a review from Copilot January 4, 2026 22:43

Copilot AI reviewed Jan 4, 2026

View reviewed changes

Dockerfile-lb Show resolved Hide resolved

Dockerfile-lb-cpu Show resolved Hide resolved

docs/Load_Balancer_Docker_Infrastructure.md Outdated Show resolved Hide resolved

docs/Load_Balancer_Docker_Infrastructure.md Outdated Show resolved Hide resolved

deanq added 9 commits January 4, 2026 17:39

fix(load-balancer): update diagram port reference from 8000 to 80

078bd00

chore: update tetra-rp submodule to deanq/ae-1196-absolute-drift-dete…

7be6f36

…ction branch

Merge branch 'deanq/ae-1102-live-load-balancer' of https://github.com…

9dbb9a2

…/runpod-workers/worker-tetra into deanq/ae-1102-live-load-balancer

Merge branch 'deanq/ae-1102-live-load-balancer' of https://github.com…

a227a9f

…/runpod-workers/worker-tetra into deanq/ae-1102-live-load-balancer

deanq mentioned this pull request Jan 5, 2026

feat: complete @remote support for LoadBalancer endpoints runpod/tetra-rp#131

Merged

jhcipar approved these changes Jan 8, 2026

View reviewed changes

deanq merged commit 7cfe1b7 into main Jan 8, 2026
18 checks passed

deanq deleted the deanq/ae-1102-live-load-balancer branch January 8, 2026 01:55

runpod-workers-release-please-bot bot mentioned this pull request Jan 8, 2026

chore(main): release 0.7.3 #47

Merged

deanq mentioned this pull request Jan 14, 2026

feat: AE-1018 Add support for new serverless runtime #26

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(load-balancer): implement Live Load Balancer runtime Docker infrastructure #45

feat(load-balancer): implement Live Load Balancer runtime Docker infrastructure #45

Uh oh!

deanq commented Jan 4, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(load-balancer): implement Live Load Balancer runtime Docker infrastructure #45

feat(load-balancer): implement Live Load Balancer runtime Docker infrastructure #45

Uh oh!

Conversation

deanq commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Components

Technical Details

Test Results

Test Plan

Related Issues

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

deanq commented Jan 4, 2026 •

edited

Loading