Skip to content

Conversation

@deanq
Copy link
Contributor

@deanq deanq commented Jan 4, 2026

Prerequisite: runpod/tetra-rp#131

Summary

Implements complete Docker infrastructure and testing support for LiveLoadBalancerSLSResource remote code execution capabilities. This enables local development and testing of Load Balancer-based serverless functions with HTTP /execute endpoint before production deployment.

Key Components

  • FastAPI Handler (src/lb_handler.py): HTTP-based /execute endpoint for synchronous function execution with /health check
  • Dockerfile-lb: Docker image based on PyTorch CUDA runtime with FastAPI and uvicorn support
  • Testing Infrastructure (src/test-lb-handler.sh): Comprehensive test suite mirroring test-handler.sh pattern
  • CI/CD Pipeline: Automated GitHub Actions jobs for building and pushing Load Balancer images across test, main, and production stages
  • Smoketest Support: New make targets (smoketest-macos-lb-build, smoketest-macos-lb) for local Docker validation
  • Documentation: Complete architectural guides on Load Balancer runtime and CI/CD pipeline

Technical Details

  • Direct HTTP POST to /execute endpoint with FunctionRequest JSON payloads
  • Supports both direct and RunPod-wrapped request formats for compatibility
  • Full dependency installation support (Python packages via uv, system packages via apt/nala)
  • Port 8000 exposure for HTTP communication
  • CMD placeholder strategy allowing RunPod to override with specific handlers at runtime

Test Results

  • ✅ All 14 handler tests passing locally
  • ✅ 13/14 tests passing in Docker smoketest (1 minor numpy import environment issue)
  • ✅ All quality checks passing (format, lint, typecheck, coverage)

Test Plan

  • Local testing with make test-lb-handler
  • Docker smoketest with make smoketest-macos-lb
  • Quality gates passing with make quality-check
  • CI/CD jobs validate image builds
  • Production deployment verification

Related Issues

Implements infrastructure for runpod/tetra-rp#131

deanq added 6 commits January 3, 2026 23:05
…astructure

Add complete Docker infrastructure and testing support for LiveLoadBalancerSLSResource remote code execution:

- **FastAPI Handler** (src/lb_handler.py): HTTP-based /execute endpoint for synchronous function execution
- **Dockerfile-lb**: Docker image based on PyTorch CUDA runtime with FastAPI/uvicorn support
- **Testing**: Comprehensive test suite with test-lb-handler.sh script mirroring test-handler.sh pattern
- **CI/CD Pipeline**: Automated GitHub Actions jobs for building and pushing Load Balancer images (test, main, prod stages)
- **Smoketest Support**: New make targets (smoketest-macos-lb-build, smoketest-macos-lb) for local Docker validation
- **Documentation**: Complete guides on Load Balancer runtime and CI/CD pipeline architecture

Key Features:
- Direct HTTP POST to /execute endpoint with FunctionRequest JSON payloads
- Support for both direct and RunPod-wrapped request formats
- Full dependency installation (Python and system packages)
- Port 8000 exposure for HTTP communication
- All 14 handler tests passing locally; 13/14 passing in Docker container

This enables local development and testing of Load Balancer-based serverless functions before production deployment.
Implement CPU-only version of Load Balancer runtime (Dockerfile-lb-cpu), mirroring the pattern of Dockerfile vs Dockerfile-cpu for queue-based workers.

Changes:
- Create Dockerfile-lb-cpu with python:3.12-slim base (no GPU/CUDA dependencies)
- Add build-lb-cpu and smoketest-macos-lb-cpu make targets
- Add docker-test-lb-cpu, docker-main-lb-cpu, docker-prod-lb-cpu CI/CD jobs
- Update Docker_Build_Pipeline.md documentation to include CPU Load Balancer images
- Support builds for development (:main tag) and releases (semantic versioning)

This provides lightweight CPU-only option alongside GPU Load Balancer for deployments without GPU requirements.
…ndler

The uvicorn CMD in both Dockerfile-lb and Dockerfile-lb-cpu was referencing 'handler:app' but the actual module is 'lb_handler.py'. This caused ASGI app loading errors when workers were deployed.

Change both Dockerfiles to use 'lb_handler:app' so uvicorn can properly find and load the FastAPI app.
Remove unnecessary /health endpoint and update docstring.
Keep /ping (required by RunPod) and /execute (for local dev).

All 14 handler tests passing.
Update both Dockerfile-lb and Dockerfile-lb-cpu to expose and listen on port 80
instead of 8000, per RunPod Load Balancer specification.

Changes:
- EXPOSE 80 instead of 8000
- CMD uses --port 80 instead of --port 8000
- Update inline comments to reflect correct port
Update the if __name__ == '__main__' uvicorn server to use port 80,
consistent with Dockerfile and RunPod Load Balancer specification.
@deanq deanq requested a review from Copilot January 4, 2026 22:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements the complete Docker infrastructure and testing support for Load Balancer-based serverless function execution with HTTP endpoints. It enables local development and testing of functions via a /execute endpoint before production deployment.

Key changes:

  • FastAPI-based HTTP handler with /ping health check and /execute endpoints for remote function execution
  • New Dockerfiles (Dockerfile-lb, Dockerfile-lb-cpu) supporting both GPU and CPU Load Balancer runtimes
  • Comprehensive CI/CD pipeline with automated testing, building, and pushing of Load Balancer images across test/main/production stages

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tetra-rp Updated submodule reference to branch with Load Balancer resource implementation
src/lb_handler.py New FastAPI handler providing /ping and /execute endpoints for HTTP-based function execution
src/test-lb-handler.sh Test script for validating Load Balancer handler with comprehensive endpoint testing
src/test-handler.sh Updated to use uv-managed Python for consistency
src/tests/test_lb_simple_function.json Test fixture for simple addition function execution
pyproject.toml Added FastAPI and uvicorn dependencies for HTTP handler support
Dockerfile-lb GPU-enabled Load Balancer Docker image with PyTorch CUDA runtime
Dockerfile-lb-cpu CPU-only Load Balancer Docker image
Makefile New build/test targets for Load Balancer images and smoketests
.github/workflows/ci.yml CI/CD jobs for testing and deploying Load Balancer images
docs/Load_Balancer_Docker_Infrastructure.md Architecture documentation for Load Balancer runtime
docs/Docker_Build_Pipeline.md CI/CD pipeline documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

deanq added 9 commits January 4, 2026 17:39
…script and documentation

This fixes the port inconsistency identified in PR #45 review comments. The Load Balancer
uses port 80 by default (RunPod standard), but the test script and documentation incorrectly
referenced port 8000.

Changes:
- Updated test-lb-handler.sh to use PORT=80
- Updated all documentation examples and references to use port 80 instead of 8000
- Aligns test infrastructure with actual Dockerfile configuration

Resolves all 4 review comments on PR #45.
The --python flag specifies which Python interpreter to install packages to.
Without it, uv pip install --system was returning success but not actually
installing packages to the system site-packages. This caused import errors
when trying to use installed dependencies.

Fixes test_dependencies.json failures in CI/CD.
The --python flag specifies which Python interpreter to install packages to.
Without it, uv pip install --system was returning success but not actually
installing packages to the system site-packages. This caused import errors
when trying to use installed dependencies.

Fixes test_dependencies.json failures in CI/CD.
…tial package detection

Implement differential installation to avoid unnecessary package reinstalls.
The _filter_installed_packages method checks if packages are already installed
before attempting installation, preventing redundant operations like reinstalling
PyTorch in pytorch-based Docker images.

Changes:
- Filter out already-installed packages before installation
- Simplify Docker installation to use standard uv pip and pip commands
- Add package import check to detect existing installations
- Handle package name variations (e.g., package-name vs package_name)

Fixes test_dependencies.json failures in CI/CD.
The differential installation filter was interfering with package installation
in Docker. Reverting to main's proven working approach which uses:
- uv pip install --system (with accelerate_downloads)
- pip install (without acceleration)

This ensures numpy and other dependencies install correctly in the Docker runtime.
The previous version used 'uv run python3 handler.py' which created an
isolated environment that didn't have access to system-installed packages.
In Docker, packages are installed via 'uv pip install --system' to the system
site-packages, so using 'uv run' prevented the tests from finding dependencies
like numpy.

This fix detects the environment and:
- Uses 'python handler.py' in Docker (where python is available with system packages)
- Uses 'uv run python3 handler.py' locally (where uv manages dependencies)

Fixes: Test failures in docker-test CI/CD job
@deanq deanq merged commit 7cfe1b7 into main Jan 8, 2026
18 checks passed
@deanq deanq deleted the deanq/ae-1102-live-load-balancer branch January 8, 2026 01:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants