Skip to content

Latest commit

 

History

History
61 lines (51 loc) · 1.75 KB

File metadata and controls

61 lines (51 loc) · 1.75 KB

REST API Streaming with Lambda Custom Runtime + Bedrock

A compact POC that streams HTTP responses from API Gateway (REST) using a custom Python Lambda runtime inside a container image. It supports:

  • Response streaming (chunked HTTP) from API Gateway
  • Optional chaining to a second Lambda via Function URL
  • Bedrock streaming via converse_stream

Architecture (current behavior)

  • API Gateway (REST) → Lambda 1 (custom runtime)
  • Lambda 1 streams back to API Gateway via Runtime API
  • Optional: Lambda 1 → Lambda 2 (Function URL) → streams back
  • Bedrock calls use bedrock-runtime.converse_stream

Prerequisites

  • AWS CLI configured
  • Docker running
  • Bedrock access in the region you use
  • Inference profiles for models that require them (e.g., Amazon Nova)

Deploy

chmod +x deploy_streaming.sh
./deploy_streaming.sh

The script will:

  1. Build & push container image to ECR
  2. Deploy CloudFormation (API Gateway + 2 Lambdas)
  3. Print endpoint and API key

Required Env Vars

The script exits if these are missing:

export BEDROCK_PROFILE_FIRST="arn-or-id"
export BEDROCK_PROFILE_SECOND="arn-or-id"

Optional overrides:

export AWS_REGION=us-east-1
export BEDROCK_REGION=us-east-1
export BEDROCK_MODEL_FIRST="amazon.nova-micro-v1:0"
export BEDROCK_MODEL_SECOND="amazon.nova-micro-v1:0"
export IMAGE_TAG=auto   # auto = timestamp

Test

curl -N -H "x-api-key: $API_KEY_VALUE" \
  "$BASE_URL/stream?prompt=Hello"

Route through the second Lambda:

curl -N -H "x-api-key: $API_KEY_VALUE" \
  "$BASE_URL/stream?use_second=true&prompt=Hello"

Notes

  • If Bedrock returns throttling/quota errors, the request is rejected before streaming starts.
  • If you see Internal server error, check Lambda logs.