Skip to content

Conversation

leggetter
Copy link
Collaborator

@leggetter leggetter commented Aug 24, 2025

Overview

This PR adds Redis cluster mode support to Outpost. All Redis interactions are now centralized behind a single abstraction that supports both single-node and cluster deployments.

Technical Changes

Redis Client Abstraction

  • Centralized Redis client factory in internal/redis/redis.go
  • Automatic cluster vs regular client selection based on configuration
  • Single interface for all Redis operations across services

Key Format Changes

Keys now use hash tags to ensure same-slot placement in Redis clusters.
Hash tags ({tenant}) ensure all tenant data maps to the same Redis cluster slot, enabling atomic transactions.

Before:

tenant:user123
tenant:user123:destinations
tenant:user123:destination:abc

After:

{user123}:tenant
{user123}:destinations
{user123}:destination:abc

Transaction Support

EntityStore operations restored to use transactions:

  • UpsertDestination: Atomic destination + summary updates
  • DeleteDestination: Atomic deletion with summary cleanup
  • DeleteTenant: Atomic multi-destination deletion

RSMQ Compatibility

  • Scheduler service maintains separate Redis client for RSMQ library compatibility
  • Uses legacy Redis package (github.com/go-redis/redis) for RSMQ operations
  • Automatic cluster/single-node client selection for RSMQ

Key Benefits

  • Universal Deployment: Same codebase works with single-node Redis, Redis clusters, and Azure Managed Redis
  • Atomic Operations: Multi-key transactions work in both deployment modes through hash slot co-location
  • Zero Code Changes: Services automatically use appropriate client based on configuration
  • Production Ready: Includes TLS support and cluster-aware error handling

Redis Deployment Support

Redis Type Before After
Single-Node Redis ✅ Full support ✅ Full support
Redis Cluster ❌ CROSSSLOT errors ✅ Full support
Azure Managed Redis ⚠️ Basic connectivity ✅ Full support

Files Changed

Core Changes

  • internal/models/entity.go - Hash-tagged keys and cluster transactions
  • internal/redis/redis.go - Cluster client selection and management
  • internal/scheduler/scheduler.go - RSMQ compatibility with cluster mode

Service Updates

16 service files updated to use centralized Redis abstraction:

  • internal/api/
  • internal/billing/
  • internal/connection/
  • internal/forwarder/
  • internal/logs/
  • internal/metrics/
  • internal/notification/
  • internal/ratelimit/
  • internal/request/
  • internal/transform/

Migration Tools

  • scripts/migrate-redis-keys.sh - Migrates legacy to hash-tagged keys
  • scripts/test-hash-slots.sh - Validates cluster hash slot distribution
  • scripts/test-transactions.sh - Tests cluster transaction functionality
  • REDIS_MIGRATION.md - Migration documentation

Configuration

Environment Variables

Redis cluster mode is controlled via environment variables:

# Standard Redis configuration
REDIS_HOST="redis"
REDIS_PORT="6379"
REDIS_PASSWORD="password"
REDIS_DATABASE="0"
REDIS_TLS_ENABLED="false"         # Enable TLS encryption (required for Azure Managed Redis)
REDIS_CLUSTER_ENABLED="false"     # Enable cluster mode (required for Azure Managed Redis)

Example Configurations

Single-Node Redis (Development/Small Scale):

REDIS_CLUSTER_ENABLED="false"
REDIS_TLS_ENABLED="false"

Azure Managed Redis Enterprise (Production):

REDIS_CLUSTER_ENABLED="true"
REDIS_TLS_ENABLED="true"
REDIS_HOST="your-azure-redis.redis.cache.windows.net"
REDIS_PORT="10000"

Cluster Mode Configuration

When REDIS_CLUSTER_ENABLED=true:

  • Uses redis.NewClusterClient() for cluster operations
  • Supports MOVED redirects and cluster topology discovery
  • Enables hash slot distribution for multi-key operations

When REDIS_CLUSTER_ENABLED=false:

  • Uses redis.NewClient() for single-node operations
  • Standard Redis client behavior

Implementation Details

Client Selection Logic

func New(ctx context.Context, config *RedisConfig) (redis.Cmdable, error) {
    if config.ClusterEnabled {
        return redis.NewClusterClient(options) // Cluster-aware client
    }
    return redis.NewClient(options)           // Single-node client
}

Hash Slot Compatibility

  • All tenant operations use same hash tag: {tenantID}:*
  • Ensures co-location in Redis cluster slots
  • Enables MULTI/EXEC transactions across tenant data

Test Status

✅ All Redis-related tests passing:

  • Unit tests updated for hash-tagged key format
  • Integration tests cover both single-node and cluster scenarios
  • RSMQ functionality verified with both client types

Note: Some PostgreSQL integration tests may fail due to test environment setup (unrelated to Redis changes)

Migration Guide

For New Deployments

  • Set REDIS_CLUSTER_ENABLED=true for cluster deployments
  • No migration needed - uses hash-tagged keys from start

For Existing Deployments

⚠️ Breaking Change: Key format has changed for cluster compatibility

Required Steps:

  1. Before deployment: Run scripts/migrate-redis-keys.sh to convert legacy keys
  2. Deploy updated code: New code only reads hash-tagged format ({tenant}:*)
  3. Validate: Use provided scripts to verify cluster functionality

Rollback: Legacy keys are preserved during migration for safety

Copy link

vercel bot commented Aug 24, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
outpost-docs Ready Ready Preview Comment Oct 1, 2025 3:39am
outpost-website Ready Ready Preview Comment Oct 1, 2025 3:39am

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive Redis cluster support to Outpost, enabling deployment on modern Redis cluster services like Azure Managed Redis while maintaining backward compatibility with single-node Redis. The migration strategy uses hash-tagged keys to ensure same-tenant operations occur on the same Redis cluster slot, restoring transaction support for atomic operations.

  • Centralizes Redis client management with automatic cluster vs single-node selection
  • Migrates key format from tenant:* to {tenant}:* for cluster compatibility
  • Restores atomic transactions for destination and tenant operations

Reviewed Changes

Copilot reviewed 40 out of 40 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
internal/redis/redis.go Centralized Redis client factory with cluster support
internal/models/entity.go Hash-tagged keys and restored transaction operations
internal/scheduler/scheduler.go RSMQ compatibility with cluster mode
scripts/migrate-redis-keys.sh Key migration script for deployment
scripts/test-*.sh Validation tools for cluster functionality
internal/rsmq/rsmq.go Cluster client compatibility and improved logging
Service files (16 files) Updated to use centralized Redis abstraction
Configuration files Added TLS and cluster enable flags
Comments suppressed due to low confidence (1)

internal/models/entity.go:1

  • Executing commands individually instead of using transactions reduces atomicity and may impact performance. Consider using cluster-compatible transactions with same-hash-tag keys instead.
package models

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +18 to +20
# Build redis-cli command
if [ "$REDIS_PORT" = "10000" ] || [ "$REDIS_PORT" = "6380" ]; then
# Azure Managed Redis uses TLS on port 10000
Copy link
Preview

Copilot AI Aug 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Hard-coded port numbers for Azure Managed Redis detection should be replaced with a more explicit configuration check or documented constants to improve maintainability.

Suggested change
# Build redis-cli command
if [ "$REDIS_PORT" = "10000" ] || [ "$REDIS_PORT" = "6380" ]; then
# Azure Managed Redis uses TLS on port 10000
# Azure Managed Redis TLS ports (documented constants for maintainability)
AZURE_REDIS_TLS_PORT1=10000 # Standard Azure Managed Redis TLS port
AZURE_REDIS_TLS_PORT2=6380 # Alternate Azure Managed Redis TLS port
# Build redis-cli command
if [ "$REDIS_PORT" = "$AZURE_REDIS_TLS_PORT1" ] || [ "$REDIS_PORT" = "$AZURE_REDIS_TLS_PORT2" ]; then
# Azure Managed Redis uses TLS on these ports

Copilot uses AI. Check for mistakes.

Comment on lines +23 to +24
# Build redis-cli command
if [ "$REDIS_PORT" = "10000" ] || [ "$REDIS_PORT" = "6380" ]; then
Copy link
Preview

Copilot AI Aug 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The same hard-coded port detection logic appears in multiple scripts. Consider extracting this into a shared function or configuration variable.

Suggested change
# Build redis-cli command
if [ "$REDIS_PORT" = "10000" ] || [ "$REDIS_PORT" = "6380" ]; then
# Ports that require TLS
TLS_PORTS="10000 6380"
# Function to check if a port requires TLS
is_tls_port() {
for port in $TLS_PORTS; do
if [ "$1" = "$port" ]; then
return 0
fi
done
return 1
}
# Build redis-cli command
if is_tls_port "$REDIS_PORT"; then

Copilot uses AI. Check for mistakes.

alexluong

This comment was marked as outdated.

@alexluong
Copy link
Collaborator

Can you elaborate on the migration plan? Is downtime expected during the migration, given there's no continuous support or auto-update logic?

Copy link
Collaborator

@alexluong alexluong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial round of review, I focused more on the core logic (redis, rsmq, config) rather than the others. Looks good at first glance. I need to look more into the rsmq changes I think.

I'll also get more hands-on tomorrow, i.e. running the test suite and maybe refactoring a few smaller things, but overall really nice PR and hope you enjoyed working with Golang and the codebase.

Definitely a lot more work than I originally thought from Redis -> Redis Cluster.

Comment on lines +12 to +14
if config.ClusterEnabled {
t.Error("ClusterEnabled should default to false for backward compatibility")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small "nit", can we use the testify's assert or require package for this? That's sort of the general convention we use. Not a big deal tho, this is perfectly fine.

// TestRedisClusterConfigDefaults ensures backward compatibility by testing
// that REDIS_CLUSTER_ENABLED defaults to false and doesn't break existing deployments
func TestRedisClusterConfigDefaults(t *testing.T) {
t.Run("ClusterEnabledDefaultsToFalse", func(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like a similar test as BackwardCompatibilityWithoutClusterField?

t.Error("Configuration without ClusterEnabled should default to false")
}
})
} No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like a formatter issue because if we run gofmt there should be a trailing empty line like prettier, not 100% sure tho, just a note. Maybe we should set up a CI check or something for that.

if config.TLSEnabled {
options.TLSConfig = &tls.Config{
MinVersion: tls.VersionTLS12,
InsecureSkipVerify: true, // Azure Managed Redis uses self-signed certificates
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we would only support TLS for Azure for now? That's probably a fair decision but just want to highlight it.

@@ -1,10 +1,12 @@
package redis

import (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code for this package is a bit repetitive. I thought it was fine before because it was pretty straightforward but given the increased logic, it's probably good to refactor into 1 singular source of truth

"strings"
"time"

"github.com/go-redis/redis"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm i'm not 100% sure why we used the "old redis" client? will need to look more into this. Maybe there was some issue or it's just a mistake and should be upgraded.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the description:

RSMQ Compatibility

  • Scheduler service maintains separate Redis client for RSMQ library compatibility
  • Uses legacy Redis package (github.com/go-redis/redis) for RSMQ operations
  • Automatic cluster/single-node client selection for RSMQ

I believe it may be possible to migrate this. But it felt like this was already quite a big change.

if !redisConfig.ClusterEnabled {
logFields = append(logFields, zap.Int("database", redisConfig.Database))
}
config.logger.Info("Redis client initialized successfully", logFields...)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not 100% sure if this should be INFO, maybe DEBUG?

Comment on lines 38 to 48
func redisTenantID(tenantID string) string {
return fmt.Sprintf("tenant:%s", tenantID)
return fmt.Sprintf("{%s}:tenant", tenantID)
}

func redisTenantDestinationSummaryKey(tenantID string) string {
return fmt.Sprintf("tenant:%s:destinations", tenantID)
return fmt.Sprintf("{%s}:destinations", tenantID)
}

func redisDestinationID(destinationID, tenantID string) string {
return fmt.Sprintf("tenant:%s:destination:%s", tenantID, destinationID)
return fmt.Sprintf("{%s}:destination:%s", tenantID, destinationID)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if we should group everything under tenant first before we have the hash so it acts as a folder structure. There are other usage of Redis (like rsmq) and they will be in its own "folder" too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to change the current formatting as it's not cluster safe. But we could change the ordering and maintain cluster safety.

If we'd prefer the cluster-safe approach with the "folder" structure, we can update this, and we'll also need to update the migration script. That is, unless you'd prefer to have the migration handled dynamically as per #465 (comment).

More info:

Current Structure Analysis

Current Implementation:

func redisTenantID(tenantID string) string {
    return fmt.Sprintf("{%s}:tenant", tenantID)        // {tenant123}:tenant
}

func redisTenantDestinationSummaryKey(tenantID string) string {
    return fmt.Sprintf("{%s}:destinations", tenantID)  // {tenant123}:destinations
}

func redisDestinationID(destinationID, tenantID string) string {
    return fmt.Sprintf("{%s}:destination:%s", tenantID, destinationID)  // {tenant123}:destination:dest456
}

Why Current Structure IS Cluster-Safe:

  • Hash tags {tenant123} ensure all keys hit the same cluster slot
  • All tenant operations are atomic and transactionally consistent
  • Multi-key operations work correctly in Redis cluster mode

Alternative Cluster-Safe Structures

Option 1: Namespace-First with Hash Tags

func redisTenantID(tenantID string) string {
    return fmt.Sprintf("tenant:{%s}", tenantID)                    // tenant:{tenant123}
}

func redisTenantDestinationSummaryKey(tenantID string) string {
    return fmt.Sprintf("tenant:{%s}:destinations", tenantID)       // tenant:{tenant123}:destinations
}

func redisDestinationID(destinationID, tenantID string) string {
    return fmt.Sprintf("tenant:{%s}:destination:%s", tenantID, destinationID)  // tenant:{tenant123}:destination:dest456
}

Option 2: Mixed Structure with Hash Tags

func redisTenantID(tenantID string) string {
    return fmt.Sprintf("tenant:{%s}:profile", tenantID)            // tenant:{tenant123}:profile
}

func redisTenantDestinationSummaryKey(tenantID string) string {
    return fmt.Sprintf("tenant:{%s}:destinations", tenantID)       // tenant:{tenant123}:destinations
}

func redisDestinationID(destinationID, tenantID string) string {
    return fmt.Sprintf("tenant:{%s}:destination:%s", tenantID, destinationID)  // tenant:{tenant123}:destination:dest456
}

Key Insight: Hash Tag Placement

The critical requirement for cluster safety is that the hash tag {tenantID} appears in ALL related keys:

Structure Cluster Safe? Reason
{tenant123}:tenant YES Hash tag ensures same slot
tenant:{tenant123} YES Hash tag ensures same slot
tenant:{tenant123}:destinations YES Hash tag ensures same slot
tenant:tenant123:destinations NO No hash tag = different slots

Recommended Approach

Namespace-first structure with hash tags provides the best of both worlds:

// Better organization while maintaining cluster safety
"tenant:{tenant123}"                    // Clear namespace, cluster-safe
"tenant:{tenant123}:destinations"       // Hierarchical, cluster-safe  
"tenant:{tenant123}:destination:dest456" // Consistent, cluster-safe

Benefits:

  • Cluster-safe: Hash tags preserve atomicity
  • Better organization: Clear namespace hierarchy
  • Consistent: Matches patterns like alert:*, outpost:*
  • Future-proof: Easier to extend and maintain

The ordering can be changed to improve readability and organization as long as the hash tags {tenantID} are preserved for cluster safety.

Historical Context

Previous Structure (Pre-Hash Tags) - NOT Cluster-Safe

func redisTenantID(tenantID string) string {
    return fmt.Sprintf("tenant:%s", tenantID)           // tenant:tenant123
}

func redisTenantDestinationSummaryKey(tenantID string) string {
    return fmt.Sprintf("tenant:%s:destinations", tenantID)  // tenant:tenant123:destinations
}

func redisDestinationID(destinationID, tenantID string) string {
    return fmt.Sprintf("tenant:%s:destination:%s", tenantID, destinationID)  // tenant:tenant123:destination:dest456
}

This structure was NOT cluster-safe because:

  1. Different Hash Slots: Each key would hash to different Redis cluster slots
  2. Transaction Failures: Multi-key operations would fail with CROSSSLOT errors
  3. Broken Atomicity: Operations like tenant deletion would fail in cluster mode

Conclusion

The migration to hash tags was essential for Redis cluster compatibility. The current implementation is already cluster-safe, but the ordering can be improved for better organization while maintaining all technical guarantees.

@leggetter
Copy link
Collaborator Author

Can you elaborate on the migration plan? Is downtime expected during the migration, given there's no continuous support or auto-update logic?

To be safe, there would have to be downtime. Given the BETA status, this is okay.

I did have code within Outpost to perform a check on the structure of the key and dynamically run the migration. We could add that back if you feel strongly about it.

…tion

Add comprehensive support for Redis Enterprise clustering (Azure Managed Redis,
AWS ElastiCache cluster mode, GCP Memorystore cluster, etc.) with explicit
REDIS_CLUSTER_ENABLED configuration option.

## Key Changes

### Redis Cluster Support
- Add REDIS_CLUSTER_ENABLED configuration option for explicit cluster mode control
- Create RedisClient interface abstraction supporting both regular and cluster clients
- Update RSMQ to work with cluster-compatible client interface
- Fix RSMQ Lua scripts for cluster compatibility (ARGV vs KEYS for timestamps)
- Eliminate EXECABORT and CROSSSLOT errors when using Redis Enterprise clusters

### Configuration & Infrastructure
- Add ClusterEnabled field to RedisConfig struct in config and redis packages
- Update scheduler to create appropriate client type based on cluster configuration
- Maintain backward compatibility - defaults to false for existing deployments
- Support both redis.Client and redis.ClusterClient through interface abstraction

### Operational Tools & Documentation
- Add redis-debug diagnostic tool for testing Redis connectivity and cluster mode
- Create comprehensive Redis troubleshooting guide (docs/troubleshooting-redis.md)
- Update README with generic "Redis Cluster" configuration (not Azure-specific)
- Enhance .env.example with cluster configuration options
- Add Makefile target for redis-debug tool (make redis/debug)
- Update Azure examples with cluster-specific guidance

### Testing & Quality
- Add interface compatibility tests for both Redis client types
- Add Redis cluster configuration tests with backward compatibility validation
- Standardize logging by removing inappropriate fmt.Printf from internal packages
- Fix TypeScript moduleResolution configuration in examples

### Documentation Updates
- Update examples/azure/README.md with cluster configuration details
- Add troubleshooting documentation with diagnostic examples
- Update Zudoku config to include troubleshooting guide in navigation
- Enhance Azure deployment scripts with cluster-aware diagnostics

## Technical Details

**Root Cause Resolution:**
- EXECABORT errors: Fixed by using actual redis.ClusterClient instead of regular client
- CROSSSLOT errors: Fixed by updating RSMQ Lua scripts to use ARGV for non-key parameters
- Interface mismatches: Resolved through RedisClient interface abstraction

**Backward Compatibility:**
- Existing Redis deployments continue working unchanged
- Configuration defaults maintain current behavior (REDIS_CLUSTER_ENABLED=false)
- All existing tests pass with new interface abstraction

**Cluster Support:**
- Azure Managed Redis (Redis Enterprise)
- AWS ElastiCache for Redis (cluster mode)
- Google Cloud Memorystore for Redis (cluster mode)
- Self-hosted Redis Enterprise clusters

Resolves Redis clustering issues for enterprise managed Redis services while
maintaining full backward compatibility with single-node Redis deployments.
…tion

This commit centralizes Redis abstraction and enables cluster transactions through hash-tagged keys.

## Key Changes

### Redis Abstraction
- Centralized Redis client factory in `internal/redis/redis.go`
- Single entry point `redis.New()` returning `redis.Cmdable` interface
- Explicit cluster vs regular client selection based on configuration
- OpenTelemetry instrumentation support for both client types
- Separate scheduler client for RSMQ compatibility

### Hash-Tagged Key Format
- Updated Redis key patterns to use hash tags (`{tenantID}:*`)
- Restored atomic transactions for same-tenant operations
- EntityStore operations now use `TxPipelined` for consistency
- Hash slot validation ensures keys land on same cluster node

### Service Updates
- Updated all services to use centralized Redis abstraction
- Removed 600+ lines of legacy wrapper code
- Simplified Redis client initialization across codebase
- Enhanced error handling and connection management

### Migration & Validation
- Added `scripts/migrate-redis-keys.sh` for legacy key migration
- Added `scripts/test-hash-slots.sh` for cluster validation
- Added `scripts/test-transactions.sh` for transaction testing
- Comprehensive migration documentation in `REDIS_MIGRATION.md`

## Benefits
- ✅ Supports both single-node and cluster Redis deployments
- ✅ Restores atomic transactions for data consistency
- ✅ Eliminates CROSSSLOT errors in cluster mode
- ✅ Centralized Redis configuration and management
- ✅ Zero-downtime migration path for existing deployments

## Compatibility
- Works with both Redis cluster and single-node instances
- Hash tags are ignored by single-node Redis (backward compatible)
- Migration script required for existing deployments

Resolves cluster transaction issues and provides production-ready Azure Redis support.
@alexluong
Copy link
Collaborator

Sharing a few tools / scripts / demo of what I put together to assist with the testing of this task.

First, I added a small cmd/seed script. It will make requests to Outpost (localhost:3333) to set up a bunch of tenants and destinations.

go run ./cmd/seed

Then, I introduced a Redis migration flow like so. Please check out the README here. I explained the thought process in a bit more detail. I also recorded a short video demo if you'd like to check that out.

REDIS_PASSWORD=password go run ./cmd/migrateredis -migration 001_hash_tags plan
REDIS_PASSWORD=password go run ./cmd/migrateredis -migration 001_hash_tags apply
REDIS_PASSWORD=password go run ./cmd/migrateredis -migration 001_hash_tags verify
REDIS_PASSWORD=password go run ./cmd/migrateredis -migration 001_hash_tags cleanup

A small idea behind this migrateredis tool is that it will store some migration state in Redis itself.

Note

Using this, we can consider introducing a step on startup to ensure that the Redis schema version matches the release version to avoid running Outpost without a proper migration.

Lastly, I also renamed the key convention a bit, mainly to keep the tenant:{id}:* prefix. The reason is that the prefix helps grouping the data, so when visualizing the data in GUI tool it will be a lot cleaner to understand how we're using Redis. I elaborated on this idea further in this video here.

@alexluong
Copy link
Collaborator

Another thing was the e2e test setup for Redis Cluster. It was quite a bit harder than I originally expected. What I ended doing was this workflow:

# start other test infra containers
make up/test

# start Redis Cluster Docker container + an empty container as "test runner"
make up/test/rediscluster

# ssh into the test-runner container and run the RedisCluster basic test suite
make test/e2e/rediscluster

# take down the Redis Cluster & test-runner containers
make down/test/rediscluster

There are 2 main challenges:

  1. starting a local Redis cluster; ended up using a fork of a popular Redis Cluster dev container
  2. connecting to the local Redis cluster, specifically because of port mapping & some quirks around it; ended up implementing a dev-specific config

In general, this works out okay and should work for our current workflow. We can consider investing more time and make this flow better later if necessary. Happy to explain or share more if you have any other questions.

@alexluong
Copy link
Collaborator

Ran the Azure example, needed to make a few update but everything worked as expected. I think we should be good to go.

The only thing left is finalize the migration step. I'm good with your script, or we can recommend the migrateredis tool, either is fine from my POV.

log, i already cleaned up the resources so no worries about exposed credentials ➜ azure git:(chore/azure-managed-redis) ✗ ./dependencies.sh 🔧 Using OUTPOST_AZURE_ID: bab1ea 🔑 Generating new PostgreSQL password... 🔎 Checking if resource group 'outpost-azure-bab1ea' exists... 📦 Creating resource group... 🔎 Checking if Microsoft.DBforPostgreSQL is registered... ✅ PostgreSQL provider registered 🔎 Checking if PostgreSQL server 'outpost-pg-bab1ea' exists... 🐘 Creating PostgreSQL Flexible Server... Checking the existence of the resource group 'outpost-azure-bab1ea'... Resource group 'outpost-azure-bab1ea' exists ? : True Creating PostgreSQL Server 'outpost-pg-bab1ea' in group 'outpost-azure-bab1ea'... Your server 'outpost-pg-bab1ea' is using sku 'Standard_B1ms' (Paid Tier). Please refer to https://aka.ms/postgres-pricing for pricing details (MissingSubscriptionRegistration) The subscription is not registered to use namespace 'Microsoft.DBforPostgreSQL'. See https://aka.ms/rps-not-found for how to register subscriptions. Code: MissingSubscriptionRegistration Message: The subscription is not registered to use namespace 'Microsoft.DBforPostgreSQL'. See https://aka.ms/rps-not-found for how to register subscriptions. Exception Details: (MissingSubscriptionRegistration) The subscription is not registered to use namespace 'Microsoft.DBforPostgreSQL'. See https://aka.ms/rps-not-found for how to register subscriptions. Code: MissingSubscriptionRegistration Message: The subscription is not registered to use namespace 'Microsoft.DBforPostgreSQL'. See https://aka.ms/rps-not-found for how to register subscriptions. Target: Microsoft.DBforPostgreSQL ➜ azure git:(chore/azure-managed-redis) ✗ ./dependencies.sh 🔧 Using OUTPOST_AZURE_ID: bab1ea 🔑 Generating new PostgreSQL password... 🔎 Checking if resource group 'outpost-azure-bab1ea' exists... ✅ Resource group exists 🔎 Checking if Microsoft.DBforPostgreSQL is registered... Current state: NotRegistered 📥 Registering Microsoft.DBforPostgreSQL... ⏳ Waiting for registration to complete (this may take a few minutes)...

✅ PostgreSQL provider registered
🔎 Checking if PostgreSQL server 'outpost-pg-bab1ea' exists...
🐘 Creating PostgreSQL Flexible Server...
Checking the existence of the resource group 'outpost-azure-bab1ea'...
Resource group 'outpost-azure-bab1ea' exists ? : True
Creating PostgreSQL Server 'outpost-pg-bab1ea' in group 'outpost-azure-bab1ea'...
Your server 'outpost-pg-bab1ea' is using sku 'Standard_B1ms' (Paid Tier). Please refer to https://aka.ms/postgres-pricing for pricing details
Configuring server firewall rule to accept connections from '0.0.0.0' to '255.255.255.255'...
Make a note of your password. If you forget, you would have to reset your password with "az postgres flexible-server update -n outpost-pg-bab1ea -g outpost-azure-bab1ea -p ".
Try using 'az postgres flexible-server connect' command to test out connection.
{
"connectionString": "postgresql://outpostadmin:QjvNurofgJG6So11rnucK9vz@outpost-pg-bab1ea.postgres.database.azure.com/postgres?sslmode=require",
"databaseName": "postgres",
"firewallName": "AllowAll_2025-9-22_19-3-34",
"host": "outpost-pg-bab1ea.postgres.database.azure.com",
"id": "/subscriptions/aa821a91-8c97-4e38-995d-7025fbeab1ec/resourceGroups/outpost-azure-bab1ea/providers/Microsoft.DBforPostgreSQL/flexibleServers/outpost-pg-bab1ea",
"location": "West Europe",
"password": "QjvNurofgJG6So11rnucK9vz",
"resourceGroup": "outpost-azure-bab1ea",
"skuname": "Standard_B1ms",
"username": "outpostadmin",
"version": "17"
}
🔎 Checking if database 'outpost' exists...
📦 Creating database 'outpost'...
Creating database with utf8 charset and en_US.utf8 collation
{
"charset": "UTF8",
"collation": "en_US.utf8",
"id": "/subscriptions/aa821a91-8c97-4e38-995d-7025fbeab1ec/resourceGroups/outpost-azure-bab1ea/providers/Microsoft.DBforPostgreSQL/flexibleServers/outpost-pg-bab1ea/databases/outpost",
"name": "outpost",
"resourceGroup": "outpost-azure-bab1ea",
"systemData": null,
"type": "Microsoft.DBforPostgreSQL/flexibleServers/databases"
}
🔎 Checking if Microsoft.Cache is registered...
🔧 Installing Azure CLI Redis Enterprise extension...
📥 Installing redisenterprise extension...
🔎 Checking if Azure Managed Redis cluster 'outpost-redis-bab1ea' exists...
🔴 Creating Azure Managed Redis cluster...
Resource provider 'Microsoft.Cache' used by this operation is not registered. We are registering for you.
Registration succeeded.
{
"accessKeysAuthentication": "Enabled",
"clientProtocol": "Encrypted",
"clusteringPolicy": "OSSCluster",
"deferUpgrade": "NotDeferred",
"evictionPolicy": "VolatileLRU",
"id": "/subscriptions/aa821a91-8c97-4e38-995d-7025fbeab1ec/resourceGroups/outpost-azure-bab1ea/providers/Microsoft.Cache/redisEnterprise/outpost-redis-bab1ea/databases/default",
"name": "default",
"port": 10000,
"provisioningState": "Succeeded",
"redisVersion": "7.4",
"resourceGroup": "outpost-azure-bab1ea",
"resourceState": "Running",
"type": "Microsoft.Cache/redisEnterprise/databases"
}
⏳ Waiting for cluster to be ready...
🔎 Checking if Microsoft.ServiceBus is registered...
📡 Checking if Service Bus namespace 'outpost-sb-bab1ea' exists...
📡 Creating Service Bus namespace...
⏳ Waiting for namespace to be ready...
📨 Creating topic 'outpost-delivery'...
🔔 Creating subscription 'outpost-delivery-sub' for topic 'outpost-delivery'...
⏳ Pausing for 5 seconds before creating next topic...
📨 Creating topic 'outpost-log'...
🔔 Creating subscription 'outpost-log-sub' for topic 'outpost-log'...
👤 Creating or updating service principal for Outpost access...
WARNING: Option '--sdk-auth' has been deprecated and will be removed in a future release.
WARNING: The output includes credentials that you must protect. Be sure that you do not include these credentials in your code or check the credentials into your source control. For more information, see https://aka.ms/azadsp-cli
🔐 Assigning Service Bus roles...
-> Assigning 'Azure Service Bus Data Owner'...
-> Waiting for 'Azure Service Bus Data Owner' to propagate...
-> ✅ 'Azure Service Bus Data Owner' confirmed.
✅ Done. Values written to .env.outpost

➜ azure git:(chore/azure-managed-redis) ✗ ./diagnostics.sh --local --webhook-url https://hkdk.events/kdy4kufhdo23gy
📄 Loading environment variables from .env.outpost...
📄 Loading environment variables from .env.runtime...
🔍 Validating required environment variables...
✅ All required env vars are set.
🔧 Using webhook URL from command line flag.
🌐 Testing network connectivity...
🌐 Testing outpost-pg-bab1ea.postgres.database.azure.com:5432 ... Connection to outpost-pg-bab1ea.postgres.database.azure.com port 5432 [tcp/postgresql] succeeded!
✅ Reachable
🌐 Testing outpost-redis-bab1ea.westeurope.redisenterprise.cache.azure.net:10000 ... Connection to outpost-redis-bab1ea.westeurope.redisenterprise.cache.azure.net port 10000 [tcp/ndmp] succeeded!
✅ Reachable
🌐 Testing outpost-sb-bab1ea.servicebus.windows.net:443 ... Connection to outpost-sb-bab1ea.servicebus.windows.net port 443 [tcp/https] succeeded!
✅ Reachable
🔐 Testing Azure Service Bus permissions...
(Getting Service Principal Object ID...)
(Checking for role: 'Azure Service Bus Data Owner' at Namespace scope...)
-> ✅ Service principal has the required 'Azure Service Bus Data Owner' role at the Namespace scope.
(Checking for role: 'Azure Service Bus Data Sender' at Namespace scope...)
-> No 'Azure Service Bus Data Sender' role found at Namespace scope.
(Checking for role: 'Azure Service Bus Data Sender' at Topic scope...)
-> No 'Azure Service Bus Data Sender' role found at Topic scope.
-> ✅ Permissions are sufficient for publishing.

🩺 Running LOCAL Deployment Tests...

🐘 Testing PostgreSQL login...
✅ PostgreSQL login successful
🧪 Testing Azure Managed Redis connection on port 10000...
-> Testing with TLS encryption (skipping cert verification for Azure Managed Redis)...
PONG
✅ Azure Managed Redis responded to ping with TLS on port 10000
🚀 Testing Outpost API at http://localhost:3333...
(Creating tenant: diagnostics-tenant-x...)
-> ✅ Tenant created.
(Creating webhook destination...)
-> ✅ Webhook destination created.
(Publishing test event...)
-> ✅ Event published.
(Getting Outpost portal URL...)
-> ✅ View event details at: http://localhost:3333?token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE3NTg2MzE3MjIsImlhdCI6MTc1ODU0NTMyMiwiaXNzIjoib3V0cG9zdCIsInN1YiI6ImRpYWdub3N0aWNzLXRlbmFudC14In0.340ZcS4b2S3WnytTVc88uW_qg8M9jF14beARuRO1Tx0
(Testing destination deletion...)
-> ✅ Webhook destination deleted.
⏱️ Checking system time sync...
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
499bcf3c8ead: Pull complete
Digest: sha256:d82f458899c9696cb26a7c02d5568f81c8c8223f8661bb2a7988b269c8b9051e
Status: Downloaded newer image for busybox:latest
Mon Sep 22 12:48:51 UTC 2025
Mon Sep 22 19:48:51 +07 2025
🔍 Diagnostics complete.

@alexluong alexluong marked this pull request as ready for review September 29, 2025 19:10
* refactor: outpost cli

* refactor: migrate redis cli

* feat: migrateredis status & current

* chore: migration verify & cleanup

* fix: migration check logic

* refactor: cli & migrator init

* refactor: cli flags

* refactor: redis config loader

* chore: startup cmd

* feat: outpost-migrate-redis init

* refactor: migrator

* fix: local dev entrypoint.sh

* chore: dev log

* fix: lock logic

* chore: simplify migration commands

* chore: rename `outpost migrate redis` to `outpost migrate`

* feat: text & json logger

* chore: rename

* build: goreleaser & production docker image

* chore: log level

* chore: outpost bin default behavior

* chore: startup & logging updates

* fix: cleanup 001_hash_tags bug

* chore: confirm text

* chore: make build/goreleaser

* docs: migration.mdx

* fix: goreleaser & tags

* docs: update migration.mdx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants