Skip to content

feat: Add Microsoft Azure integration#3074

Open
Tamiru-Alemnew wants to merge 12 commits intosuperplanehq:mainfrom
Tamiru-Alemnew:feat/azure-integration
Open

feat: Add Microsoft Azure integration#3074
Tamiru-Alemnew wants to merge 12 commits intosuperplanehq:mainfrom
Tamiru-Alemnew:feat/azure-integration

Conversation

@Tamiru-Alemnew
Copy link

Implements #2921

Summary

This PR adds the Microsoft Azure integration to SuperPlane, including one trigger (onVirtualMachineCreated) and one action (createVirtualMachine).
It uses Azure Workload Identity Federation (OIDC) for authentication, supports implicit NIC creation from a subnet, and returns rich connection details (Public IP, Private IP, Admin Username) after VM creation.

Backend Implementation

  • Added a new Azure integration under pkg/integrations/azure and registered it in the main server.
  • Implemented OIDC-based auth using azidentity.NewClientAssertionCredential and AZURE_FEDERATED_TOKEN_FILE, with a seamless fallback to az login for local development (no client secrets are stored).
  • Implemented the createVirtualMachine action:
    • Supports core VM options: Resource Group, Region, VM Name, Image, Size.
    • Supports disk configuration (e.g. Premium_LRS, StandardSSD_LRS).
    • Implements implicit NIC creation: users provide a Virtual Network and Subnet, and the integration automatically creates the NIC (and optional Public IP, Standard SKU) and attaches it.
    • Supports Cloud-init / Custom Data, Base64-encoding it before sending to Azure.
    • Uses Azure SDK LROs (BeginCreateOrUpdate + PollUntilDone) to wait for the VM to be ready.
    • After creation, looks up the primary NIC and (optionally) Public IP to return vmId, privateIp, publicIp, and adminUsername.
  • Implemented the onVirtualMachineCreated trigger:
    • Listens to Azure Event Grid Microsoft.Resources.ResourceWriteSuccess events.
    • Handles the Event Grid subscription validation handshake (Microsoft.EventGrid.SubscriptionValidationEvent).
    • Normalizes the incoming event into the trigger payload used by workflows.
  • Refactored magic strings into constants for resource providers, SKUs, and IP allocation methods, and cleaned up comments / dead code.

Frontend Implementation

  • Added Azure mappers under web_src/src/pages/workflowv2/mappers/azure to wire the createVirtualMachine action and onVirtualMachineCreated trigger into the workflow builder.
  • Implemented dynamic dropdowns for:
    • Resource Groups
    • VM Sizes (per region)
    • Virtual Networks (per resource group)
    • Subnets (per virtual network)
  • Updated integration display name mapping so "azure" renders as Azure in the UI.
  • Added an Azure icon and wired it into the integration and building-block sidebars.

Docs

  • Implemented integration docs in pkg/integrations/azure/README.md and generated the user-facing doc docs/components/Microsoft Azure.mdx via make gen.components.docs.
  • Documented:
    • Dual-mode authentication (Production OIDC via federated credential, Local Dev via az login).
    • How to configure the onVirtualMachineCreated trigger using Azure Event Grid.
    • How to use the createVirtualMachine action, including:
      • Required basics (RG, Region, Image, Size).
      • Networking via implicit NIC creation from a VNet + Subnet.
      • Advanced options (disk types, Public IP, Cloud-init).
    • Prerequisites (App Registration, Federated Credential, required RBAC roles such as Contributor on the target Resource Group).
    • Common troubleshooting steps for OIDC issuer / token issues and local tunnel URLs.

Tests

  • Added / updated unit tests in pkg/integrations/azure to cover:
    • Request validation (including implicit NIC creation rules).
    • VM creation behavior and rich output mapping.
    • Event Grid webhook handling and subscription validation.
  • Ensured make test and linting pass locally.

Notes

  • The createVirtualMachine action currently targets the Azure Compute + Network APIs via the official Go SDK; support for additional advanced VM options can be added in future PRs if needed.
  • Local development assumes a publicly reachable base URL (e.g. via localhost.run) so Azure can reach the webhook endpoints.

Demo

Signed-off-by: Tamiru Alemnew <tamirualemnew33@gmail.com>
Signed-off-by: Tamiru Alemnew <tamirualemnew33@gmail.com>
Signed-off-by: Tamiru Alemnew <tamirualemnew33@gmail.com>
Signed-off-by: Tamiru Alemnew <tamirualemnew33@gmail.com>
Signed-off-by: Tamiru Alemnew <tamirualemnew33@gmail.com>
Signed-off-by: Tamiru Alemnew <tamirualemnew33@gmail.com>
Signed-off-by: Tamiru Alemnew <tamirualemnew33@gmail.com>
Signed-off-by: Tamiru Alemnew <tamirualemnew33@gmail.com>
@AleksandarCole AleksandarCole added pr:stage-1/3 Needs to pass basic review. wfh labels Feb 12, 2026
…ctions

- Add webhook secret verification in HandleWebhook
- Refactor trigger to use helper functions from webhook_events.go
- Fix Go formatting issues
- Note: Event Grid validation response body limitation documented (framework limitation)

Signed-off-by: Tamiru Alemnew <tamirualemnew33@gmail.com>
// For now, if a SAS token is present, we'll allow it (this is a basic implementation)
ctx.Logger.Debug("Azure Event Grid SAS token found in header")
return nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SAS token header bypasses webhook secret authentication

High Severity

When a webhook secret is configured, the authenticateWebhook method allows any request that includes an aeg-sas-token header, regardless of the header's value. An attacker can bypass the secret check entirely by sending any arbitrary value in that header. The SAS token is never validated — the code just checks for its presence and returns nil.

Fix in Cursor Fix in Web

// Check for custom secret header
secretHeader := ctx.Headers.Get("X-Webhook-Secret")
if secretHeader != "" {
if secretHeader != string(secret) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Webhook secret compared without constant-time equality

Medium Severity

The webhook secret comparisons use plain != string comparison instead of crypto/subtle.ConstantTimeCompare, making them vulnerable to timing side-channel attacks. Other integrations in this codebase (gitlab, render, slack) consistently use constant-time comparison for secret validation.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

if config.ResourceGroup != "" && resourceGroup != config.ResourceGroup {
ctx.Logger.Debugf("Skipping VM event for resource group %s (filter: %s)", resourceGroup, config.ResourceGroup)
return nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case-sensitive resource group comparison breaks Azure filtering

Medium Severity

The resource group filter uses a case-sensitive comparison (resourceGroup != config.ResourceGroup), but Azure resource group names are case-insensitive. Azure ARM may return different casing in resource IDs than what the user originally specified, causing valid events to be silently filtered out. This comparison needs strings.EqualFold instead.

Fix in Cursor Fix in Web

@AleksandarCole AleksandarCole added pr:stage-2/3 Needs to pass functional review and removed pr:stage-1/3 Needs to pass basic review. labels Feb 13, 2026
@shiroyasha
Copy link
Collaborator

@Tamiru-Alemnew hey, great start!

I see that you included the SDK into the application. Is this necessary?
We didn't use the SDK for AWS afaik. They are quite bit packages to have just for one integration.

Looping in @lucaspin as well.

@Tamiru-Alemnew
Copy link
Author

@Tamiru-Alemnew hey, great start!

I see that you included the SDK into the application. Is this necessary? We didn't use the SDK for AWS afaik. They are quite bit packages to have just for one integration.

Looping in @lucaspin as well.

Hi @shiroyasha @lucaspin,

Great question

I was intentional about pulling in the Azure SDK modules (azidentity, armcompute, armnetwork) rather than writing raw HTTP clients.

While it's true we keep the AWS integration lightweight (mostly using the SDK Core + Signer), Azure presents three specific challenges that make the official SDK the safer choice here:

1. Authentication Complexity (OIDC to ARM)
For Azure, we are using Workload Identity Federation. The environment provides an OIDC token via AZURE_FEDERATED_TOKEN_FILE, and the Azure SDK’s azidentity module securely exchanges that assertion for an ARM access token.

Re-implementing this handshake manually—including securely reading the token file, performing the token exchange, caching, and automatic refreshing—would be significant custom code and a potential security risk. The SDK handles this standard flow out of the box.

2. Long-Running Operations (LROs)
Provisioning a VM in Azure is an asynchronous ARM operation. It returns a 201 Created and an Azure-AsyncOperation header that must be polled.

  • Without SDK: We would have to write a custom poller with backoff logic to track the operation status.
  • With SDK: We get BeginCreateOrUpdate, which includes a native, robust poller (PollUntilDone). This guarantees our "Create VM" action doesn't hang or return prematurely.

3. Complex Typed Resources
Azure ARM payloads (for VirtualMachine, NetworkInterface, PublicIP) are deeply nested and versioned (e.g., 2023-03-01). The SDK provides strongly typed structs that catch configuration errors at compile time. This is critical for keeping the "Create VM" action stable as we map the complex User Inputs (Subnets, Disks, Cloud-init) to the API.

Mitigation
To minimize bloat, I have scoped the imports strictly to the modules we need (azidentity and specific arm* packages), rather than importing the entire Azure SDK bundle.

I believe this approach strikes the right balance between binary size and maintainability for the initial integration.

Signed-off-by: Tamiru Alemnew <tamirualemnew33@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr:stage-2/3 Needs to pass functional review wfh

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants