Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opentelekomcloud_networking_router_route_v2 does not handle disappeared routes well #2652

Open
pvbouwel opened this issue Sep 17, 2024 · 15 comments
Assignees
Labels
otc-issue Blocked by OTC issues

Comments

@pvbouwel
Copy link

pvbouwel commented Sep 17, 2024

Terraform provider version

1.35.2

Affected Resource(s)

opentelekomcloud_networking_router_route_v2

Terraform Configuration Files

resource "opentelekomcloud_networking_router_route_v2" "router_vpn_routes" {
  depends_on       = [opentelekomcloud_compute_instance_v2.compute_node[0]]
  for_each         = var.vpn_router_routes
  router_id        = var.router_id
  destination_cidr = each.value.destination_cidr
  next_hop         = var.vpngateway_ip
}

Debug Output/Panic Output

N/A

Steps to Reproduce

  1. terraform apply for your start setup
  2. Change the image of the compute instance forcing the replacement of the compute instance
  3. terraform apply
  4. terraform apply

Expected Behavior

On step 3 the compute instance should have been replaced and the routes that were defined should exist.

Actual Behavior

On step 3 the compute instance is replaced but all the old routes have disappeared.
On step 4 when trying the apply again terraform notices that the routes are no longer in place but rather than creating new routes he wants to replace them:

# module.vpngateway[0].opentelekomcloud_networking_router_route_v2.router_vpn_routes["key"] must be replaced
-/+ resource "opentelekomcloud_networking_router_route_v2" "router_vpn_routes" {
      + destination_cidr = "1.2.3.0/27" # forces replacement
      ~ id               = "f2150689-9c2e-4c5c-b03f-e95a63d03746-route-1.2.3.0/27-10.1.2.10" -> (known after apply)
      + next_hop         = "10.1.2.10" # forces replacement
      ~ region           = "eu-nl" -> (known after apply)
        # (1 unchanged attribute hidden)
    }

10.1.2.10 would be the IP-address associated with the ECS server

And as a result will give errors (1 per route):

│ Error: route did not exist already

Important Factoids

So the compute instance is just an ECS that we configure with source/destination check disabled and where a software VPN is running (to setup a VPN tunnel)

References

I could not find related github issues

@anton-sidelnikov
Copy link
Member

Hi @pvbouwel opentelekomcloud_networking_router_route_v2 is deprecated resource, use opentelekomcloud_vpc_route_v2 instead, and update terraform to latest

@anton-sidelnikov anton-sidelnikov self-assigned this Sep 18, 2024
@pvbouwel
Copy link
Author

@anton-sidelnikov https://registry.terraform.io/providers/opentelekomcloud/opentelekomcloud/latest/docs/resources/vpc_route_v2 (v1.36.18) does not allow this type of route.

opentelekomcloud_networking_router_route_v2 only allows to specify routes of type peering.

So if I use something like:

resource "opentelekomcloud_vpc_route_v2" "router_vpn_routes" {
  depends_on       = [opentelekomcloud_compute_instance_v2.compute_node[0]]
  type             = "peering"
  vpc_id           = "my-vpc-uuid"
  destination      = "192.168.254.254/32"
  nexthop          = opentelekomcloud_compute_instance_v2.compute_node[0].network[0].port
}

Then it will fail with Error: error creating OpenTelekomCloud VPC route: Resource not found: [POST https://vpc.eu-nl.otc.t-systems.com/v2.0/vpc/routes], error message: {"NeutronError": {"detail": "", "message": "No VPC peering exist with id 6ebac2e9-56f3-4959-96f5-4d4a4dea55f3", "type": "VPCPeeringNotExist"}} where the error message hints that it only considers VPC peerings as nexthop but that is not what I want in my use case.

In my use case I have to be able to add a static route to a certain interface/network port/IP in order to direct traffic into my VPN tunnel. And this interface is in a local account so no VPC Peering setup applies.

@anton-sidelnikov
Copy link
Member

anton-sidelnikov commented Sep 18, 2024

@pvbouwel yes, sorry my mistake wrong resource, please try: opentelekomcloud_vpc_route_table_v1
https://registry.terraform.io/providers/opentelekomcloud/opentelekomcloud/latest/docs/resources/vpc_route_table_v1

@pvbouwel
Copy link
Author

@anton-sidelnikov That does come with a similar issue. If I change my ECS the plan shows it should:

  1. Replace the ECS
  2. Update the Route table in-place

Which is what I'd want but it fails with:
│ Error: error updating OpenTelekomCloud VPC route table: Bad request with: [PUT https://vpc.eu-nl.otc.t-systems.com/v1//routetables/], error message: {"code":"VPC.2800","message":"The same route is included in the route list."}

I also noticed that if you create a route table and later on just add a subnet in the terraform router resource but do not do any change to the routes then you get the same error.

So it seems it suffers the same bug. Note that even if it gets squashed/resolved there would be a regression in user experience because with the new resource you must know all your routes up-front. Which hinders breaking up Terraform code in clean modules. Because I cannot but my VPN logic in a module and run it after my base networking module. IMHO: Route should be a separate terraform resource and terraform should manage the idempotency.

Slightly off topic: How can we see what is deprecated and what ain't because if you read https://registry.terraform.io/providers/opentelekomcloud/opentelekomcloud/latest/docs/resources/networking_router_route_v2 you wouldn't guess it is deprecated. But migrating to the VPC resources is an expensive operation (requires careful planning of managing the terraform state when you don't have the luxury of downtime)

@pvbouwel
Copy link
Author

For reproducing:

locals {
  test_network_name = "test-network-pvbouwel"
  test_subnet = "test-subnet-pvbouwel-1"
  test_router = "test-router-pvbouwel"
}

resource "opentelekomcloud_vpc_v1" "testnetwork" {
  name = local.test_network_name
  cidr = "10.0.0.0/16"
  description = "For https://github.com/opentelekomcloud/terraform-provider-opentelekomcloud/issues/2652"
}


resource "opentelekomcloud_vpc_subnet_v1" "subnet_1-1" {
  name       = local.test_subnet
  cidr       = "10.0.1.0/24"
  gateway_ip = "10.0.1.1"
  vpc_id     = opentelekomcloud_vpc_v1.testnetwork.id
}

resource "opentelekomcloud_vpc_route_table_v1" "table_1" {
  name        = local.test_router
  vpc_id      = opentelekomcloud_vpc_v1.testnetwork.id
  description = "created by terraform with routes"

  route {
    destination = "192.168.254.254/32"
    type        = "eni"
    nexthop     = opentelekomcloud_compute_instance_v2.compute_node[0].network[0].port
    description = "Example into VPN tunnel"
  }

  subnets = [
    opentelekomcloud_vpc_subnet_v1.subnet_1-1.id,
  ]
}

resource "opentelekomcloud_compute_instance_v2" "compute_node" {
  count           = 1

  name            = "pvbouwel-test"
  flavor_id       = "<chose>"
  image_name      = "<chose an image inyour account>" # Changing this and running terraform apply again causes issues
  security_groups = ["<your-sec-group>""]
  key_pair        = "<your-keypair>"

  network {
    name = opentelekomcloud_vpc_v1.testnetwork.id
    fixed_ip_v4 = "10.0.1.30"
  }

  depends_on = [ 
    opentelekomcloud_vpc_subnet_v1.subnet_1-1
  ]
}

@anton-sidelnikov
Copy link
Member

Hi @pvbouwel , yes sorry, i will add deprecation message to doc, someone forgot to do that. I will investigate issue today, strange behaviour, maybe api issue, will inform you on progression.

@anton-sidelnikov
Copy link
Member

anton-sidelnikov commented Sep 20, 2024

Hi @pvbouwel, i found an issue inside API which leads to this force-recreation of route, I need to create internal issue for that. Point is when you create route with type ENI, in api it appears as ECS, which lead to state inconsistency, and in your scenario impossible to use route of type ECS from the beginning, weird behaviour.
https://jira.tsi-dev.otc-service.com/browse/BM-5965 - internal ticket

@anton-sidelnikov anton-sidelnikov added the otc-issue Blocked by OTC issues label Sep 20, 2024
otc-zuul bot pushed a commit that referenced this issue Oct 1, 2024
…2669)

[Network] Doc deprecation message to router_route_v2 and refactoring

Summary of the Pull Request
Resource not presented in docportal, it means that it is deprecated.
But still works, so refactored and added test case with instance recreation.
PR Checklist

 Refers to: #2652
 Tests added/passed.
 Documentation updated.
 Schema updated.
 Release notes added.

Acceptance Steps Performed
=== RUN   TestAccNetworkingV2RouterRoute_basic
=== PAUSE TestAccNetworkingV2RouterRoute_basic
=== CONT  TestAccNetworkingV2RouterRoute_basic
--- PASS: TestAccNetworkingV2RouterRoute_basic (233.55s)
PASS

=== RUN   TestAccNetworkingV2RouterRoute_ecs
=== PAUSE TestAccNetworkingV2RouterRoute_ecs
=== CONT  TestAccNetworkingV2RouterRoute_ecs
--- PASS: TestAccNetworkingV2RouterRoute_ecs (340.75s)
PASS

Debugger finished with the exit code 0

Reviewed-by: Artem Lifshits
Reviewed-by: Muneeb H. Jan <muneebhafeezjan@gmail.com>
@anton-sidelnikov
Copy link
Member

anton-sidelnikov commented Oct 17, 2024

Hi @pvbouwel could you check again this resource opentelekomcloud_networking_router_route_v2, while we are waiting for the fixes for new service, i provded some fixes for old one, seems it worke for now but I cannon said when it will be decomissioned.

@pvbouwel
Copy link
Author

@anton-sidelnikov I am not sure I understand your question. What should I check for https://registry.terraform.io/providers/opentelekomcloud/opentelekomcloud/latest/docs/resources/vpc_route_v2 ? That one does work for VPC-peerings but that is the only supported type according to the terraform docs.

And the docs of the backend API do not mention what are the possible values it only states that "peering" is the default: https://docs.otc.t-systems.com/ansible-collection-cloud/vpc_route_module.html

@anton-sidelnikov
Copy link
Member

anton-sidelnikov commented Oct 17, 2024

@pvbouwel Sorry again, the same problem with copy-paste, i wanted to said about: opentelekomcloud_networking_router_route_v2

@pvbouwel
Copy link
Author

pvbouwel commented Nov 4, 2024

@anton-sidelnikov it took a while before I could test but with latest version I could pull ("1.36.23") it seems the refresh got broken and deletion is not really idempotent yet.

So I used:

locals {
  test_network_name = "test-network-pvbouwel"
  test_subnet = "test-subnet-pvbouwel-1"
  test_router = "test-router-pvbouwel"
}

resource "opentelekomcloud_networking_network_v2" "testnetwork" {
  name = local.test_network_name
  admin_state_up = true
}

resource "opentelekomcloud_networking_router_v2" "router" {
  name             = "${local.test_network_name}-router"
  admin_state_up   = true
}

resource "opentelekomcloud_networking_subnet_v2" "private_network_subnet" {
  name            = local.test_subnet
  network_id      = opentelekomcloud_networking_network_v2.testnetwork.id
  cidr            = "10.0.1.0/24"
  ip_version      = 4
}
resource "opentelekomcloud_networking_router_interface_v2" "private_network_router_interface" {
  router_id = opentelekomcloud_networking_router_v2.router.id
  subnet_id = opentelekomcloud_networking_subnet_v2.private_network_subnet.id
}

resource "opentelekomcloud_networking_router_route_v2" "router_vpn_routes" {
  depends_on       = [opentelekomcloud_compute_instance_v2.compute_node[0]]
  for_each         = toset(["192.168.0.1/32", "192.168.0.2/32"])
  router_id        = opentelekomcloud_networking_router_v2.router.id
  destination_cidr = each.value
  next_hop         = "10.0.1.30"
}


resource "opentelekomcloud_compute_instance_v2" "compute_node" {
  count           = 1

  name            = "pvbouwel-test"
  flavor_id       = "s3.large.8"
  image_name      = "my-image"
  security_groups = ["my-secgroup"]
  key_pair        = "my-keypair"

  metadata = {
    ssh_user   = "my-user"
  }

  network {
    name =  "${local.test_network_name}"
    fixed_ip_v4 = "10.0.1.30"
  }

  depends_on = [ 
    opentelekomcloud_networking_subnet_v2.private_network_subnet
  ]
}

If I then terraform taint 'opentelekomcloud_compute_instance_v2.compute_node[0]' and do a terraform apply the instance gets replaced and the routes are missing.

If I then re-run terraform apply it says No changes. Your infrastructure matches the configuration. eventhough the static routes are missing.

If I taint a static route terraform taint 'opentelekomcloud_networking_router_route_v2.router_vpn_routes["192.168.0.1/32"]' then it will fail destruction:

opentelekomcloud_networking_router_route_v2.router_vpn_routes["192.168.0.1/32"]: Destroying... [id=61e4b20c-c5cf-4ce8-aedb-90686a7cfd25-route-192.168.0.1/32-10.0.1.30]
╷
│ Error: Can't find route to 192.168.0.1/32 via 10.0.1.30 on OpenTelekomCloud Neutron Router 61e4b20c-c5cf-4ce8-aedb-90686a7cfd25

It makes sense that the route cannot be found but then destruction should just realize there is no work to do (because we expect terraform to handle the idempotency via the providers) and just create the new route.

@muneeb-jan
Copy link
Member

Hi @pvbouwel

Could you try like this?

locals {
  test_network_name = "test-network-pvbouwel"
  test_subnet       = "test-subnet-pvbouwel-1"
  test_router       = "test-router-pvbouwel"
}

resource "opentelekomcloud_networking_network_v2" "testnetwork" {
  name           = local.test_network_name
  admin_state_up = true
}

resource "opentelekomcloud_networking_router_v2" "router" {
  name           = "${local.test_network_name}-router"
  admin_state_up = true
}

resource "opentelekomcloud_networking_subnet_v2" "private_network_subnet" {
  name       = local.test_subnet
  network_id = opentelekomcloud_networking_network_v2.testnetwork.id
  cidr       = "10.0.1.0/24"
  ip_version = 4
}
resource "opentelekomcloud_networking_router_interface_v2" "private_network_router_interface" {
  router_id = opentelekomcloud_networking_router_v2.router.id
  subnet_id = opentelekomcloud_networking_subnet_v2.private_network_subnet.id
}

resource "opentelekomcloud_networking_port_v2" "instance_port_1" {
  name       = "my_port"
  network_id = opentelekomcloud_networking_network_v2.testnetwork.id
  fixed_ip {
    subnet_id  = opentelekomcloud_networking_subnet_v2.private_network_subnet.id
    ip_address = "10.0.1.30"
  }
}


resource "opentelekomcloud_networking_router_route_v2" "router_vpn_routes" {
  for_each         = toset(["192.168.0.1/32", "192.168.0.2/32"])
  router_id        = opentelekomcloud_networking_router_v2.router.id
  destination_cidr = each.value
  next_hop         = "10.0.1.30"

  depends_on = [opentelekomcloud_compute_instance_v2.compute_node]
}

resource "opentelekomcloud_compute_instance_v2" "compute_node" {
  name            = "pvbouwel-test"
  flavor_id       = "s3.large.8"
  image_name      = "image"
  security_groups = ["default"]
  key_pair        = "your-keypair"

  metadata = {
    ssh_user = "my-user"
  }

  network {
    port = opentelekomcloud_networking_port_v2.instance_port_1.id
  }

  depends_on = [
    opentelekomcloud_networking_subnet_v2.private_network_subnet
  ]
}

@anton-sidelnikov
Copy link
Member

@pvbouwel the main idea above is to connect instance not by network name, but by port, seems that port behaves differently and triggers recreation of routes

@pvbouwel
Copy link
Author

pvbouwel commented Nov 8, 2024

Thanks @muneeb-jan this indeed helps manipulate the ECS without getting routing issues. Since the port will keep on existing.

@anton-sidelnikov to me this is an acceptable workaround for opentelekomcloud_networking_router_route_v2 and our current setup.

Just curious as you mentioned the deprecation of opentelekomcloud_networking_router_route_v2. The strategic solution is to migrate to VPC. If I understand correctly there won´t be a need for the intermediate port but it requires an internal fix ( https://jira.tsi-dev.otc-service.com/browse/BM-5965 ). Is that correct?

Is there planned work to close the gap between opentelekomcloud_vpc_route_v2 and opentelekomcloud_networking_router_route_v2 ? Because opentelekomcloud_vpc_route_v2 only supports routes of type peering and you can only seem to specify other routes in opentelekomcloud_vpc_route_table_v1 for VPC which means that you need to know all routes (except non-peering ones) at route table creation time which causes more dependencies between terraform modules.

For example if you have a base module that creates your VPC and router. And you want to have an optional module that configures a NAT gateway then the base module needs to know about the NAT gateway module and be the one calling it because it needs to retrieve the NAT gateway IP to configure the route. With the old resources you don´t have that limitation because the NAT module just needs to know the network it arrives into and it can manage its own route resource (with the workaround port + route resource).

@anton-sidelnikov
Copy link
Member

anton-sidelnikov commented Nov 8, 2024

Hi @pvbouwel,
requires an internal fix ( https://jira.tsi-dev.otc-service.com/browse/BM-5965 ). Is that correct? - Yes, that’s correct.

As for opentelekomcloud_vpc_route_v2, I haven’t heard of any plans to improve this resource—it’s likely they may even deprecate it. I believe the main goal is to fix everything within the API for opentelekomcloud_vpc_route_table_v1.

Regarding the last sentence, we’ll likely discuss potential solutions for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
otc-issue Blocked by OTC issues
Projects
None yet
Development

No branches or pull requests

3 participants