feat(agent): ipv6 address-family enablement + route filtering#404
feat(agent): ipv6 address-family enablement + route filtering#404chet wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
Still chipping away at [IPv6 support](NVIDIA#84). Work thus far has included: - Moving to `IpNetwork` and `IpAddress` throughout ([NVIDIA#192](NVIDIA#192)). - Accepting IPv6 site prefixes and network segments. ([NVIDIA#204](NVIDIA#204)). - Making the IP allocator family-aware ([NVIDIA#217](NVIDIA#217)). - Making the prefix allocator family-aware ([NVIDIA#237](NVIDIA#237)). - Removing some more API guards and enhancing the `IdentifyAddressFamily` trait ([NVIDIA#324](NVIDIA#324)). - Adding `AAAA` record support to DNS ([NVIDIA#332](NVIDIA#332)). - Backend DHCP plumbing updates ([NVIDIA#335](NVIDIA#335)). - Adding a new `ResourcePoolType::Ipv6` type ([NVIDIA#344](NVIDIA#344)). - Adding a new `ResourcePoolType::Ipv6Prefix` type ([NVIDIA#345](NVIDIA#345)). - Widening agent config types from `Ipv4Addr` to `IpAddr` ([NVIDIA#360](NVIDIA#360)). - Removing IPv4 filter from DPU interface address planning ([NVIDIA#372](NVIDIA#372)). - Making `ServicesAddresses` + name resolution dual-stack ([NVIDIA#385](NVIDIA#385)). *This* PR enables IPv6 route exchange support on the DPUs, with filtering plumbing, to make it such that IPv6 routes are subject to the same `prefix-list` and `route-map` controls as IPv4. Before this change, every BGP `address-family` block only configured `ipv4-unicast`. The DPU couldn't: - Exchange IPv6 routes with hosts. - Advertise IPv6 prefixes into the EVPN overlay. - Redistribute IPv6 connected routes. ..but now that the parts of the agent stack below (including interface planning, `ServiceAddresses`, and our internal DNS resolution) all support IPv6, we can change this. `nvue` configuration has a few options that build on eachother: ``` prefix-list: Defines which prefixes are allowed/denied. route-map: References prefix-lists and adds match conditions for ipv4 or ipv6, setting attributes on matched routes. peer-group policy: References route-maps (or prefix-lists directly) under ipv4-unicast or ipv6-unicast address-family. This is where the filtering is actually applied to sessions. ``` Each "layer" is `address-family` aware. A `route-map` rule with `type: ipv4` only matches IPv4 routes, and IPv6 routes skip it entirely, so we need: - Separate IPv6 `prefix-lists`. - Separate IPv6 `route-maps` (with `type: ipv6`). - Separate `ipv6-unicast` policy references pointing to them. To make this work with the templating, we now have a `split_prefixes_by_family()` function that parses the `anycast_site_prefixes` and `traffic_intercept_public_prefixes`, and buckets them into IPv4 or IPv6. Unparseable prefixes (since they just come in as `String`) fall back to IPv4, mainly for backwards compatibility: previously we passed all IPv4 prefixes through as `String` without *any* parsing, and/but I had to introduce it here to separate them. We then separate `anycast_site_prefixes` into two groups: - `AnycastSitePrefixes` (the existing IPv4 field) - `AnycastSitePrefixesIpv6` ..and do the same for `traffic_intercept_public_prefixes` with: - `TrafficInterceptPublicPrefixes` (the existing IPv4 field) - `TrafficInterceptPublicPrefixesIpv6` Added unit tests to cover the splitting behavior -- mixed, v4 only, v6 only, empty, and the "unparseable" v4 To accompany these new sorted groupings, we have the following IPv6 `prefix-lists`: - `DPU_FROM_INSTANCE_PREFIX_LIST_IPV6`, which is the same as the IPv4 list, but iterates over `AnycastSitePrefixesIpv6`. - `DPU_FROM_TRAFFIC_INTERCEPT_PEER_PREFIX_LIST_IPV6, which again, same as IPv4, but iterates over `TrafficInterceptPublicPrefixesIpv6`. ...and the following IPv6 `route-maps`: - `dpu_from_instance_ipv6`, which mirrors `dpu_from_instance`, but with `type: ipv6` match rules. There are two permit rules (one for BYOIP community tagged routes w/ tag 65100, one for non-BYOIP w/ tag 65101, and then `deny-all`). - `dpu_from_traffic_intercept_gw_peer_ipv6`, same pattern for traffic intercept inbound. ...and then updated `peer-group` sections with `ipv6-unicast`: - `tenant`: - inbound: `dpu_from_instance_ipv6` - outbound `dpu_to_instance` - `traffic_intercept_gw_peer`: - inbound: `dpu_from_traffic_intercept_gw_peer_ipv6` - outbound: `dpu_to_traffic_intercept_gw_peer` - `underlay`: - outbound: `leak_to_underlay` (reused) Updated all templates to include the new `prefix-lists`, `route-maps` and `peer-group` policy blocks. We could add `type: ipv6` rules to the existing `route-maps` instead of creating new ones, but it seemed nicer to leave the existing IPv4 stuff untouched, and to also just keep things separate in general. We can combine if it seems like a better approach. Signed-off-by: Chet Nichols III <chetn@nvidia.com>
| }); | ||
| ipv6_idx += 1; | ||
| } | ||
| Err(_) => { |
There was a problem hiding this comment.
NVUE would hard fail if the address were invalid. Since you're now doing the work here explicitly, I would actually warn and drop or just hard fail. Warn and drop doesn't feel bad at all.
|
|
||
| #[test] | ||
| fn test_split_prefixes_by_family() { | ||
| let prefixes = vec![ |
There was a problem hiding this comment.
Can't forget our friend embedded ipv4: ::ffff:192.0.2.33
Is it v4? is it v6? Neither? Both? 🥲
Definitely keep separate. |
| /// into IPv4 and IPv6 buckets. Each bucket gets sequential indices | ||
| /// starting at `start_index`. | ||
| /// | ||
| /// Previously we just passed through IPv4 addresses as strings |
There was a problem hiding this comment.
site_fabric_prefixes can contain both v4 and v6, so it needs the same treatment.
The traffic_intercept_public_prefixes and anycast_site_prefixes
prefix lists are filtered in the API config, so they should only contain IPv4 by the time it gets here (but of course +1 for defense-in-depth), but then it seems like it would be good to just allow v4 and v6 in the both of those lists so they behave like site_fabric_prefixes.
deny_prefixes is also kind of a funny one:
I wondered if we'd end up with stuff like site_fabric_prefixes and site_fabric_prefixes_v6 in the api config, but the splitting here is nice because it gives us an excuse to parse the addresses again and keeps the API config from having blah and blah_v6 all over the place.
kensimon
left a comment
There was a problem hiding this comment.
Haven't looked at the actual nvue stuff yet, just a dumb nit
| fn split_prefixes_by_family(prefixes: &[String], start_index: usize) -> (Vec<Prefix>, Vec<Prefix>) { | ||
| let mut ipv4 = Vec::new(); | ||
| let mut ipv6 = Vec::new(); | ||
| let mut ipv4_idx = start_index; | ||
| let mut ipv6_idx = start_index; | ||
| for s in prefixes { | ||
| match s.parse::<IpNet>() { | ||
| Ok(IpNet::V4(_)) => { | ||
| ipv4.push(Prefix { | ||
| Index: format!("{ipv4_idx}"), | ||
| Prefix: s.clone(), | ||
| }); | ||
| ipv4_idx += 1; | ||
| } | ||
| Ok(IpNet::V6(_)) => { | ||
| ipv6.push(Prefix { | ||
| Index: format!("{ipv6_idx}"), | ||
| Prefix: s.clone(), | ||
| }); | ||
| ipv6_idx += 1; | ||
| } | ||
| Err(_) => { | ||
| // ...see comment above around how we used to | ||
| // not parse at all, and why I'm just treating | ||
| // these as IPv4 addresses to maintain the same | ||
| // behavior. | ||
| ipv4.push(Prefix { | ||
| Index: format!("{ipv4_idx}"), | ||
| Prefix: s.clone(), | ||
| }); | ||
| ipv4_idx += 1; | ||
| } | ||
| } | ||
| } | ||
| (ipv4, ipv6) | ||
| } |
There was a problem hiding this comment.
Nit: Using partition might make this a bit more readable
| fn split_prefixes_by_family(prefixes: &[String], start_index: usize) -> (Vec<Prefix>, Vec<Prefix>) { | |
| let mut ipv4 = Vec::new(); | |
| let mut ipv6 = Vec::new(); | |
| let mut ipv4_idx = start_index; | |
| let mut ipv6_idx = start_index; | |
| for s in prefixes { | |
| match s.parse::<IpNet>() { | |
| Ok(IpNet::V4(_)) => { | |
| ipv4.push(Prefix { | |
| Index: format!("{ipv4_idx}"), | |
| Prefix: s.clone(), | |
| }); | |
| ipv4_idx += 1; | |
| } | |
| Ok(IpNet::V6(_)) => { | |
| ipv6.push(Prefix { | |
| Index: format!("{ipv6_idx}"), | |
| Prefix: s.clone(), | |
| }); | |
| ipv6_idx += 1; | |
| } | |
| Err(_) => { | |
| // ...see comment above around how we used to | |
| // not parse at all, and why I'm just treating | |
| // these as IPv4 addresses to maintain the same | |
| // behavior. | |
| ipv4.push(Prefix { | |
| Index: format!("{ipv4_idx}"), | |
| Prefix: s.clone(), | |
| }); | |
| ipv4_idx += 1; | |
| } | |
| } | |
| } | |
| (ipv4, ipv6) | |
| } | |
| fn split_prefixes_by_family( | |
| prefixes: Vec<String>, | |
| start_index: usize, | |
| ) -> (Vec<Prefix>, Vec<Prefix>) { | |
| let (ipv4_prefixes, ipv6_prefixes): (Vec<_>, Vec<_>) = prefixes.into_iter().partition(|s| { | |
| // treat parse errors as IPv4 | |
| // ...see comment above around how we used to | |
| // not parse at all, and why I'm just treating | |
| // these as IPv4 addresses to maintain the same | |
| // behavior. | |
| matches!(s.parse::<IpNet>(), Ok(IpNet::V4(_)) | Err(_)) | |
| }); | |
| let ipv4 = ipv4_prefixes | |
| .into_iter() | |
| .enumerate() | |
| .map(|(idx, s)| Prefix { | |
| Index: format!("{}", idx + start_index), | |
| Prefix: s, | |
| }) | |
| .collect::<Vec<_>>(); | |
| let ipv6 = ipv6_prefixes | |
| .into_iter() | |
| .enumerate() | |
| .map(|(idx, s)| Prefix { | |
| Index: format!("{}", idx + start_index), | |
| Prefix: s, | |
| }) | |
| .collect::<Vec<_>>(); | |
| (ipv4, ipv6) | |
| } | |
Description
Still chipping away at IPv6 support. This definitely need's @bcavnvidia's eyes.
Work thus far has included:
IpNetworkandIpAddressthroughout (#192).IdentifyAddressFamilytrait (#324).AAAArecord support to DNS (#332).ResourcePoolType::Ipv6type (#344).ResourcePoolType::Ipv6Prefixtype (#345).Ipv4AddrtoIpAddr(#360).ServicesAddresses+ name resolution dual-stack (#385).This PR enables IPv6 route exchange support on the DPUs, with filtering plumbing, to make it such that IPv6 routes are subject to the same
prefix-listandroute-mapcontrols as IPv4.Before this change, every BGP
address-familyblock only configuredipv4-unicast. The DPU couldn't:..but now that the parts of the agent stack below (including interface planning,
ServiceAddresses, and our internal DNS resolution) all support IPv6, we can change this.nvueconfiguration has a few options that build on eachother:Each "layer" is
address-familyaware. Aroute-maprule withtype: ipv4only matches IPv4 routes, and IPv6 routes skip it entirely, so we need:prefix-lists.route-maps(withtype: ipv6).ipv6-unicastpolicy references pointing to them.To make this work with the templating, we now have a
split_prefixes_by_family()function that parses theanycast_site_prefixesandtraffic_intercept_public_prefixes, and buckets them into IPv4 or IPv6. Unparseable prefixes (since they just come in asString) fall back to IPv4, mainly for backwards compatibility: previously we passed all IPv4 prefixes through asStringwithout any parsing, and/but I had to introduce it here to separate them.We then separate
anycast_site_prefixesinto two groups:AnycastSitePrefixes(the existing IPv4 field)AnycastSitePrefixesIpv6..and do the same for
traffic_intercept_public_prefixeswith:TrafficInterceptPublicPrefixes(the existing IPv4 field)TrafficInterceptPublicPrefixesIpv6Added unit tests to cover the splitting behavior -- mixed, v4 only, v6 only, empty, and the "unparseable" v4
To accompany these new sorted groupings, we have the following IPv6
prefix-lists:DPU_FROM_INSTANCE_PREFIX_LIST_IPV6, which is the same as the IPv4 list, but iterates overAnycastSitePrefixesIpv6.DPU_FROM_TRAFFIC_INTERCEPT_PEER_PREFIX_LIST_IPV6, which again, same as IPv4, but iterates overTrafficInterceptPublicPrefixesIpv6`....and the following IPv6
route-maps:dpu_from_instance_ipv6, which mirrorsdpu_from_instance, but withtype: ipv6match rules. There are two permit rules (one for BYOIP community tagged routes w/ tag 65100, one for non-BYOIP w/ tag 65101, and thendeny-all).dpu_from_traffic_intercept_gw_peer_ipv6, same pattern for traffic intercept inbound....and then updated
peer-groupsections withipv6-unicast:tenant:dpu_from_instance_ipv6dpu_to_instance(notypeso we just use for both)traffic_intercept_gw_peer:dpu_from_traffic_intercept_gw_peer_ipv6dpu_to_traffic_intercept_gw_peer(notypeso we just use for both)underlay:leak_to_underlayUpdated all templates to include the new
prefix-lists,route-mapsandpeer-grouppolicy blocks.We could add
type: ipv6rules to the existingroute-mapsinstead of creating new ones, but it seemed nicer to leave the existing IPv4 stuff untouched, and to also just keep things separate in general. We can combine if it seems like a better approach.Signed-off-by: Chet Nichols III chetn@nvidia.com
Type of Change
Related Issues (Optional)
Breaking Changes
Testing
Additional Notes