While preparing the labs for “Audio over IP Networks for Events - An Opinionated Guide, Part 4: BGP as advanced routing protocol for when you need a little bit more spice”, I stumbled across a curious bug in Arista EOS related to OSPF unnumbered, ECMP and iBGP.
In short, when using OSPF unnumbered with ECMP, Arista EOS (specifically cEOS64-4.35.0F) fails to install some iBGP routes into the FIB for no obvious reason. This bug also seems to be present in 4.34.3M and some other people have reported similar issues on physical hardware platforms ranging back to at least 4.28.
It’s a bit tricky because the bug doesn’t always present itself identically, but I can consistently trigger it, just not always with the exact same results.
This is the topology: Example topology for the OSPF unnumbered + iBGP + eBGP lab
Really nothing special, right?
The issue
Let’s look at stage-center:
stage-center(config)#show ip route
VRF: default
WARNING: Some of the routes are not programmed in
kernel, and they are marked with '%'.
Source Codes:
C - connected, S - static, K - kernel,
O - OSPF, O IA - OSPF inter area, O E1 - OSPF external type 1,
O E2 - OSPF external type 2, O N1 - OSPF NSSA external type 1,
O N2 - OSPF NSSA external type2, O3 - OSPFv3,
O3 IA - OSPFv3 inter area, O3 E1 - OSPFv3 external type 1,
O3 E2 - OSPFv3 external type 2,
O3 N1 - OSPFv3 NSSA external type 1,
O3 N2 - OSPFv3 NSSA external type2, B - Other BGP Routes,
B I - iBGP, B E - eBGP, R - RIP, I L1 - IS-IS level 1,
I L2 - IS-IS level 2, A B - BGP Aggregate,
A O - OSPF Summary, NG - Nexthop Group Static Route,
V - VXLAN Control Service, M - Martian,
DH - DHCP client installed default route,
DP - Dynamic Policy Route, L - VRF Leaked,
G - gRIBI, RC - Route Cache Route,
CL - CBF Leaked Route
Gateway of last resort is not set
[...]
B I 10.3.10.0/24 [200/0]
via 10.3.0.1, Ethernet7
via 10.3.0.1, Ethernet8
[...]
B I% 10.8.10.0/24 [200/0]
via 10.1.0.1, Ethernet1
via 10.1.0.1, Ethernet2
via 10.1.0.1, Ethernet3
via 10.1.0.1, Ethernet4
via 10.1.0.1, Ethernet5
via 10.1.0.1, Ethernet6
via 10.4.0.1, Ethernet9
via 10.4.0.1, Ethernet10
B I 10.9.0.1/32 [200/0]
via 10.1.0.1, Ethernet1
via 10.1.0.1, Ethernet2
via 10.1.0.1, Ethernet3
via 10.1.0.1, Ethernet4
via 10.1.0.1, Ethernet5
via 10.1.0.1, Ethernet6
B I 10.9.10.0/24 [200/0]
via 10.1.0.1, Ethernet1
via 10.1.0.1, Ethernet2
via 10.1.0.1, Ethernet3
via 10.1.0.1, Ethernet4
via 10.1.0.1, Ethernet5
via 10.1.0.1, Ethernet6Uhm - why is 10.8.10.0/24 not installed into the kernel?
The nexthops are there:
O 10.1.0.1/32 [110/18]
directly connected, Ethernet1
directly connected, Ethernet2
directly connected, Ethernet3
directly connected, Ethernet4
directly connected, Ethernet5
directly connected, Ethernet6
O 10.4.0.1/32 [110/18]
directly connected, Ethernet9
directly connected, Ethernet10Alright, let’s look at delay-row2-left
delay-row2-left>show ip route
VRF: default
WARNING: Some of the routes are not programmed in
kernel, and they are marked with '%'.
[...]
B I% 10.3.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
[...]
B I% 10.8.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
B I% 10.9.0.1/32 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
B I% 10.9.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2Interesting, here none of the iBGP routes gets installed. Next-hop is also there
delay-row2-left>show ip route 10.5.0.1/32
VRF: default
WARNING: Some of the routes are not programmed in
kernel, and they are marked with '%'.
[...]
O 10.5.0.1/32 [110/18]
directly connected, Ethernet1
directly connected, Ethernet2delay-row2-left#ping 10.5.0.1 source Loopback 1
PING 10.5.0.1 (10.5.0.1) from 10.7.0.1 : 72(100) bytes of data.
80 bytes from 10.5.0.1: icmp_seq=1 ttl=64 time=0.047 ms
80 bytes from 10.5.0.1: icmp_seq=2 ttl=64 time=0.036 ms
80 bytes from 10.5.0.1: icmp_seq=3 ttl=64 time=0.010 ms
80 bytes from 10.5.0.1: icmp_seq=4 ttl=64 time=0.009 ms
80 bytes from 10.5.0.1: icmp_seq=5 ttl=64 time=0.012 ms
--- 10.5.0.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.009/0.022/0.047/0.015 ms, ipg/ewma 0.039/0.034 msLet’s dig a bit deeper.
delay-row2-left#show ip bgp 10.3.10.0/24 detail
BGP routing table information for VRF default
Router identifier 10.7.0.1, local AS number 64496
BGP routing table entry for 10.3.10.0/24
Paths: 1 available
Local
10.3.0.1 from 10.3.0.1 (10.3.0.1)
Origin IGP, metric 0, localpref 100, IGP metric 26, weight 0, tag 0
Received 00:12:29 ago, valid, internal, best
Rx SAFI: Unicast
Not advertised to any peer.That looks good. The iBGP route is there, valid and best.
delay-row2-left#show ip bgp neighbors 10.3.0.1 received-routes
BGP routing table information for VRF default
Router identifier 10.7.0.1, local AS number 64496
Route status codes: s - suppressed contributor, * - valid, > - active, E - ECMP head, e - ECMP
S - Stale, c - Contributing to ECMP, b - backup, L - labeled-unicast, q - Pending FIB install
% - Pending best path selection
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI Origin Validation codes: V - valid, I - invalid, U - unknown
AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop
Network Next Hop Metric AIGP LocPref Weight Path
* > 10.3.10.0/24 10.3.0.1 - - 100 - iYep, that also looks correct.
delay-row2-left#show rib route ip 10.3.10.0/24 debug
VRF: default
Codes: C - Connected, S - Static, P - Route Input, G - Gribi
B - BGP, O - Ospf, O3 - Ospf3, I - Isis, R - Rip, VL - VRF Leak
> - Best Route, * - Unresolved Next hop
EM - Exact match of the SR-TE Policy
NM - Null endpoint match of the SR-TE Policy
AM - Any endpoint match of the SR-TE Policy
L - Part of a recursive route resolution loop
A - Next hop not resolved in ARP/ND
NF - Not in FEC
>B 10.3.10.0/24 [set ID 3, 200 pref/0 MED] updated 00:14:06 ago
via [config: VRF ID 0, ipv4, ID 19] 10.3.0.1 [110 pref/26 metric] type ipv4
via [status: ipv4, ID 7] 10.5.0.1, Ethernet1
via [status: ipv4, ID 8] 10.5.0.1, Ethernet2The RIB entry also looks good, the next-hops are resolved. The destination for 10.3.10.0/24 is 10.3.0.1 which is resolved to the nexthop 10.5.0.1 via IGP.
Let’s do a wireshark capture, just to be sure…
delay-row2-left(config)#clear bgp *
! Peerings for all neighbors were hard reset
delay-row2-left(config)#show ip route bgp
VRF: default
WARNING: Some of the routes are not programmed in
kernel, and they are marked with '%'.
Source Codes:
C - connected, S - static, K - kernel,
O - OSPF, O IA - OSPF inter area, O E1 - OSPF external type 1,
O E2 - OSPF external type 2, O N1 - OSPF NSSA external type 1,
O N2 - OSPF NSSA external type2, O3 - OSPFv3,
O3 IA - OSPFv3 inter area, O3 E1 - OSPFv3 external type 1,
O3 E2 - OSPFv3 external type 2,
O3 N1 - OSPFv3 NSSA external type 1,
O3 N2 - OSPFv3 NSSA external type2, B - Other BGP Routes,
B I - iBGP, B E - eBGP, R - RIP, I L1 - IS-IS level 1,
I L2 - IS-IS level 2, A B - BGP Aggregate,
A O - OSPF Summary, NG - Nexthop Group Static Route,
V - VXLAN Control Service, M - Martian,
DH - DHCP client installed default route,
DP - Dynamic Policy Route, L - VRF Leaked,
G - gRIBI, RC - Route Cache Route,
CL - CBF Leaked Route
B I% 10.3.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
B I% 10.8.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
B I% 10.9.0.1/32 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
B I% 10.9.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2Nope, that wasn’t it. The BGP capture also looks completely normal:

Wireshark capture of BGP update for 10.3.10.0/24
It gets even more cursed. Not only is the route not installed - it seems to get partially installed?
Let’s go to stage-left, where the route to 10.8.10.0/24 is installed
stage-left#show ip route 10.8.10.0/24
VRF: default
WARNING: Some of the routes are not programmed in
kernel, and they are marked with '%'.
[...]
B I 10.8.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
via 10.2.0.1, Ethernet3
via 10.2.0.1, Ethernet4stage-left#ping 10.8.10.10 source Loopback 1
PING 10.8.10.10 (10.8.10.10) from 10.3.0.1 : 72(100) bytes of data.
From 10.2.0.1 icmp_seq=1 Destination Net Unreachable
From 10.6.0.1 icmp_seq=1 Destination Net Unreachable
80 bytes from 10.8.10.10: icmp_seq=1 ttl=60 time=3.35 ms
80 bytes from 10.8.10.10: icmp_seq=2 ttl=60 time=3.51 ms
80 bytes from 10.8.10.10: icmp_seq=3 ttl=60 time=2.36 ms
--- 10.8.10.10 ping statistics ---
3 packets transmitted, 3 received, +2 errors, 0% packet loss, time 3ms
rtt min/avg/max/mdev = 2.360/3.072/3.507/0.507 ms, pipe 2, ipg/ewma 1.693/3.243 msSo… Some packets get lost, some get through. Looks like ECMP shenanigans.
The workaround
A really great engineer and DisNOG community member offered their help, and we dived into a nice late-night debugging session. We double-checked everything, even went through show tech-support extended tfa outputs, but couldn’t find anything obviously wrong. We spotted a little oddity in the show tech-support extended tfa output, and said engineer suggested to test maximum-paths 8 in the OSPF configuration. This did not change anything… But then I had a brainwave and thought “Let’s try maximum-paths 1”.
Lo and behold - all routes got installed correctly!
delay-row2-left(config)#show ip route bgp
VRF: default
WARNING: Some of the routes are not programmed in
kernel, and they are marked with '%'.
[...]
B I% 10.3.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
B I% 10.8.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
B I% 10.9.0.1/32 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
B I% 10.9.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
via 10.5.0.1, Ethernet2
delay-row2-left(config)#router ospf 1
delay-row2-left(config-router-ospf)#maximum-paths 1
delay-row2-left(config-router-ospf)#end
delay-row2-left#show ip route bgp
VRF: default
[...]
B I 10.3.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
B I 10.8.10.0/24 [200/0]
via 10.5.0.1, Ethernet1
B I 10.9.0.1/32 [200/0]
via 10.5.0.1, Ethernet1
B I 10.9.10.0/24 [200/0]
via 10.5.0.1, Ethernet1Look at that!
So, the workaround for this bug is to set maximum-paths 1 in the OSPF configuration. This seems to prevent the bug from occurring, allowing all iBGP routes to be installed correctly into the routing table.
TAC Case incoming…