This article is the third article of my blog series “Audio over IP Networks for Events - An Opinionated Guide”. In the second article, I have provided an introduction to Layer 3 itself and Layer 3 Network design principles. I recommend you read it before continuing with this article. You can find the second article here.

I assume that the reader has read and understood the previous articles in this series and knows how Layer 3 works

I will try to guide you through the concepts in a way that lays the foundations for you to get a deeper understanding, but this is by no means a complete guide to networking. Given the starting point laid out above, my goal is that you don’t have to go back and forth, googling every second word to understand the concepts I am trying to explain. I have tried to find a middle ground for the level of detail - however, if you feel that I am skipping too much detail, please let me know and I will try to improve the article.

Disclaimer:
As the title warns, this is an opinionated guide that reflects my personal opinions and field experience.
I make no claims to absolute truth and it is well within the realms of possibility that some statements I make are just plain wrong or lack exposure to scenarios that would shift my thinking.
I welcome all questions, suggestions and feedback (even if it’s a rant about how you think I’ve completely missed the mark).

Basic Terminology - Routing Protocols

Before diving into the specifics of OSPF, let’s clarify what routing protocols actually do. This has already been covered briefly in the previous article.

The gist is that Routing protocols dynamically discover, build and establish paths and distribute reachability information throughout a network.
This means they automatically figure out how to get packets from any point A to any point B in your network, adapt when things change, and maintain this knowledge continuously without manual intervention (well, except when they don’t).

What is OSPF?

I will try to keep this as brief, practice-oriented and simple as possible. If you want details, you should read one of the dozens of great articles about OSPF on the internet!

OSPF (Open Shortest Path First) is the bread-and-butter of routing protocols and one of the most widely used Interior Gateway Protocols (IGP) in the world. Every proper network administrator knows how to use it.

OSPF is a link-state routing protocol. This means that OSPF maintains a complete map of the network topology and uses this information to calculate the shortest path to each destination.

OSPF identifies each router with an arbitrary unique Router ID (RID), which is a 32-bit number usually represented in the same format as an IPv4 address (e.g. 1.2.3.4). This does NOT mean that the router ID has to be an actual IP address assigned to an interface on the router - it is just a unique identifier for the router within the OSPF domain. The RID is typically the highest logical IP address (loopback interface IP address) on the router, but it is best practice to configure it manually.

Why do we like OSPF?

  • Battle-tested protocol
    • First spec of OSPFv2 published in 1991 (RFC 1247), still used today with several refinements. First OSPFv1 drafts date back to 1989 (RFC 1131).
    • Hundreds of thousands of deployments worldwide
    • Mature implementations, decades of real-world deployment and refinement
  • Wide vendor support
    • Less gatekeeping (licensing / platform restrictions) by vendors compared to “big” routing protocols like BGP
  • Simple to configure
    • Far simpler than most people think nowadays, especially for small to medium networks
  • Rapid convergence
    • OSPF convergence times in case of a link failure can be between a few milliseconds up to a few hundred milliseconds
    • Depends on the network size, configuration, hardware and tricks used such as BFD, FRR (Fast Reroute), LFA (Loop-Free Alternates), etc.
  • Scales well
    • Works well from small to large networks
    • Small to medium networks are easy to configure with today’s control planes (single area), huge networks can require multi-area designs

OSPF Core Concepts

If you feel like heading straight into the configuration, please resist the temptation and read through these core concepts first. They will help you understand what is going on under the hood and why certain configurations are necessary.

OSPF is a link-state routing protocol. This means that each OSPF router has a complete picture of the network topology. Based on this complete picture, OSPF can make optimal routing decisions considering link speed / costs, administrative policies and other factors.

This complete picture is a graph (Graphs on Wikipedia) that contains all the links between OSPF routers and their states (hence the name link-state protocol). Fundamentally OSPF uses Dijkstra’s algorithm to calculate the shortest path from itself to every other router in the network based on this graph. A shortest path is inherently loop-free. In practice, another path selection process can take place, e.g. when you have multiple areas OSPF will prefer intra-area routes over inter-area routes.

This differentiates OSPF from distance-vector protocols like RIP, where each router only knows the distance (cost) to each destination and not the complete path or path-vector routing protocols like BGP, where each router knows the complete path to each destination but not the complete network topology.

This complete picture or graph is more formally called the Link State Database (LSDB). The LSDB is built and maintained through the exchange of Link State Advertisements (LSAs) between OSPF routers. In simple terms, each router sends information about its links and their states to all other routers in the OSPF area.

Within a single area, all OSPF routers have an identical copy of the LSDB. This is crucial for OSPF to function correctly, as it ensures that all routers have the same view of the network topology.

OSPF Areas

OSPF supports the concept of areas, where an area is a logical partition of the OSPF network. Areas help to reduce the size of the LSDB and limit the scope of routing updates, which can improve scalability and performance.

Bear in mind that much of the information about areas is outdated

Many recommendations stem from the early days of OSPF, when control planes had less computing power than your smart lightbulb.

Nowadays, the absolute majority of OSPF deployments are perfectly fine running in a single area (Area 0, the backbone area). As a rule of thumb, if the number of routers in your OSPF area has 3 digits or less, you will be fine.

The chances that you will ever need multiple areas in a temporary event network are virtually zero. If you read about funny constructs like totally stubby NSSA (totally stubby not-so-stubby areas), don’t worry about it. You will most likely never need it.

Therefore, we will assume that everything is in Area 0 and not waste any more time on areas

OSPF Neighbour Relationships, Adjacencies and Adjacency States

OSPF routers need to exchange information between each other. Therefore, OSPF-capable routers form neighbour relationships and adjacencies to facilitate the exchange of routing information.

There are several different operation modes that define how OSPF routers communicate, discover each other, form neighbour relationships and form adjacencies. The most common two modes are:

  • Point-to-Point: Two routers are directly connected to each other
  • Broadcast: Multiple routers share a broadcast domain, e.g. multiple OSPF routers plugged into a dumb L2 switch in the same VLAN

More modes, such as Non-Broadcast Multi-Access (NBMA), Point-to-Multipoint, Point-to-Multipoint Non-Broadcast or indirect connections exist but are less relevant for our use case.

Adjacency States

It is important to differentiate between the different adjacency states OSPF routers can have. When establishing connectivity, OSPF routers go through several states, such as Init, 2-Way, ExStart, Exchange, Full, etc…

OSPF routers automatically establish neighbour relationships (an established neighbour relationship is also called 2-Way) with other OSPF routers when they have compatible settings, such as the same subnet and mask, a matching area ID, and some others. An OSPF neighbour relationship only means that two OSPF routers can communicate with each other and exchange Hello packets, not per se that they exchange routing information.

An OSPF adjacency is a more “advanced” state of the neighbour relationship, where two OSPF routers have exchanged their LSDBs and are fully synchronized. The formation of an adjacency starts with ExStart, followed by Exchange, Loading and finally Full. Only when two OSPF routers are in the Full state, they have fully exchanged their LSDBs and can make optimal routing decisions based on the complete network topology.

Not all neighbours become adjacents - for example, in a broadcast network, only the Designated Router (DR) and Backup Designated Router (BDR) form adjacencies with all other routers, while the other routers stay in 2-Way state with each other.

Information Exchange with Multicast and Unicast

OSPF uses a mix of unicast and multicast to exchange information between routers. Multicast helps facilitate automatic neighbour discovery in compatible networks and the efficient exchange of routing information.

OSPF reserves the multicast addresses 224.0.0.5 (IPv4) and ff02::5 (IPv6) as AllSPFRouters (all SPF/link state routers) and 224.0.0.6 (IPv4) and ff02::6 (IPv6) as AllDRouters (all Designated Routers).

Hello Protocol and DR / BDR election

OSPF routers use a process called Hello Protocol to discover and maintain neighbour relationships. They send Hello packets at regular intervals to announce their presence and listen for Hello packets from other routers.

For Point-to-Point networks / connections, OSPF routers automatically form full adjacencies with each other (because there are only two routers on the link). DR/BDR election is not used on point-to-point links.

For Broadcast or NBMA (Non-Broadcast Multi-Access) networks / connections, OSPF uses a system of Designated Routers (DR) and Backup Designated Routers (BDR) for more efficient exchange of routing information. FULL adjacencies are only formed between the DR/BDR and all other routers, while non-DR / non-BDR routers stay in 2-Way state (neighbour relationship) with each other.

The DR / BDR act as a hub for exchanging routing information, reducing the amount of routing traffic on the network. The routers form a leader / follower relationship with the DR / BDR.

The DR / BDR are elected using the Hello Protocol. The election process considers factors such as router priority (a configurable value) and router ID.

Dangers of Broadcast Networks / Connections

OSPF over Broadcast networks / connections is generally considered more complex and less efficient than Point-to-Point connections and should be avoided if possible.

OSPF over Broadcast networks has some subtle pitfalls. One example is the potential for asymmetrical routing due to speed mismatches between different devices on the same broadcast domain.

In an OSPF Broadcast network, devices are typically not directly connected (because then it would be a Point to Point network). Instead, they are connected through a shared medium, such as a dumb L2 switch.

Now, if one router negotiates a speed of 10 Gbit/s with this dumb L2 switch and another device negotiates a speed of 1 Gbit/s we have a speed mismatch. The routers have different views of the link speed and thus the link cost, which can lead to asymmetrical routing.

Asymmetrical routing means a situation where traffic might flow one direction via one path and return via another, which can break certain protocols or create troubleshooting nightmares.

Avoid this scenario! Use point-to-point connections whenever possible.

OSPF Path Cost and Reference bandwidth

OSPF uses a metric called cost to determine the best path to a destination. The cost is typically calculated based on the bandwidth of the links, with lower costs indicating higher bandwidth links. The cost can also be overriden due to administrative policies.

In order to calculate the cost, OSPF uses a reference bandwidth, which is a configurable value that represents the highest bandwidth link in the network.

Link Cost is calculated as cost = reference bandwidth / link bandwidth

With reference bandwidth set to 800 GBit/s, a 100Gbit/s link would have a cost of 800 / 100 = 8, a 10Gbit/s link would have a cost of 80, and a 1Gbit/s link would have a cost of 800.

The default reference bandwidth is 100 Mbit/s, which means that a 100 Mbit/s link has a cost of 1, a 10 Mbit/s link has a cost of 10, and a 1 Gbit/s link has a cost of… 1! Just as 10 Gbit/s, 25 Gbit/s, 40 Gbit/s, 100 Gbit/s, etc… This number is clearly not suitable for modern networks.

In modern networks that have no links slower than 1 Gbit/s, I personally recommend setting a reference bandwidth of 800 Gbit/s. 800 Gbit/s is a nice number, because it is divisible by many common link speeds, such as 1 Gbit/s, 10 Gbit/s, 25 Gbit/s, 40 Gbit/s, 100 Gbit/s, 200 Gbit/s and even 400 Gbit/s. A nice side effect of this is that it is also suitable (as in: results in unique link bandwidths) for common link aggregation configurations, such as 4x 40 Gbit/s (160 Gbit/s), 4x 100 Gbit/s (400 Gbit/s), 2x 40 Gbit/s (80 Gbit/s) or 2x 100 Gbit/s (200 Gbit/s).

Example topology

Initially I wanted to use the full blown example address plan below to illustrate the concepts. However, it turned out that this would make the article way too long and complex. Therefore, I will use a much simpler example topology for the rest of the article.

Full blown example address plan
10.0.0.0/8 -> Internal network for event
├── 10.1.0.0/16 -> Front of House
│   ├── 10.1.0.0/24 -> FOH Router IDs
│   │   ├── 10.1.0.1/32 -> FOH Core 1 Loopback
│   │   ├── 10.1.0.2/32 -> FOH Core 2 Loopback
│   │   └── 10.1.0.3/32 -> FOH Access Loopback
│   ├── 10.1.10.0/24 -> Audio @ FOH
│   │   ├── 10.1.10.1/24 -> Audio @ FOH VLAN Interface 
│   │   ├── 10.1.10.10/24 -> Main Mix Surface 1
│   │   └── 10.1.10.11/24 -> Main Mix Surface 2
│   ├── 10.1.2.0/24 -> Lighting @ FOH
│   │   ├── 10.1.20.1/24 -> Lighting @ FOH VLAN Interface
│   │   ├── 10.1.20.10/24 -> Lighting Surface 1
│   │   └── 10.1.20.11/24 -> Lighting Surface 2
│   └── 10.1.3.0/24 -> Video @ FOH
│       ├── 10.1.30.1/24 -> Video @ FOH VLAN Interface
│       ├── 10.1.30.10/24 -> Video Mix Surface 1
│       └── 10.1.30.11/24 -> Video Mix Surface 1
├── 10.2.0.0/16 -> Stage Center
│   ├── 10.2.0.0/24 -> Stage Center Router IDs
│   │   ├── 10.2.0.1/32 -> Stage Center Core 1 Loopback
│   │   ├── 10.2.0.2/32 -> Stage Center Core 2 Loopback
│   │   ├── 10.2.0.3/32 -> Stage Center Audio Access Loopback
│   │   └── 10.2.0.4/32 -> Stage Center Lighting Access Loopback
│   ├── 10.2.10.0/24 -> Audio @ Stage Center / Front Fills
│   │   ├── 10.2.10.1/24 -> Audio @ Stage Center / Front Fills VLAN Interface
│   │   ├── 10.2.10.10/24 -> Front Fill 1
│   │   └── ...
│   ├── 10.2.11.0/24 -> Audio @ Stage Center / Subwoofers
│   │   ├── 10.2.11.1/24 -> Audio @ Stage Center / Subwoofers VLAN Interface
│   │   ├── 10.2.11.10/24 -> Subwoofer 1
│   │   └── ...
│   ├── 10.2.20.0/24 -> Lighting @ Stage Center
│   │   ├── 10.2.20.1/24 -> Lighting @ Stage Center VLAN Interface
│   │   ├── 10.2.20.10/24 -> Lighting Processing Unit 1
│   │   └── 10.2.20.11/24 -> Lighting Processing Unit 2
│   └── 10.2.30.0/24 -> Video @ Stage Center
│       ├── 10.2.30.1/24 -> Video @ Stage Center VLAN Interface
│       ├── 10.2.30.10/24 -> Cam 1
│       ├── 10.2.30.11/24 -> Cam 2
│       └── 10.2.30.12/24 -> Cam 3
├── 10.3.0.0/16 -> Stage Left
│   ├── 10.3.0.0/24 -> Stage Left Router IDs
│   │   ├── 10.3.0.1/32 -> Stage Left Core
│   │   ├── 10.3.0.2/32 -> Stage Left Mixer + Stageboxes Access
│   │   ├── 10.3.0.3/32 -> Stage Left Main Hang Access
│   │   └── 10.3.0.4/32 -> Stage Left Outfill Access
│   ├── 10.3.10.0/24 -> Audio @ Stage Left / Mixer + Stageboxes
│   │   ├── 10.3.10.1/24 -> Audio @ Stage Left / Mixer + Stageboxes VLAN Interface
│   │   ├── 10.3.10.10/24 -> Stagebox 1
│   │   ├── 10.3.10.11/24 -> Stagebox 2
│   │   ├── 10.3.10.12/24 -> Stagebox 3
│   │   ├── 10.3.10.13/24 -> Stagebox 4
│   │   ├── 10.3.10.20/24 -> Monitor Console 1
│   │   └── 10.3.10.21/24 -> Monitor Console 2
│   ├── 10.3.11.0/24 -> Audio @ Stage Left / Main Hang
│   │   ├── 10.3.11.1/24 -> Audio @ Stage Left / Main Hang VLAN Interface
│   │   ├── 10.3.11.10/24 -> Mainhang Left Speaker 1
│   │   └── ...
│   ├── 10.3.12.0/24 -> Audio @ Stage Left / Outfill
│   │   ├── 10.3.12.1/24 -> Audio @ Stage Left / Outfill VLAN Interface
│   │   ├── 10.3.12.10/24 -> Outfill Left Speaker 1
│   │   └── ...
│   ├── 10.3.30.0/24 -> Video @ Stage Left / Cams
│   │   ├── 10.3.30.1/24 -> Video @ Stage Left / Cams VLAN Interface
│   │   └── 10.3.30.10/24 -> Cam 4
│   └── 10.3.31.0/24 -> Video @ Stage Left / LED Walls
│       ├── 10.3.31.1/24 -> Video @ Stage Left / LED Walls VLAN Interface
│       ├── 10.3.31.10/24 -> LED Controller Left 1
│       └── 10.3.31.11/24 -> LED Controller Left 1
├── 10.4.0.0/16 -> Stage Right
│   ├── 10.4.0.0/24 -> Stage Right Router IDs
│   │   ├── 10.4.0.1/32 -> Stage Right Core
│   │   ├── 10.4.0.2/32 -> Stage Right Main Hang Access
│   │   └── 10.4.0.3/32 -> Stage Right Outfill Access
│   ├── 10.4.10.0/24 -> Audio @ Stage Right / Main Hang
│   │   ├── 10.4.10.1/24 -> Audio @ Stage Right / Main Hang VLAN Interface
│   │   ├── 10.4.10.10/24 -> Mainhang Right Speaker 1
│   │   └── ...
│   ├── 10.4.11.0/24 -> Audio @ Stage Right / Outfill
│   │   ├── 10.4.11.1/24 -> Audio @ Stage Right / Outfill VLAN Interface
│   │   ├── 10.5.11.10/24 -> Outfill Right Speaker 1
│   │   └── ...
│   ├── 10.4.30.0/24 -> Video @ Stage Right / Cams
│   │   ├── 10.3.30.1/24 -> Video @ Stage Right / Cams VLAN Interface
│   │   └── 10.3.30.10/24 -> Cam 5
│   └── 10.4.31.0/24 -> Video @ Stage Right / LED Walls
│       ├── 10.4.31.1/24 -> Video @ Stage Right / LED Walls VLAN Interface
│       ├── 10.4.31.10/24 -> LED Controller Right 1
│       └── 10.4.31.11/24 -> LED Controller Right 1
├── 10.5.0.0/16 -> Delay Row 1 Left
│   ├── 10.5.0.0/24 -> Delay Row 1 Left Router IDs
│   │   ├── 10.5.0.1/32 -> Delay Row 1 Left Core
│   │   └── 10.5.0.2/32 -> Delay Row 1 Left Access
│   ├── 10.5.10.0/24 -> Audio @ Delay Row 1 Left / Speaker Hang
│   │   ├── 10.5.10.1/24 -> Audio @ Delay Row 1 Left / Speaker Hang VLAN Interface
│   │   ├── 10.5.10.10/24 -> Delay Row 1 Left Speaker 1
│   │   ├── 10.5.10.11/24 -> Delay Row 1 Left Speaker 2
│   │   └── ...
│   ├── 10.5.20.0/24 -> Lighting @ Delay Row 1 Left
│   │   ├── 10.5.20.1/24 -> Lighting @ Delay Row 1 Left VLAN Interface
│   │   └── 10.5.20.10/24 -> Lighting Processing Unit Delay Row 1 Left
│   ├── 10.5.30.0/24 -> Video @ Delay Row 1 Left / Cams
│   │   ├── 10.5.30.1/24 -> Video @ Delay Row 1 Left / Cams VLAN Interface
│   │   └── 10.5.30.10/24 -> Cam Delay Row 1 Left
│   └── 10.5.31.0/24 -> Video @ Delay Row 1 Left / LED Walls
│       ├── 10.5.31.1/24 -> Video @ Delay Row 1 Left /f LED Walls VLAN Interface
│       └── 10.5.31.10/24 -> LED Controller Delay Row 1 Left
├── 10.6.0.0/16 -> Delay Row 1 Right
│   ├── 10.6.0.0/24 -> Delay Row 1 Right Router IDs
│   │   ├── 10.6.0.1/32 -> Delay Row 1 Right Core
│   │   └── 10.6.0.2/32 -> Delay Row 1 Right Access
│   ├── 10.6.10.0/24 -> Audio @ Delay Row 1 Right / Speaker Hang
│   │   ├── 10.6.10.1/24 -> Audio @ Delay Row 1 Right / Speaker Hang VLAN Interface
│   │   ├── 10.6.10.10/24 -> Delay Row 1 Right Speaker 1
│   │   ├── 10.6.10.11/24 -> Delay Row 1 Right Speaker 2
│   │   └── ...
│   ├── 10.6.20.0/24 -> Lighting @ Delay Row 1 Right
│   │   ├── 10.6.20.1/24 -> Lighting @ Delay Row 1 Right VLAN Interface
│   │   └── 10.6.20.10/24 -> Lighting Processing Unit Delay Row 1 Right
│   ├── 10.6.30.0/24 -> Video @ Delay Row 1 Right / Cams
│   │   ├── 10.6.30.1/24 -> Video @ Delay Row 1 Right / Cams VLAN Interface
│   │   └── 10.6.30.10/24 -> Cam Delay Row 1 Right
│   └── 10.6.31.0/24 -> Video @ Delay Row 1 Right / LED Walls
│       ├── 10.6.31.1/24 -> Video @ Delay Row 1 Right / LED Walls VLAN Interface
│       └── 10.6.31.10/24 -> LED Controller Delay Row 1 Right
├── 10.7.0.0/16 -> Delay Row 2 Left
│   ├── 10.7.0.0/24 -> Delay Row 2 Left Router IDs
│   │   ├── 10.7.0.1/32 -> Delay Row 2 Left Core
│   │   └── 10.7.0.2/32 -> Delay Row 2 Left Access
│   ├── 10.7.10.0/24 -> Audio @ Delay Row 2 Left / Speaker Hang
│   │   ├── 10.7.10.1/24 -> Audio @ Delay Row 2 Left / Speaker Hang VLAN Interface
│   │   ├── 10.7.10.10/24 -> Delay Row 2 Left Speaker 1
│   │   ├── 10.7.10.11/24 -> Delay Row 2 Left Speaker 2
│   │   └── ...
│   ├── 10.7.20.0/24 -> Lighting @ Delay Row 2 Left
│   │   ├── 10.7.20.1/24 -> Lighting @ Delay Row 2 Left VLAN Interface
│   │   └── 10.7.20.10/24 -> Lighting Processing Unit Delay Row 2 Left
│   ├── 10.7.30.0/24 -> Video @ Delay Row 2 Left / Cams
│   │   ├── 10.7.30.1/24 -> Video @ Delay Row 2 Left / Cams VLAN Interface
│   │   └── 10.7.30.10/24 -> Cam Delay Row 2 Left
│   └── 10.7.31.0/24 -> Video @ Delay Row 2 Left / LED Walls
│       ├── 10.7.31.1/24 -> Video @ Delay Row 2 Left / LED Walls VLAN Interface
│       └── 10.7.31.10/24 -> LED Controller Delay Row 2 Left
├── 10.8.0.0/16 -> Delay Row 2 Right
│   ├── 10.8.0.0/24 -> Delay Row 2 Right Router IDs
│   │   ├── 10.8.0.1/32 -> Delay Row 2 Right Core
│   │   └── 10.8.0.2/32 -> Delay Row 2 Right Access
│   ├── 10.8.10.0/24 -> Audio @ Delay Row 2 Right / Speaker Hang
│   │   ├── 10.8.10.1/24 -> Audio @ Delay Row 2 Right / Speaker Hang VLAN Interface
│   │   ├── 10.8.10.10/24 -> Delay Row 2 Right Speaker 1
│   │   ├── 10.8.10.11/24 -> Delay Row 2 Right Speaker 2
│   │   └── ...
│   ├── 10.8.20.0/24 -> Lighting @ Delay Row 2 Right
│   │   ├── 10.8.20.1/24 -> Lighting @ Delay Row 2 Right VLAN Interface
│   │   └── 10.8.20.10/24 -> Lighting Processing Unit Delay Row 2 Right
│   ├── 10.8.30.0/24 -> Video @ Delay Row 2 Right / Cams
│   │   ├── 10.8.30.1/24 -> Video @ Delay Row 2 Right / Cams VLAN Interface
│   │   └── 10.8.30.10/24 -> Cam Delay Row 2 Right
│   └── 10.8.31.0/24 -> Video @ Delay Row 2 Right / LED Walls
│       ├── 10.8.31.1/24 -> Video @ Delay Row 2 Right / LED Walls VLAN Interface
│       └── 10.8.31.10/24 -> LED Controller Delay Row 2 Right
└── 10.9.0.0/16 -> Broadcast Truck

10.1.0.1/32 -> FOH Router Loopback
10.2.0.1/32 -> Stage Center Router Loopback
10.3.0.1/32 -> Stage Left Router Loopback
    10.3.10.1/24 -> Audio @ Stage Left VLAN Interface
    10.3.10.10/24 -> Stagebox 1
10.4.0.1/32 -> Stage Right Router Loopback
10.5.0.1/32 -> Delay Row 1 Left Router Loopback
10.6.0.1/32 -> Delay Row 1 Right Router Loopback
10.7.0.1/32 -> Delay Row 2 Left Router Loopback
10.8.0.1/32 -> Delay Row 2 Right Router Loopback
    10.8.10.1/24 -> Audio @ Delay Row 2 Right VLAN Interface
    10.8.10.10/24 -> Delay Row 2 Right Speaker 1
10.9.0.1/32 -> Broadcast Truck Router Loopback

Besides all the /32 addresses for the router loopbacks (and router IDs at the same time), we have two /24 subnets for fictitious audio devices, the stagebox at Stage Left and a speaker at Delay Row 2 Right to test connectivity.

Example Topology

Example Topology

Setting up the topology in Containerlab

I will use Containerlab to set up the example topology. Containerlab is a great tool to quickly spin up virtual network topologies using containerized network operating systems. It is perfect for testing and learning about networking concepts without needing physical hardware.

If you want to reproduce this lab, I recommend using Linux and VSCode with the excellent Containerlab VSCode Plugin. The lab topology can be downloaded from this link.

I am not going to go into the details of the Containerlab setup here, as this is not the focus of the article. If you want to learn more about Containerlab, please refer to the Containerlab documentation.

The lab is set up so that the startup configurations are automatically applied.

This lab uses Arista cEOS 4.35.0F as routers and containerlab’s network-multitool for the two end devices (stagebox and speaker). I decided to use Arista cEOS because Arista is one of the best vendors on the market (I rate them as S-Tier alongside with Nokia) and should feel most familiar as it has the classic Cisco style CLI (Nokia’s CLI is a bit different). Arista cEOS is available as a free download after registration on the Arista website.

The lab makes use of Management VRFs in the cEOS instances so that we can keep our main and management routing tables clean and separate, making it easier to focus on the OSPF configuration.

TL;DR:

  • Arista cEOS 4.35.0F as routers
  • network-multitool as end devices
  • Management VRFs on cEOS for management access
  • OSPF unnumbered, point-to-point links between routers
  • BFD (Bidirectional Forwarding Detection) enabled on all OSPF links for rapid failure detection

BFD

BFD (Bidirectional Forwarding Detection) is a lightweight protocol that detects link failures in milliseconds, much faster than OSPF’s native detection mechanisms. It achieves this by sending periodic control packets between the two endpoints of a link, allowing for rapid failure detection and improved convergence times. We will just accept that BFD exists and is a good thing to have.

Router Configuration Explained

All routers have a similar configuration. I will explain the configuration of the stage-left router in detail, as it contains all the necessary concepts. The other routers are configured similarly, with only minor differences in IP addresses and interface names.

We will go through the configuration from top to bottom.

General Config

no aaa root
!
username admin privilege 15 role network-admin secret sha512 $6$wqEKa7EJbfGTYQQL$p7uKiJAo0C93YDgSdzgMAWjbk7KlJ5B7xycGnh7J9iWIHWB/NJ4m92X4rybhR0H/N/YrPZNusiBPbK3VZq7zF1
!
management api http-commands
   no shutdown
   !
   vrf MGMT
      no shutdown
!
no service interface inactive port-id allocation disabled
!
transceiver qsfp default-mode 4x10G

You can ignore all of this, it’s default cEOS boilerplate configuration or related to containerlab itself


service routing protocols model multi-agent

This is a default setting, but a very important one. Arista has two version of their routing protocol implementations: ribd is legacy and considered deprecated, multi-agent is the modern implementation that should be used for all new deployments.


hostname stage-left

This sets the hostname, so we can easily identify the router and make sure we don’t accidentally configure the wrong device.


spanning-tree mode mstp
!
system l1
   unsupported speed action error
   unsupported error-correction action error

Those are default settings, not relevant for this article.


vrf instance MGMT
!
management api gnmi
   transport grpc default
      vrf MGMT
!
management api netconf
   transport ssh default
      vrf MGMT

This creates the management VRF (separate routing table for a separated management network that containerlab automatically sets up for us), and enables gNMI and NETCONF access to the router within this VRF.

Interface Configuration

interface Ethernet1
   speed 100g-4
   no switchport
   ip address unnumbered Loopback1
   ip ospf network point-to-point
   ip ospf area 0.0.0.0

This config is similar for all uplink interfaces (interfaces to other routers), for stage-left that is Ethernet1 to Ethernet4

Let’s go through this:

  • interface Ethernet1: We are configuring interface Ethernet1
  • speed 100g-4: Containerlab currently does not support interface speed as a first-class property, so we manually force the speed to a bog standard 100 Gbit/s link (composed of 4 lanes - your standard QSFP28 port). This is important for OSPF cost calculation later on.
  • no switchport: This is a crucial part - This means we are configuring a Layer 3 interface (sometimes also called router port, route-only interface or similar). This interface is removed from any normal switching activity (e.g. it does not forward broadcast traffic)
  • ip address unnumbered Loopback1: We are configuring this interface as an unnumbered interface, borrowing the IP address from Loopback1. This means that this interface does not have its own IP address, but uses the IP address of the loopback interface instead. This is a common practice in OSPF deployments, as it reduces the number of IP addresses needed and simplifies configuration. Otherwise we would need to assign a unique /30 or /31 subnet to each point-to-point link between routers, which can quickly eat up a lot of IP address space and is tedious to manage.
  • ip ospf network point-to-point: We are telling OSPF that this is a point-to-point link. This is important, as it changes the way OSPF behaves on this interface. This point was discussed in the OSPF Core Concepts section earlier.
  • ip ospf area 0.0.0.0: We are assigning this interface to OSPF area 0 (The area ID is a 32-bit number that is often expressed in the familiar IPv4 dotted notation, so 0.0.0.0 is the same as 0). OSPF Area 0 is the backbone area and must always exist (okay, technically speaking it’s a bit more complicated, but let’s not go into this). As discussed earlier, we are keeping everything in Area 0 for simplicity.

interface Ethernet5

This is our downlink interface to the stagebox. No further configuration necessary.


interface Loopback1
   ip address 10.3.0.1/32

Here we assign the Loopback interface its IP address. Loopback interfaces are virtual interfaces that are always up as long as the router is up. They provide a stable and unique IP address for the router. This address is really assigned to the router itself and not a specific physical interface. Loopback interfaces are useful for OSPF router IDs and unnumbered interfaces.

It can be difficult to wrap your head around the concept of loopback interfaces at first, but they are very useful in practice.


interface Management0
   vrf MGMT
   ip address 172.20.20.4/24
   ipv6 address 3fff:172:20:20::4/64

Here we set up the management interface within the management VRF and assign the addresses from the containerlab management network.


interface Vlan1
   ip address 10.3.10.1/24

While we build an L3 fabric, we still have VLANs for the downlink / client devices. This is just a normal VLAN interface configuration. This interface will be used by the stagebox as gateway to reach other devices in the network.

General routing configuration

ip routing
no ip routing vrf MGMT

ip routing is crucial, otherwise our router won’t route. no ip routing vrf MGMT is a default setting. The management VRF is only for management access, we don’t want our router to be abused as a gateway there.


ip route vrf MGMT 0.0.0.0/0 172.20.20.1
!
ipv6 route vrf MGMT ::/0 3fff:172:20:20::1

This sets the default route for the management VRF, so that we can reach the router from our host machine.


router multicast
   ipv4
      software-forwarding kernel
   !
   ipv6
      software-forwarding kernel

This is default boilerplate configuration for multicast routing, not relevant for this article.

OSPF Configuration

router ospf 1
   router-id 10.3.0.1
   auto-cost reference-bandwidth 800000
   bfd default
   network 10.3.0.1/32 area 0.0.0.0
   network 10.3.10.0/24 area 0.0.0.0
   max-lsa 12000

Let’s go through this line by line:

  • router ospf 1: We are configuring OSPF instance 1. Arista supports multiple OSPF instances / processes, but it is good practice to use process 1 as your default.
  • router-id 10.3.0.1: This sets the OSPF router ID to the IP address of the loopback interface. While this could also be inferred automatically from the loopback interfaces if not configured explicitly, it is recommended to set it explicitly to avoid some tricky issues
  • auto-cost reference-bandwidth 800000: This sets the reference bandwidth for OSPF cost calculations to 800 Gbit/s. This was discussed in the OSPF Path Cost and Reference bandwidth section earlier.
  • bfd default: This enables Bidirectional Forwarding Detection (BFD) for all OSPF interfaces. BFD is a fast failure detection protocol.
  • network 10.3.0.1/32 area 0.0.0.0: This advertises the loopback interface into OSPF area 0. For a large number of connected networks, you should look into automatic route redistribution.
  • network 10.3.10.0/24 area 0.0.0.0: This advertises the VLAN interface into OSPF area 0.
  • max-lsa 12000: This is a default setting. It specifies the maximum number of LSAs allowed in the LSDB database. This is a safeguard against excessive LSA flooding that could overwhelm the router’s resources.

Inspecting our network

Now that we have configured OSPF on all routers, let’s inspect the network and see if everything is working as expected.

Connectivity between stagebox and speaker

First, let’s check if the stagebox can reach the speaker. For this, we open a Shell into the stagebox container and use ping to test connectivity.

[*]─[stagebox1]─[~]
└──> ping 10.8.10.10
PING 10.8.10.10 (10.8.10.10) 56(84) bytes of data.
64 bytes from 10.8.10.10: icmp_seq=1 ttl=59 time=6.61 ms
64 bytes from 10.8.10.10: icmp_seq=2 ttl=59 time=3.83 ms
64 bytes from 10.8.10.10: icmp_seq=3 ttl=59 time=2.74 ms
64 bytes from 10.8.10.10: icmp_seq=4 ttl=59 time=3.03 ms
^C
--- 10.8.10.10 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3005ms
rtt min/avg/max/mdev = 2.740/4.052/6.609/1.529 ms

Excellent. We have Unicast connectivity between the stagebox and the speaker.

Seeing ECMP in action

With our default configuration, OSPF automatically makes use of ECMP (Equal-Cost Multi-Path) routing. This means that if there are multiple paths to a destination with the same cost, OSPF will use all of them for load balancing.

Because we do not use Link Aggregation for the bundle links between the routers, all paths will be considered equal cost. All links between routers are 100G and thus have a cost of 8. Even though 6 links between stage-center and foh exist, they are treated as separate paths with the same cost.

One possible path for stagebox -> speaker is:

  • stagebox -> stage-left -> stage-center -> foh -> delay-row-2-right -> speaker

But another path of the same length is:

  • stagebox -> stage-left -> stage-right -> foh -> delay-row-2-right -> speaker

We will use tcptraceroute on the stagebox to see which path is taken. Running it multiple times should show different paths being used.

Unfortunately, cEOS only seems to use field up to Layer 3, but not the Layer 4 ports (source / destination ports) for ECMP hashing. Therefore, we will use a little trick and run tcptraceroute twice, once to the speaker IP and once to the delay-row-2-right router IP.

[*]─[stagebox1]─[~]
└──> sudo tcptraceroute 10.8.10.10
Selected device eth1, address 10.3.10.10, port 56961 for outgoing packets
Tracing the path to 10.8.10.10 on TCP port 80 (http), 30 hops max
 1  10.3.10.1  0.567 ms  1.187 ms  0.761 ms
 2  10.2.0.1  1.227 ms  0.734 ms  0.725 ms
 3  10.1.0.1  1.218 ms  1.714 ms  1.662 ms
 4  10.6.0.1  2.790 ms  2.826 ms  1.951 ms
 5  10.8.0.1  2.556 ms  2.318 ms  2.583 ms
 6  10.8.10.10 [open]  3.854 ms  3.028 ms  2.837 ms

[*]─[stagebox1]─[~]
└──> sudo tcptraceroute 10.8.0.1
Selected device eth1, address 10.3.10.10, port 47761 for outgoing packets
Tracing the path to 10.8.0.1 on TCP port 80 (http), 30 hops max
 1  10.3.10.1  0.671 ms  1.142 ms  1.595 ms
 2  10.5.0.1  0.954 ms  0.749 ms  0.759 ms
 3  10.1.0.1  1.662 ms  1.903 ms  1.283 ms
 4  10.6.0.1  1.876 ms  1.687 ms  1.648 ms
 5  10.8.0.1 [closed]  2.505 ms  3.521 ms  3.103 ms

As we can see, the first path goes through stage-center and foh, while the second path goes through stage-right and foh. This shows that ECMP is working as expected.

The question whether or not this is desirable or what traffic-engineering can and should be done for real world deployments is something for a different article.

Routes seen on stage-left router

If we do a show ip route on the stage-left router, we can see all the routes learned via OSPF.

stage-left>show ip route

VRF: default
Source Codes:
       C - connected, S - static, K - kernel,
       O - OSPF, O IA - OSPF inter area, O E1 - OSPF external type 1,
       O E2 - OSPF external type 2, O N1 - OSPF NSSA external type 1,
       O N2 - OSPF NSSA external type2, O3 - OSPFv3,
       O3 IA - OSPFv3 inter area, O3 E1 - OSPFv3 external type 1,
       O3 E2 - OSPFv3 external type 2,
       O3 N1 - OSPFv3 NSSA external type 1,
       O3 N2 - OSPFv3 NSSA external type2, B - Other BGP Routes,
       B I - iBGP, B E - eBGP, R - RIP, I L1 - IS-IS level 1,
       I L2 - IS-IS level 2, A B - BGP Aggregate,
       A O - OSPF Summary, NG - Nexthop Group Static Route,
       V - VXLAN Control Service, M - Martian,
       DH - DHCP client installed default route,
       DP - Dynamic Policy Route, L - VRF Leaked,
       G  - gRIBI, RC - Route Cache Route,
       CL - CBF Leaked Route

Gateway of last resort is not set

 O        10.1.0.1/32 [110/26]
           via 10.5.0.1, Ethernet1
           via 10.5.0.1, Ethernet2
           via 10.2.0.1, Ethernet3
           via 10.2.0.1, Ethernet4
 O        10.2.0.1/32 [110/18]
           directly connected, Ethernet3
           directly connected, Ethernet4
 C        10.3.0.1/32
           directly connected, Loopback1
 C        10.3.10.0/24
           directly connected, Vlan1
 O        10.4.0.1/32 [110/26]
           via 10.2.0.1, Ethernet3
           via 10.2.0.1, Ethernet4
 O        10.5.0.1/32 [110/18]
           directly connected, Ethernet1
           directly connected, Ethernet2
 O        10.6.0.1/32 [110/34]
           via 10.5.0.1, Ethernet1
           via 10.5.0.1, Ethernet2
           via 10.2.0.1, Ethernet3
           via 10.2.0.1, Ethernet4
 O        10.7.0.1/32 [110/26]
           via 10.5.0.1, Ethernet1
           via 10.5.0.1, Ethernet2
 O        10.8.0.1/32 [110/42]
           via 10.5.0.1, Ethernet1
           via 10.5.0.1, Ethernet2
           via 10.2.0.1, Ethernet3
           via 10.2.0.1, Ethernet4
 O        10.8.10.0/24 [110/42]
           via 10.5.0.1, Ethernet1
           via 10.5.0.1, Ethernet2
           via 10.2.0.1, Ethernet3
           via 10.2.0.1, Ethernet4
 O        10.9.0.1/32 [110/34]
           via 10.5.0.1, Ethernet1
           via 10.5.0.1, Ethernet2
           via 10.2.0.1, Ethernet3
           via 10.2.0.1, Ethernet4

We can see all the loopback interfaces (10.X.0.1/32) of the other routers and the different ECMP nexthops, and also the VLAN interface subnets both local and remote.

If we were to turn off ECMP (by adding maximum-paths 1 under the OSPF configuration), we would only see a single nexthop for each route:

 O        10.6.0.1/32 [110/34]
           via 10.2.0.1, Ethernet3
 O        10.7.0.1/32 [110/26]
           via 10.5.0.1, Ethernet1
 O        10.8.0.1/32 [110/42]
           via 10.2.0.1, Ethernet3

OSPF Neighbors

We can also check the OSPF neighbors on the stage-left router:

stage-left>show ip ospf neighbor
Neighbor ID     Instance VRF      Pri State                  Dead Time   Address         Interface
10.5.0.1        1        default  0   FULL                   00:00:32    10.5.0.1        Ethernet1
10.5.0.1        1        default  0   FULL                   00:00:33    10.5.0.1        Ethernet2
10.2.0.1        1        default  0   FULL                   00:00:37    10.2.0.1        Ethernet4
10.2.0.1        1        default  0   FULL                   00:00:31    10.2.0.1        Ethernet3

We can see all our OSPF neighbors, their router IDs, the interface they are connected to and their state. All neighbors are in FULL state, which means that the OSPF adjacency is fully established and routes can be exchanged.

OSPF Database

Finally, we can check the OSPF database on the stage-left router:

stage-left>show ip ospf database detail 

            OSPF Router with ID(10.3.0.1) (Instance ID 1) (VRF default)


                 Router Link States (Area 0.0.0.0)

  LS Age: 1768
  Options: (E DC)
  LS Type: Router Links
  Link State ID: 10.6.0.1
  Advertising Router: 10.6.0.1
  LS Seq Number: 0x80000008
  Checksum: 0x9068
  Length: 108
  Number of Links: 7

    Link connected to: a Stub Network
     (Link ID) Network/subnet number: 10.6.0.1
     (Link Data) Network Mask: 255.255.255.255
      Number of TOS metrics: 0
       TOS 0 Metrics: 10


    Link connected to: a Point-to-point Network
     (Link ID) Neighboring Router ID: 10.1.0.1
     (Link Data)  0.0.0.1
      Number of TOS metrics: 0
       TOS 0 Metrics: 8


    Link connected to: a Point-to-point Network
     (Link ID) Neighboring Router ID: 10.1.0.1
     (Link Data)  0.0.0.2
      Number of TOS metrics: 0
       TOS 0 Metrics: 8


    Link connected to: a Point-to-point Network
     (Link ID) Neighboring Router ID: 10.4.0.1
     (Link Data)  0.0.0.4
      Number of TOS metrics: 0
       TOS 0 Metrics: 8


    Link connected to: a Point-to-point Network
     (Link ID) Neighboring Router ID: 10.4.0.1
     (Link Data)  0.0.0.3
      Number of TOS metrics: 0
       TOS 0 Metrics: 8


    Link connected to: a Point-to-point Network
     (Link ID) Neighboring Router ID: 10.8.0.1
     (Link Data)  0.0.0.5
      Number of TOS metrics: 0
       TOS 0 Metrics: 8


    Link connected to: a Point-to-point Network
     (Link ID) Neighboring Router ID: 10.8.0.1
     (Link Data)  0.0.0.6
      Number of TOS metrics: 0
       TOS 0 Metrics: 8

…and so on. The detail flag really lives up to its name

The output starts with the LSA database of 10.6.0.1 which is our delay-row1-right.

We can see the loopback interface of delay-row1-right as Stub interface, then the point-to-point links to foh (Neighboring Router ID: 10.1.0.1), then the links to stage-right (Neighboring Router ID: 10.4.0.1), and finally the links to delay-row2-right (Neighboring Router ID: 10.8.0.1).

Remember that this is on stage-left! So stage-left knows about all of delay-row1-right’s links.

This proves what we discussed earlier: Each OSPF router has a complete view of the network topology in its area, not just its direct neighbors!

Conclusion

In this article, we explored the configuration of OSPF in a fictitious mid-scale event network topology. We discussed the core concepts of OSPF, including areas, router IDs, unnumbered interfaces, and ECMP routing.

We then set up the topology using Containerlab and configured OSPF on all routers. Finally, we inspected the network to verify connectivity and OSPF operation.

We have shown that OSPF is a powerful and flexible routing protocol, yet comparatively easy to configure and manage.

Assets / Documents