Providing Redundancy and Load Balancing in Designs


Background.  Providing redundancy relates to ensuring there is adequate back up equipment if some aspect of network performance fails or degrades.  Network traffic will shift to the backup device if the primary unit fails in a process called failover.  This is done in order to provide some sort of fault tolerance. 

Load balancing also improves fault tolerance since the network is configured so that if particular thresholds are exceeded, traffic will shift directions and be processed using alternatives.  In load balancing the alternative equipment isn't necessarily redundant and not likely to be kept in a more idle backup mode.  Load balancing can be achieved in a variety of ways.  One of the main ways is to tune the routing metrics in router config files within particular routing domains.

In order to make use of both of these approaches to improving network performance, the network must be designed with multiple routes between different devices.  This is called topology meshing.

Topology Meshing.  A mesh is where two network devices, usually routers or switches, are directly connected.  The following diagram represents six routers that are fully meshed, that is every router has a direct connection to every other router.



Doing this can get quite expensive since if you have n nodes there must be



connections, n-1 for each node.

It is much more common to select particular direct connections for some reasons and develop a partially meshed topology as shown in the next diagram.



Even if each router was just connected to the two nearest routers in some sort of chain, it would still provide some alternative routings.  But, the more direct links that exist, the smaller the diameter of the collection of devices.

Since we will be focusing on hierarchical models, these are most likely to be implemented using meshes in something like the following.



To enhance load balancing one should also try to make sure that bandwidth is consistent within a given layer of a hierarchical topology.  Routing protocols also tend to converge much faster if multiple "equal cots" paths exist in a destination network.

Upside/Downside.  So actually implementing these sorts of options and appropriately configuring the routers and/or switches has both advantages and disadvantages.

On the upside

  • Performance is improved by increasing the number of mesh links since it is only a single hop when a link exists.
  • Availability is improved because if one device goes down, one or more alternative routes are available.
  • The possibilities for load balancing are improved because alternative paths can be used for normal operations.

On the downside

  • Every router or switch interface that is used to for meshing cannot be used to connect to a LAN segment which increases the need need for more routers or switches and overall expenses.
  • Since devices, particularly routers, constantly advertise their services to one another the more devices that are directly linked the greater the overhead traffic being broadcast.
  • Sometimes having so many devices interlinked can cause broadcast storms when a particular device goes down.
  • Additional connections result in additional complexity.  There are many more trails to follow so sometimes when one needs to investigate all paths, many more paths must be examined and considered.

A general rule of thumb is to keep broadcast traffic to less than 20% of the bandwidth in each link.  For these reasons, few internetworks are fully meshed.  More general practice fully meshes only the backbone portions .