Tuesday, March 30, 2010

The Spanning Tree Algorithm for Beginners


A spanning tree is a subset of a network that reaches all nodes in the network and has no loops. The "nodes" are the red circles in the picture to the left. The spanning tree is the blue path through the network. Note that the path spans (reaches) all the nodes, but there's only one path to each node from any other node. There are no loops. It's a tree. Granted, this example makes it look like kind of a funny tree, but use your imagination.

The spanning tree algorithm converts a mesh network into a tree-shaped network. To understand a mesh network, think of the Los Angeles freeway system, with its numerous interconnected roads. To understand a tree, think of a real tree with a root and branches, but no interconnected, looped branches. Or think of an org chart (an upside down tree) or a file system on your computer that has directories, subdirectories, and files. Going back to the Los Angeles freeway system, to understand the spanning tree algorithm, think of yourself sitting at home in, let’s say Redondo Beach, trying to get to San Bernardino.

In the computer field, a mesh network has the advantages of backup links and multiple paths, in case a path goes down or gets as crowded as the 101 freeway during rush hour. But a mesh also has the disadvantage that data could reach the recipient more than once, which would be bad. So we need an algorithm that lets us build networks with redundancy but ensures that data travels a single path to arrive just once at the recipient. Some spanning tree algorithms also ensure that the data takes the shortest path, though that’s a particular type of spanning tree algorithm, a so-called minimum spanning tree algorithm.

It’s not accurate to say that Radia Perlman’s spanning tree algorithm is used on the Internet, which is a misconception that I see so often that it inspired me to write this blog post. The first few words of this beautifully-written blog, for example, are misleading. Dr. Perlman’s algorithm isn’t used to get data to Google, YouTube, etc., (well, unless you count once the data gets into Google or YouTube’s internal network.) Her algorithm is used inside enterprise networks to build a path to a recipient across a mesh network of interconnected Ethernet switches. Enterprise networks include the internal networks at corporations, universities, government agencies, ISPs, etc. (Note, that it’s used inside ISPs, not ISP-to-ISP, though).

Spanning tree algorithms are used in many aspects of computer science and were not invented by Radia Perlman. The first algorithm for finding a minimum spanning tree was developed by Czech scientist Otakar Borůvka in 1926 (see Borůvka's algorithm). Its purpose was an efficient electrical coverage of Moravia, a region in Central Europe. Spanning tree algorithms are used by routers too, not just switches, but not on the routers that connect the Internet (which run Border Gateway Protocol, which uses a distance-vector algorithm, not link-state and spanning tree). Oh dear, now I’m getting into jargon.

Here are some good resources regarding the spanning tree algorithm:
  • Wikipedia’s article on minimum spanning trees is good.
  • Wikipedia’s article on the particular spanning tree protocol that Radia Perlman invented for use on internal switched (bridged) networks is good too.
  • Cisco's article on the rapid spanning tree protocol used by switches is a must-read for those of us in the networking field.
  • I wrote about Radia Perlman last year for Ada Lovelace Day.
  • Finally, be sure to read Dr. Perlman’s famous poem about spanning trees at the bottom of this terrific interview with her.

No comments:

Post a Comment