Minimum Cost Spanning Trees

The cost of a spanning tree is the sum of the costs (weights) of the edges in the spanning tree.

A minimum-cost spanning tree is a spanning tree of least cost.

Review (again):

V= nodes, E= edges

Graph G=(V,E)

Subgraph G'=(V',E'): V' V, E' E (note: graphs can be disconnected)

If G' contains no cycles then G'=T is a tree.

T= (V',E') is a spanning tree of G if V'= V

The right two are spanning trees of the graph on the left

Cost on edges C: E -> R+

(i.e., is a function from the edges to the real numbers)

Spanning tree of minimum cost

Application: circuit equations, broadcasting

n nodes spanning tree must contain n-1 edges

Greedy Strategy

Methods we will be considering both use the greedy design strategy

At each stage include edge of minimum cost which does not introduce a cycle.

Edges included so far form a tree (Prim) (page 218-220)

Edges included so far form a forest (Kruskal) (page 220-225)

Kruskal's method: first sort by cost

(1,2)-- 10

(3,6)-- 15

(4,6)-- 20

(2,6)-- 25

(1,4)-- 30

(3,5)-- 35 ...

Before
After
Step 1
Step 2
Step 3
Step 4

Use Union and Find to convert a forest of n isolated nodes to a spanning tree

note: link between nodes in Union-Find data structure not necessarily same as edges in spanning tree. We want no cycles in this construction.

Time:

e= |E|, n= |V|

Sorting: |E| log |E|
Text notes that "not essential to sort...so long as the next edge can be determined easily" - suggest a heap (O(log |E|) for next to consider and O(|E|) to build the heap.

The rest of the algorithm then takes 2|E| Finds and (N-1) Unions. ( time)

almost 0(e + n)

O(n + eloge) but if each node n has even one edge, bounded mostly by the sort then -- 0(|E| log |E|)

Theorem: Procedure Kruskal produces a minimum cost spanning tree.

Proof: (page 225 also) Similar to the knapsack proof using transformation and contradiction

Proof:

Let T be tree constructed by algorithm
Let T' be a min cost spanning tree

If T=T' than Done ... we will transform to make it so

visual
Claim: c(e') > c(e)

Consider T'' = T' U {e} - {e'}

c(T'') < c(T')

by repeated transformation, T' can be converted to T without increasing cost.
REALLY DONE. Hence T is a minimum cost spanning tree.

Prim: Edges included so far form a tree

Before
After

Time

O(n2)

Outer loop n-1 (basically n to get all vertices) times - inside loop ("over all edges") O(n) time)

but ways to implement to get 0(n2)

Prim 0(n2)

Again, note that if # of e is close to n2 -- almost complete graph

Kruskal O(e log e)

e<< n2 or e ~= n - a sparse graph

note: if a lot of edges Kruscal's looks like n2 log n2

This, or even, n2 log n is worse than just n2

If # edges << n2 (i.e. sparse)

use Kruskal 0(e log e)

otherwise use Prim's 0(n2)