The cost of a spanning tree is the sum of the costs (weights) of the edges in the spanning tree.
A minimum-cost spanning tree is a spanning tree of least cost.
Review (again):
V= nodes, E= edges
Graph G=(V,E)
Subgraph G'=(V',E'): V'
If G' contains no cycles then G'=T is a tree.
T= (V',E') is a spanning tree of G if V'= V
The right two are spanning trees of the graph on the left
Cost on edges C: E -> R+
(i.e., is a function from the edges to the real numbers)
(3,6)-- 15
(4,6)-- 20
(2,6)-- 25
(1,4)-- 30
(3,5)-- 35 ...
Use Union and Find to convert a forest of n isolated nodes to a spanning tree
note: link between nodes in Union-Find data structure not necessarily same as
edges in spanning tree. We want no cycles in this construction.
Time:
e= |E|, n= |V|
Sorting: |E| log |E|
The rest of the algorithm then takes 2|E| Finds and (N-1) Unions. ( time)
almost 0(e + n)
O(n + eloge) but if each node n has even one edge, bounded mostly by the sort then -- 0(|E| log |E|)
Theorem: Procedure Kruskal produces a minimum cost spanning tree.
Proof: (page 225 also) Similar to the knapsack proof using transformation and contradiction
Proof: Let T be tree constructed by algorithm
If T=T' than Done ... we will transform to make it so
Let e' be an edge on this cycle such that e'
then, at some stage in the construction of T, Kruskal must have considered e' for inclusion (i.e., T<e U e' or not)
e' must have been rejected because it caused a cycle with edges in T<e. Contradiction. Done .
Consider T'' = T' U {e} - {e'}
c(T'') < c(T')
Outer loop n-1 (basically n to get all vertices) times - inside loop ("over all edges") O(n) time)
(1,2)-- 10
Before
After
Step 1
Step 2
Step 3
Step 4
Text notes that "not essential to sort...so long as the next edge can be determined easily" - suggest a heap (O(log |E|) for next to consider and O(|E|) to build the heap.
Let T' be a min cost spanning tree
Let e be an edge of min. cost in T - T'. (ones Kruscal used but they didn't)
visual
Let T<e denote the edges in T of cost < c(e) (anybody smaller not used?)
If all edges in T<e contained in T' then T' U {e} creates a unique cycle.
Claim: c(e') > c(e)Proof: suppose c(e') < c (e)
but T<e
therefore T' contains a cycle
Before
After
Time
O(n2)
inside the loop --
could check all edges => bounded by e
but ways to implement to get 0(n2)
but e, if graph is complete, could be up n2 => time for algorithm of O(n3)