Kruskal's algorithm is an algorithm in graph theory that finds a minimum spanning tree for a connected weighted graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component).

It works as follows:

  • create a forest F (a set of trees), where each vertex in the graph is a separate tree
  • create a set S containing all the edges in the graph
  • while S is nonempty
    • remove an edge with minimum weight from S
    • if that edge connects two different trees, then add it to the forest, combining two trees into a single tree
    • otherwise discard that edge
At the termination of the algorithm, the forest has only one component and forms a minimum spanning tree of the graph.

With the use of a suitable data structure, Kruskal's algorithm can be shown to run in O (m log m) time, where m is the number of edges in the graph.

Proof

Let P be a connected, weighted graph and let Y be the subgraph of P produced by the algorithm. Y cannot have a cycle, since the last edge added to that cycle would have been within one subtree and not between two different trees. Y cannot be disconnected, since the first encountered edge that joins two components of Y would have been added by the algorithm. Thus, Y is a spanning tree of P.

Let Y1 be a minimum spanning tree for P which has the greatest number of edges in common with Y. If Y1=Y then Y is a minimum spanning tree. Otherwise, let e be the first edge considered by the algorithm that is in Y but not in Y1. Let C1 and C2 be the components of F which e lies between at the stage when e is considered. Since Y1 is a tree, Y1+e has a cycle and there is some different edge f of that cycle that also lies between C1 and C2. Then Y2=Y1+e-f is also a spanning tree. Since e is considered by the algorithm before f, the weight of e is at most equal to the weight of f, and since Y1 is a minimum spanning tree the weights of these two edges must in fact be equal. Therefore, Y2 is a minimum spanning tree having more edges in common with Y than Y1 does, contradicting our assumption about Y1. This proves that Y must be a minimum spanning tree.

Other algorithms for this problem include Prim's algorithm, and Boruvka's algorithm.