This handout documents some of the material covered in the sorting lectures.

- There are basically two types of sorting algorithms:
- comparison-based sorting, and
- address-calculation-based sorting

- Radix sorting is an example of an address-calculation-based method. We do not cover these methods.
- Examples of comparison-based algorithms are:
- O(n
^{2}) algorithms- bubblesort
- insertion sort
- selection sort

- O(n lg(n)) algorithms
- merge sort
- quick sort (average behavior)
- heap sort

- O(n
- In the worst case, Quicksort is an O(n
^{2}) algorithm. - The minimum number of comparisons required, on average, to sort n items, using a comparison-based sorting method, is n lg(n).

** Definition**: A *comparison tree* (sometimes called a
*decision tree*) is a binary tree in which, at each internal node, a
comparison is made between two keys and in which each leaf represents a
sorted arrangement of keys. The number of leaves in a comparison tree must
be n!, where n is the number of items to be sorted. This is the number of
permutations of the n items. Every permutation must be represented in the
comparison tree, and every leaf represents one of the permutations.

The following figure is a comparison tree that sorts 3 items. Each node in the tree asks one question about the relative order of a, b and c. The answer to the question determines which branch below the node is taken. Each node is also labeled with the set of possible permutations of a, b and c that is consistent with the questions that have been answered so far.

Fig 1: Comparison Tree for 3 Items

The "worst-case" number of comparisons in the tree is the length of the longest path. For the three item tree above, the longest path is of length 3. This can be expressed as the ceiling of lg(n). The "average" number of comparisons is just the sum of the path lengths divided by the number of leaves or

(2 + 3 + 3 + 3 + 3 + 2) / 6 = 2.67It can be shown that as n increases, the average number of comparisons grows proportionately to n lg(n). Thus, the very best average performance of any sorting algorithm based on comparisons is O(n lg(n)).

** Question**: Since Merge Sort is an O(n lg(n)) algorithm and
selection sort is an O(n^{2}) algorithm, why would one ever
choose the "slower" selection sort over the "faster" Merge Sort?

**Answer**: selection sort can be faster than Merge Sort when n is not
large. It's a simpler algorithm so will likely have a lower constant of
proportionality than Merge Sort. The following figure shows an example.

Fig 2: Comparing the functions 10n^{2} and
30 n lg(n) for small values of n.

**Question**: Since Merge Sort and Quicksort are each O(n lg(n))
algorithms, why choose one over the other?

** Answer**: Quicksort runs faster on average, even though both have
the same growth behavior with increasing n.

** Question**: Well, then, why ever use Merge Sort?

** Answer**: The average performance of Quicksort is O(n lg(n)),
but there are worst cases which produce n^{2} performance. Merge Sort
performance is the same for average and worst cases. If you don't want to
take the chance that your data may give the worst case for Quicksort, you
might want to choose Merge Sort (or some other O(n lg(n)) algorithm).

Modified by *Richard Chang* Thu Jan 22 2:56:48 EST 1998.