Data Structures and Algorithms
Recursive Sorting Chris Brooks Department of Computer Science University of San Francisco
Department of Computer Scie...
Recursive Sorting Chris Brooks Department of Computer Science University of San Francisco
Department of Computer Science — University of San Francisco – p.1/45
12-0:
Recursive Sorting Algorithms
Basic sorting algorithms all run in Θ(n2 ) time. We can do better by sorting sublists and combining results.
Department of Computer Science — University of San Francisco – p.2/45
12-1:
Merge Sort – Recursive Sorting
Base Case: A list of length 1 or length 0 is already sorted Recursive Case: Split the list in half Recursively sort two halves Merge sorted halves together
Department of Computer Science — University of San Francisco – p.3/45
12-2: 51826437 5128 51 28 • 5 1 2 15 28 1258
6437 64 8 6 46 3467
37 4 37
3
Example
7
12345678
Department of Computer Science — University of San Francisco – p.4/45
12-3:
Merging
Merge lists into a new list, T Maintain three pointers (indices) i, j, and n i is index of left hand list j is index of right hand list n is index of destination list T If A[i] < A[j] T [n] = A[i], increment n and i else T [n] = A[j], increment n and j
Department of Computer Science — University of San Francisco – p.5/45
12-4:
Θ() for Merge Sort
T (0) = c1 for some constant c1 T (1) = c2 for some constant c2 T (n) = nc3 + 2T (n/2) for some constant c3 T (n) = nc3 + 2T (n/2)
Department of Computer Science — University of San Francisco – p.6/45
12-5:
Θ() for Merge Sort
T (0) = c1 for some constant c1 T (1) = c2 for some constant c2 T (n) = nc3 + 2T (n/2) for some constant c3 T (n) = nc3 + 2T (n/2) = nc3 + 2(n/2c3 + 2T (n/4)) = 2nc3 + 4T (n/4)
Department of Computer Science — University of San Francisco – p.7/45
12-6:
Θ() for Merge Sort
T (0) = c1 for some constant c1 T (1) = c2 for some constant c2 T (n) = nc3 + 2T (n/2) for some constant c3 T (n) = nc3 + 2T (n/2) = nc3 + 2(n/2c3 + 2T (n/4)) = 2nc3 + 4T (n/4) = 2nc3 + 4(n/4c3 + 2T (n/8)) = 3nc3 + 8T (n/8))
Department of Computer Science — University of San Francisco – p.8/45
12-7:
Θ() for Merge Sort
T (0) = c1 for some constant c1 T (1) = c2 for some constant c2 T (n) = nc3 + 2T (n/2) for some constant c3 T (n) = nc3 + 2T (n/2) = nc3 + 2(n/2c3 + 2T (n/4)) = 2nc3 + 4T (n/4) = 2nc3 + 4(n/4c3 + 2T (n/8)) = 3nc3 + 8T (n/8)) = 3nc3 + 8(n/8c3 + 2T (n/16)) = 4nc3 + 16T (n/16)
Department of Computer Science — University of San Francisco – p.9/45
12-8:
Θ() for Merge Sort
T (0) = c1 for some constant c1 T (1) = c2 for some constant c2 T (n) = nc3 + 2T (n/2) for some constant c3 T (n) = nc3 + 2T (n/2) = nc3 + 2(n/2c3 + 2T (n/4)) = 2nc3 + 4T (n/4) = 2nc3 + 4(n/4c3 + 2T (n/8)) = 3nc3 + 8T (n/8)) = 3nc3 + 8(n/8c3 + 2T (n/16)) = 4nc3 + 16T (n/16) = 5nc3 + 32T (n/32)
Department of Computer Science — University of San Francisco – p.10/45
12-9:
Θ() for Merge Sort
T (0) = c1 for some constant c1 T (1) = c2 for some constant c2 T (n) = nc3 + 2T (n/2) for some constant c3 T (n) = nc3 + 2T (n/2) = nc3 + 2(n/2c3 + 2T (n/4)) = 2nc3 + 4T (n/4) = 2nc3 + 4(n/4c3 + 2T (n/8)) = 3nc3 + 8T (n/8)) = 3nc3 + 8(n/8c3 + 2T (n/16)) = 4nc3 + 16T (n/16) = 5nc3 + 32T (n/32) = knc3 + 2k T (n/2k )
Department of Computer Science — University of San Francisco – p.11/45
12-10:
Θ() for Merge Sort
T (0) = c1 T (1) = c2 T (n) = knc3 + 2k T (n/2k ) Pick a value for k such that n/2k = 1: n/2k = 1 n = 2k lg n = k T (n) = = = = ∈
(lg n)nc3 + 2lg n T (n/2lg n ) c3 n lg n + nT (n/n) c3 n lg n + nT (1) c3 n lg n + c2 n O(n lg n)
Department of Computer Science — University of San Francisco – p.12/45
12-11:
Θ() for Merge Sort
T(n)
Department of Computer Science — University of San Francisco – p.13/45
12-12:
Θ() for Merge Sort
c*n T(n/2)
T(n/2)
Department of Computer Science — University of San Francisco – p.14/45
12-13:
Θ() for Merge Sort
c*n c*(n/2)
T(n/4)
T(n/4)
c*(n/2)
T(n/4)
T(n/4)
Department of Computer Science — University of San Francisco – p.15/45
12-14:
Θ() for Merge Sort c*n
c*n c*(n/2)
c*n
c*(n/2)
c*(n/4)
c*(n/4)
...
...
c*(n/4) ...
c*(n/4) ...
c*n c*n
Department of Computer Science — University of San Francisco – p.16/45
12-15:
Θ() for Merge Sort
c*n c*(n/2)
c*n c*(n/2)
c*n
lg n leve c*(n/4)
c*(n/4)
...
...
c*(n/4) ...
c*(n/4)
c*n
... c*n
Department of Computer Science — University of San Francisco – p.17/45
12-16:
Θ() for Merge Sort
c*n c*(n/2)
c*n c*(n/2)
c*n
lg n leve c*(n/4)
c*(n/4)
...
...
c*(n/4) ...
c*(n/4)
c*n
... c*n Total time = c*n lg n Θ(n lg n)
Department of Computer Science — University of San Francisco – p.18/45
12-17:
Divide & Conquer
Merge Sort: Divide the list two parts No work required – just calculate midpoint Recursively sort two parts Combine sorted lists into one list Some work required – need to merge lists
Department of Computer Science — University of San Francisco – p.19/45
12-18:
Divide & Conquer
Quick Sort: Divide the list two parts Some work required – Small elements in left sublist, large elements in right sublist Recursively sort two parts Combine sorted lists into one list No work required!
Department of Computer Science — University of San Francisco – p.20/45
12-19:
Quick Sort
Pick a pivot element Reorder the list: All elements < pivot Pivot element All elements > pivot Recursively sort elements < pivot Recursively sort elements > pivot
Department of Computer Science — University of San Francisco – p.21/45
12-20:
Example
Example: 3 7 2 8 1 4 6 Suppose 3 is our pivot Split into 3 2 1 7 8 4 6 Sort left half - suppose 2 is the pivot 123 Sort right half - suppose 1 is the pivot 4678 Recurse and merge
Department of Computer Science — University of San Francisco – p.22/45
12-21:
Quick Sort - Partitioning
Basic Idea: Swap pivot element out of the way (we’ll swap it back later) Maintain two pointers, i and j i points to the beginning of the list j points to the end of the list Move i and j in to the middle of the list – ensuring that all elements to the left of i are < the pivot, and all elememnts to the right of j are greater than the pivot Swap pivot element back to middle of list
Department of Computer Science — University of San Francisco – p.23/45
12-22:
Quick Sort - Partitioning
Pseudocode: Pick a pivot index Swap A[pivotindex] and A[high] Set i ← low, j ← high−1 while (i A[pivot], decrement i swap A[i] and A[j] increment i, decrement j swap A[i] and A[pivot]
Department of Computer Science — University of San Francisco – p.24/45
12-23:
Θ() for Quick Sort
Coming up with a recurrence relation for quicksort is harder than mergesort How the problem is divided depends upon the data Break list into: size 0, size n − 1 size 1, size n − 2 ... size b(n − 1)/2c, size d(n − 1)/2e ... size n − 2, size 1 size n − 1, size 0
Department of Computer Science — University of San Francisco – p.25/45
12-24:
Θ() for Quick Sort
Worst case performance occurs when break list into size n − 1 and size 0 for some constant c1 T (0) = c1 T (1) = c2 for some constant c2 T (n) = nc3 + T (n − 1) + T (0) for some constant c3 T (n) = nc3 + T (n − 1) + T (0) = T (n − 1) + nc3 + c2
Department of Computer Science — University of San Francisco – p.26/45
12-25:
Θ() for Quick Sort
Worst case: T (n) = T (n − 1) + nc3 + c2 T (n) = T (n − 1) + nc3 + c2
Department of Computer Science — University of San Francisco – p.27/45
Department of Computer Science — University of San Francisco – p.32/45
12-31:
Θ() for Quick Sort
T(n)
Department of Computer Science — University of San Francisco – p.33/45
12-32:
Θ() for Quick Sort
c*n T(n-1)
T(0)
Department of Computer Science — University of San Francisco – p.34/45
12-33:
Θ() for Quick Sort
c*n c*(n-1) T(n-2)
c2 T(0)
Department of Computer Science — University of San Francisco – p.35/45
12-34:
Θ() for Quick Sort
c*n c*(n-1)
c2
c*(n-2)
c2
T(n-3)
T(0)
Department of Computer Science — University of San Francisco – p.36/45
12-35:
c*n c*(n-1)
Θ() for Quick Sort
c*n c2
c*(n-1)+c2
c*(n-2)
c2
c*(n-2)+c2
c*(n-3)
c2
c*(n-3)+c2
n level
... c*(n-k)+c2
Department of Computer Science — University of San Francisco – p.37/45
12-36:
c*n c*(n-1)
Θ() for Quick Sort
c*n c2
c*(n-1)+c2
c*(n-2)
c2
c*(n-2)+c2
c*(n-3)
c2
c*(n-3)+c2
n level
... c*(n-k)+c2 Total time = c*n*(n+1)/2 + nc2 Θ(n2 )
Department of Computer Science — University of San Francisco – p.38/45
12-37:
Θ() for Quick Sort
Best case performance occurs when break list into size b(n − 1)/2c and size d(n − 1)/2e T (0) = c1 for some constant c1 T (1) = c2 for some constant c2 T (n) = nc3 + 2T (n/2) for some constant c3 This is the same as Merge Sort: Θ(n lg n)
Department of Computer Science — University of San Francisco – p.39/45
12-38:
Quick Sort?
If Quicksort is Θ(n2 ) on some lists, why is it called quick? Most lists give running time of Θ(n lg n) Average case running time is Θ(n lg n) Constants are very small Constants don’t matter when complexity is different Constants do matter when complexity is the same What lists will cause Quick Sort to have Θ(n2 ) performance?
Department of Computer Science — University of San Francisco – p.40/45
12-39:
Quick Sort - Worst Case
Quick Sort has worst-case performance when: The list is sorted (or almost sorted) The list is inverse sorted (or almost inverse sorted) Many lists we want to sort are almost sorted! How can we fix Quick Sort?
Department of Computer Science — University of San Francisco – p.41/45
12-40:
Better Partitions
Pick the middle element as the pivot Sorted and reverse sorted lists give good performance Pick a random element as the pivot No single list always gives bad performance Pick the median of 3 elements First, Middle, Last 3 Random Elements
Department of Computer Science — University of San Francisco – p.42/45
12-41:
Improving Quick Sort
Insertion Sort runs faster than Quick Sort on small lists Why? We can combine Quick Sort & Insertion Sort When lists get small, run Insertion Sort instead of a recursive call to Quick Sort When lists get small, stop! After call to Quick Sort, list will be almost sorted – finish the job with a single call to Insertion Sort
Department of Computer Science — University of San Francisco – p.43/45
12-42:
Heap Sort
Build a heap out of the data Repeat: Remove the largest element from the list, place it at end of heap Until all elements have been removed from the heap The list is now sorted Example: 3 1 7 2 5 4
Department of Computer Science — University of San Francisco – p.44/45
12-43:
Θ() for Heap Sort
Building the heap takes time Θ(n) Each of the n RemoveMax calls takes time O(lg n) Total time: (n lg n) (also Θ(n lg n))
Department of Computer Science — University of San Francisco – p.45/45