Archive

Posts Tagged ‘Selection Sort’

Different types of sorting algorithms.

May 23, 2013 20 comments

Sorting

  1. Comparable elements
  2. Bubblesort
    1. compare 1st and 2nd elements
    2. if 1st larger than 2nd, swap
    3. compare 2nd and 3rd, and swap if necessary
    4. continue until compare the last two elements
    5. the largest element is now the last element in the array.
    6. repeat statring from the beginning until no swaps are performed (i.e.,
      the array is sorted)
    7. each time you go through the elements bubbling up the largest element
    8. no need to try the last i elements for the ith run since the end
      elements are already sorted
  3. Selection Sort
    1. array to be sorted: A
    2. array to be returned: B
    3. find smallest element in A and put in B
    4. mark space in A with null so it won’t be chosen again
    5. repeat last two steps until B is sorted array
  4. Insertion Sort
    1. algorithm
      • passes through each element
      • everything before element is sorted
      • puts element in appropriate place in sorted half of array by checking
        each element starting from the back of the sorted part of the array
    2. Code Methods: insertionsort
    3. Worst Case O(N2) – when array is reverse sorted
    4. Best Case O(N) – when array is already sorted
    5. Swap Number
      • = number of inversions (i.e., i < j but a[i] > a[j])
      • Running Time = O(I + N)
      • Notice that for a sorted array I = 0
      • For a reverse array I = O(N2)
      • Average Number of inversions in an array of N distinct elements is N *
        (N – 1 ) / 4 = O(N2)
    6. Average Case O(n2)
    7. Any algorithm that sorts by exchanging adjacent elements requires
      O(N2 time on average.

      • Including Bubblesort and Selection Sort
  5. Shellsort
    1. invented by Donald Shell
    2. Algorithm
      1. start separation = number of elements / 2
      2. compare a[i] and a[i+separation]
      3. swap if necessary
      4. do for each element of array i = 0 to a.length – separation
      5. set separation = separation /2
      6. recompare all the elements
    3. separation calculation
      1. shell suggested /2
      2. other sequences give great improvement
      3. note figure 7.3 does not / 2 properly
    4. Code Methods: shellsort
    5. Average Case Shellsort : Open Problem
    6. Worst Case O(N2)
      • 1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 16
      • subsequant increments don’t do anything because not relatively prime
      • last row is insertion sort of N(N-1)/2 comparisons
      • second to last row is (N)(N/2 – 1)/2 comparisons and so on
      • toss out the minus ones (we are rounding up – so will only increase
        running time)
      • NN/2 + NN/2*2 + NN/4*2
      • factor out NN/2 => NN/2 * ( 1 + 1/2 + 1/4 + ..)
      • series is less than 2, so get NN/2 * (2) = NN => O(N2)
      • OR …
      • each pass with increment hk has hk insertion
        sorts of N/hk elements
      • insertion sort is O(N)
      • each pass is thus of O(hk * (N/hk)2)
      • or O(N2/hk)
      • hk changes for each pass
      • hk in it’s worst case is 1, 2, 4, 8, …
      • all other cases will divide down smaller by throwing out remainder
      • so sum of 1/hk = 1 + 1/2 + 1/4 + 1/8 + .. < 2
      • Algorithm so we have 2N2 = O(N2)
    7. Hibbard’s increments
      1. Use relatively prime gaps
      2. 1, 3, 7, 2k – 1
      3. Worst Case: O(N3/2)
      4. I will spare you the proof.
      5. Average Case: O(N5/4) – not proved yet
    8. Sedgewick increments
      1. 1, 5, 19, 41, 109
      2. terms are either 9 * 4i – 9 * 2i + 1 or
        4i – 3 * 2i + 1
      3. Worst Case: O(N5/4)
      4. Average Case: O(N7/6) – not proved
  6. Heapsort
    1. Worst Case: O(NlogN)
    2. Slower in practice than Sedgewick’s
    3. Code Methods: leftChild, percDown, heapsort
    4. algorithm:
      1. put elements in array
      2. build heap O(N)
      3. deleteMin N times and place back in array – O(N*logN)
    5. uses two arrays
      1. too much space for large arrays
      2. could use one, by placing deleteMins in back of array
      3. Will then be in reverse order
      4. could instead delete max
      5. and place max in back of array when delete it
      6. then use one array
      7. similar to selection sort
    6. array for heapsort starts in position 0 as opposed to a heap where it
      starts in position 1
    7. Analysis
      1. Building Heap: 2N comparisons
      2. deleteMin logN each (N of them)
      3. O(NlogN)
      4. Performance is Extremely consistent
      5. On average uses only slightly fewer than worst bound
      6. To determine average case is difficult because successive deleteMax
        destroy heap’s randomness
      7. Average number of comparisons: 2NlogN – O(NloglogN)
      8. Notice that this is still O(NlogN)
      9. Best Case: NlogN – O(N)
      10. Notice that this is still O(NlogN)
  7. Mergesort
    1. Code Methods: mergesort (2), merge
    2. Worst Case: O(NlogN)
    3. Recursively merges two sorted lists
    4. Time to merge two sorted lists is linear (N-1 comparisons)
    5. 1 13 24 26 merge 2 15 27 38 gives 1 2 13 15 24 26 27 38
    6. Classic Divide and Conquer Strategy
    7. If more than one element, divide and merge sort the first and second
      half
    8. Analysis
      1. Recurrance Relation
      2. T(1) = 1
      3. T(N) = 2T(N/2) + N
      4. T(N)/ N = T(N/2)/N/2 + 1
      5. T(N/2) / N/2 = T(N/4) / N/4 + 1
      6. T(2) / 2 = T(1)/1 + 1
      7. Sum up all of the equations
      8. T(N)/N = T(1)/1 + logN <- this logN is the sum of the 1′s
      9. T(N) = NlogN + N = O(NlogN)
    9. Uses extra memory to merge and copy back into array
    10. Merging is cornerstone of external sorting routines
    11. Not often used for main memory sorts
    12. Can also do it not recursively
    13. Or can use less memory – much more complex algorithm (impractical)
  8. Quicksort
    1. Code Methods: quicksort (2), median3
    2. Fastest known sorting algorithm in practice
    3. Worst Case: O(N2)
    4. Average Case: O(NlogN)
    5. Worst Case can be made very unlikely with little effort
    6. Divide and Conquer Algorithm
    7. Algorithm
      1. If the number of elements in S is 0 or 1, return
      2. pick any element in v in S. This is called the pivot.
      3. Partition the elements in S into two groups, those below the pivot
        (numerically) and those above the pivot.
      4. return quicksort of s1 followed by v followed by quicksort of s2
    8. Hope that half keys are greater than the pivot, other are less
    9. Subproblems are not guaranteed to be of equal size which is potentially
      bad
    10. Faster than mergesort because partitioning step can be performed in
      place and efficiently
    11. Picking the Pivot
      1. Pick first element
        1. Awful performance if list is sorted or reverse sorted
        2. All elements will go in S1 or S2
        3. Will take quadratic time and do nothing at all
        4. Presorted input or mostly sorted input is quite frequent
        5. Horrible pivot point
      2. Pick larger of the first two elements
        1. bad for the same reasons as above
      3. Pick pivot randomly
        1. Good solution
        2. Unless random number generator has a flaw (which is common)
        3. Random number generation expensive
      4. Median of Three
        1. Best choice would be the median of the array
        2. Would take too long
        3. Take median of first last and middle
        4. elimnates sorted case
        5. reduces running time of quicksort by about 5%
    12. Partitioning Strategy
      1. Swap pivot into last element
      2. Assuming all distinct elements
      3. while i less than j
      4. Move i to right while elements less than pivot
      5. Move j to left while elements greater than pivot
      6. When i and j have stopped – swap elements and continue
      7. Once i and j have crossed, stop
      8. i is at lowest element greater than pivot
      9. swap i and pivot
      10. Duplicates to pivot
        1. both i and j go past the pivot
        2. Otherwise, if only i does, will put all duplicate in S1
        3. if both go past and all the same element – could create very uneven
          subarrays
        4. giving O(N2) results
        5. best to make both stop and swap, since the extra swaps is better
          than uneven subarrays
        6. Why would we sort identical elements?
        7. Think of sorting 100,000 elements where 5000 are identical
        8. Eventually will call quicksort on 5000 elements (since quicksort is
          recursive)
        9. Will want these to be sorted efficiently
    13. Small Arrays
      1. N <= 20
      2. Insertion sort is faster
      3. Use insertion sort inside of quicksort for arrays smaller than 20
      4. Faster by 15% in running time
      5. Good cutoff is N = 10, although any number between 5 and 20 is fine
      6. Saves nasty cases – such as taking median of three elements when only
        two
    14. Routines
      1. Driver calls quicksort method that specifies the start and stop of
        array to be sorted
      2. Median of Three Pivot selection
        1. returns pivot
        2. places smallest of left, median, and right to left
        3. places largest to the right
        4. places median in the middle then
        5. switched pivot to last element -1
      3. Heart of Quicksort
        1. Calls insertion sort on small arrays
        2. Skips outer elements since done in median3 with the ++ and –
        3. resores pivot into center from right – 1
        4. Need to increase i and j each loop (i.e., not inside while) because
          otherwise i and j will never change and will always equal pivot
    15. Analysis
      1. Worst Case: O(N2)
      2. Best Case: O(NlogN) using same proof as merge sort
      3. Average Case: O(NlogN)
  9. Selection Problem
    1. Quickselect : Code
    2. |s| is the number of elements in s
    3. Algorithm
      1. if |s| is 1 then k = 1, return element
      2. if |S| <= cutoff (i.e., 10), then sort S and return kth largest
      3. pick pivot
      4. partition into S1 and S2
      5. if k <= s1, then kth smallest is in S1, return quicksort(s1, k)
      6. if k = 1 + s1, then return pivot
      7. otherwise quickselect(s2, k – |s1| – 1)
    4. makes only one recursive call instead of two
    5. Worst Case: O(N2) – when S1 or S2 is empty (not saving
      recursive call)
    6. Average Case: O(N)
    7. Array starts at zero, so kth smallest element in k-1 position
    8. If use median of three – then worst case condition almost impossible
    9. Does destroy initial ordering
    10. If not desirable – a copy need be made
  10. Decision Trees and Lower Bounds
    1. Any sorting algorithm that uses comparisons requires at least O(NlogN)
      in worst time
    2. merge sort and heap sort optimal within constant factor
    3. Decision Trees can represent all possible options, and can prove the
      number of comparisons must be NlogN
    4. Can only draw small decision trees (3 elements) will get too large
    5. Let T be a binary tree of depth d, then T has at most 2d
      trees

      • Induction
      • Basis: If d = 0, then there is at most one leaf
      • Otherwise, we have a root, which cannot be a leaf, and a right and
        left subtree, each of depth at most d-1. By the induction hypothesis, they
        can have at most 2d-1 leaves giving a total of at most
        2d leaves.
    6. log(n!) = Log(NlogN)
  11. Bucket Sort
    1. array of the size of maximum element + 1
    2. counts the number of elements of each
    3. useful where input is only small integers
  12. External Sorting
    1. Too large for main memory – skipping section
    2. Multiway Merge
    3. Polyphase Merge
    4. Replacement Selection

Different types of sorting algorithms

June 24, 2009 2 comments

Sorting

  1. Comparable elements
  2. Bubblesort
    1. compare 1st and 2nd elements
    2. if 1st larger than 2nd, swap
    3. compare 2nd and 3rd, and swap if necessary
    4. continue until compare the last two elements
    5. the largest element is now the last element in the array.
    6. repeat statring from the beginning until no swaps are performed (i.e.,
      the array is sorted)
    7. each time you go through the elements bubbling up the largest element
    8. no need to try the last i elements for the ith run since the end
      elements are already sorted
  3. Selection Sort
    1. array to be sorted: A
    2. array to be returned: B
    3. find smallest element in A and put in B
    4. mark space in A with null so it won’t be chosen again
    5. repeat last two steps until B is sorted array
  4. Insertion Sort
    1. algorithm
      • passes through each element
      • everything before element is sorted
      • puts element in appropriate place in sorted half of array by checking
        each element starting from the back of the sorted part of the array
    2. Code Methods: insertionsort
    3. Worst Case O(N2) – when array is reverse sorted
    4. Best Case O(N) – when array is already sorted
    5. Swap Number
      • = number of inversions (i.e., i < j but a[i] > a[j])
      • Running Time = O(I + N)
      • Notice that for a sorted array I = 0
      • For a reverse array I = O(N2)
      • Average Number of inversions in an array of N distinct elements is N *
        (N – 1 ) / 4 = O(N2)
    6. Average Case O(n2)
    7. Any algorithm that sorts by exchanging adjacent elements requires
      O(N2 time on average.

      • Including Bubblesort and Selection Sort
  5. Shellsort
    1. invented by Donald Shell
    2. Algorithm
      1. start separation = number of elements / 2
      2. compare a[i] and a[i+separation]
      3. swap if necessary
      4. do for each element of array i = 0 to a.length – separation
      5. set separation = separation /2
      6. recompare all the elements
    3. separation calculation
      1. shell suggested /2
      2. other sequences give great improvement
      3. note figure 7.3 does not / 2 properly
    4. Code Methods: shellsort
    5. Average Case Shellsort : Open Problem
    6. Worst Case O(N2)
      • 1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 16
      • subsequant increments don’t do anything because not relatively prime
      • last row is insertion sort of N(N-1)/2 comparisons
      • second to last row is (N)(N/2 – 1)/2 comparisons and so on
      • toss out the minus ones (we are rounding up – so will only increase
        running time)
      • NN/2 + NN/2*2 + NN/4*2
      • factor out NN/2 => NN/2 * ( 1 + 1/2 + 1/4 + ..)
      • series is less than 2, so get NN/2 * (2) = NN => O(N2)
      • OR …
      • each pass with increment hk has hk insertion
        sorts of N/hk elements
      • insertion sort is O(N)
      • each pass is thus of O(hk * (N/hk)2)
      • or O(N2/hk)
      • hk changes for each pass
      • hk in it’s worst case is 1, 2, 4, 8, …
      • all other cases will divide down smaller by throwing out remainder
      • so sum of 1/hk = 1 + 1/2 + 1/4 + 1/8 + .. < 2
      • Algorithm so we have 2N2 = O(N2)
    7. Hibbard’s increments
      1. Use relatively prime gaps
      2. 1, 3, 7, 2k – 1
      3. Worst Case: O(N3/2)
      4. I will spare you the proof.
      5. Average Case: O(N5/4) – not proved yet
    8. Sedgewick increments
      1. 1, 5, 19, 41, 109
      2. terms are either 9 * 4i – 9 * 2i + 1 or
        4i – 3 * 2i + 1
      3. Worst Case: O(N5/4)
      4. Average Case: O(N7/6) – not proved
  6. Heapsort
    1. Worst Case: O(NlogN)
    2. Slower in practice than Sedgewick’s
    3. Code Methods: leftChild, percDown, heapsort
    4. algorithm:
      1. put elements in array
      2. build heap O(N)
      3. deleteMin N times and place back in array – O(N*logN)
    5. uses two arrays
      1. too much space for large arrays
      2. could use one, by placing deleteMins in back of array
      3. Will then be in reverse order
      4. could instead delete max
      5. and place max in back of array when delete it
      6. then use one array
      7. similar to selection sort
    6. array for heapsort starts in position 0 as opposed to a heap where it
      starts in position 1
    7. Analysis
      1. Building Heap: 2N comparisons
      2. deleteMin logN each (N of them)
      3. O(NlogN)
      4. Performance is Extremely consistent
      5. On average uses only slightly fewer than worst bound
      6. To determine average case is difficult because successive deleteMax
        destroy heap’s randomness
      7. Average number of comparisons: 2NlogN – O(NloglogN)
      8. Notice that this is still O(NlogN)
      9. Best Case: NlogN – O(N)
      10. Notice that this is still O(NlogN)
  7. Mergesort
    1. Code Methods: mergesort (2), merge
    2. Worst Case: O(NlogN)
    3. Recursively merges two sorted lists
    4. Time to merge two sorted lists is linear (N-1 comparisons)
    5. 1 13 24 26 merge 2 15 27 38 gives 1 2 13 15 24 26 27 38
    6. Classic Divide and Conquer Strategy
    7. If more than one element, divide and merge sort the first and second
      half
    8. Analysis
      1. Recurrance Relation
      2. T(1) = 1
      3. T(N) = 2T(N/2) + N
      4. T(N)/ N = T(N/2)/N/2 + 1
      5. T(N/2) / N/2 = T(N/4) / N/4 + 1
      6. T(2) / 2 = T(1)/1 + 1
      7. Sum up all of the equations
      8. T(N)/N = T(1)/1 + logN <- this logN is the sum of the 1′s
      9. T(N) = NlogN + N = O(NlogN)
    9. Uses extra memory to merge and copy back into array
    10. Merging is cornerstone of external sorting routines
    11. Not often used for main memory sorts
    12. Can also do it not recursively
    13. Or can use less memory – much more complex algorithm (impractical)
  8. Quicksort
    1. Code Methods: quicksort (2), median3
    2. Fastest known sorting algorithm in practice
    3. Worst Case: O(N2)
    4. Average Case: O(NlogN)
    5. Worst Case can be made very unlikely with little effort
    6. Divide and Conquer Algorithm
    7. Algorithm
      1. If the number of elements in S is 0 or 1, return
      2. pick any element in v in S. This is called the pivot.
      3. Partition the elements in S into two groups, those below the pivot
        (numerically) and those above the pivot.
      4. return quicksort of s1 followed by v followed by quicksort of s2
    8. Hope that half keys are greater than the pivot, other are less
    9. Subproblems are not guaranteed to be of equal size which is potentially
      bad
    10. Faster than mergesort because partitioning step can be performed in
      place and efficiently
    11. Picking the Pivot
      1. Pick first element
        1. Awful performance if list is sorted or reverse sorted
        2. All elements will go in S1 or S2
        3. Will take quadratic time and do nothing at all
        4. Presorted input or mostly sorted input is quite frequent
        5. Horrible pivot point
      2. Pick larger of the first two elements
        1. bad for the same reasons as above
      3. Pick pivot randomly
        1. Good solution
        2. Unless random number generator has a flaw (which is common)
        3. Random number generation expensive
      4. Median of Three
        1. Best choice would be the median of the array
        2. Would take too long
        3. Take median of first last and middle
        4. elimnates sorted case
        5. reduces running time of quicksort by about 5%
    12. Partitioning Strategy
      1. Swap pivot into last element
      2. Assuming all distinct elements
      3. while i less than j
      4. Move i to right while elements less than pivot
      5. Move j to left while elements greater than pivot
      6. When i and j have stopped – swap elements and continue
      7. Once i and j have crossed, stop
      8. i is at lowest element greater than pivot
      9. swap i and pivot
      10. Duplicates to pivot
        1. both i and j go past the pivot
        2. Otherwise, if only i does, will put all duplicate in S1
        3. if both go past and all the same element – could create very uneven
          subarrays
        4. giving O(N2) results
        5. best to make both stop and swap, since the extra swaps is better
          than uneven subarrays
        6. Why would we sort identical elements?
        7. Think of sorting 100,000 elements where 5000 are identical
        8. Eventually will call quicksort on 5000 elements (since quicksort is
          recursive)
        9. Will want these to be sorted efficiently
    13. Small Arrays
      1. N <= 20
      2. Insertion sort is faster
      3. Use insertion sort inside of quicksort for arrays smaller than 20
      4. Faster by 15% in running time
      5. Good cutoff is N = 10, although any number between 5 and 20 is fine
      6. Saves nasty cases – such as taking median of three elements when only
        two
    14. Routines
      1. Driver calls quicksort method that specifies the start and stop of
        array to be sorted
      2. Median of Three Pivot selection
        1. returns pivot
        2. places smallest of left, median, and right to left
        3. places largest to the right
        4. places median in the middle then
        5. switched pivot to last element -1
      3. Heart of Quicksort
        1. Calls insertion sort on small arrays
        2. Skips outer elements since done in median3 with the ++ and –
        3. resores pivot into center from right – 1
        4. Need to increase i and j each loop (i.e., not inside while) because
          otherwise i and j will never change and will always equal pivot
    15. Analysis
      1. Worst Case: O(N2)
      2. Best Case: O(NlogN) using same proof as merge sort
      3. Average Case: O(NlogN)
  9. Selection Problem
    1. Quickselect : Code
    2. |s| is the number of elements in s
    3. Algorithm
      1. if |s| is 1 then k = 1, return element
      2. if |S| <= cutoff (i.e., 10), then sort S and return kth largest
      3. pick pivot
      4. partition into S1 and S2
      5. if k <= s1, then kth smallest is in S1, return quicksort(s1, k)
      6. if k = 1 + s1, then return pivot
      7. otherwise quickselect(s2, k – |s1| – 1)
    4. makes only one recursive call instead of two
    5. Worst Case: O(N2) – when S1 or S2 is empty (not saving
      recursive call)
    6. Average Case: O(N)
    7. Array starts at zero, so kth smallest element in k-1 position
    8. If use median of three – then worst case condition almost impossible
    9. Does destroy initial ordering
    10. If not desirable – a copy need be made
  10. Decision Trees and Lower Bounds
    1. Any sorting algorithm that uses comparisons requires at least O(NlogN)
      in worst time
    2. merge sort and heap sort optimal within constant factor
    3. Decision Trees can represent all possible options, and can prove the
      number of comparisons must be NlogN
    4. Can only draw small decision trees (3 elements) will get too large
    5. Let T be a binary tree of depth d, then T has at most 2d
      trees

      • Induction
      • Basis: If d = 0, then there is at most one leaf
      • Otherwise, we have a root, which cannot be a leaf, and a right and
        left subtree, each of depth at most d-1. By the induction hypothesis, they
        can have at most 2d-1 leaves giving a total of at most
        2d leaves.
    6. log(n!) = Log(NlogN)
  11. Bucket Sort
    1. array of the size of maximum element + 1
    2. counts the number of elements of each
    3. useful where input is only small integers
  12. External Sorting
    1. Too large for main memory – skipping section
    2. Multiway Merge
    3. Polyphase Merge
    4. Replacement Selection
Journey to SQL Authority with Pinal Dave

SQL, SQL Server, MySQL, Big Data and NoSQL

Bug: An elusive creature living in a program that makes it incorrect. The activity of "debugging", or removing bugs from a program, ends when people get tired of doing it, not when the bugs are removed.

CodeProject Latest Articles

Bug: An elusive creature living in a program that makes it incorrect. The activity of "debugging", or removing bugs from a program, ends when people get tired of doing it, not when the bugs are removed.

Follow

Get every new post delivered to your Inbox.

Join 31 other followers