ARRAYS and ALGORITHMS (2)

Here we'll look at some more routines.
---------
SELECTION SORT
This is one alternative to the bubble sort. It does tend to be slightly faster. We should note that the best sorting routines tend to be much more difficult to write and understand for people new to programming. With the bubble sort, we compared successive pairs of values and switched values as needed. What the selection sort does is to find the smallest value in the list, and put it into slot 0. Then we find the next smallest value and put it into slot 1, etc. We'll assume we have a list with 8 elements: 17, 67, -2, 33, 45, 19, 106, 3:

for i in range(7):                  #on each pass, we will find the smallest element in the remaining list
        for j in range(i+1,8,1):      #these are the other values to look at
               if mylist[i] > mylist[j]:      #found a smaller value--so swap
                       temp = mylist[i]
                       mylist[i] = mylist[j]
                       mylist[j] = temp

watch what happens.
1) i=0   j=1      why? j starts with the value i+1, so if i was 3, j would start with 4, etc
              is mylist[0] > mylist[1]? is 17 > 67? no--so do not swap values             17, 67, -2, 33, 45, 19, 106, 3
    i=0 j=2   is mylist[0] > mylist[2]? is 17 > -2? yes--so swap:                            -2, 67, 17, 33, 45, 19, 106, 3
    i=0 j=3    is mylist[0] > mylist[3]? is -2 > 33? no--do not swap            -2, 67, 17, 33, 45, 19, 106, 3
i=0, j=4, 5, 6 -2 is not greater than 45, 19, or 106 no swapping
i=0, j=7 -2 is not graeter than 3--do not swap.    -2, 67, 17, 33, 45, 19, 106, 3
the smallest value is in slot 0. the rest of the list is not (yet) sorted. now we want the next smallest value to go into slot 1:
2) i=1 j=2 is 67>17? yes. swap.    -2, 17, 67, 33, 45, 19, 106, 3
i=1, j=3, 4, 5, 6 17 is less than 33, 45, 19, and 106--no swapping
    i=1 j=7 is 17 > 3? yes--so swap: -2, 3, 67, 33, 45, 19, 106, 17
after two passes, the two smallest elements are in slots 0 and 1--the rest of the list is not (yet) sorted.
3) i=2 j=3 compare 67 and 33. etc. the 3rd smallest value--17--will wind up in slot 2 on this pass.
   etc.
Note that the list we're comparing items in gets shorter and shorter each pass we make through it--this is one of the reasons that the selection short usually runs faster than the bubble sort. "Usually?" yes--suppose we have a list which is already sorted. The bubble sort makes one pass through the list--nothing needs to be swapped--we're all done. The selection sort is still making lots of passes through the same sorted list. But still--on average, the selection sort is faster.
-------------------------------------------------
SEARCHING.
Suppose we have a list of 100 names (or numbers) and this list has already been sorted. How can we find a particular value in this list?    We'll do this in much the same way that you would find a name in a phone book.

We want 3 variables: startpoint, endpoint, and midpoint. For a list, or part of a list, startpoint will the the index of the first value in the list, endpoint will be the index of the last value in the list. The middle value will be at midpoint.

We are looking for NAME in mylist.

Initially--we set startpoint to 0, endpoint to 99--these are the beginning and end of the whole 100-value list. midpoint will be (startpoint + endpoint)/2: thus initially, midpoint will be 50 (we'll round up). We look at the name in mylist[midpoint]. How does this compare to NAME?

1) mylist[midpoint] equals NAME. we've found what we were looking for.
2) mylist[midpoint] < NAME In this case, NAME is in the half of the list from midpoint to endpoint. So what we will do now is to    say:
    startpoint = midpoint
    (endpoint doesn't change)
   midpoint = (startpoint + endpoint)/2   and we repeat the process--in this example, startpoint=50, endpoint=99, and midpoint = 75.
   we only have half of the previous list to check.
3) mylist[midpoint] > NAME. In this case, NAME is in the half of the list from startpoint to midpoint, so we now say:
     (startpoint doesn't change)
     endpoint = midpoint
     midpoint = (startpoint + endpoint)/2    and here we search mylist from slot 0 to slot 50 by checking slot 25, etc

This is actually a very efficient process--it's also how you yourselves find names in a phone book, or on a page in the phone book. On each pass, we throw out half of the remaining names: If our original list was size 100, on the next pass we're searching a list of size 50, then searching a list of size 25, etc. Eventually we reach a list of size 1. Imagine a VERY small town with only 1 inhabitant (Ludwig Snarf, e.g.). There's a phone book for Snarfville--and there's only one name in the phone book. (It's a very small phone book). How long does it take you to look up Ludwig Snarf's phone number?

If you start with 2 and divide it in half, you get 1. If you start with 4 and divide it in half twice, you get 1. If you start with 64 and divide it in half 6 times you get 1. If you start with 1 million and divide it in half 20 times you get 1. Why? 4 is 2 to the second power. 64 is 2 to the 6th, 1 million is 2 to the 20th power. If you like mathy stuff, we say that 20 is log-base-2 of 1,000,000. This means that if we start with a list of size 1 million and divide it in half 20 times we get 1--in a list of 1 million names chopping the list as shown above gets us to a list of size 1.   How many more chops do we need to do if the list was 2 million rather than 1 million? Just one more chop--that's the nice thing about this algorithm