10.6 Shellsort

When lists start to get longer insertion sort gets slower and slower (unless the list is almost in order).

Example

Suppose we had a list that consists of these elements:

1 2 3 4 8 10 15 14

The list is almost in order. For an insertion sort we'd need to do the following.

Even if the list consisted of 1000 negative items and then 1 2 3 4 8 10 15 14 it wouldn't take much more time to sort.

If however, the list was reversed we would need to do far more work since the items would need to be moved around a lot to get them to where they should go.

14 15 10 8 4 3 2 1

First pass: 14 15 is in proper position so no moving
Second pass: 14 and 15 must move over so 10 can go before them. List is now: 10 14 15 8 4 3 2 1
Third pass: 10 14 and 15 must move over so 8 can go before them. List is now: 8 10 14 15 4 3 2 1
.
.
.
Last pass: 2 3 4 8 10 14 and 15 all must move over so 1 can go before them.

All this moving and comparing takes time. The longer the list the more this has to happen (if the items are not almost already in order).


What about selection sort? It gets vastly slower with large lists as well. And it doesn't really make any difference if the list is almost in order.

Example

Suppose we had a list that consists of these elements:

1 2 3 4 8 10 15 14

For a selection sort we need to find the largest item. That is 15. Swap it with the 14. The list is now 1 2 3 4 8 10 14 15. The list is in oder but our selection sort doesn't know that.
So we then find the next largest in the rest of the list. That is 14. It doesn't have to move.
Then we find next largest which is 10 and doesn't have to move. And so on until we get to the 2. The fact the list is almost in order doesn't help much (just saves some swaps but not the number of comparisons).


So how do we sort large unordered lists? We try to take advantage of the fact that insertion sort works well for short and also for almost ordered lists. That's where Shellsort comes in. It uses insertion sort on every kth element in a list.

Def: A list is k-sorted if, starting at any point in the list, every kth element is in order.

Example

Here is how we would 4-sort the following list:

30 22 90 31 8 15 55 73 29 44 74 85 12 13

We first make 4 sublists consisting of elements 1, 5, 9,... and 2, 6, 10,... and 3, 7, 11,... and 4, 8, 12, ...

30                     8                     29                   12
      22                   15                     44                  13
            90                   55                     74
                   31                  73                    85
Sort each sublist to obtain:

8                     12                     29                   30
      13                   15                     22                  44
            55                   74                     90
                   31                  73                    85
This gives the 4-sorted list:

8 13 55 31 12 15 74 73 29 22 90 85 30 44

Notice that this list is much closer to being sorted than the original. And some elements have moved a long way towards where they should be in one pass (note how far the 90 and 13 moved for instance).


Shellsort works by using insertion sort to create sorted sublists. It starts with relatively large values of k and keeps using smaller ones until eventually k is 1. A 1-sorted list is completely sorted.

It has yet to be established what the optimum sequence of k values are. However, one sequence that works well is ..., 40, 13, 4, 1. Each term is one more than 3 times as large as the term on the right. You start with the largest possible value of k that is less than the number of items to be sorted.

In the homework you will be asked to implement this sort.

Let's look at what happens with long lists and the different kinds of sorts we've seen so far. Each of these sorts will be on arrays of 50 000 ints.

You may want to try refreshing/reloading your browser so you can try this applet a few times to see if the results vary. The computer is sharing its resources to run these sorts concurrently. Even so some of the sorts may have to wait for their turn. Also if your computer is doing anything else that will effect the results.

Did you notice how selection sort speeds up at the end? Why is that?