A Fast Easy Sort

B y S t e p h e n L a c e y a n d R I c h a r d B o x

A bubble sort is the first standard sorting algorithm most programmers learn how to code. Those who continue to write programs for mainframes, personal computers, or even programmable calculators may continue to use a bubble sort simply because it's intuitive, it's easy to write and debug, and it consumes little memory. A bubble sort would most likely be the standard sorting routine today if it were not so incredibly slow for most lists. However, a few simple modifications to this classic routine make it a fast and efficient sort for all kinds of lists.

Turtles and Rabbits

In a bubble sort, each element is compared to the next; if the two are out of order, they are swapped. It takes many passes through a list to finally get all the elements in order. A bubble sort is finished sorting when it makes a pass that does not require any swaps. This type of sorting routine is called a bubble sort because of the way low values appear to "bubble" up to the top of the list. The comparison of two values can be the most time-consuming part of any sorting routine, and bubble sorts require comparisons proportional to the list size squared, or N2. Faster sorting methods, such as Quick-sort, require comparisons proportional to N*log2N.

Bubble sorts are also slow because they are susceptible to the birth of elements we call turtles. A turtle (in an ascending sort) is a relatively low value located near the end of a list. During a bubble sort, this element moves up only one position for each pass (or stroke), so a single turtle can cause maximal slowing. Almost every long random list contains a turtle.

On the other hand, a high- value element near the top of a list (a rabbit) is harmless. If you reverse the direction of the stroke, turtles become rabbits and rabbits become turtles. The worst possible turtle is the lowest relative value at the end of a list. It forces a bubble sort to make (N-1)2 comparisons. This means that a bubble sort of a 1000-item list could require nearly a million comparisons.

Combsort is our simple modification of a bubble sort. It eliminates turtles by allowing the distance between compared elements (the gap, which is always 1 in a bubble sort) to be greater than 1.

For the first stroke, the gap is the list length divided by 1.3 (the shrink factor, an important value, as we explain later.) Before each subsequent stroke, the gap is reduced to the previous gap divided by 1.3; if this quotient becomes less than 1, the gap is reduced to 1, collapsing Combsort into a bubble sort. A swap moves across the entire gap, causing turtles to jump rather than crawl. The key to speeding up turtles, thus speeding up the sort, is to give the turtles just enough opportunities to make large jumps.

Successively shrinking the gap is analogous to combing long, tangled hair stroking first with your fingers alone then with a pick comb that has widely spaced teeth, followed by finer combs with progressively closer teeth. Combsort has a similar shrinking effect on the gap (hence the name comb sort) Each stroke presorts the list (i.e., it kills or wounds some turtles) Therefore, by the time the gap has declined to unity (a bubble sort), all the elements are so close to their final positions that a bubble sort is efficient.

The Importance of 1.3 We came up with a shrink factor of 1.3 empirically by testing Combsort on over 200,000 random lists. As shown in figure 1, we plotted sort time~ versus shrink factors from 1.1 to 1.45 for random lists varying in length from 1000 to 1040 elements. This is a cornucopia plot, which means that the data follows a relatively tight curve at the small shrink factors but starts to vary widely as the shrink factor increases. We looked for the best situation, a balance between short times and predictability, and we found that the optimum shrink factor (SF) is near 1.3. SF = 1.15 is slowed by excess comparisons; SF =1.45 is erratically slowed be-cause too few turtles are killed. From this and similar plots, it became clear to us that 1.3 is the best marriage of consistency and speed. We tried dividing the list of gaps (the gap list) into geometric halves and using a different shrink factor for each half. A three dimensional contour plot of time versus SF1 and SF2 looks like a crater with the ideal points at SF1 = 1.37 and SF2=l.24. This routine averages 1 per-cent faster than the single shrink factor of 1.3, but it forms turtles about 3 percent of the time. This is an unacceptable tradeoff. The performance is not consistent enough, because SF1 shrinks the gap too aggressively for some lists. We tried initializing the gap at 50 random positions, but there was no improvement. We designed routines that use the number of swaps on a completed stroke to determine how much to shrink the gap for the next stroke. However, none of these methods was as consistent as Combsort, particularly with nonrandom lists. And also linear approaches to the decrease in gap, such as gap list = 0.9N, 0.8N, 0.7N 1, are substantially slower than Combsort. We also experimented with an 'exponential" shrink factor. Before each stroke, the gap was divided by a shrink factor, and the shrink factor was divided by a second factor. This routine was almost as fast as Combsort but was susceptible to turtles, much the same as Combsort with two shrink factors.

Rule of 11

The one improvement we found was to eliminate gap sizes of 9 and 10 but always include 11; the gap size becomes 11. With shrink factors near 1.3, the gap list can conclude its journey toward 1 in only three ways:

9 6 4 3 2 1
10 7 5 3 2 1
11 8 6 4 3 2 1

In the third case, all mini-turtles are killed before the gap reaches 1. In the first two cases, about 8 percent of lists have surviving mini-turtles, causing the routine to slow by 15 percent to 20 percent. The result is Combsort11

You may wonder whether there are nonrandom lists that decrease performance. This is an important question, because certain list types cause some implementations of Quicksort (the fastest known Sort) to degenerate into an N2 sort.

For example, a presorted list is best sorted by a bubble sort, while a reverse- sorted list is the worst case for a bubble sort. Combsort1l sorts both list types faster than it would sort a random list. In fact, all the partially ordered lists we tested Sort faster with Combsort1l than a random list does. Since we were unable to design an algorithm better than Conbsort11 to select a gap list, we sought to demonstrate by exhaustion that a better gap list cannot be generated. We refer to the difference between any list and its sorted counterpart as the tangle. At any point in a sort, the tangle is calculated by summing the square of the distance each element currently is from its sorted position. Tangle = 0 when the list is sorted. We square the distances to emphasize turtles. The efficiency of a stroke for a given gap at any point in the sort is defined as the percent of change in the tangle per comparison. We plotted efficiency versus gap on a list of 100 elements, trying every possible gap. After the ideal gap had been estimated and applied, the same process of finding the tangle and efficiency was repeated for each stroke. These putative ideal gap lists were all strikingly similar to a gap list generated by 1.3, except that they decreased faster when the gap fell below 13. This failure to follow the Rule of 11 allowed fast sorting in special cases but did not have general applicability. Even our "proof by exhaustion" revealed no gap list consistently superior to the one that was calculated by the shrink factor method using SF=l.3.

Figure 1: The number of compares (approximately equivalent to the sorting time) for different shrink factors. The random lists have from 1000 to 1040 elements. There are better shrink factors than 1. 3 for individual lists, but 1.3 gives the best balance of consistency and speed for a greater variety of lists. The sorting routine is Combsort11.