Wednesday, April 20, 2011

Is there ever a good reason to use Insertion Sort?

For general-purpose sorting, the answer appears to be no, as quick sort, merge sort and heap sort tend to perform better in the average- and worst-case scenarios. However, insertion sort appears to excel at incremental sorting, that is, adding elements to a list one at a time over an extended period of time while keeping the list sorted, especially if the insertion sort is implemented as a linked list (O(log n) average case vs. O(n)). However, a heap seems to be able to perform just (or nearly) as well for incremental sorting (adding or removing a single element from a heap has a worst-case scenario of O(log n)). So what exactly does insertion sort have to offer over other comparison-based sorting algorithms or heaps?

From stackoverflow
  • Most sorting procedures will use quicksort and then insertion sort for very small data sets.

  • From http://www.sorting-algorithms.com/insertion-sort:

    Although it is one of the elementary sorting algorithms with O(n2) worst-case time, insertion sort is the algorithm of choice either when the data is nearly sorted (because it is adaptive) or when the problem size is small (because it has low overhead).

    For these reasons, and because it is also stable, insertion sort is often used as the recursive base case (when the problem size is small) for higher overhead divide-and-conquer sorting algorithms, such as merge sort or quick sort.

    j_random_hacker : +1. Insertion sort's inner loop just happens to be a good fit for modern CPUs and caches -- it's a very tight loop that accesses memory in increasing order only.
    guns : Well, quicksort can be implemented as a stable sort, but since it's optimal for randomized sets, I think the efficient qsort functions randomize the data deliberately before sorting.
    sykora : Insertion sort is also good because it's useful in online situation, when you get one element at a time.
  • An important concept in analysis of algorithms is asymptotic analysis. In the case of two algorithms with different asymptotic running times, such as one O(n^2) and one O(nlogn) as is the case with insertion sort and quicksort respectively, it is not definite that one is faster than the other.

    The important distinction with this sort of analysis is that for sufficiently large N, one algorithm will be faster than another. When analyzing an algorithm down to a term like O(nlogn), you drop constants. When realistically analyzing the running of an algorithm, those constants will be important only for situations of small n.

    So what does this mean? That means for certain small n, some algorithms are faster. This article from EmbeddedGurus.net includes an interesting perspective on choosing different sorting algorithms in the case of a limited space (16k) and limited memory system. Of course, the article references only sorting a list of 20 integers, so larger orders of n is irrelevant. Shorter code and less memory consumption (as well as avoiding recursion) were ultimately more important decisions.

    Insertion sort has low overhead, it can be written fairly succinctly, and it has several two key benefits: it is stable, and it has a fairly fast running case when the input is nearly sorted.

  • If you're talking about maintaining a sorted list, there is no advantage over some kind of tree, it's just slower.

    Well, maybe it consumes less memory or is a simpler implementation.

    Inserting into a sorted list will involve a scan, which means that each insert is O(n), therefore sorting n items becomes O(n^2)

    Inserting into a container such as a balanced tree, is typically log(n), therefore the sort is O(n log(n)) which is of course better.

    But for small lists it hardly makes any difference. You might use an insert sort if you have to write it yourself without any libraries, the lists are small and/or you don't care about performance.

  • YES,

    Insertion sort is better than Quick Sort on short lists.

    In fact an optimal Quick Sort has a size threshold that it stops at, and then the entire array is sorted by insertion sort over the threshold limits.

    Also...

    For maintaining a scoreboard, Binary Insertion Sort may be as good as it gets.

    See this page: http://www.pathcom.com/~vadco/binary.html

    All the Best,

    HeavyJ

0 comments:

Post a Comment