This blog post discusses stable sort implementations in various languages, and it also presents a non-recursive, fast and compact mergesort implementation in C++. The implementation runs the original mergesort algorithm in an iterative way with some optimizations, so it is stable.
Please note that C++ STL has a stable sort built in: std::stable_sort in #include <algorithm>. If it's OK to use STL in your C++ code, use that, and you won't need any custom code from this blog post.
About compactness: when compiled statically with GCC 4.1.2 (g++ -fno-rtti -fno-exceptions -s -O2 -static) for i386, the mergesort in this blog post produces an executable 4084 bytes smaller than with std::stable_sort.
About speed: between 10% and 16% faster than std::stable_sort, see speed measurements below.
The fast, non-recursive, stable sort implementation in C++
It's a non-recursive, stable mergesort implementation, with O(n*log(n)) worst-case speed.
Some convenience functions to call it:
Example invocation:
A slow, non-recursive stable sort implementation in C++
If you don't care about speed, an insertion sort would do: it's simple (short), stable and non-recursive. Unfortunately it's slow: O(n2) in worst case and average.
C++ stable sort speed measurements
10000 random int vectors (of random size smaller than 16384) were sorted in place in a C++ program running on Linux 2.6.35 running on an Intel Core 2 Duo CPU T6600 @ 2.20GHz. The total user time was measured, including the sort, the running of std::stable_sort, and the verification of the sort output (i.e. comparing it to the output of std::stable_sort. The insmax parameter below indicates the size of the sublists with were sorted using insertion sort before running the mergesort (the implementation above uses insmax=2, which is implemented in the Unfold... loop).
The raw time measurement results:
It looks like that the mergesort algorithm proposed in this blog post (insmax=2 in the measurements above) is between 10% and 16% faster than std::stable_sort. It looks like that increasing insmax from 2 to 8 could yield an additional 1.18% speed increase.
Stable sort in other programming languages
Java java.util.Collections.sort and java.util.Arrays.sort are stable. In OpenJDK 1.6, for primitive types, it uses an optimized (tuned) quicksort, and for objects (with a comparator) it uses a recursive, stable mergesort. Even though quicksort is not stable, this doesn't matter for primitive types, because there is no comparator (so it's always in ascending order), and it's impossible to distinguish different instances of the same primitive type value.
C qsort function is not stable, because it uses quicksort. FreeBSD, Mac OS X and other BSD systems have the mergesort function in their standard library (in libc, stdlib/merge.c), Linux glibc doesn't have it. As an alternative, it's straightforward to convert the mergesort implementation in this blog post from C++ to C.
CPython's sorted and list.sort functions are implemented with a stable, recursive mergesort (and binary insertion sort for short inputs) with a sophisticated (complicated, long) code containing lots of optimizations: see the files Objects/listsort.txt and Objects/listobject.c in the Python source code for documentation and implementation. It looks like it's not recursive.
Perl sort uses mergesort by default, so it's stable, but the default can change in future versions. Add use sort 'stable'; to the beginning of your code to get a guaranteed stable sort. See the sort pragma page for details.
Ruby 2.0 Array#sort doesn't say if it's stable or not, but a quick search for the string qsort in array.c reveals that it uses quicksort, so it's not stable. (a bit slow) stable sorting can be added easily:
... as explained here. The fundamental idea in this trick is to compare the indexes if the values are equal. This fundamental idea can be implemented in any programming language, although it may have considerable memory and/or time overhead.
JavaScript's Array.prototype.sort tends to be stable in modern browsers (see the details), but unstable on old browsers.
C# Array.Sort is not stable (because it uses quicksort), but Enumerable.OrderBy and Enumberable.ThenBy are stable (because they use mergesort. See more info here.
Other stable sorting algorithms
Heapsort and quicksort are not stable. Some variations of mergesort (such as Algorithm 5.2.4N and Algorithm 5.2.4S in The Art of Computer programming, Volume 3 by Knuth) are not stable.
This page lists stable sorting algorithms known to mankind. Be careful with the implementations though: even if the algorithm itself is stable, some optimized, otherwise modified or buggy implementations might not be. The algorithms are:
O(n*log(n)), stable, comparison-based: mergesort, cascade merge sort, oscillating merge sort.
O(n2), stable, comparison-based: bubble sort, cocktail sort, insertion sort, binary insertion sort (where the insertion index is found using binary search), gnome sort, library sort, odd-even sort.
O(n2), stable, not comparison-based: bucket sort, proxmap sort.
faster than O(n2), stable, not comparison-based: counting sort, pigeonhole sort, radix sort.
So it looks like that we don't know of an alternative of mergesort for comparison-based sorting in O(n*log(n)) time, the others above are slower. Maybe Edsger Dijkstra[citation needed] can come up with an alternative.