Performance of the Java sorting algorithm

Generally, Java's standard sorting method (Arrays.sort() and its derivative Collections.sort()) does what you would expect from a library routine: it is general-purpose, correct, well-tested and provides adequate performance in many cases. The purpose of this page is therefore not to advocate that everyone goes off and writes their own sorting routine. However, sometimes it is useful to know what performance you can expect from the library routine, and what the few specialised cases are where it's worth replacing it.

Java's sorting algorithm

The sorting algorithm used (in Sun's implementation at least) is a version of mergesort1. Mergesort is based on a "divide and conquer" strategy: we recursively divide the list into two halves and sort each half, then merge the two sorted halves back together. As we break the data down into smaller and smaller parts, we eventually reach a critical size where it's not worth dividing any more, and we sort in situ. Java uses an insertion sort for these 'atomic' sorts. Together, this gives Java's sort routine the following characteristics:

You might have read the last point and thought "well, duh!". But it turns out that more-or-less doubling the sort time as the data size is doubled is actually good going. Various less efficient sorting algorithms (including the insertion sort on its own) don't manage this, or are more likely to hit "worse cases", where the time effectively quadruples as the list size doubles.

Sort performance in practice

Holding this in our head, on the next page we look at some observed measurements of performance of Java's sorting algorithm.

1. I understand that a different algorithm will be used in Java 7. Even more reason not to rewrite your code on the basis of the particular algorithm used in today's version of the library.
2. As we'll see, mergesort is usually analysed as an O n log n sort. In practical terms, this means that when we double the number of elements, the sort time is actually "slightly worse than double" (multiplying by 2.2 gives a good guide in many cases).