In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain Total order. The most frequently used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the efficiency of other algorithms (such as search algorithm and merge algorithm algorithms) which require input data to be in sorted lists. Sorting is also often useful for Canonicalization data and for producing humanreadable output. More formally, the output of any sorting algorithm must satisfy two conditions:
Further, the input data is often stored inside of an array, which allows random access, rather than a list, which only allows sequential access; though many algorithms can be applied to either type of data after suitable modification.
Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, such as heaps and , randomized algorithms, best, worst and average case analysis, time–space tradeoffs, and upper and lower bounds.
More formally, the data being sorted can be represented as a record or tuple of values, and the part of the data that is used for sorting is called the key. In the card example, cards are represented as a record (rank, suit), and the key is the rank. A sorting algorithm is stable if whenever there are two records R and S with the same key, and R appears before S in the original list, then R will always appear before S in the sorted list.
When equal elements are indistinguishable, such as with integers, or more generally, any data where the entire element is the key, stability is not an issue. Stability is also not an issue if all keys are different.
Unstable sorting algorithms can be specially implemented to be stable. One way of doing this is to artificially extend the key comparison, so that comparisons between two objects with otherwise equal keys are decided using the order of the entries in the original input list as a tiebreaker. Remembering this order, however, may require additional time and space.
One application for stable sorting algorithms is sorting a list using a primary and secondary key. For example, suppose we wish to sort a hand of cards such that the suits are in the order clubs (♣), diamonds (♦), hearts (♥), spades (♠), and within each suit, the cards are sorted by rank. This can be done by first sorting the cards by rank (using any sort), and then doing a stable sort by suit:
Within each suit, the stable sort preserves the ordering by rank that was already done. This idea can be extended to any number of keys, and is leveraged by radix sort. The same effect can be achieved with an unstable sort by using a lexicographic key comparison, which, e.g., compares first by suit, and then compares by rank if the suits are the same.
These are all , and so cannot perform better than in the average or worst case.
+ ! Name !! Best !! Average !! Worst !! Memory !! Stable !! Method !! Other notes  
Quicksort  variation is  Typical inplace sort is not stable; stable versions exist.  Partitioning  Quicksort is usually done inplace with stack space.  
Merge sort  A hybrid Block sort is O(1) mem.  Yes  Merging  Highly parallelizable (up to using the Three Hungarians' Algorithm or, more practically, Cole's parallel merge sort) for processing large amounts of data.  
Inplace merge sort  —  —  See above, for hybrid, that is  Yes  Merging  Can be implemented as a stable sort based on stable inplace merging.  
Heapsort  If all keys are distinct,  No  Selection  
Insertion sort  Yes  Insertion  , in the worst case over sequences that have d inversions.  
Introsort  No  Partitioning & Selection  Used in several STL implementations.  
Selection sort  No  Selection  Stable with O(n) extra space, for example using lists.  
Timsort  Yes  Insertion & Merging  Makes n comparisons when the data is already sorted or reverse sorted.  
Cubesort  Yes  Insertion  Makes n comparisons when the data is already sorted or reverse sorted.  
Shell sort  No  Insertion  Small code size, no use of call stack, reasonably fast, useful where memory is at a premium such as embedded and older mainframe applications. There is a worst case O(n (log n)²) gap sequence but it loses O(n log n) best case time.  
Bubble sort  Yes  Exchanging  Tiny code size.  
Binary tree sort  Yes  Insertion  When using a selfbalancing binary search tree.  
Cycle sort  No  Insertion  Inplace with theoretically optimal number of writes.  
Library sort  Yes  Insertion  
Patience sorting  —  No  Insertion & Selection  Finds all the longest increasing subsequences in .  
Smoothsort  No  Selection  An adaptive sort variant of heapsort based upon the Leonardo number rather than a traditional binary heap.  
Strand sort  Yes  Selection  
Tournament sort  No  Selection  Variation of Heap Sort.  
Cocktail sort  Yes  Exchanging  
Comb sort  No  Exchanging  Faster than bubble sort on average.  
Gnome sort  Yes  Exchanging  Tiny code size.  
UnShuffle Sort  No  Distribution and Merge  No exchanges are performed. The parameter k is proportional to the entropy in the input. k = 1 for ordered or reverse ordered input.  
Franceschini's method  —  Yes  ?  
Block sort  Yes  Insertion & Merging  Combine a blockbased O(n) inplace merge algorithm with a bottomup merge sort.  
Odd–even sort  Yes  Exchanging  Can be run on parallel processors easily. 
The following table describes integer sorting algorithms and other sorting algorithms that are not . As such, they are not limited by a lower bound. Complexities below assume items to be sorted, with keys of size , digit size , and the range of numbers to be sorted. Many of them are based on the assumption that the key size is large enough that all entries have unique key values, and hence that , where ≪ means "much less than". In the unitcost random access machine model, algorithms with running time of $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$, such as radix sort, still take time proportional to , because is limited to be not more than $2^\backslash frac\{k\}\{d\}$, and a larger number of elements to sort would require a bigger in order to store them in the memory.
+ Noncomparison sorts ! Name !! Best !! Average !! Worst !! Memory !! Stable !! !! Notes  
Pigeonhole sort  —  $n\; +\; 2^k$  $n\; +\; 2^k$  $2^k$  Yes  Yes  
Bucket sort (uniform keys)  —  $n+k$  $n^2\; \backslash cdot\; k$  $n\; \backslash cdot\; k$  Yes  No  Assumes uniform distribution of elements from the domain in the array. 
Bucket sort (integer keys)  —  $n+r$  $n+r$  $n+r$  Yes  Yes  If r is O(n), then average time complexity is O(n). (2018). 9780471383659, John Wiley & Sons. ISBN 9780471383659 
Counting sort  —  $n+r$  $n+r$  $n+r$  Yes  Yes  If r is O(n), then average time complexity is O(n). 
LSD Radix Sort  —  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $n\; +\; 2^d$  Yes  No  
MSD Radix Sort  —  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $n\; +\; 2^d$  Yes  No  Stable version uses an external array of size n to hold all of the bins. 
MSD Radix Sort (inplace)  —  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $2^d$  No  No  $\backslash frac\{k\}\{d\}$ recursion levels, 2^{ d} for count array. 
Spreadsort  $n$  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $n\; \backslash cdot\; \backslash left(\; \{\backslash frac\{k\}\{s\}\; +\; d\}\; \backslash right)$  $\backslash frac\{k\}\{d\}\; \backslash cdot\; 2^d$  No  No  Asymptotic are based on the assumption that , but the algorithm does not require this. 
Burstsort  —  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  No  No  Has better constant factor than radix sort for sorting strings. Though relies somewhat on specifics of commonly encountered strings. 
Flashsort  $n$  $n+r$  $n^2$  $n$  No  No  Requires uniform distribution of elements from the domain in the array to run in linear time. If distribution is extremely skewed then it can go quadratic if underlying sort is quadratic (it is usually an insertion sort). Inplace version is not stable. 
Postman sort  —  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $n\; \backslash cdot\; \backslash frac\{k\}\{d\}$  $n+2^d$  —  No  A variation of bucket sort, which works very similar to MSD Radix Sort. Specific to post service needs. 
Samplesort can be used to parallelize any of the noncomparison sorts, by efficiently distributing data into several buckets and then passing down sorting to several processors, with no need to merge as buckets are already sorted between each other.
Some algorithms are slow compared to those discussed above, such as the bogosort with unbounded run time and the stooge sort which has O( n^{2.7}) run time. These sorts are usually described for educational purposes in order to demonstrate how run time of algorithms is estimated. The following table describes some sorting algorithms that are impractical for reallife use in traditional software contexts due to extremely poor performance or specialized hardware requirements.
Bead sort  N/A  No  Works only with positive integers. Requires specialized hardware for it to run in guaranteed O(n) time. There is a possibility for software implementation, but running time will be O(S), where S is sum of all integers to be sorted, in case of small integers it can be considered to be linear.  
Pancake sorting  —  No  Yes  Count is number of flips.  
Spaghetti sort  Yes  Polling  This is a lineartime, analog algorithm for sorting a sequence of items, requiring O( n) stack space, and the sort is stable. This requires n parallel processors. See spaghetti sort#Analysis.  
Sorting network  Varies (stable sorting networks require more comparisons)  Yes  Order of comparisons are set in advance based on a fixed network size. Impractical for more than 32 items.  
Bitonic sorter  No  Yes  An effective variation of Sorting networks.  
Bogosort  No  Yes  Random shuffling. Used for example purposes only, as sorting with unbounded worst case running time.  
Stooge sort  No  Yes  Slower than most of the sorting algorithms (even naive ones) with a time complexity of . 
Theoretical computer scientists have detailed other sorting algorithms that provide better than O( n log n) time complexity assuming additional constraints, including:
For more restricted data, such as numbers in a fixed interval, distribution sorts such as counting sort or radix sort are widely used. Bubble sort and variants are rarely used in practice, but are commonly found in teaching and theoretical discussions.
When physically sorting objects, such as alphabetizing papers (such as tests or books), people intuitively generally use insertion sorts for small sets. For larger sets, people often first bucket, such as by initial letter, and multiple bucketing allows practical sorting of very large sets. Often space is relatively cheap, such as by spreading objects out on the floor or over a large area, but operations are expensive, particularly moving an object a large distance – locality of reference is important. Merge sorts are also practical for physical objects, particularly as two hands can be used, one for each list to merge, while other algorithms, such as heap sort or quick sort, are poorly suited for human use. Other algorithms, such as library sort, a variant of insertion sort that leaves spaces, are also practical for physical use.
The algorithm finds the minimum value, swaps it with the value in the first position, and repeats these steps for the remainder of the list. It does no more than n swaps, and thus is useful where swapping is very expensive.
While these algorithms are asymptotically efficient on random data, for practical efficiency on realworld data various modifications are used. First, the overhead of these algorithms becomes significant on smaller data, so often a hybrid algorithm is used, commonly switching to insertion sort once the data is small enough. Second, the algorithms often perform poorly on already sorted data or almost sorted data – these are common in realworld data, and can be sorted in O( n) time by appropriate algorithms. Finally, they may also be unstable sort, and stability is often a desirable property in a sort. Thus more sophisticated algorithms are often employed, such as Timsort (based on merge sort) or introsort (based on quicksort, falling back to heap sort).
Merge sort has seen a relatively recent surge in popularity for practical implementations, due to its use in the sophisticated algorithm Timsort, which is used for the standard sort routine in the programming languages Python and Java (as of JDK7). Merge sort itself is the standard routine in Perl, among others, and has been used in Java at least since 2000 in JDK1.3. Merge sort in Java 1.3, Sun.Java 1.3 live since 2000
The important caveat about quicksort is that its worstcase performance is O( n^{2}); while this is rare, in naive implementations (choosing the first or last element as pivot) this occurs for sorted data, which is a common case. The most complex issue in quicksort is thus choosing a good pivot element, as consistently poor choices of pivots can result in drastically slower O( n^{2}) performance, but good choice of pivots yields O( n log n) performance, which is asymptotically optimal. For example, if at each step the median is chosen as the pivot then the algorithm works in O( n log n). Finding the median, such as by the median of medians selection algorithm is however an O( n) operation on unsorted lists and therefore exacts significant overhead with sorting. In practice choosing a random pivot almost certainly yields O( n log n) performance.
The worstcase time complexity of Shellsort is an open problem and depends on the gap sequence used, with known complexities ranging from O( n^{2}) to O( n^{4/3}) and Θ( n log^{2} n). This, combined with the fact that Shellsort is inplace, only needs a relatively small amount of code, and does not require use of the call stack, makes it is useful in situations where memory is at a premium, such as in and operating system kernels.
A bucket sort works best when the elements of the data set are evenly distributed across all buckets.
For example, the popular recursive quicksort algorithm provides quite reasonable performance with adequate RAM, but due to the recursive way that it copies portions of the array it becomes much less practical when the array does not fit in RAM, because it may cause a number of slow copy or move operations to and from disk. In that scenario, another algorithm may be preferable even if it requires more total comparisons.
One way to work around this problem, which works well when complex records (such as in a relational database) are being sorted by a relatively small key field, is to create an index into the array and then sort the index, rather than the entire array. (A sorted version of the entire array can then be produced with one pass, reading from the index, but often even that is unnecessary, as having the sorted index is adequate.) Because the index is much smaller than the entire array, it may fit easily in memory where the entire array would not, effectively eliminating the diskswapping problem. This procedure is sometimes called "tag sort".
Another technique for overcoming the memorysize problem is using external sorting, for example one of the ways is to combine two algorithms in a way that takes advantage of the strength of each to improve overall performance. For instance, the array might be subdivided into chunks of a size that will fit in RAM, the contents of each chunk sorted using an efficient algorithm (such as quicksort), and the results merged using a kway merge similar to that used in mergesort. This is faster than performing either mergesort or quicksort over the entire list.Donald Knuth, The Art of Computer Programming, Volume 3: Sorting and Searching, Second Edition. AddisonWesley, 1998, , Section 5.4: External Sorting, pp. 248–379.Ellis Horowitz and Sartaj Sahni, Fundamentals of Data Structures, H. Freeman & Co., .
Techniques can also be combined. For sorting very large sets of data that vastly exceed system memory, even the index may need to be sorted using an algorithm or combination of algorithms designed to perform reasonably with virtual memory, i.e., to reduce the amount of swapping required.
A kind of opposite of a sorting algorithm is a shuffling algorithm. These are fundamentally different because they require a source of random numbers. Interestingly, shuffling can also be implemented by a sorting algorithm, namely by a random sort: assigning a random number to each element of the list and then sorting based on the random numbers. This is generally not done in practice, however, and there is a wellknown simple and efficient algorithm for shuffling: the Fisher–Yates shuffle.
Sorting algorithms are also given for parallel computers. These algorithms can all be run on a single instruction stream multiple data stream computer. Habermann's parallel neighborsort (or the glory of the induction principle) sorts k elements using k processors in k steps. This articlehttp://repository.cmu.edu/cgi/viewcontent.cgi?article=2876&context=compsci introduces Optimal Algorithms for Paraller Computers where rk elements can be sorted using k processors in k steps.

