Skip to content

Sorters

Morwenn edited this page Jun 12, 2016 · 74 revisions

cpp-sort uses function objects called sorters instead of regular function templates in order to implement sorting algorithms. The library provides two categories of sorters: generic sorters that will sort a collection with any given comparison function, and type-specific sorters which will be optimized to sort collections of a given type, and generally don't allow to use custom comparison functions due to the way they work. Every comparison sorter as well as some type-specific sorters may also take an additional projection parameter, allowing the algorithm to "view" the data to sort through an on-the-fly transformation.

While these function objects offer little more than regular sorting functions by themselves, they can be used together with sorter adapters to craft more elaborate sorters effortlessly. Every sorter is available in its own file. However, all the available sorters can be included at once with the following line:

#include <cpp-sort/sorters.h>

Note that for every foobar_sorter described in this page, there is a corresponding foobar_sort global instance that allows not to care about the sorter abstraction as long as it is not needed (the instances are usable as regular function templates). The only sorter without a corresponding global instance is default_sorter since it mainly exists as a fallback sorter for the functions cppsort::sort and cppsort::stable_sort when they are called without an explicit sorter.

If you want to read more about sorters and/or write your own one, then you should have a look at the dedicated page or at a specific example.

Comparison sorters

The following sorters are available and will work with any type for which std::less works and should accept any well-formed comparison function:

block_sorter<>

#include <cpp-sort/sorters/block_sorter.h>

Implements a block sort.

Best Average Worst Memory Stable Iterators
n n log n n log n 1 Yes Random-access

Note: block_sorter is a buffered sorter whose default specialization allocates a fixed buffer of 512 elements to achieve O(1) auxiliary memory. This memory complexity may change depending of the buffer provider passed to it.

template<
    typename BufferProvider = utility::fixed_buffer<512>
>
struct block_sorter;

Whether this sorter works with types that are not default-constructible depends on the memory allocation strategy of the buffer provider. The default specialization does not work with such types.

default_sorter

#include <cpp-sort/sorters/default_sorter.h>

Sorter striving to use a sorting algorithm as optimized as possible. It is the fallback sorter used by cppsort::sort when no sorter is given. The current implementation defines it as follows:

struct default_sorter:
    self_sort_adapter<
        hybrid_adapter<
            small_array_adapter<
                low_comparisons_sorter,
                std::make_index_sequence<14u>
            >,
            quick_sorter,
            pdq_sorter
        >
    >
{};

The adapter stable_adapter has an explicit specialization for default_sorter defined as follows:

template<>
struct stable_adapter<default_sorter>:
    merge_sorter
{};

grail_sorter<>

#include <cpp-sort/sorters/grail_sorter.h>

Implements a Grail sort.

Best Average Worst Memory Stable Iterators
? n log n n log n 1 Yes Random-access

Note: grail_sorter is a buffered sorter whose default specialization achieves O(1) auxiliary memory by not actualy allocating any additional memomry. The memory complexity may change depending of the buffer provider passed to it.

template<
    typename BufferProvider = utility::fixed_buffer<0>
>
struct grail_sorter;

Whether this sorter works with types that are not default-constructible depends on the memory allocation strategy of the buffer provider. The default specialization works with such types.

heap_sorter

#include <cpp-sort/sorters/heap_sorter.h>

Implements a heapsort.

Best Average Worst Memory Stable Iterators
n log n n log n n log n 1 No Random-access

insertion_sorter

#include <cpp-sort/sorters/insertion_sorter.h>

Implements an insertion sort.

Best Average Worst Memory Stable Iterators
n 1 Yes Forward

This sorter also has the following dedicated algorithms when used together with container_aware_adapter:

Container Best Average Worst Memory Stable
std::list n 1 Yes
std::forward_list n 1 Yes

merge_insertion_sorter

#include <cpp-sort/sorters/merge_insertion_sorter.h>

Implements the Ford-Johnson merge-insertion sort. I am no researcher and I couldn't find the algorithm's complexity on the internet, so you will have nice interrogation marks instead. This algorithm isn't meant to actually be used and is mostly interesting from a computer science point of view: for really small collections, it has an optimal worst case for the number of comparisons performed; it has been proven that for some sizes, no algorithm can perform less comparisons. That said, the algorithm has a rather big memory overhead and performs many move operations; it is really too slow for any real world use.

Best Average Worst Memory Stable Iterators
? ? ? ? No Random-access

merge_sorter

#include <cpp-sort/sorters/merge_sorter.h>

Implements a merge sort.

Best Average Worst Memory Stable Iterators
n log n n log n n log n n Yes Forward
n log n n log n n log² n log n Yes Forward

Note: when additional memory is available, merge_sorter runs in O(n log n), however if there is no additional memory available, it uses a O(n log² n) algorithm instead. The merging algorithm is memory adaptative, so even if it can only allocate a bit of memory instead of all the memory it needs, it will still take advantage of this additional memory.

This sorter also has the following dedicated algorithms when used together with container_aware_adapter:

Container Best Average Worst Memory Stable
std::list n log n n log n n log n log n Yes
std::forward_list n log n n log n n log n log n Yes

pdq_sorter

#include <cpp-sort/sorters/pdq_sorter.h>

Implements a pattern-defeating quicksort.

Best Average Worst Memory Stable Iterators
n n log n n log n log n No Random-access

poplar_sorter

#include <cpp-sort/sorters/poplar_sorter.h>

Implements a poplar sort, which is a heapsort derivate described by Coenraad Bron and Wim H. Hesselink in Smoothsort revisited. It builds a forest of perfect max heaps whose roots are stored on the right, then unheaps the elements to sort the collection.

Best Average Worst Memory Stable Iterators
n long n n log n n log n log n No Random-access

Note: this sorter is a bit faster or a bit slower than smooth_sorter depending on the patterns in the data to sort. I don't think it has any real advantage over heap_sorter in production code.

quick_sorter

#include <cpp-sort/sorters/quick_sorter.h>

Implements a simple median-of-9 quicksort.

Best Average Worst Memory Stable Iterators
n n log n log n No Forward

selection_sorter

#include <cpp-sort/sorters/selection_sorter.h>

Implements a selection sort.

Best Average Worst Memory Stable Iterators
1 No Forward

This sorter also has the following dedicated algorithms when used together with container_aware_adapter:

Container Best Average Worst Memory Stable
std::list 1 Yes
std::forward_list 1 Yes

smooth_sorter

#include <cpp-sort/sorters/smooth_sorter.h>

Implements a smoothsort.

Best Average Worst Memory Stable Iterators
n n log n n log n 1 No Random-Access

Note: while the complexity guarantees of this algorithm are optimal, this smoothsort isn't actually performant in practice. Except for some specific patterns (where tim_sorter, pdq_sorter or verge_sorter are far better anyway), it is always almost twice as slow as heap_sorter. Huge collections and/or huge objects may make a difference, but I have yet to see a case where this is the ideal sorting algorithm.

std_sorter

#include <cpp-sort/sorters/std_sorter.h>

Uses the standard library std::sort to sort a collection. While the complexity guarantees are only partial in the standard, here is what's expected:

Best Average Worst Memory Stable Iterators
n n log n n log n log n No Random-access

The adapter stable_adapter has an explicit specialization for std_sorter which calls std::stable_sort instead. Its complexity depends on whether it can allocate additional memory or not. While the complexity guarantees are only partial in the standard, here is what's expected:

Best Average Worst Memory Stable Iterators
n log n n log n n log n n Yes Random-access
n log² n n log² n n log² n 1 Yes Random-access

Note: std::sort and std::stable_sort are likely not able to handle proxy iterators, therefore trying to use std_sorter with code that relies on proxy iterators (e.g. schwartz_adapter) is deemed to cause errors. However, some standard libraries provide overloads of standard algorithms for some containers; for example, libc++ has an overload of std::sort for bit iterators, which means that std_sorter could the the best choice to sort an std::vector<bool>.

tim_sorter

#include <cpp-sort/sorters/tim_sorter.h>

Implements a timsort.

Best Average Worst Memory Stable Iterators
n n log n n log n n Yes Random-access

Note: while the sort is stable and the complexity guarantees are good enough, this sorter is rather slow compared to the some other ones when the data distribution is random.

verge_sorter

#include <cpp-sort/sorters/verge_sorter.h>

Implements a vergesort.

Best Average Worst Memory Stable Iterators
n n log n n log n n No Random-access
n n log n n No Bidirectional

Note: while the worst-case complexity is n², I don't even know how to reach such a case: the fallback quicksort tends to have its worst-case complexity for patterns that the vergesort layer typically defeats. Therefore, it may be impossible for this sort to have a quadratic behavior. If you ever find a way to trigger a quadratic behavior or to prove that it doesn't have one, don't hesitate to report it.

Type-specific sorters

The following sorters are available but will only work for some specific types instead of using a user-provided comparison function. Some of them also accept projections as long as the result of the projection can be handled by the sorter.

counting_sorter

#include <cpp-sort/sorters/counting_sorter.h>

counting_sorter implements a simple counting sort. This sorter also supports reverse sorting with std::greater<>.

Best Average Worst Memory Stable Iterators
n n+r n+r n+r No* Forward

Note: this sorter works with any type satisfying the trait std::is_integral. It can be insanely faster than other sorting algorithms when there are only a few different values in a tight range (e.g. values between 0 and 100 in an array of 10000 elements), but will be far too slow and eat too much memory if the range is wider than the number of elements (e.g. an array with two elements whose values are 0 and 100000). No memory is used if the collection is already sorted.

Since the original integers are discarded and overwritten, whether the algorithm is stable or not does not mean much. Moreover, it can only sort integers, so the potential stability problems wouldn't even be observable.

spread_sorter

#include <cpp-sort/sorters/spread_sorter.h>

spread_sorter implements a spreadsort.

Best Average Worst Memory Stable Iterators
n n*(k/d) n*(k/s+d) n*(k/d) No Random-access

It comes into three main flavours (available individually if needed):

  • integer_spread_sorter works with any type satisfying the trait std::is_integral.
  • float_spread_sorter works with any type satisfying the trait std::numeric_limits::is_iec559 whose size is the same as std::uint32_t or std::uin64_t.
  • string_spread_sorter works with std::string and std::wstring (if wchar_t is 2 bytes). This sorter also supports reverse sorting with std::greater<>.

These sorters accept projections as long as their simplest form can handle the result of the projection. The three of them are aggregated into one main sorter the following way:

struct spread_sorter:
    hybrid_adapter<
        integer_spread_sorter,
        float_spread_sorter,
        string_spread_sorter
    >
{};
Clone this wiki locally