An implementation of multi-key sort
Hi hackers,
I'd submit an implementation of multi-key sort for review. Please see the
code as attachment. Thanks for your reponse in advance.
Overview
--------
MKsort (multi-key sort) is an alternative of standard qsort algorithm,
which has better performance for particular sort scenarios, i.e. the data
set has multiple keys to be sorted.
The implementation is based on the paper:
Jon L. Bentley and Robert Sedgewick, "Fast Algorithms for Sorting and
Searching Strings", Jan 1997 [1]https://www.cs.tufts.edu/~nr/cs257/archive/bob-sedgewick/fast-strings.pdf
MKsort is applied only for tuple sort by the patch. Theoretically it can
be applied for general-purpose sort scenario when there are multiple sort
keys available, but it is relatively difficult in practice because kind of
unique interface is needed to manipulate the keys. So I limit the usage of
mksort to sort SortTuple.
Comparing to classic quick sort, it can get significant performance
improvement once multiple keys are available. A rough test shows it got
~129% improvement than qsort for ORDER BY on 6 keys, and ~52% for CREATE
INDEX on the same data set. (See more details in section "Performance
Test")
Author: Yao Wang <yaowangm@outlook.com>
Co-author: Hongxu Ma <interma@outlook.com>
Scope
-----
The interface of mksort is pretty simple: in tuplesort_sort_memtuples(),
mksort_tuple() is invoked instead of qsort_tuple() if mksort is applicable.
The major logic in mksort_tuple() is to apply mksort algorithm on
SortTuple, and kind of callback mechanism is used to handle
sort-variant-specific issue, e.g. comparing different datums, like
qsort_tuple() does. It also handles the complexity of "abbreviated keys".
A small difference from classic mksort algorithm is: for IndexTuple, when
all the columns are equal, an additional comparing based on ItemPointer
is performed to determine the order. It is to make the result consistent
to existing qsort.
I did consider about implementing mksort by the approach of kind of
template mechanism like qsort (see sort_template.h), but it seems
unnecessary because all concrete tuple types need to be handled are
derived from SortTuple. Use callback to isolate type specific features
is good enough.
Note that not all tuple types are supported by mksort. Please see the
comments inside tuplesort_sort_memtuples().
Test Cases
----------
The changes of test cases include:
* Generally, mksort should generate result exactly same to qsort. However
some test cases don't. The reason is that SQL doesn't specify order on
all possible columns, e.g. "select c1, c2 from t1 order by c1" will
generate different results between mksort/qsort when c1 values are equal,
and the solution is to order c2 as well ("select c1, c2 from t1 order by
c1, c2"). (e.g. geometry)
* Some cases need to be updated to display the new sort method "multi-key
sort" in explain result. (e.g. incremental_sort)
* regress/tuplesort was updated with new cases to cover some scenarios of
mksort.
Performance Test
----------------
The script I used to configure the build:
CFLAGS="-O3 -fargument-noalias-global -fno-omit-frame-pointer -g"
./configure --prefix=$PGHOME --with-pgport=5432 --with-perl --with-openssl
--with-python --with-pam --with-blocksize=16 --with-wal-blocksize=16
--with-perl --enable-tap-tests --with-gssapi --with-ldap
I used the script for a rough test for ORDER BY:
\timing on
create table t1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t1 values (generate_series(1,499999), 0, 0, 0, 0,
'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
update t1 set c2 = c1 % 100, c3 = c1 % 50, c4 = c1 % 10, c5 = c1 % 3;
update t1 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (c1 % 5)::text;
-- Use a large work mem to ensure the entire sort happens in memory
set work_mem='1GB';
-- switch between qsort/mksort
set enable_mk_sort=off;
explain analyze select c1 from t1 order by c6, c5, c4, c3, c2, c1;
Results:
mksort:
1341.283 ms (00:01.341)
1379.098 ms (00:01.379)
1369.868 ms (00:01.370)
qsort:
3137.277 ms (00:03.137)
3147.771 ms (00:03.148)
3131.887 ms (00:03.132)
The perf improvement is ~129%.
Another perf test for CREATE INDEX:
create index idx_t1_mk on t3 (c6, c5, c4, c3, c2, c1);
Results:
mksort:
1147.207 ms (00:01.147)
1200.501 ms (00:01.201)
1235.657 ms (00:01.236)
Qsort:
1852.957 ms (00:01.853)
1824.209 ms (00:01.824)
1808.781 ms (00:01.809)
The perf improvement is ~52%.
Another test is to use one of queries of TPC-H:
set work_mem='1GB';
-- query rewritten from TPCH-Q1, and there are 6001215 rows in lineitem
explain analyze select
l_returnflag,l_linestatus,l_quantity,l_shipmode
from
lineitem
where
l_shipdate <= date'1998-12-01' - interval '65 days'
order by
l_returnflag,l_linestatus,l_quantity,l_shipmode;
Result:
Qsort:
14582.626 ms
14524.188 ms
14524.111 ms
mksort:
11390.891 ms
11647.065 ms
11546.791 ms
The perf improvement is ~25.8%.
[1]: https://www.cs.tufts.edu/~nr/cs257/archive/bob-sedgewick/fast-strings.pdf
[2]: https://www.tpc.org/tpch/
Thanks,
Yao Wang
Attachments:
0001-Implement-multi-key-sort.patchapplication/octet-stream; name=0001-Implement-multi-key-sort.patchDownload
From 1d583dec54ecd47912a5d68038bc9a07c2339c45 Mon Sep 17 00:00:00 2001
From: Yao Wang <yaowangm@outlook.com>
Date: Tue, 7 May 2024 08:11:13 +0000
Subject: [PATCH] Implement multi-key sort
MKsort (multi-key sort) is an alternative of standard qsort algorithm,
which has better performance for particular sort scenarios, i.e. the data
set has multiple keys to be sorted. Comparing to classic quick sort, it
can get significant performance improvement once multiple keys are
available.
Author: Yao Wang <yaowangm@outlook.com>
Co-author: Hongxu Ma <interma@outlook.com>
---
src/backend/utils/misc/guc_tables.c | 11 +
src/backend/utils/sort/mksort_tuple.c | 358 +++++++++++++++++
src/backend/utils/sort/tuplesort.c | 44 ++
src/backend/utils/sort/tuplesortvariants.c | 300 +++++++++++++-
src/include/c.h | 4 +
src/include/utils/tuplesort.h | 34 +-
src/test/regress/expected/geometry.out | 4 +-
.../regress/expected/incremental_sort.out | 12 +-
src/test/regress/expected/tuplesort.out | 375 ++++++++++++++++++
src/test/regress/expected/window.out | 58 +--
src/test/regress/sql/geometry.sql | 2 +-
src/test/regress/sql/tuplesort.sql | 59 +++
src/test/regress/sql/window.sql | 22 +-
13 files changed, 1215 insertions(+), 68 deletions(-)
create mode 100644 src/backend/utils/sort/mksort_tuple.c
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 3fd0b14dd8..b8fe447d68 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -103,6 +103,7 @@ extern char *default_tablespace;
extern char *temp_tablespaces;
extern bool ignore_checksum_failure;
extern bool ignore_invalid_pages;
+extern bool enable_mk_sort;
#ifdef TRACE_SYNCSCAN
extern bool trace_syncscan;
@@ -839,6 +840,16 @@ struct config_bool ConfigureNamesBool[] =
true,
NULL, NULL, NULL
},
+ {
+ {"enable_mk_sort", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables multi-key"),
+ NULL,
+ GUC_EXPLAIN
+ },
+ &enable_mk_sort,
+ true,
+ NULL, NULL, NULL
+ },
{
{"enable_hashagg", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of hashed aggregation plans."),
diff --git a/src/backend/utils/sort/mksort_tuple.c b/src/backend/utils/sort/mksort_tuple.c
new file mode 100644
index 0000000000..949e9bbf8d
--- /dev/null
+++ b/src/backend/utils/sort/mksort_tuple.c
@@ -0,0 +1,358 @@
+/*
+ * MKsort (multiple-key sort) is an alternative of standard qsort algorithm,
+ * which has better performance for particular sort scenarios, i.e. the
+ * data set has multiple keys to be sorted.
+ *
+ * The sorting algorithm blends Quicksort and radix sort; Like regular
+ * Quicksort, it partitions its input into sets less than and greater than a
+ * given value; like radix sort, it moves on to the next field once the current
+ * input is known to be equal in the given field.
+ *
+ * The implementation is based on the paper:
+ * Jon L. Bentley and Robert Sedgewick, "Fast Algorithms for Sorting and
+ * Searching Strings", Jan 1997
+ *
+ * Some improvements which is related to additional handling for equal tuples
+ * have been adapted to keep consistency with the implementations of postgres
+ * qsort.
+ *
+ * For now, mksort_tuple() is called in tuplesort_sort_memtuples() as a
+ * replacement of qsort_tuple() when specific conditions are satisfied.
+ */
+
+/* Swap two tuples in sort tuple array */
+static inline void
+mksort_swap(int a,
+ int b,
+ SortTuple *x)
+{
+ SortTuple t;
+
+ if (a == b)
+ return;
+ t = x[a];
+ x[a] = x[b];
+ x[b] = t;
+}
+
+/* Swap tuples by batch in sort tuple array */
+static inline void
+mksort_vec_swap(int a,
+ int b,
+ int size,
+ SortTuple *x)
+{
+ while (size-- > 0)
+ {
+ mksort_swap(a, b, x);
+ a++;
+ b++;
+ }
+}
+
+/*
+ * Check whether current datum (at specified tuple and depth) is null
+ * Note that the input x means a specified tuple provided by caller but not
+ * a tuple array, so tupleIndex is unnecessary
+ */
+static inline bool
+check_datum_null(SortTuple *x,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum;
+ bool isNull;
+
+ /* Since we have a specified tuple, the tupleIndex is always 0 */
+ state->base.mksortGetDatumFunc(x, 0, depth, state, &datum, &isNull, false);
+
+ /*
+ * Note: for "abbreviated key", we don't need to handle more here because
+ * if "abbreviated key" of a datum is null, the "full" datum must be null.
+ */
+
+ return isNull;
+}
+
+/*
+ * Compare two tuples at specified depth
+ *
+ * If "abbreviated key" is disabled:
+ * get specified datums and compare them by ApplySortComparator().
+ * If "abbreviated key" is enabled:
+ * Only first datum may be abbr key according to the design (see the comments
+ * of struct SortTuple), so different operations are needed for different
+ * datum.
+ * For first datum (depth == 0): get first datums ("abbr key" version) and
+ * compare them by ApplySortComparator(). If they are equal, get "full"
+ * version and compare again by ApplySortAbbrevFullComparator().
+ * For other datums: get specified datums and compare them by
+ * ApplySortComparator() as regular routine does.
+ *
+ * See comparetup_heap() for details.
+ */
+static inline int
+mksort_compare_datum(SortTuple *tuple1,
+ SortTuple *tuple2,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum1, datum2;
+ bool isNull1, isNull2;
+ SortSupport sortKey;
+ int ret = 0;
+
+ Assert(state->mksortGetDatumFunc);
+
+ sortKey = state->base.sortKeys + depth;
+ state->base.mksortGetDatumFunc(tuple1, 0, depth, state,
+ &datum1, &isNull1, false);
+ state->base.mksortGetDatumFunc(tuple2, 0, depth, state,
+ &datum2, &isNull2, false);
+
+ ret = ApplySortComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+
+ /*
+ * If "abbreviated key" is enabled, and we are in the first depth, it means
+ * only "abbreviated keys" are compared. If the two datums are determined to
+ * be equal by ApplySortComparator(), we need to perform an extra "full"
+ * comparing by ApplySortAbbrevFullComparator().
+ */
+ if (sortKey->abbrev_converter &&
+ depth == 0 &&
+ ret == 0)
+ {
+ /* Fetch "full" datum by setting useFullKey = true */
+ state->base.mksortGetDatumFunc(tuple1, 0, depth, state,
+ &datum1, &isNull1, true);
+ state->base.mksortGetDatumFunc(tuple2, 0, depth, state,
+ &datum2, &isNull2, true);
+
+ ret = ApplySortAbbrevFullComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+ }
+
+ return ret;
+}
+
+#ifdef USE_ASSERT_CHECKING
+/*
+ * Verify whether the SortTuple list is ordered or not at specified depth
+ */
+static void
+mksort_verify(SortTuple *x,
+ int n,
+ int depth,
+ Tuplesortstate *state)
+{
+ int ret;
+
+ for (int i = 0;i < n - 1;i++)
+ {
+ ret = mksort_compare_datum(x + i,
+ x + i + 1,
+ depth,
+ state);
+ Assert(ret <= 0);
+ }
+}
+#endif
+
+/*
+ * Major of multi-key sort
+ *
+ * seenNull indicates whether we have seen NULL in any datum we checked
+ */
+static void
+mksort_tuple(SortTuple *x,
+ size_t n,
+ int depth,
+ Tuplesortstate *state,
+ bool seenNull)
+{
+ /*
+ * In the process, the tuple array consists of five parts:
+ * left equal, less, not-processed, greater, right equal
+ *
+ * lessStart indicates the first position of less part
+ * lessEnd indicates the next position after less part
+ * greaterStart indicates the prior position before greater part
+ * greaterEnd indicates the latest position of greater part
+ * the range between lessEnd and greaterStart (inclusive) is not-processed
+ */
+ int lessStart, lessEnd, greaterStart, greaterEnd, tupCount;
+ int32 dist;
+ SortTuple *pivot;
+ bool isDatumNull;
+
+ Assert(depth <= state->base.nKeys);
+ Assert(state->base.sortKeys);
+ Assert(state->base.mksortGetDatumFunc);
+
+ if (n <= 1)
+ return;
+
+ /* If we have exceeded the max depth, return immediately */
+ if (depth == state->base.nKeys)
+ return;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Select pivot by random and move it to the first position */
+ lessStart = pg_prng_int64p(&pg_global_prng_state) % n;
+ mksort_swap(0, lessStart, x);
+ pivot = x;
+
+ lessStart = 1;
+ lessEnd = 1;
+ greaterStart = n - 1;
+ greaterEnd = n - 1;
+
+ /* Sort the array to three parts: lesser, equal, greater */
+ while (true)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Compare the left end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare lessEnd and pivot at current depth */
+ dist = mksort_compare_datum(x + lessEnd,
+ pivot,
+ depth,
+ state);
+
+ if (dist > 0)
+ break;
+
+ /* If lessEnd is equal to pivot, move it to lessStart */
+ if (dist == 0)
+ {
+ mksort_swap(lessEnd, lessStart, x);
+ lessStart++;
+ }
+ lessEnd++;
+ }
+
+ /* Compare the right end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare greaterStart and pivot at current depth */
+ dist = mksort_compare_datum(x + greaterStart,
+ pivot,
+ depth,
+ state);
+
+ if (dist < 0)
+ break;
+
+ /* If greaterStart is equal to pivot, move it to greaterEnd */
+ if (dist == 0)
+ {
+ mksort_swap(greaterStart, greaterEnd, x);
+ greaterEnd--;
+ }
+ greaterStart--;
+ }
+
+ if (lessEnd > greaterStart)
+ break;
+ mksort_swap(lessEnd, greaterStart, x);
+ lessEnd++;
+ greaterStart--;
+ }
+
+ /*
+ * Now the array has four parts:
+ * left equal, lesser, greater, right equal
+ * Note greaterStart is less than lessEnd now
+ */
+
+ /* Move the left equal part to middle */
+ dist = Min(lessStart, lessEnd - lessStart);
+ mksort_vec_swap(0, lessEnd - dist, dist, x);
+
+ /* Move the right equal part to middle */
+ dist = Min(greaterEnd - greaterStart, n - greaterEnd - 1);
+ mksort_vec_swap(lessEnd, n - dist, dist, x);
+
+ /*
+ * Now the array has three parts:
+ * lesser, equal, greater
+ * Note that one or two parts may have no element at all.
+ */
+
+ /* Recursively sort the lesser part */
+
+ /* dist means the size of less part */
+ dist = lessEnd - lessStart;
+ mksort_tuple(x,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+ /* Recursively sort the equal part */
+
+ /*
+ * (x + dist) means the first tuple in the equal part
+ * Since all tuples have equal datums at current depth, we just check any one
+ * of them to determine whether we have seen null datum.
+ */
+ isDatumNull = check_datum_null(x + dist, depth, state);
+
+ /* (lessStart + n - greaterEnd - 1) means the size of equal part */
+ tupCount = lessStart + n - greaterEnd - 1;
+
+ if (depth < state->base.nKeys - 1)
+ {
+ mksort_tuple(x + dist,
+ tupCount,
+ depth + 1,
+ state,
+ seenNull || isDatumNull);
+ } else {
+ /*
+ * We have reach the max depth: Call mksortHandleDupFunc to handle duplicated
+ * tuples if necessary, e.g. checking uniqueness or extra comparing
+ */
+
+ /*
+ * Call mksortHandleDupFunc if:
+ * 1. mksortHandleDupFunc is filled
+ * 2. the size of equal part > 1
+ */
+ if (state->base.mksortHandleDupFunc &&
+ (tupCount > 1))
+ {
+ state->base.mksortHandleDupFunc(x + dist,
+ tupCount,
+ seenNull || isDatumNull,
+ state);
+ }
+ }
+
+ /* Recursively sort the greater part */
+
+ /* dist means the size of greater part */
+ dist = greaterEnd - greaterStart;
+ mksort_tuple(x + n - dist,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+#ifdef USE_ASSERT_CHECKING
+ mksort_verify(x,
+ n,
+ depth,
+ state);
+#endif
+}
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 7c4d6dc106..c865772a7a 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -108,6 +108,7 @@
#include "utils/memutils.h"
#include "utils/pg_rusage.h"
#include "utils/tuplesort.h"
+#include "common/pg_prng.h"
/*
* Initial size of memtuples array. We're trying to select this size so that
@@ -128,6 +129,7 @@ bool trace_sort = false;
bool optimize_bounded_sort = true;
#endif
+bool enable_mk_sort = true;
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
@@ -337,6 +339,9 @@ struct Tuplesortstate
#ifdef TRACE_SORT
PGRUsage ru_start;
#endif
+
+ /* Whether multi-key sort is used */
+ bool mksortUsed;
};
/*
@@ -622,6 +627,8 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
#define ST_DEFINE
#include "lib/sort_template.h"
+#include "mksort_tuple.c"
+
/*
* tuplesort_begin_xxx
*
@@ -690,6 +697,7 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
state->base.sortopt = sortopt;
state->base.tuples = true;
state->abbrevNext = 10;
+ state->mksortUsed = false;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -2559,6 +2567,8 @@ tuplesort_get_stats(Tuplesortstate *state,
case TSS_SORTEDINMEM:
if (state->boundUsed)
stats->sortMethod = SORT_TYPE_TOP_N_HEAPSORT;
+ else if (state->mksortUsed)
+ stats->sortMethod = SORT_TYPE_MKSORT;
else
stats->sortMethod = SORT_TYPE_QUICKSORT;
break;
@@ -2592,6 +2602,8 @@ tuplesort_method_name(TuplesortMethod m)
return "external sort";
case SORT_TYPE_EXTERNAL_MERGE:
return "external merge";
+ case SORT_TYPE_MKSORT:
+ return "multi-key sort";
}
return "unknown";
@@ -2717,6 +2729,38 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
if (state->memtupcount > 1)
{
+ /*
+ * Apply multi-key sort when:
+ * 1. enable_mk_sort is set
+ * 2. There are multiple keys available
+ * 3. mksortGetDatumFunc is filled, which implies that current tuple
+ * type is supported by mksort. (By now only Heap tuple and Btree
+ * Index tuple are supported, and more types may be supported in
+ * future.)
+ *
+ * A summary of tuple types supported by mksort:
+ *
+ * HeapTuple: supported
+ * IndexTuple(btree): supported
+ * IndexTuple(hash): not supported because there is only one key
+ * DatumTuple: not supported because there is only one key
+ * HeapTuple(for cluster): not supported yet
+ * IndexTuple(gist): not supported yet
+ * IndexTuple(brin): not supported yet
+ */
+ if (enable_mk_sort &&
+ state->base.nKeys > 1 &&
+ state->base.mksortGetDatumFunc != NULL)
+ {
+ state->mksortUsed = true;
+ mksort_tuple(state->memtuples,
+ state->memtupcount,
+ 0,
+ state,
+ false);
+ return;
+ }
+
/*
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index 05a853caa3..1ae7947cc5 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -30,6 +30,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/tuplesort.h"
+#include "miscadmin.h"
/* sort-type codes for sort__start probes */
@@ -92,6 +93,41 @@ static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static void freestate_cluster(Tuplesortstate *state);
+static Datum mksort_get_datum_heap(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+static Datum mksort_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+static void
+mksort_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
+static int
+mksort_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state);
+
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2);
+
+static void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state);
+
/*
* Data structure pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
* the tuplesort_begin_cluster.
@@ -163,6 +199,14 @@ typedef struct BrinSortTuple
/* Size of the BrinSortTuple, given length of the BrinTuple. */
#define BRINSORTTUPLE_SIZE(len) (offsetof(BrinSortTuple, tuple) + (len))
+#define ST_SORT qsort_tuple_by_itempointer
+#define ST_ELEMENT_TYPE SortTuple
+#define ST_COMPARE(a, b, state) mksort_compare_equal_index_btree(a, b, state)
+#define ST_COMPARE_ARG_TYPE Tuplesortstate
+#define ST_CHECK_FOR_INTERRUPTS
+#define ST_SCOPE static
+#define ST_DEFINE
+#include "lib/sort_template.h"
Tuplesortstate *
tuplesort_begin_heap(TupleDesc tupDesc,
@@ -200,6 +244,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
base->removeabbrev = removeabbrev_heap;
base->comparetup = comparetup_heap;
base->comparetup_tiebreak = comparetup_heap_tiebreak;
+ base->mksortGetDatumFunc = mksort_get_datum_heap;
base->writetup = writetup_heap;
base->readtup = readtup_heap;
base->haveDatum1 = true;
@@ -388,6 +433,8 @@ tuplesort_begin_index_btree(Relation heapRel,
base->removeabbrev = removeabbrev_index;
base->comparetup = comparetup_index_btree;
base->comparetup_tiebreak = comparetup_index_btree_tiebreak;
+ base->mksortGetDatumFunc = mksort_get_datum_index_btree;
+ base->mksortHandleDupFunc = mksort_handle_dup_index_btree;
base->writetup = writetup_index;
base->readtup = readtup_index;
base->haveDatum1 = true;
@@ -1563,25 +1610,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
* attribute in order to ensure that all keys in the index are physically
* unique.
*/
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
}
static int
@@ -1888,3 +1917,236 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
+
+/*
+ * Get specified datum from SortTuple (HeapTuple) list
+ *
+ * If the first datum is requested (depth == 0), sortTuple->datum1/isnull1
+ * will be returned. For other datums, relevant datum will be extracted from
+ * sortTuple->tuple.
+ *
+ * The parameter "useFullKey" is used for scenario of "abbreviated key":
+ * false - get sortTuple->datum1/isnull1 (abbreviated key)
+ * true - get the "full" datum
+ * If "abbreviated key" is disabled, useFullKey will be ignored.
+ *
+ * See comparetup_heap() for details.
+ */
+static Datum
+mksort_get_datum_heap(SortTuple *x,
+ int tupleIndex,
+ int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
+{
+ TupleDesc tupDesc = NULL;
+ HeapTupleData heapTuple;
+ AttrNumber attno;
+ SortTuple *sortTuple = x + tupleIndex;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys + depth;;
+
+ Assert(state);
+ Assert(depth < state->nKeys);
+
+ /*
+ * useFullKey is valid only when depth == 0, because only the first datum
+ * may be involved to "abbreviated key", so only the first datum need to
+ * be checked with "full" version.
+ */
+ AssertImply(useFullKey, depth == 0);
+
+ tupDesc = (TupleDesc)base->arg;
+
+ /*
+ * When useFullKey is false, and the first datum is requested, return the
+ * leading datum
+ */
+ if (depth == 0 && !useFullKey)
+ {
+ *datum = sortTuple->datum1;
+ *isNull = sortTuple->isnull1;
+ return *datum;
+ }
+
+ /* For any datums which depth > 0, extract it from sortTuple->tuple */
+ heapTuple.t_len = ((MinimalTuple) sortTuple->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ heapTuple.t_data = (HeapTupleHeader) ((char *) sortTuple->tuple - MINIMAL_TUPLE_OFFSET);
+ attno = sortKey->ssup_attno;
+ *datum = heap_getattr(&heapTuple, attno, tupDesc, isNull);
+
+ return *datum;
+}
+
+/*
+ * Get specified datum from SortTuple (IndexTuple for btree index) list
+ *
+ * If the first datum is requested (depth == 0), sortTuple->datum1/isnull1
+ * will be returned. For other datums, relevant datum will be extracted from
+ * sortTuple->tuple.
+ *
+ * The parameter "useFullKey" is used for scenario of "abbreviated key":
+ * false - get sortTuple->datum1/isnull1 (abbreviated key)
+ * true - get the "full" datum
+ * If "abbreviated key" is disabled, useFullKey will be ignored.
+ *
+ * See comparetup_index_btree() for details.
+ */
+static Datum
+mksort_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
+{
+ TupleDesc tupDesc;
+ IndexTuple indexTuple;
+ SortTuple *sortTuple = x + tupleIndex;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ Assert(state);
+ Assert(depth < state->nKeys);
+
+ /*
+ * useFullKey is valid only when depth == 0, because only the first datum
+ * may be involved to "abbreviated key", so only the first datum need to
+ * be checked with "full" version.
+ */
+ AssertImply(useFullKey, depth == 0);
+
+ /*
+ * When useFullKey is false, and the first datum is requested, return the
+ * leading datum
+ */
+ if (depth == 0 && !useFullKey)
+ {
+ *isNull = sortTuple->isnull1;
+ *datum = sortTuple->datum1;
+ return *datum;
+ }
+
+ indexTuple = (IndexTuple) sortTuple->tuple;
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+
+ /*
+ * Set parameter attnum = depth + 1 because attnum starts from 1 but depth
+ * starts from 0
+ */
+ *datum = index_getattr(indexTuple, depth + 1, tupDesc, isNull);
+
+ return *datum;
+}
+
+/*
+ * Handle duplicated SortTuples (IndexTuple for btree index during mksort)
+ * x: the duplicated tuple list
+ * tupleCount: count of the tuples
+ */
+static void
+mksort_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ /* If enforceUnique is enabled and we never saw NULL, raise error */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && seenNull))
+ {
+ Assert(state->comparetup == comparetup_index_btree);
+
+ /*
+ * x means the first tuple of duplicated tuple list
+ * Since they are duplicated, simply pick up the first one
+ * to raise error
+ */
+ raise_error_of_dup_index((IndexTuple)(x->tuple), state);
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ qsort_tuple_by_itempointer(x,
+ tupleCount,
+ state);
+}
+
+/*
+ * Compare two btree index tuples by ItemPointer
+ * It is a callback function for qsort_tuple() called by
+ * mksort_handle_dup_index_btree()
+ */
+static int
+mksort_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state)
+{
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
+}
+
+/* Compare two index tuples by ItemPointer */
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2)
+{
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+/* Raise error for duplicated tuple when creating unique index */
+static void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state)
+{
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ TupleDesc tupDesc;
+ char *key_desc;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+ index_deform_tuple((IndexTuple)x, tupDesc, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+}
diff --git a/src/include/c.h b/src/include/c.h
index dc1841346c..f7c368cd16 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -857,12 +857,14 @@ typedef NameData *Name;
#define Assert(condition) ((void)true)
#define AssertMacro(condition) ((void)true)
+#define AssertImply(condition1, condition2) ((void)true)
#elif defined(FRONTEND)
#include <assert.h>
#define Assert(p) assert(p)
#define AssertMacro(p) ((void) assert(p))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
#else /* USE_ASSERT_CHECKING && !FRONTEND */
@@ -886,6 +888,8 @@ typedef NameData *Name;
((void) ((condition) || \
(ExceptionalCondition(#condition, __FILE__, __LINE__), 0)))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
+
#endif /* USE_ASSERT_CHECKING && !FRONTEND */
/*
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index e7941a1f09..d3f27b49dc 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -29,7 +29,6 @@
#include "utils/relcache.h"
#include "utils/sortsupport.h"
-
/*
* Tuplesortstate and Sharedsort are opaque types whose details are not
* known outside tuplesort.c.
@@ -79,9 +78,10 @@ typedef enum
SORT_TYPE_QUICKSORT = 1 << 1,
SORT_TYPE_EXTERNAL_SORT = 1 << 2,
SORT_TYPE_EXTERNAL_MERGE = 1 << 3,
+ SORT_TYPE_MKSORT = 1 << 4,
} TuplesortMethod;
-#define NUM_TUPLESORTMETHODS 4
+#define NUM_TUPLESORTMETHODS 5
typedef enum
{
@@ -155,6 +155,21 @@ typedef struct
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
+typedef Datum
+(*MksortGetDatumFunc) (SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+typedef void
+(*MksortHandleDupFunc) (SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
/*
* The public part of a Tuple sort operation state. This data structure
* contains the definition of sort-variant-specific interface methods and
@@ -249,6 +264,21 @@ typedef struct
bool tuples; /* Can SortTuple.tuple ever be set? */
void *arg; /* Specific information for the sort variant */
+
+ /*
+ * Function pointer, referencing a function to get specified datum from
+ * SortTuple list with multi-key.
+ * Used by mksort_tuple().
+ */
+ MksortGetDatumFunc mksortGetDatumFunc;
+
+ /*
+ * Function pointer, referencing a function to handle duplicated tuple
+ * from SortTuple list with multi-key.
+ * Used by mksort_tuple().
+ * For now, the function pointer is filled for only btree index tuple.
+ */
+ MksortHandleDupFunc mksortHandleDupFunc;
} TuplesortPublic;
/* Sort parallel code from state for sort__start probes */
diff --git a/src/test/regress/expected/geometry.out b/src/test/regress/expected/geometry.out
index 8be694f46b..094d22861c 100644
--- a/src/test/regress/expected/geometry.out
+++ b/src/test/regress/expected/geometry.out
@@ -4273,7 +4273,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
circle | point | distance
----------------+-------------------+---------------
<(1,2),3> | (-3,4) | 1.472135955
@@ -4310,8 +4310,8 @@ SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
<(3,5),0> | (Infinity,1e+300) | Infinity
<(1,2),3> | (1e+300,Infinity) | Infinity
<(5,1),3> | (1e+300,Infinity) | Infinity
- <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,2),3> | (Infinity,1e+300) | Infinity
+ <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,3),5> | (1e+300,Infinity) | Infinity
<(1,3),5> | (Infinity,1e+300) | Infinity
<(100,200),10> | (1e+300,Infinity) | Infinity
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 5fd54a10b1..e8dba83389 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -520,13 +520,13 @@ select * from (select * from t order by a) s order by a, b limit 55;
-- Test EXPLAIN ANALYZE with only a fullsort group.
select explain_analyze_without_memory('select * from (select * from t order by a) s order by a, b limit 55');
- explain_analyze_without_memory
----------------------------------------------------------------------------------------------------------------
+ explain_analyze_without_memory
+--------------------------------------------------------------------------------------------------------------------
Limit (actual rows=55 loops=1)
-> Incremental Sort (actual rows=55 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 2 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 2 Sort Methods: top-N heapsort, multi-key sort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=101 loops=1)
Sort Key: t.a
Sort Method: quicksort Memory: NNkB
@@ -554,7 +554,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Group Count": 2, +
"Sort Methods Used": [ +
"top-N heapsort", +
- "quicksort" +
+ "multi-key sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
@@ -728,7 +728,7 @@ select explain_analyze_without_memory('select * from (select * from t order by a
-> Incremental Sort (actual rows=70 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 1 Sort Method: quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 1 Sort Method: multi-key sort Average Memory: NNkB Peak Memory: NNkB
Pre-sorted Groups: 5 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=1000 loops=1)
Sort Key: t.a
@@ -756,7 +756,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Full-sort Groups": { +
"Group Count": 1, +
"Sort Methods Used": [ +
- "quicksort" +
+ "multi-key sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
diff --git a/src/test/regress/expected/tuplesort.out b/src/test/regress/expected/tuplesort.out
index 6dd97e7427..cd08ce8b3c 100644
--- a/src/test/regress/expected/tuplesort.out
+++ b/src/test/regress/expected/tuplesort.out
@@ -703,3 +703,378 @@ EXPLAIN (COSTS OFF) :qry;
(10 rows)
COMMIT;
+-- Test cases for multi-key sort
+set work_mem='100MB';
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 50) g;
+select * from mksort_simple_tbl order by a, b, c;
+ a | b | c
+---+----+------
+ 0 | 0 | 3417
+ 0 | 5 | 98f1
+ 0 | 5 | c0c7
+ 0 | 10 | d3d9
+ 0 | 10 | d645
+ 1 | 1 | c16a
+ 1 | 1 | c4ca
+ 1 | 6 | 3c59
+ 1 | 11 | 3416
+ 1 | 11 | 6512
+ 2 | 2 | 6364
+ 2 | 2 | c81e
+ 2 | 7 | b6d7
+ 2 | 12 | a1d0
+ 2 | 12 | c20a
+ 3 | 3 | 182b
+ 3 | 3 | eccb
+ 3 | 8 | 3769
+ 3 | 13 | 17e6
+ 3 | 13 | c51c
+ 4 | 4 | a87f
+ 4 | 4 | e369
+ 4 | 9 | 1ff1
+ 4 | 14 | aab3
+ 4 | 14 | f717
+ 5 | 0 | 6c83
+ 5 | 0 | 9bf3
+ 5 | 5 | 1c38
+ 5 | 5 | e4da
+ 5 | 10 | 8e29
+ 6 | 1 | c74d
+ 6 | 1 | d9d4
+ 6 | 6 | 1679
+ 6 | 6 | 19ca
+ 6 | 11 | 4e73
+ 7 | 2 | 67c6
+ 7 | 2 | 70ef
+ 7 | 7 | 8f14
+ 7 | 7 | a5bf
+ 7 | 12 | 02e7
+ 8 | 3 | 642e
+ 8 | 3 | 6f49
+ 8 | 8 | a577
+ 8 | 8 | c9f0
+ 8 | 13 | 33e7
+ 9 | 4 | 1f0e
+ 9 | 4 | f457
+ 9 | 9 | 45c4
+ 9 | 9 | d67d
+ 9 | 14 | 6ea9
+(50 rows)
+
+drop table mksort_simple_tbl;
+-- test table with abbr keys
+create table abbr_tbl (a int, b varchar(100), c uuid);
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+--(see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+select c, b, a from abbr_tbl order by c, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+(50 rows)
+
+select c, b, a from abbr_tbl order by c desc, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+(50 rows)
+
+select c, b, a from abbr_tbl order by c, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+NOTICE: index "idx_abbr_tbl" does not exist, skipping
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+---
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+(1 row)
+
+-- Uniqueness check of CREATE INDEX
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+ERROR: could not create unique index "idx_abbr_tbl"
+DETAIL: Key (c, b, a)=(00000000-0000-0000-0000-000000000001, aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1, 1) is duplicated.
+drop table abbr_tbl;
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ae4e8851f8..2de20ca1d0 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -18,13 +18,13 @@ INSERT INTO empsalary VALUES
('sales', 3, 4800, '2007-08-01'),
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | sum
-----------+-------+--------+-------
develop | 7 | 4200 | 25100
develop | 9 | 4500 | 25100
- develop | 11 | 5200 | 25100
develop | 10 | 5200 | 25100
+ develop | 11 | 5200 | 25100
develop | 8 | 6000 | 25100
personnel | 5 | 3500 | 7400
personnel | 2 | 3900 | 7400
@@ -33,13 +33,13 @@ SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM emps
sales | 1 | 5000 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | rank
-----------+-------+--------+------
develop | 7 | 4200 | 1
develop | 9 | 4500 | 2
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
personnel | 5 | 3500 | 1
personnel | 2 | 3900 | 2
@@ -90,18 +90,18 @@ SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PA
sales | 4 | 4800 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
depname | empno | salary | rank
-----------+-------+--------+------
- develop | 7 | 4200 | 1
- personnel | 5 | 3500 | 1
sales | 3 | 4800 | 1
sales | 4 | 4800 | 1
+ personnel | 5 | 3500 | 1
+ develop | 7 | 4200 | 1
personnel | 2 | 3900 | 2
develop | 9 | 4500 | 2
sales | 1 | 5000 | 3
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
(10 rows)
@@ -3749,23 +3749,24 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
empno | depname | rn | rnk | cnt
-------+-----------+----+-----+-----
- 8 | develop | 1 | 1 | 1
- 10 | develop | 2 | 2 | 1
- 11 | develop | 3 | 3 | 1
- 9 | develop | 4 | 4 | 2
- 7 | develop | 5 | 4 | 2
- 2 | personnel | 1 | 1 | 1
- 5 | personnel | 2 | 2 | 1
1 | sales | 1 | 1 | 1
+ 2 | personnel | 1 | 1 | 1
3 | sales | 2 | 2 | 1
4 | sales | 3 | 3 | 1
+ 5 | personnel | 2 | 2 | 1
+ 7 | develop | 4 | 4 | 1
+ 8 | develop | 1 | 1 | 1
+ 9 | develop | 5 | 5 | 1
+ 10 | develop | 2 | 2 | 1
+ 11 | develop | 3 | 3 | 1
(10 rows)
-- Test pushdown of quals into a subquery containing window functions
@@ -4106,17 +4107,17 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
empno | depname | salary | c
-------+-----------+--------+---
+ 1 | sales | 5000 | 1
+ 2 | personnel | 3900 | 1
+ 3 | sales | 4800 | 3
+ 4 | sales | 4800 | 3
+ 5 | personnel | 3500 | 2
8 | develop | 6000 | 1
10 | develop | 5200 | 3
11 | develop | 5200 | 3
- 2 | personnel | 3900 | 1
- 5 | personnel | 3500 | 2
- 1 | sales | 5000 | 1
- 4 | sales | 4800 | 3
- 3 | sales | 4800 | 3
(8 rows)
-- Ensure we get the correct run condition when the window function is both
@@ -4468,14 +4469,15 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
depname | empno | salary | enroll_date | first_emp | last_emp
-----------+-------+--------+-------------+-----------+----------
+ develop | 7 | 4200 | 01-01-2008 | 4 | 1
develop | 8 | 6000 | 10-01-2006 | 1 | 5
- develop | 7 | 4200 | 01-01-2008 | 5 | 1
personnel | 2 | 3900 | 12-23-2006 | 1 | 2
personnel | 5 | 3500 | 12-10-2007 | 2 | 1
sales | 1 | 5000 | 10-01-2006 | 1 | 3
diff --git a/src/test/regress/sql/geometry.sql b/src/test/regress/sql/geometry.sql
index c3ea368da5..1f47f07f31 100644
--- a/src/test/regress/sql/geometry.sql
+++ b/src/test/regress/sql/geometry.sql
@@ -403,7 +403,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
-- To polygon
SELECT f1, f1::polygon FROM CIRCLE_TBL WHERE f1 >= '<(0,0),1>';
diff --git a/src/test/regress/sql/tuplesort.sql b/src/test/regress/sql/tuplesort.sql
index 8476e594e6..65ecbbd5c9 100644
--- a/src/test/regress/sql/tuplesort.sql
+++ b/src/test/regress/sql/tuplesort.sql
@@ -305,3 +305,62 @@ EXPLAIN (COSTS OFF) :qry;
:qry;
COMMIT;
+
+-- Test cases for multi-key sort
+
+set work_mem='100MB';
+
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 50) g;
+select * from mksort_simple_tbl order by a, b, c;
+
+drop table mksort_simple_tbl;
+
+-- test table with abbr keys
+
+create table abbr_tbl (a int, b varchar(100), c uuid);
+
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+--(see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+
+select c, b, a from abbr_tbl order by c, b, a;
+select c, b, a from abbr_tbl order by c desc, b, a;
+select c, b, a from abbr_tbl order by c, b desc, a;
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+
+-- Uniqueness check of CREATE INDEX
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop table abbr_tbl;
\ No newline at end of file
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 6de5493b05..46359cb796 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -21,9 +21,9 @@ INSERT INTO empsalary VALUES
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
-- with GROUP BY
SELECT four, ten, SUM(SUM(four)) OVER (PARTITION BY four), AVG(ten) FROM tenk1
@@ -31,7 +31,7 @@ GROUP BY four, ten ORDER BY four, ten;
SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PARTITION BY depname);
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
-- empty window specification
SELECT COUNT(*) OVER () FROM tenk1 WHERE unique2 < 10;
@@ -1146,11 +1146,12 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
-- Test pushdown of quals into a subquery containing window functions
@@ -1332,7 +1333,7 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
-- Ensure we get the correct run condition when the window function is both
-- monotonically increasing and decreasing.
@@ -1510,10 +1511,11 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
-- cleanup
DROP TABLE empsalary;
--
2.25.1
On 22/05/2024 15:48, Wang Yao wrote:
Comparing to classic quick sort, it can get significant performance
improvement once multiple keys are available. A rough test shows it got
~129% improvement than qsort for ORDER BY on 6 keys, and ~52% for CREATE
INDEX on the same data set. (See more details in section "Performance
Test")
Impressive. Did you test the performance of the cases where MK-sort
doesn't help, to check if there is a performance regression?
--
Heikki Linnakangas
Neon (https://neon.tech)
No obvious perf regression is expected because PG will follow original
qsort code path when mksort is disabled. For the case, the only extra
cost is the check in tuplesort_sort_memtuples() to enter mksort code path.
It's also proved by the experiment today:
Mksort disabled:
2949.287 ms
2955.258 ms
2947.262 ms
No mksort code:
2947.094 ms
2946.419 ms
2953.215 ms
Almost the same.
I also updated code with small enhancements. Please see the latest code
as attachment.
Thanks,
Yao Wang
________________________________
发件人: Heikki Linnakangas <hlinnaka@iki.fi>
发送时间: 2024年5月22日 23:29
收件人: Wang Yao <yaowangm@outlook.com>; PostgreSQL Hackers <pgsql-hackers@postgresql.org>
抄送: interma@outlook.com <interma@outlook.com>
主题: Re: An implementation of multi-key sort
On 22/05/2024 15:48, Wang Yao wrote:
Comparing to classic quick sort, it can get significant performance
improvement once multiple keys are available. A rough test shows it got
~129% improvement than qsort for ORDER BY on 6 keys, and ~52% for CREATE
INDEX on the same data set. (See more details in section "Performance
Test")
Impressive. Did you test the performance of the cases where MK-sort
doesn't help, to check if there is a performance regression?
--
Heikki Linnakangas
Neon (https://neon.tech)
Attachments:
v2-Implement-multi-key-sort.patchapplication/octet-stream; name=v2-Implement-multi-key-sort.patchDownload
From 3f1436ad4157fcff29c3b53f8cf3ecd96f2fbf6d Mon Sep 17 00:00:00 2001
From: Yao Wang <yaowangm@outlook.com>
Date: Tue, 7 May 2024 08:11:13 +0000
Subject: [PATCH] Implement multi-key sort
MKsort (multi-key sort) is an alternative of standard qsort algorithm,
which has better performance for particular sort scenarios, i.e. the data
set has multiple keys to be sorted. Comparing to classic quick sort, it
can get significant performance improvement once multiple keys are
available.
Author: Yao Wang <yaowangm@outlook.com>
Co-author: Hongxu Ma <interma@outlook.com>
---
src/backend/utils/misc/guc_tables.c | 11 +
src/backend/utils/sort/mksort_tuple.c | 358 +++++++++++++++++
src/backend/utils/sort/tuplesort.c | 44 ++
src/backend/utils/sort/tuplesortvariants.c | 313 +++++++++++++--
src/include/c.h | 4 +
src/include/utils/tuplesort.h | 34 +-
src/test/regress/expected/geometry.out | 4 +-
.../regress/expected/incremental_sort.out | 12 +-
src/test/regress/expected/tuplesort.out | 375 ++++++++++++++++++
src/test/regress/expected/window.out | 58 +--
src/test/regress/sql/geometry.sql | 2 +-
src/test/regress/sql/tuplesort.sql | 59 +++
src/test/regress/sql/window.sql | 22 +-
13 files changed, 1216 insertions(+), 80 deletions(-)
create mode 100644 src/backend/utils/sort/mksort_tuple.c
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 3fd0b14dd8..b8fe447d68 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -103,6 +103,7 @@ extern char *default_tablespace;
extern char *temp_tablespaces;
extern bool ignore_checksum_failure;
extern bool ignore_invalid_pages;
+extern bool enable_mk_sort;
#ifdef TRACE_SYNCSCAN
extern bool trace_syncscan;
@@ -839,6 +840,16 @@ struct config_bool ConfigureNamesBool[] =
true,
NULL, NULL, NULL
},
+ {
+ {"enable_mk_sort", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables multi-key"),
+ NULL,
+ GUC_EXPLAIN
+ },
+ &enable_mk_sort,
+ true,
+ NULL, NULL, NULL
+ },
{
{"enable_hashagg", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of hashed aggregation plans."),
diff --git a/src/backend/utils/sort/mksort_tuple.c b/src/backend/utils/sort/mksort_tuple.c
new file mode 100644
index 0000000000..949e9bbf8d
--- /dev/null
+++ b/src/backend/utils/sort/mksort_tuple.c
@@ -0,0 +1,358 @@
+/*
+ * MKsort (multiple-key sort) is an alternative of standard qsort algorithm,
+ * which has better performance for particular sort scenarios, i.e. the
+ * data set has multiple keys to be sorted.
+ *
+ * The sorting algorithm blends Quicksort and radix sort; Like regular
+ * Quicksort, it partitions its input into sets less than and greater than a
+ * given value; like radix sort, it moves on to the next field once the current
+ * input is known to be equal in the given field.
+ *
+ * The implementation is based on the paper:
+ * Jon L. Bentley and Robert Sedgewick, "Fast Algorithms for Sorting and
+ * Searching Strings", Jan 1997
+ *
+ * Some improvements which is related to additional handling for equal tuples
+ * have been adapted to keep consistency with the implementations of postgres
+ * qsort.
+ *
+ * For now, mksort_tuple() is called in tuplesort_sort_memtuples() as a
+ * replacement of qsort_tuple() when specific conditions are satisfied.
+ */
+
+/* Swap two tuples in sort tuple array */
+static inline void
+mksort_swap(int a,
+ int b,
+ SortTuple *x)
+{
+ SortTuple t;
+
+ if (a == b)
+ return;
+ t = x[a];
+ x[a] = x[b];
+ x[b] = t;
+}
+
+/* Swap tuples by batch in sort tuple array */
+static inline void
+mksort_vec_swap(int a,
+ int b,
+ int size,
+ SortTuple *x)
+{
+ while (size-- > 0)
+ {
+ mksort_swap(a, b, x);
+ a++;
+ b++;
+ }
+}
+
+/*
+ * Check whether current datum (at specified tuple and depth) is null
+ * Note that the input x means a specified tuple provided by caller but not
+ * a tuple array, so tupleIndex is unnecessary
+ */
+static inline bool
+check_datum_null(SortTuple *x,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum;
+ bool isNull;
+
+ /* Since we have a specified tuple, the tupleIndex is always 0 */
+ state->base.mksortGetDatumFunc(x, 0, depth, state, &datum, &isNull, false);
+
+ /*
+ * Note: for "abbreviated key", we don't need to handle more here because
+ * if "abbreviated key" of a datum is null, the "full" datum must be null.
+ */
+
+ return isNull;
+}
+
+/*
+ * Compare two tuples at specified depth
+ *
+ * If "abbreviated key" is disabled:
+ * get specified datums and compare them by ApplySortComparator().
+ * If "abbreviated key" is enabled:
+ * Only first datum may be abbr key according to the design (see the comments
+ * of struct SortTuple), so different operations are needed for different
+ * datum.
+ * For first datum (depth == 0): get first datums ("abbr key" version) and
+ * compare them by ApplySortComparator(). If they are equal, get "full"
+ * version and compare again by ApplySortAbbrevFullComparator().
+ * For other datums: get specified datums and compare them by
+ * ApplySortComparator() as regular routine does.
+ *
+ * See comparetup_heap() for details.
+ */
+static inline int
+mksort_compare_datum(SortTuple *tuple1,
+ SortTuple *tuple2,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum1, datum2;
+ bool isNull1, isNull2;
+ SortSupport sortKey;
+ int ret = 0;
+
+ Assert(state->mksortGetDatumFunc);
+
+ sortKey = state->base.sortKeys + depth;
+ state->base.mksortGetDatumFunc(tuple1, 0, depth, state,
+ &datum1, &isNull1, false);
+ state->base.mksortGetDatumFunc(tuple2, 0, depth, state,
+ &datum2, &isNull2, false);
+
+ ret = ApplySortComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+
+ /*
+ * If "abbreviated key" is enabled, and we are in the first depth, it means
+ * only "abbreviated keys" are compared. If the two datums are determined to
+ * be equal by ApplySortComparator(), we need to perform an extra "full"
+ * comparing by ApplySortAbbrevFullComparator().
+ */
+ if (sortKey->abbrev_converter &&
+ depth == 0 &&
+ ret == 0)
+ {
+ /* Fetch "full" datum by setting useFullKey = true */
+ state->base.mksortGetDatumFunc(tuple1, 0, depth, state,
+ &datum1, &isNull1, true);
+ state->base.mksortGetDatumFunc(tuple2, 0, depth, state,
+ &datum2, &isNull2, true);
+
+ ret = ApplySortAbbrevFullComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+ }
+
+ return ret;
+}
+
+#ifdef USE_ASSERT_CHECKING
+/*
+ * Verify whether the SortTuple list is ordered or not at specified depth
+ */
+static void
+mksort_verify(SortTuple *x,
+ int n,
+ int depth,
+ Tuplesortstate *state)
+{
+ int ret;
+
+ for (int i = 0;i < n - 1;i++)
+ {
+ ret = mksort_compare_datum(x + i,
+ x + i + 1,
+ depth,
+ state);
+ Assert(ret <= 0);
+ }
+}
+#endif
+
+/*
+ * Major of multi-key sort
+ *
+ * seenNull indicates whether we have seen NULL in any datum we checked
+ */
+static void
+mksort_tuple(SortTuple *x,
+ size_t n,
+ int depth,
+ Tuplesortstate *state,
+ bool seenNull)
+{
+ /*
+ * In the process, the tuple array consists of five parts:
+ * left equal, less, not-processed, greater, right equal
+ *
+ * lessStart indicates the first position of less part
+ * lessEnd indicates the next position after less part
+ * greaterStart indicates the prior position before greater part
+ * greaterEnd indicates the latest position of greater part
+ * the range between lessEnd and greaterStart (inclusive) is not-processed
+ */
+ int lessStart, lessEnd, greaterStart, greaterEnd, tupCount;
+ int32 dist;
+ SortTuple *pivot;
+ bool isDatumNull;
+
+ Assert(depth <= state->base.nKeys);
+ Assert(state->base.sortKeys);
+ Assert(state->base.mksortGetDatumFunc);
+
+ if (n <= 1)
+ return;
+
+ /* If we have exceeded the max depth, return immediately */
+ if (depth == state->base.nKeys)
+ return;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Select pivot by random and move it to the first position */
+ lessStart = pg_prng_int64p(&pg_global_prng_state) % n;
+ mksort_swap(0, lessStart, x);
+ pivot = x;
+
+ lessStart = 1;
+ lessEnd = 1;
+ greaterStart = n - 1;
+ greaterEnd = n - 1;
+
+ /* Sort the array to three parts: lesser, equal, greater */
+ while (true)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Compare the left end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare lessEnd and pivot at current depth */
+ dist = mksort_compare_datum(x + lessEnd,
+ pivot,
+ depth,
+ state);
+
+ if (dist > 0)
+ break;
+
+ /* If lessEnd is equal to pivot, move it to lessStart */
+ if (dist == 0)
+ {
+ mksort_swap(lessEnd, lessStart, x);
+ lessStart++;
+ }
+ lessEnd++;
+ }
+
+ /* Compare the right end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare greaterStart and pivot at current depth */
+ dist = mksort_compare_datum(x + greaterStart,
+ pivot,
+ depth,
+ state);
+
+ if (dist < 0)
+ break;
+
+ /* If greaterStart is equal to pivot, move it to greaterEnd */
+ if (dist == 0)
+ {
+ mksort_swap(greaterStart, greaterEnd, x);
+ greaterEnd--;
+ }
+ greaterStart--;
+ }
+
+ if (lessEnd > greaterStart)
+ break;
+ mksort_swap(lessEnd, greaterStart, x);
+ lessEnd++;
+ greaterStart--;
+ }
+
+ /*
+ * Now the array has four parts:
+ * left equal, lesser, greater, right equal
+ * Note greaterStart is less than lessEnd now
+ */
+
+ /* Move the left equal part to middle */
+ dist = Min(lessStart, lessEnd - lessStart);
+ mksort_vec_swap(0, lessEnd - dist, dist, x);
+
+ /* Move the right equal part to middle */
+ dist = Min(greaterEnd - greaterStart, n - greaterEnd - 1);
+ mksort_vec_swap(lessEnd, n - dist, dist, x);
+
+ /*
+ * Now the array has three parts:
+ * lesser, equal, greater
+ * Note that one or two parts may have no element at all.
+ */
+
+ /* Recursively sort the lesser part */
+
+ /* dist means the size of less part */
+ dist = lessEnd - lessStart;
+ mksort_tuple(x,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+ /* Recursively sort the equal part */
+
+ /*
+ * (x + dist) means the first tuple in the equal part
+ * Since all tuples have equal datums at current depth, we just check any one
+ * of them to determine whether we have seen null datum.
+ */
+ isDatumNull = check_datum_null(x + dist, depth, state);
+
+ /* (lessStart + n - greaterEnd - 1) means the size of equal part */
+ tupCount = lessStart + n - greaterEnd - 1;
+
+ if (depth < state->base.nKeys - 1)
+ {
+ mksort_tuple(x + dist,
+ tupCount,
+ depth + 1,
+ state,
+ seenNull || isDatumNull);
+ } else {
+ /*
+ * We have reach the max depth: Call mksortHandleDupFunc to handle duplicated
+ * tuples if necessary, e.g. checking uniqueness or extra comparing
+ */
+
+ /*
+ * Call mksortHandleDupFunc if:
+ * 1. mksortHandleDupFunc is filled
+ * 2. the size of equal part > 1
+ */
+ if (state->base.mksortHandleDupFunc &&
+ (tupCount > 1))
+ {
+ state->base.mksortHandleDupFunc(x + dist,
+ tupCount,
+ seenNull || isDatumNull,
+ state);
+ }
+ }
+
+ /* Recursively sort the greater part */
+
+ /* dist means the size of greater part */
+ dist = greaterEnd - greaterStart;
+ mksort_tuple(x + n - dist,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+#ifdef USE_ASSERT_CHECKING
+ mksort_verify(x,
+ n,
+ depth,
+ state);
+#endif
+}
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 7c4d6dc106..c865772a7a 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -108,6 +108,7 @@
#include "utils/memutils.h"
#include "utils/pg_rusage.h"
#include "utils/tuplesort.h"
+#include "common/pg_prng.h"
/*
* Initial size of memtuples array. We're trying to select this size so that
@@ -128,6 +129,7 @@ bool trace_sort = false;
bool optimize_bounded_sort = true;
#endif
+bool enable_mk_sort = true;
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
@@ -337,6 +339,9 @@ struct Tuplesortstate
#ifdef TRACE_SORT
PGRUsage ru_start;
#endif
+
+ /* Whether multi-key sort is used */
+ bool mksortUsed;
};
/*
@@ -622,6 +627,8 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
#define ST_DEFINE
#include "lib/sort_template.h"
+#include "mksort_tuple.c"
+
/*
* tuplesort_begin_xxx
*
@@ -690,6 +697,7 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
state->base.sortopt = sortopt;
state->base.tuples = true;
state->abbrevNext = 10;
+ state->mksortUsed = false;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -2559,6 +2567,8 @@ tuplesort_get_stats(Tuplesortstate *state,
case TSS_SORTEDINMEM:
if (state->boundUsed)
stats->sortMethod = SORT_TYPE_TOP_N_HEAPSORT;
+ else if (state->mksortUsed)
+ stats->sortMethod = SORT_TYPE_MKSORT;
else
stats->sortMethod = SORT_TYPE_QUICKSORT;
break;
@@ -2592,6 +2602,8 @@ tuplesort_method_name(TuplesortMethod m)
return "external sort";
case SORT_TYPE_EXTERNAL_MERGE:
return "external merge";
+ case SORT_TYPE_MKSORT:
+ return "multi-key sort";
}
return "unknown";
@@ -2717,6 +2729,38 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
if (state->memtupcount > 1)
{
+ /*
+ * Apply multi-key sort when:
+ * 1. enable_mk_sort is set
+ * 2. There are multiple keys available
+ * 3. mksortGetDatumFunc is filled, which implies that current tuple
+ * type is supported by mksort. (By now only Heap tuple and Btree
+ * Index tuple are supported, and more types may be supported in
+ * future.)
+ *
+ * A summary of tuple types supported by mksort:
+ *
+ * HeapTuple: supported
+ * IndexTuple(btree): supported
+ * IndexTuple(hash): not supported because there is only one key
+ * DatumTuple: not supported because there is only one key
+ * HeapTuple(for cluster): not supported yet
+ * IndexTuple(gist): not supported yet
+ * IndexTuple(brin): not supported yet
+ */
+ if (enable_mk_sort &&
+ state->base.nKeys > 1 &&
+ state->base.mksortGetDatumFunc != NULL)
+ {
+ state->mksortUsed = true;
+ mksort_tuple(state->memtuples,
+ state->memtupcount,
+ 0,
+ state,
+ false);
+ return;
+ }
+
/*
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index 05a853caa3..c105eb1e35 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -30,6 +30,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/tuplesort.h"
+#include "miscadmin.h"
/* sort-type codes for sort__start probes */
@@ -92,6 +93,41 @@ static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static void freestate_cluster(Tuplesortstate *state);
+static Datum mksort_get_datum_heap(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+static Datum mksort_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+static void
+mksort_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
+static int
+mksort_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state);
+
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2);
+
+static inline void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state);
+
/*
* Data structure pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
* the tuplesort_begin_cluster.
@@ -163,6 +199,14 @@ typedef struct BrinSortTuple
/* Size of the BrinSortTuple, given length of the BrinTuple. */
#define BRINSORTTUPLE_SIZE(len) (offsetof(BrinSortTuple, tuple) + (len))
+#define ST_SORT qsort_tuple_by_itempointer
+#define ST_ELEMENT_TYPE SortTuple
+#define ST_COMPARE(a, b, state) mksort_compare_equal_index_btree(a, b, state)
+#define ST_COMPARE_ARG_TYPE Tuplesortstate
+#define ST_CHECK_FOR_INTERRUPTS
+#define ST_SCOPE static
+#define ST_DEFINE
+#include "lib/sort_template.h"
Tuplesortstate *
tuplesort_begin_heap(TupleDesc tupDesc,
@@ -200,6 +244,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
base->removeabbrev = removeabbrev_heap;
base->comparetup = comparetup_heap;
base->comparetup_tiebreak = comparetup_heap_tiebreak;
+ base->mksortGetDatumFunc = mksort_get_datum_heap;
base->writetup = writetup_heap;
base->readtup = readtup_heap;
base->haveDatum1 = true;
@@ -388,6 +433,8 @@ tuplesort_begin_index_btree(Relation heapRel,
base->removeabbrev = removeabbrev_index;
base->comparetup = comparetup_index_btree;
base->comparetup_tiebreak = comparetup_index_btree_tiebreak;
+ base->mksortGetDatumFunc = mksort_get_datum_index_btree;
+ base->mksortHandleDupFunc = mksort_handle_dup_index_btree;
base->writetup = writetup_index;
base->readtup = readtup_index;
base->haveDatum1 = true;
@@ -1543,18 +1590,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
*/
Assert(tuple1 != tuple2);
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
+ raise_error_of_dup_index(tuple1, state);
}
/*
@@ -1563,25 +1599,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
* attribute in order to ensure that all keys in the index are physically
* unique.
*/
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
}
static int
@@ -1888,3 +1906,236 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
+
+/*
+ * Get specified datum from SortTuple (HeapTuple) list
+ *
+ * If the first datum is requested (depth == 0), sortTuple->datum1/isnull1
+ * will be returned. For other datums, relevant datum will be extracted from
+ * sortTuple->tuple.
+ *
+ * The parameter "useFullKey" is used for scenario of "abbreviated key":
+ * false - get sortTuple->datum1/isnull1 (abbreviated key)
+ * true - get the "full" datum
+ * If "abbreviated key" is disabled, useFullKey will be ignored.
+ *
+ * See comparetup_heap() for details.
+ */
+static Datum
+mksort_get_datum_heap(SortTuple *x,
+ int tupleIndex,
+ int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
+{
+ TupleDesc tupDesc = NULL;
+ HeapTupleData heapTuple;
+ AttrNumber attno;
+ SortTuple *sortTuple = x + tupleIndex;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys + depth;;
+
+ Assert(state);
+ Assert(depth < state->nKeys);
+
+ /*
+ * useFullKey is valid only when depth == 0, because only the first datum
+ * may be involved to "abbreviated key", so only the first datum need to
+ * be checked with "full" version.
+ */
+ AssertImply(useFullKey, depth == 0);
+
+ tupDesc = (TupleDesc)base->arg;
+
+ /*
+ * When useFullKey is false, and the first datum is requested, return the
+ * leading datum
+ */
+ if (depth == 0 && !useFullKey)
+ {
+ *datum = sortTuple->datum1;
+ *isNull = sortTuple->isnull1;
+ return *datum;
+ }
+
+ /* For any datums which depth > 0, extract it from sortTuple->tuple */
+ heapTuple.t_len = ((MinimalTuple) sortTuple->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ heapTuple.t_data = (HeapTupleHeader) ((char *) sortTuple->tuple - MINIMAL_TUPLE_OFFSET);
+ attno = sortKey->ssup_attno;
+ *datum = heap_getattr(&heapTuple, attno, tupDesc, isNull);
+
+ return *datum;
+}
+
+/*
+ * Get specified datum from SortTuple (IndexTuple for btree index) list
+ *
+ * If the first datum is requested (depth == 0), sortTuple->datum1/isnull1
+ * will be returned. For other datums, relevant datum will be extracted from
+ * sortTuple->tuple.
+ *
+ * The parameter "useFullKey" is used for scenario of "abbreviated key":
+ * false - get sortTuple->datum1/isnull1 (abbreviated key)
+ * true - get the "full" datum
+ * If "abbreviated key" is disabled, useFullKey will be ignored.
+ *
+ * See comparetup_index_btree() for details.
+ */
+static Datum
+mksort_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
+{
+ TupleDesc tupDesc;
+ IndexTuple indexTuple;
+ SortTuple *sortTuple = x + tupleIndex;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ Assert(state);
+ Assert(depth < state->nKeys);
+
+ /*
+ * useFullKey is valid only when depth == 0, because only the first datum
+ * may be involved to "abbreviated key", so only the first datum need to
+ * be checked with "full" version.
+ */
+ AssertImply(useFullKey, depth == 0);
+
+ /*
+ * When useFullKey is false, and the first datum is requested, return the
+ * leading datum
+ */
+ if (depth == 0 && !useFullKey)
+ {
+ *isNull = sortTuple->isnull1;
+ *datum = sortTuple->datum1;
+ return *datum;
+ }
+
+ indexTuple = (IndexTuple) sortTuple->tuple;
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+
+ /*
+ * Set parameter attnum = depth + 1 because attnum starts from 1 but depth
+ * starts from 0
+ */
+ *datum = index_getattr(indexTuple, depth + 1, tupDesc, isNull);
+
+ return *datum;
+}
+
+/*
+ * Handle duplicated SortTuples (IndexTuple for btree index during mksort)
+ * x: the duplicated tuple list
+ * tupleCount: count of the tuples
+ */
+static void
+mksort_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ /* If enforceUnique is enabled and we never saw NULL, raise error */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && seenNull))
+ {
+ Assert(state->comparetup == comparetup_index_btree);
+
+ /*
+ * x means the first tuple of duplicated tuple list
+ * Since they are duplicated, simply pick up the first one
+ * to raise error
+ */
+ raise_error_of_dup_index((IndexTuple)(x->tuple), state);
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ qsort_tuple_by_itempointer(x,
+ tupleCount,
+ state);
+}
+
+/*
+ * Compare two btree index tuples by ItemPointer
+ * It is a callback function for qsort_tuple() called by
+ * mksort_handle_dup_index_btree()
+ */
+static int
+mksort_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state)
+{
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
+}
+
+/* Compare two index tuples by ItemPointer */
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2)
+{
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+/* Raise error for duplicated tuple when creating unique index */
+static inline void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state)
+{
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ TupleDesc tupDesc;
+ char *key_desc;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+ index_deform_tuple((IndexTuple)x, tupDesc, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+}
diff --git a/src/include/c.h b/src/include/c.h
index dc1841346c..f7c368cd16 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -857,12 +857,14 @@ typedef NameData *Name;
#define Assert(condition) ((void)true)
#define AssertMacro(condition) ((void)true)
+#define AssertImply(condition1, condition2) ((void)true)
#elif defined(FRONTEND)
#include <assert.h>
#define Assert(p) assert(p)
#define AssertMacro(p) ((void) assert(p))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
#else /* USE_ASSERT_CHECKING && !FRONTEND */
@@ -886,6 +888,8 @@ typedef NameData *Name;
((void) ((condition) || \
(ExceptionalCondition(#condition, __FILE__, __LINE__), 0)))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
+
#endif /* USE_ASSERT_CHECKING && !FRONTEND */
/*
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index e7941a1f09..d3f27b49dc 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -29,7 +29,6 @@
#include "utils/relcache.h"
#include "utils/sortsupport.h"
-
/*
* Tuplesortstate and Sharedsort are opaque types whose details are not
* known outside tuplesort.c.
@@ -79,9 +78,10 @@ typedef enum
SORT_TYPE_QUICKSORT = 1 << 1,
SORT_TYPE_EXTERNAL_SORT = 1 << 2,
SORT_TYPE_EXTERNAL_MERGE = 1 << 3,
+ SORT_TYPE_MKSORT = 1 << 4,
} TuplesortMethod;
-#define NUM_TUPLESORTMETHODS 4
+#define NUM_TUPLESORTMETHODS 5
typedef enum
{
@@ -155,6 +155,21 @@ typedef struct
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
+typedef Datum
+(*MksortGetDatumFunc) (SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+typedef void
+(*MksortHandleDupFunc) (SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
/*
* The public part of a Tuple sort operation state. This data structure
* contains the definition of sort-variant-specific interface methods and
@@ -249,6 +264,21 @@ typedef struct
bool tuples; /* Can SortTuple.tuple ever be set? */
void *arg; /* Specific information for the sort variant */
+
+ /*
+ * Function pointer, referencing a function to get specified datum from
+ * SortTuple list with multi-key.
+ * Used by mksort_tuple().
+ */
+ MksortGetDatumFunc mksortGetDatumFunc;
+
+ /*
+ * Function pointer, referencing a function to handle duplicated tuple
+ * from SortTuple list with multi-key.
+ * Used by mksort_tuple().
+ * For now, the function pointer is filled for only btree index tuple.
+ */
+ MksortHandleDupFunc mksortHandleDupFunc;
} TuplesortPublic;
/* Sort parallel code from state for sort__start probes */
diff --git a/src/test/regress/expected/geometry.out b/src/test/regress/expected/geometry.out
index 8be694f46b..094d22861c 100644
--- a/src/test/regress/expected/geometry.out
+++ b/src/test/regress/expected/geometry.out
@@ -4273,7 +4273,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
circle | point | distance
----------------+-------------------+---------------
<(1,2),3> | (-3,4) | 1.472135955
@@ -4310,8 +4310,8 @@ SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
<(3,5),0> | (Infinity,1e+300) | Infinity
<(1,2),3> | (1e+300,Infinity) | Infinity
<(5,1),3> | (1e+300,Infinity) | Infinity
- <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,2),3> | (Infinity,1e+300) | Infinity
+ <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,3),5> | (1e+300,Infinity) | Infinity
<(1,3),5> | (Infinity,1e+300) | Infinity
<(100,200),10> | (1e+300,Infinity) | Infinity
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 5fd54a10b1..e8dba83389 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -520,13 +520,13 @@ select * from (select * from t order by a) s order by a, b limit 55;
-- Test EXPLAIN ANALYZE with only a fullsort group.
select explain_analyze_without_memory('select * from (select * from t order by a) s order by a, b limit 55');
- explain_analyze_without_memory
----------------------------------------------------------------------------------------------------------------
+ explain_analyze_without_memory
+--------------------------------------------------------------------------------------------------------------------
Limit (actual rows=55 loops=1)
-> Incremental Sort (actual rows=55 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 2 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 2 Sort Methods: top-N heapsort, multi-key sort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=101 loops=1)
Sort Key: t.a
Sort Method: quicksort Memory: NNkB
@@ -554,7 +554,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Group Count": 2, +
"Sort Methods Used": [ +
"top-N heapsort", +
- "quicksort" +
+ "multi-key sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
@@ -728,7 +728,7 @@ select explain_analyze_without_memory('select * from (select * from t order by a
-> Incremental Sort (actual rows=70 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 1 Sort Method: quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 1 Sort Method: multi-key sort Average Memory: NNkB Peak Memory: NNkB
Pre-sorted Groups: 5 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=1000 loops=1)
Sort Key: t.a
@@ -756,7 +756,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Full-sort Groups": { +
"Group Count": 1, +
"Sort Methods Used": [ +
- "quicksort" +
+ "multi-key sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
diff --git a/src/test/regress/expected/tuplesort.out b/src/test/regress/expected/tuplesort.out
index 6dd97e7427..cd08ce8b3c 100644
--- a/src/test/regress/expected/tuplesort.out
+++ b/src/test/regress/expected/tuplesort.out
@@ -703,3 +703,378 @@ EXPLAIN (COSTS OFF) :qry;
(10 rows)
COMMIT;
+-- Test cases for multi-key sort
+set work_mem='100MB';
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 50) g;
+select * from mksort_simple_tbl order by a, b, c;
+ a | b | c
+---+----+------
+ 0 | 0 | 3417
+ 0 | 5 | 98f1
+ 0 | 5 | c0c7
+ 0 | 10 | d3d9
+ 0 | 10 | d645
+ 1 | 1 | c16a
+ 1 | 1 | c4ca
+ 1 | 6 | 3c59
+ 1 | 11 | 3416
+ 1 | 11 | 6512
+ 2 | 2 | 6364
+ 2 | 2 | c81e
+ 2 | 7 | b6d7
+ 2 | 12 | a1d0
+ 2 | 12 | c20a
+ 3 | 3 | 182b
+ 3 | 3 | eccb
+ 3 | 8 | 3769
+ 3 | 13 | 17e6
+ 3 | 13 | c51c
+ 4 | 4 | a87f
+ 4 | 4 | e369
+ 4 | 9 | 1ff1
+ 4 | 14 | aab3
+ 4 | 14 | f717
+ 5 | 0 | 6c83
+ 5 | 0 | 9bf3
+ 5 | 5 | 1c38
+ 5 | 5 | e4da
+ 5 | 10 | 8e29
+ 6 | 1 | c74d
+ 6 | 1 | d9d4
+ 6 | 6 | 1679
+ 6 | 6 | 19ca
+ 6 | 11 | 4e73
+ 7 | 2 | 67c6
+ 7 | 2 | 70ef
+ 7 | 7 | 8f14
+ 7 | 7 | a5bf
+ 7 | 12 | 02e7
+ 8 | 3 | 642e
+ 8 | 3 | 6f49
+ 8 | 8 | a577
+ 8 | 8 | c9f0
+ 8 | 13 | 33e7
+ 9 | 4 | 1f0e
+ 9 | 4 | f457
+ 9 | 9 | 45c4
+ 9 | 9 | d67d
+ 9 | 14 | 6ea9
+(50 rows)
+
+drop table mksort_simple_tbl;
+-- test table with abbr keys
+create table abbr_tbl (a int, b varchar(100), c uuid);
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+--(see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+select c, b, a from abbr_tbl order by c, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+(50 rows)
+
+select c, b, a from abbr_tbl order by c desc, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+(50 rows)
+
+select c, b, a from abbr_tbl order by c, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+NOTICE: index "idx_abbr_tbl" does not exist, skipping
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+---
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+(1 row)
+
+-- Uniqueness check of CREATE INDEX
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+ERROR: could not create unique index "idx_abbr_tbl"
+DETAIL: Key (c, b, a)=(00000000-0000-0000-0000-000000000001, aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1, 1) is duplicated.
+drop table abbr_tbl;
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ae4e8851f8..2de20ca1d0 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -18,13 +18,13 @@ INSERT INTO empsalary VALUES
('sales', 3, 4800, '2007-08-01'),
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | sum
-----------+-------+--------+-------
develop | 7 | 4200 | 25100
develop | 9 | 4500 | 25100
- develop | 11 | 5200 | 25100
develop | 10 | 5200 | 25100
+ develop | 11 | 5200 | 25100
develop | 8 | 6000 | 25100
personnel | 5 | 3500 | 7400
personnel | 2 | 3900 | 7400
@@ -33,13 +33,13 @@ SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM emps
sales | 1 | 5000 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | rank
-----------+-------+--------+------
develop | 7 | 4200 | 1
develop | 9 | 4500 | 2
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
personnel | 5 | 3500 | 1
personnel | 2 | 3900 | 2
@@ -90,18 +90,18 @@ SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PA
sales | 4 | 4800 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
depname | empno | salary | rank
-----------+-------+--------+------
- develop | 7 | 4200 | 1
- personnel | 5 | 3500 | 1
sales | 3 | 4800 | 1
sales | 4 | 4800 | 1
+ personnel | 5 | 3500 | 1
+ develop | 7 | 4200 | 1
personnel | 2 | 3900 | 2
develop | 9 | 4500 | 2
sales | 1 | 5000 | 3
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
(10 rows)
@@ -3749,23 +3749,24 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
empno | depname | rn | rnk | cnt
-------+-----------+----+-----+-----
- 8 | develop | 1 | 1 | 1
- 10 | develop | 2 | 2 | 1
- 11 | develop | 3 | 3 | 1
- 9 | develop | 4 | 4 | 2
- 7 | develop | 5 | 4 | 2
- 2 | personnel | 1 | 1 | 1
- 5 | personnel | 2 | 2 | 1
1 | sales | 1 | 1 | 1
+ 2 | personnel | 1 | 1 | 1
3 | sales | 2 | 2 | 1
4 | sales | 3 | 3 | 1
+ 5 | personnel | 2 | 2 | 1
+ 7 | develop | 4 | 4 | 1
+ 8 | develop | 1 | 1 | 1
+ 9 | develop | 5 | 5 | 1
+ 10 | develop | 2 | 2 | 1
+ 11 | develop | 3 | 3 | 1
(10 rows)
-- Test pushdown of quals into a subquery containing window functions
@@ -4106,17 +4107,17 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
empno | depname | salary | c
-------+-----------+--------+---
+ 1 | sales | 5000 | 1
+ 2 | personnel | 3900 | 1
+ 3 | sales | 4800 | 3
+ 4 | sales | 4800 | 3
+ 5 | personnel | 3500 | 2
8 | develop | 6000 | 1
10 | develop | 5200 | 3
11 | develop | 5200 | 3
- 2 | personnel | 3900 | 1
- 5 | personnel | 3500 | 2
- 1 | sales | 5000 | 1
- 4 | sales | 4800 | 3
- 3 | sales | 4800 | 3
(8 rows)
-- Ensure we get the correct run condition when the window function is both
@@ -4468,14 +4469,15 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
depname | empno | salary | enroll_date | first_emp | last_emp
-----------+-------+--------+-------------+-----------+----------
+ develop | 7 | 4200 | 01-01-2008 | 4 | 1
develop | 8 | 6000 | 10-01-2006 | 1 | 5
- develop | 7 | 4200 | 01-01-2008 | 5 | 1
personnel | 2 | 3900 | 12-23-2006 | 1 | 2
personnel | 5 | 3500 | 12-10-2007 | 2 | 1
sales | 1 | 5000 | 10-01-2006 | 1 | 3
diff --git a/src/test/regress/sql/geometry.sql b/src/test/regress/sql/geometry.sql
index c3ea368da5..1f47f07f31 100644
--- a/src/test/regress/sql/geometry.sql
+++ b/src/test/regress/sql/geometry.sql
@@ -403,7 +403,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
-- To polygon
SELECT f1, f1::polygon FROM CIRCLE_TBL WHERE f1 >= '<(0,0),1>';
diff --git a/src/test/regress/sql/tuplesort.sql b/src/test/regress/sql/tuplesort.sql
index 8476e594e6..65ecbbd5c9 100644
--- a/src/test/regress/sql/tuplesort.sql
+++ b/src/test/regress/sql/tuplesort.sql
@@ -305,3 +305,62 @@ EXPLAIN (COSTS OFF) :qry;
:qry;
COMMIT;
+
+-- Test cases for multi-key sort
+
+set work_mem='100MB';
+
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 50) g;
+select * from mksort_simple_tbl order by a, b, c;
+
+drop table mksort_simple_tbl;
+
+-- test table with abbr keys
+
+create table abbr_tbl (a int, b varchar(100), c uuid);
+
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+--(see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+
+select c, b, a from abbr_tbl order by c, b, a;
+select c, b, a from abbr_tbl order by c desc, b, a;
+select c, b, a from abbr_tbl order by c, b desc, a;
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+
+-- Uniqueness check of CREATE INDEX
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop table abbr_tbl;
\ No newline at end of file
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 6de5493b05..46359cb796 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -21,9 +21,9 @@ INSERT INTO empsalary VALUES
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
-- with GROUP BY
SELECT four, ten, SUM(SUM(four)) OVER (PARTITION BY four), AVG(ten) FROM tenk1
@@ -31,7 +31,7 @@ GROUP BY four, ten ORDER BY four, ten;
SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PARTITION BY depname);
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
-- empty window specification
SELECT COUNT(*) OVER () FROM tenk1 WHERE unique2 < 10;
@@ -1146,11 +1146,12 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
-- Test pushdown of quals into a subquery containing window functions
@@ -1332,7 +1333,7 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
-- Ensure we get the correct run condition when the window function is both
-- monotonically increasing and decreasing.
@@ -1510,10 +1511,11 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
-- cleanup
DROP TABLE empsalary;
--
2.25.1
On 23/05/2024 15:39, Wang Yao wrote:
No obvious perf regression is expected because PG will follow original
qsort code path when mksort is disabled. For the case, the only extra
cost is the check in tuplesort_sort_memtuples() to enter mksort code path.
And what about the case the mksort is enabled, but it's not effective
because all leading keys are different?
--
Heikki Linnakangas
Neon (https://neon.tech)
When all leading keys are different, mksort will finish the entire sort at the
first sort key and never touch other keys. For the case, mksort falls back to
kind of qsort actually.
I created another data set with distinct values in all sort keys:
create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t2 values (generate_series(1,499999), 0, 0, 0, 0, '');
update t2 set c2 = 999990 - c1, c3 = 999991 - c1, c4 = 999992 - c1, c5
= 999993 - c1;
update t2 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (999994 - c1)::text;
explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;
Results:
MKsort:
12374.427 ms
12528.068 ms
12554.718 ms
qsort:
12251.422 ms
12279.938 ms
12280.254 ms
MKsort is a bit slower than qsort, which can be explained by extra
checks of MKsort.
Yao Wang
On Fri, May 24, 2024 at 8:36 PM Wang Yao <yaowangm@outlook.com> wrote:
获取Outlook for Android
________________________________
From: Heikki Linnakangas <hlinnaka@iki.fi>
Sent: Thursday, May 23, 2024 8:47:29 PM
To: Wang Yao <yaowangm@outlook.com>; PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Cc: interma@outlook.com <interma@outlook.com>
Subject: Re: 回复: An implementation of multi-key sortOn 23/05/2024 15:39, Wang Yao wrote:
No obvious perf regression is expected because PG will follow original
qsort code path when mksort is disabled. For the case, the only extra
cost is the check in tuplesort_sort_memtuples() to enter mksort code path.And what about the case the mksort is enabled, but it's not effective
because all leading keys are different?--
Heikki Linnakangas
Neon (https://neon.tech)
--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.
Import Notes
Reply to msg id not found: PH7P220MB1533CFE983A46A596372A164D9F52@PH7P220MB1533.NAMP220.PROD.OUTLOOK.COM
I added two optimizations to mksort which exist on qsort_tuple():
1. When selecting pivot, always pick the item in the middle of array but
not by random. Theoretically it has the same effect to old approach, but
it can eliminate some unstable perf test results, plus a bit perf benefit by
removing random value generator.
2. Always check whether the array is ordered already, and return
immediately if it is. The pre-ordered check requires extra cost and
impacts perf numbers on some data sets, but can improve perf
significantly on other data sets.
By now, mksort has perf results equal or better than qsort on all data
sets I ever used.
I also updated test case. Please see v3 code as attachment.
Perf test results:
Data set 1 (with mass duplicate values):
-----------------------------------------
create table t1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t1 values (generate_series(1,499999), 0, 0, 0, 0,
'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
update t1 set c2 = c1 % 100, c3 = c1 % 50, c4 = c1 % 10, c5 = c1 % 3;
update t1 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (c1 % 5)::text;
Query 1:
explain analyze select c1 from t1 order by c6, c5, c4, c3, c2, c1;
Disable Mksort
3021.636 ms
3014.669 ms
3033.588 ms
Enable Mksort
1688.590 ms
1686.956 ms
1688.567 ms
The improvement is 78.9%, which is reduced from the previous version
(129%). The most cost should be the pre-ordered check.
Query 2:
create index idx_t1_mk on t1 (c6, c5, c4, c3, c2, c1);
Disable Mksort
1674.648 ms
1680.608 ms
1681.373 ms
Enable Mksort
1143.341 ms
1143.462 ms
1143.894 ms
The improvement is ~47%, which is also reduced a bit (52%).
Data set 2 (with distinct values):
----------------------------------
create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t2 values (generate_series(1,499999), 0, 0, 0, 0, '');
update t2 set c2 = 999990 - c1, c3 = 999991 - c1, c4 = 999992 - c1, c5
= 999993 - c1;
update t2 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (999994 - c1)::text;
Query 1:
explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;
Disable Mksort
12199.963 ms
12197.068 ms
12191.657 ms
Enable Mksort
9538.219 ms
9571.681 ms
9536.335 ms
The improvement is 27.9%, which is much better than the old approach (-6.2%).
Query 2 (the data is pre-ordered):
explain analyze select c1 from t2 order by c6 desc, c5, c4, c3, c2, c1;
Enable Mksort
768.191 ms
768.079 ms
767.026 ms
Disable Mksort
768.757 ms
766.166 ms
766.149 ms
They are almost the same since no actual sort was performed, and much
better than the old approach (-1198.1%).
Thanks,
Yao Wang
On Fri, May 24, 2024 at 8:50 PM Yao Wang <yao-yw.wang@broadcom.com> wrote:
When all leading keys are different, mksort will finish the entire sort at the
first sort key and never touch other keys. For the case, mksort falls back to
kind of qsort actually.I created another data set with distinct values in all sort keys:
create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t2 values (generate_series(1,499999), 0, 0, 0, 0, '');
update t2 set c2 = 999990 - c1, c3 = 999991 - c1, c4 = 999992 - c1, c5
= 999993 - c1;
update t2 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (999994 - c1)::text;
explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;Results:
MKsort:
12374.427 ms
12528.068 ms
12554.718 msqsort:
12251.422 ms
12279.938 ms
12280.254 msMKsort is a bit slower than qsort, which can be explained by extra
checks of MKsort.Yao Wang
On Fri, May 24, 2024 at 8:36 PM Wang Yao <yaowangm@outlook.com> wrote:
获取Outlook for Android
________________________________
From: Heikki Linnakangas <hlinnaka@iki.fi>
Sent: Thursday, May 23, 2024 8:47:29 PM
To: Wang Yao <yaowangm@outlook.com>; PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Cc: interma@outlook.com <interma@outlook.com>
Subject: Re: 回复: An implementation of multi-key sortOn 23/05/2024 15:39, Wang Yao wrote:
No obvious perf regression is expected because PG will follow original
qsort code path when mksort is disabled. For the case, the only extra
cost is the check in tuplesort_sort_memtuples() to enter mksort code path.And what about the case the mksort is enabled, but it's not effective
because all leading keys are different?--
Heikki Linnakangas
Neon (https://neon.tech)
--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.
Attachments:
v3-Implement-multi-key-sort.patchapplication/octet-stream; name=v3-Implement-multi-key-sort.patchDownload
From dfeba76ea15f4a56535b1534aa8a06880cdc32cb Mon Sep 17 00:00:00 2001
From: Yao Wang <yaowangm@outlook.com>
Date: Tue, 7 May 2024 08:11:13 +0000
Subject: [PATCH] Implement multi-key sort
MKsort (multi-key sort) is an alternative of standard qsort algorithm,
which has better performance for particular sort scenarios, i.e. the data
set has multiple keys to be sorted. Comparing to classic quick sort, it
can get significant performance improvement once multiple keys are
available.
Author: Yao Wang <yao-yw.wang@broadcom.com>
Co-author: Hongxu Ma <hongxu.ma@broadcom.com>
---
src/backend/utils/misc/guc_tables.c | 11 +
src/backend/utils/sort/mksort_tuple.c | 384 ++++++++++++++++++
src/backend/utils/sort/tuplesort.c | 44 ++
src/backend/utils/sort/tuplesortvariants.c | 313 ++++++++++++--
src/include/c.h | 4 +
src/include/utils/tuplesort.h | 34 +-
src/test/regress/expected/geometry.out | 4 +-
.../regress/expected/incremental_sort.out | 12 +-
src/test/regress/expected/tuplesort.out | 376 +++++++++++++++++
src/test/regress/expected/window.out | 58 +--
src/test/regress/sql/geometry.sql | 2 +-
src/test/regress/sql/tuplesort.sql | 66 +++
src/test/regress/sql/window.sql | 22 +-
13 files changed, 1250 insertions(+), 80 deletions(-)
create mode 100644 src/backend/utils/sort/mksort_tuple.c
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 3fd0b14dd8..b8fe447d68 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -103,6 +103,7 @@ extern char *default_tablespace;
extern char *temp_tablespaces;
extern bool ignore_checksum_failure;
extern bool ignore_invalid_pages;
+extern bool enable_mk_sort;
#ifdef TRACE_SYNCSCAN
extern bool trace_syncscan;
@@ -839,6 +840,16 @@ struct config_bool ConfigureNamesBool[] =
true,
NULL, NULL, NULL
},
+ {
+ {"enable_mk_sort", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables multi-key"),
+ NULL,
+ GUC_EXPLAIN
+ },
+ &enable_mk_sort,
+ true,
+ NULL, NULL, NULL
+ },
{
{"enable_hashagg", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of hashed aggregation plans."),
diff --git a/src/backend/utils/sort/mksort_tuple.c b/src/backend/utils/sort/mksort_tuple.c
new file mode 100644
index 0000000000..85e69aa783
--- /dev/null
+++ b/src/backend/utils/sort/mksort_tuple.c
@@ -0,0 +1,384 @@
+/*
+ * MKsort (multiple-key sort) is an alternative of standard qsort algorithm,
+ * which has better performance for particular sort scenarios, i.e. the
+ * data set has multiple keys to be sorted.
+ *
+ * The sorting algorithm blends Quicksort and radix sort; Like regular
+ * Quicksort, it partitions its input into sets less than and greater than a
+ * given value; like radix sort, it moves on to the next field once the current
+ * input is known to be equal in the given field.
+ *
+ * The implementation is based on the paper:
+ * Jon L. Bentley and Robert Sedgewick, "Fast Algorithms for Sorting and
+ * Searching Strings", Jan 1997
+ *
+ * Some improvements which is related to additional handling for equal tuples
+ * have been adapted to keep consistency with the implementations of postgres
+ * qsort.
+ *
+ * For now, mksort_tuple() is called in tuplesort_sort_memtuples() as a
+ * replacement of qsort_tuple() when specific conditions are satisfied.
+ */
+
+/* Swap two tuples in sort tuple array */
+static inline void
+mksort_swap(int a,
+ int b,
+ SortTuple *x)
+{
+ SortTuple t;
+
+ if (a == b)
+ return;
+ t = x[a];
+ x[a] = x[b];
+ x[b] = t;
+}
+
+/* Swap tuples by batch in sort tuple array */
+static inline void
+mksort_vec_swap(int a,
+ int b,
+ int size,
+ SortTuple *x)
+{
+ while (size-- > 0)
+ {
+ mksort_swap(a, b, x);
+ a++;
+ b++;
+ }
+}
+
+/*
+ * Check whether current datum (at specified tuple and depth) is null
+ * Note that the input x means a specified tuple provided by caller but not
+ * a tuple array, so tupleIndex is unnecessary
+ */
+static inline bool
+check_datum_null(SortTuple *x,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum;
+ bool isNull;
+
+ /* Since we have a specified tuple, the tupleIndex is always 0 */
+ state->base.mksortGetDatumFunc(x, 0, depth, state, &datum, &isNull, false);
+
+ /*
+ * Note: for "abbreviated key", we don't need to handle more here because
+ * if "abbreviated key" of a datum is null, the "full" datum must be null.
+ */
+
+ return isNull;
+}
+
+/*
+ * Compare two tuples at specified depth
+ *
+ * If "abbreviated key" is disabled:
+ * get specified datums and compare them by ApplySortComparator().
+ * If "abbreviated key" is enabled:
+ * Only first datum may be abbr key according to the design (see the comments
+ * of struct SortTuple), so different operations are needed for different
+ * datum.
+ * For first datum (depth == 0): get first datums ("abbr key" version) and
+ * compare them by ApplySortComparator(). If they are equal, get "full"
+ * version and compare again by ApplySortAbbrevFullComparator().
+ * For other datums: get specified datums and compare them by
+ * ApplySortComparator() as regular routine does.
+ *
+ * See comparetup_heap() for details.
+ */
+static inline int
+mksort_compare_datum(SortTuple *tuple1,
+ SortTuple *tuple2,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum1, datum2;
+ bool isNull1, isNull2;
+ SortSupport sortKey;
+ int ret = 0;
+
+ Assert(state->mksortGetDatumFunc);
+
+ sortKey = state->base.sortKeys + depth;
+ state->base.mksortGetDatumFunc(tuple1, 0, depth, state,
+ &datum1, &isNull1, false);
+ state->base.mksortGetDatumFunc(tuple2, 0, depth, state,
+ &datum2, &isNull2, false);
+
+ ret = ApplySortComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+
+ /*
+ * If "abbreviated key" is enabled, and we are in the first depth, it means
+ * only "abbreviated keys" are compared. If the two datums are determined to
+ * be equal by ApplySortComparator(), we need to perform an extra "full"
+ * comparing by ApplySortAbbrevFullComparator().
+ */
+ if (sortKey->abbrev_converter &&
+ depth == 0 &&
+ ret == 0)
+ {
+ /* Fetch "full" datum by setting useFullKey = true */
+ state->base.mksortGetDatumFunc(tuple1, 0, depth, state,
+ &datum1, &isNull1, true);
+ state->base.mksortGetDatumFunc(tuple2, 0, depth, state,
+ &datum2, &isNull2, true);
+
+ ret = ApplySortAbbrevFullComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+ }
+
+ return ret;
+}
+
+#ifdef USE_ASSERT_CHECKING
+/*
+ * Verify whether the SortTuple list is ordered or not at specified depth
+ */
+static void
+mksort_verify(SortTuple *x,
+ int n,
+ int depth,
+ Tuplesortstate *state)
+{
+ int ret;
+
+ for (int i = 0;i < n - 1;i++)
+ {
+ ret = mksort_compare_datum(x + i,
+ x + i + 1,
+ depth,
+ state);
+ Assert(ret <= 0);
+ }
+}
+#endif
+
+/*
+ * Major of multi-key sort
+ *
+ * seenNull indicates whether we have seen NULL in any datum we checked
+ */
+static void
+mksort_tuple(SortTuple *x,
+ size_t n,
+ int depth,
+ Tuplesortstate *state,
+ bool seenNull)
+{
+ /*
+ * In the process, the tuple array consists of five parts:
+ * left equal, less, not-processed, greater, right equal
+ *
+ * lessStart indicates the first position of less part
+ * lessEnd indicates the next position after less part
+ * greaterStart indicates the prior position before greater part
+ * greaterEnd indicates the latest position of greater part
+ * the range between lessEnd and greaterStart (inclusive) is not-processed
+ */
+ int lessStart, lessEnd, greaterStart, greaterEnd, tupCount;
+ int32 dist;
+ SortTuple *pivot;
+ bool isDatumNull;
+ bool strictOrdered = true;
+
+ Assert(depth <= state->base.nKeys);
+ Assert(state->base.sortKeys);
+ Assert(state->base.mksortGetDatumFunc);
+
+ if (n <= 1)
+ return;
+
+ /* If we have exceeded the max depth, return immediately */
+ if (depth == state->base.nKeys)
+ return;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /*
+ * Check if the array is ordered already. If yes, return immediately.
+ * Different from qsort_tuple(), the array must be strict ordered (no
+ * equal datums). If there are equal datums, we must continue the mksort
+ * process to check datums on lower depth.
+ */
+ for (int i = 0;i < n - 1;i++)
+ {
+ int ret;
+
+ CHECK_FOR_INTERRUPTS();
+ ret = mksort_compare_datum(x + i,
+ x + i + 1,
+ depth,
+ state);
+ if (ret >= 0)
+ {
+ strictOrdered = false;
+ break;
+ }
+ }
+
+ if (strictOrdered)
+ return;
+
+ /* Select pivot by random and move it to the first position */
+ lessStart = n / 2;
+ mksort_swap(0, lessStart, x);
+ pivot = x;
+
+ lessStart = 1;
+ lessEnd = 1;
+ greaterStart = n - 1;
+ greaterEnd = n - 1;
+
+ /* Sort the array to three parts: lesser, equal, greater */
+ while (true)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Compare the left end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare lessEnd and pivot at current depth */
+ dist = mksort_compare_datum(x + lessEnd,
+ pivot,
+ depth,
+ state);
+
+ if (dist > 0)
+ break;
+
+ /* If lessEnd is equal to pivot, move it to lessStart */
+ if (dist == 0)
+ {
+ mksort_swap(lessEnd, lessStart, x);
+ lessStart++;
+ }
+ lessEnd++;
+ }
+
+ /* Compare the right end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare greaterStart and pivot at current depth */
+ dist = mksort_compare_datum(x + greaterStart,
+ pivot,
+ depth,
+ state);
+
+ if (dist < 0)
+ break;
+
+ /* If greaterStart is equal to pivot, move it to greaterEnd */
+ if (dist == 0)
+ {
+ mksort_swap(greaterStart, greaterEnd, x);
+ greaterEnd--;
+ }
+ greaterStart--;
+ }
+
+ if (lessEnd > greaterStart)
+ break;
+ mksort_swap(lessEnd, greaterStart, x);
+ lessEnd++;
+ greaterStart--;
+ }
+
+ /*
+ * Now the array has four parts:
+ * left equal, lesser, greater, right equal
+ * Note greaterStart is less than lessEnd now
+ */
+
+ /* Move the left equal part to middle */
+ dist = Min(lessStart, lessEnd - lessStart);
+ mksort_vec_swap(0, lessEnd - dist, dist, x);
+
+ /* Move the right equal part to middle */
+ dist = Min(greaterEnd - greaterStart, n - greaterEnd - 1);
+ mksort_vec_swap(lessEnd, n - dist, dist, x);
+
+ /*
+ * Now the array has three parts:
+ * lesser, equal, greater
+ * Note that one or two parts may have no element at all.
+ */
+
+ /* Recursively sort the lesser part */
+
+ /* dist means the size of less part */
+ dist = lessEnd - lessStart;
+ mksort_tuple(x,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+ /* Recursively sort the equal part */
+
+ /*
+ * (x + dist) means the first tuple in the equal part
+ * Since all tuples have equal datums at current depth, we just check any one
+ * of them to determine whether we have seen null datum.
+ */
+ isDatumNull = check_datum_null(x + dist, depth, state);
+
+ /* (lessStart + n - greaterEnd - 1) means the size of equal part */
+ tupCount = lessStart + n - greaterEnd - 1;
+
+ if (depth < state->base.nKeys - 1)
+ {
+ mksort_tuple(x + dist,
+ tupCount,
+ depth + 1,
+ state,
+ seenNull || isDatumNull);
+ } else {
+ /*
+ * We have reach the max depth: Call mksortHandleDupFunc to handle duplicated
+ * tuples if necessary, e.g. checking uniqueness or extra comparing
+ */
+
+ /*
+ * Call mksortHandleDupFunc if:
+ * 1. mksortHandleDupFunc is filled
+ * 2. the size of equal part > 1
+ */
+ if (state->base.mksortHandleDupFunc &&
+ (tupCount > 1))
+ {
+ state->base.mksortHandleDupFunc(x + dist,
+ tupCount,
+ seenNull || isDatumNull,
+ state);
+ }
+ }
+
+ /* Recursively sort the greater part */
+
+ /* dist means the size of greater part */
+ dist = greaterEnd - greaterStart;
+ mksort_tuple(x + n - dist,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+#ifdef USE_ASSERT_CHECKING
+ mksort_verify(x,
+ n,
+ depth,
+ state);
+#endif
+}
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 7c4d6dc106..c865772a7a 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -108,6 +108,7 @@
#include "utils/memutils.h"
#include "utils/pg_rusage.h"
#include "utils/tuplesort.h"
+#include "common/pg_prng.h"
/*
* Initial size of memtuples array. We're trying to select this size so that
@@ -128,6 +129,7 @@ bool trace_sort = false;
bool optimize_bounded_sort = true;
#endif
+bool enable_mk_sort = true;
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
@@ -337,6 +339,9 @@ struct Tuplesortstate
#ifdef TRACE_SORT
PGRUsage ru_start;
#endif
+
+ /* Whether multi-key sort is used */
+ bool mksortUsed;
};
/*
@@ -622,6 +627,8 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
#define ST_DEFINE
#include "lib/sort_template.h"
+#include "mksort_tuple.c"
+
/*
* tuplesort_begin_xxx
*
@@ -690,6 +697,7 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
state->base.sortopt = sortopt;
state->base.tuples = true;
state->abbrevNext = 10;
+ state->mksortUsed = false;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -2559,6 +2567,8 @@ tuplesort_get_stats(Tuplesortstate *state,
case TSS_SORTEDINMEM:
if (state->boundUsed)
stats->sortMethod = SORT_TYPE_TOP_N_HEAPSORT;
+ else if (state->mksortUsed)
+ stats->sortMethod = SORT_TYPE_MKSORT;
else
stats->sortMethod = SORT_TYPE_QUICKSORT;
break;
@@ -2592,6 +2602,8 @@ tuplesort_method_name(TuplesortMethod m)
return "external sort";
case SORT_TYPE_EXTERNAL_MERGE:
return "external merge";
+ case SORT_TYPE_MKSORT:
+ return "multi-key sort";
}
return "unknown";
@@ -2717,6 +2729,38 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
if (state->memtupcount > 1)
{
+ /*
+ * Apply multi-key sort when:
+ * 1. enable_mk_sort is set
+ * 2. There are multiple keys available
+ * 3. mksortGetDatumFunc is filled, which implies that current tuple
+ * type is supported by mksort. (By now only Heap tuple and Btree
+ * Index tuple are supported, and more types may be supported in
+ * future.)
+ *
+ * A summary of tuple types supported by mksort:
+ *
+ * HeapTuple: supported
+ * IndexTuple(btree): supported
+ * IndexTuple(hash): not supported because there is only one key
+ * DatumTuple: not supported because there is only one key
+ * HeapTuple(for cluster): not supported yet
+ * IndexTuple(gist): not supported yet
+ * IndexTuple(brin): not supported yet
+ */
+ if (enable_mk_sort &&
+ state->base.nKeys > 1 &&
+ state->base.mksortGetDatumFunc != NULL)
+ {
+ state->mksortUsed = true;
+ mksort_tuple(state->memtuples,
+ state->memtupcount,
+ 0,
+ state,
+ false);
+ return;
+ }
+
/*
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index 05a853caa3..c105eb1e35 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -30,6 +30,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/tuplesort.h"
+#include "miscadmin.h"
/* sort-type codes for sort__start probes */
@@ -92,6 +93,41 @@ static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static void freestate_cluster(Tuplesortstate *state);
+static Datum mksort_get_datum_heap(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+static Datum mksort_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+static void
+mksort_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
+static int
+mksort_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state);
+
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2);
+
+static inline void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state);
+
/*
* Data structure pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
* the tuplesort_begin_cluster.
@@ -163,6 +199,14 @@ typedef struct BrinSortTuple
/* Size of the BrinSortTuple, given length of the BrinTuple. */
#define BRINSORTTUPLE_SIZE(len) (offsetof(BrinSortTuple, tuple) + (len))
+#define ST_SORT qsort_tuple_by_itempointer
+#define ST_ELEMENT_TYPE SortTuple
+#define ST_COMPARE(a, b, state) mksort_compare_equal_index_btree(a, b, state)
+#define ST_COMPARE_ARG_TYPE Tuplesortstate
+#define ST_CHECK_FOR_INTERRUPTS
+#define ST_SCOPE static
+#define ST_DEFINE
+#include "lib/sort_template.h"
Tuplesortstate *
tuplesort_begin_heap(TupleDesc tupDesc,
@@ -200,6 +244,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
base->removeabbrev = removeabbrev_heap;
base->comparetup = comparetup_heap;
base->comparetup_tiebreak = comparetup_heap_tiebreak;
+ base->mksortGetDatumFunc = mksort_get_datum_heap;
base->writetup = writetup_heap;
base->readtup = readtup_heap;
base->haveDatum1 = true;
@@ -388,6 +433,8 @@ tuplesort_begin_index_btree(Relation heapRel,
base->removeabbrev = removeabbrev_index;
base->comparetup = comparetup_index_btree;
base->comparetup_tiebreak = comparetup_index_btree_tiebreak;
+ base->mksortGetDatumFunc = mksort_get_datum_index_btree;
+ base->mksortHandleDupFunc = mksort_handle_dup_index_btree;
base->writetup = writetup_index;
base->readtup = readtup_index;
base->haveDatum1 = true;
@@ -1543,18 +1590,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
*/
Assert(tuple1 != tuple2);
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
+ raise_error_of_dup_index(tuple1, state);
}
/*
@@ -1563,25 +1599,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
* attribute in order to ensure that all keys in the index are physically
* unique.
*/
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
}
static int
@@ -1888,3 +1906,236 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
+
+/*
+ * Get specified datum from SortTuple (HeapTuple) list
+ *
+ * If the first datum is requested (depth == 0), sortTuple->datum1/isnull1
+ * will be returned. For other datums, relevant datum will be extracted from
+ * sortTuple->tuple.
+ *
+ * The parameter "useFullKey" is used for scenario of "abbreviated key":
+ * false - get sortTuple->datum1/isnull1 (abbreviated key)
+ * true - get the "full" datum
+ * If "abbreviated key" is disabled, useFullKey will be ignored.
+ *
+ * See comparetup_heap() for details.
+ */
+static Datum
+mksort_get_datum_heap(SortTuple *x,
+ int tupleIndex,
+ int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
+{
+ TupleDesc tupDesc = NULL;
+ HeapTupleData heapTuple;
+ AttrNumber attno;
+ SortTuple *sortTuple = x + tupleIndex;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys + depth;;
+
+ Assert(state);
+ Assert(depth < state->nKeys);
+
+ /*
+ * useFullKey is valid only when depth == 0, because only the first datum
+ * may be involved to "abbreviated key", so only the first datum need to
+ * be checked with "full" version.
+ */
+ AssertImply(useFullKey, depth == 0);
+
+ tupDesc = (TupleDesc)base->arg;
+
+ /*
+ * When useFullKey is false, and the first datum is requested, return the
+ * leading datum
+ */
+ if (depth == 0 && !useFullKey)
+ {
+ *datum = sortTuple->datum1;
+ *isNull = sortTuple->isnull1;
+ return *datum;
+ }
+
+ /* For any datums which depth > 0, extract it from sortTuple->tuple */
+ heapTuple.t_len = ((MinimalTuple) sortTuple->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ heapTuple.t_data = (HeapTupleHeader) ((char *) sortTuple->tuple - MINIMAL_TUPLE_OFFSET);
+ attno = sortKey->ssup_attno;
+ *datum = heap_getattr(&heapTuple, attno, tupDesc, isNull);
+
+ return *datum;
+}
+
+/*
+ * Get specified datum from SortTuple (IndexTuple for btree index) list
+ *
+ * If the first datum is requested (depth == 0), sortTuple->datum1/isnull1
+ * will be returned. For other datums, relevant datum will be extracted from
+ * sortTuple->tuple.
+ *
+ * The parameter "useFullKey" is used for scenario of "abbreviated key":
+ * false - get sortTuple->datum1/isnull1 (abbreviated key)
+ * true - get the "full" datum
+ * If "abbreviated key" is disabled, useFullKey will be ignored.
+ *
+ * See comparetup_index_btree() for details.
+ */
+static Datum
+mksort_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
+{
+ TupleDesc tupDesc;
+ IndexTuple indexTuple;
+ SortTuple *sortTuple = x + tupleIndex;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ Assert(state);
+ Assert(depth < state->nKeys);
+
+ /*
+ * useFullKey is valid only when depth == 0, because only the first datum
+ * may be involved to "abbreviated key", so only the first datum need to
+ * be checked with "full" version.
+ */
+ AssertImply(useFullKey, depth == 0);
+
+ /*
+ * When useFullKey is false, and the first datum is requested, return the
+ * leading datum
+ */
+ if (depth == 0 && !useFullKey)
+ {
+ *isNull = sortTuple->isnull1;
+ *datum = sortTuple->datum1;
+ return *datum;
+ }
+
+ indexTuple = (IndexTuple) sortTuple->tuple;
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+
+ /*
+ * Set parameter attnum = depth + 1 because attnum starts from 1 but depth
+ * starts from 0
+ */
+ *datum = index_getattr(indexTuple, depth + 1, tupDesc, isNull);
+
+ return *datum;
+}
+
+/*
+ * Handle duplicated SortTuples (IndexTuple for btree index during mksort)
+ * x: the duplicated tuple list
+ * tupleCount: count of the tuples
+ */
+static void
+mksort_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ /* If enforceUnique is enabled and we never saw NULL, raise error */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && seenNull))
+ {
+ Assert(state->comparetup == comparetup_index_btree);
+
+ /*
+ * x means the first tuple of duplicated tuple list
+ * Since they are duplicated, simply pick up the first one
+ * to raise error
+ */
+ raise_error_of_dup_index((IndexTuple)(x->tuple), state);
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ qsort_tuple_by_itempointer(x,
+ tupleCount,
+ state);
+}
+
+/*
+ * Compare two btree index tuples by ItemPointer
+ * It is a callback function for qsort_tuple() called by
+ * mksort_handle_dup_index_btree()
+ */
+static int
+mksort_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state)
+{
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
+}
+
+/* Compare two index tuples by ItemPointer */
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2)
+{
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+/* Raise error for duplicated tuple when creating unique index */
+static inline void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state)
+{
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ TupleDesc tupDesc;
+ char *key_desc;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+ index_deform_tuple((IndexTuple)x, tupDesc, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+}
diff --git a/src/include/c.h b/src/include/c.h
index dc1841346c..f7c368cd16 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -857,12 +857,14 @@ typedef NameData *Name;
#define Assert(condition) ((void)true)
#define AssertMacro(condition) ((void)true)
+#define AssertImply(condition1, condition2) ((void)true)
#elif defined(FRONTEND)
#include <assert.h>
#define Assert(p) assert(p)
#define AssertMacro(p) ((void) assert(p))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
#else /* USE_ASSERT_CHECKING && !FRONTEND */
@@ -886,6 +888,8 @@ typedef NameData *Name;
((void) ((condition) || \
(ExceptionalCondition(#condition, __FILE__, __LINE__), 0)))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
+
#endif /* USE_ASSERT_CHECKING && !FRONTEND */
/*
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index e7941a1f09..d3f27b49dc 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -29,7 +29,6 @@
#include "utils/relcache.h"
#include "utils/sortsupport.h"
-
/*
* Tuplesortstate and Sharedsort are opaque types whose details are not
* known outside tuplesort.c.
@@ -79,9 +78,10 @@ typedef enum
SORT_TYPE_QUICKSORT = 1 << 1,
SORT_TYPE_EXTERNAL_SORT = 1 << 2,
SORT_TYPE_EXTERNAL_MERGE = 1 << 3,
+ SORT_TYPE_MKSORT = 1 << 4,
} TuplesortMethod;
-#define NUM_TUPLESORTMETHODS 4
+#define NUM_TUPLESORTMETHODS 5
typedef enum
{
@@ -155,6 +155,21 @@ typedef struct
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
+typedef Datum
+(*MksortGetDatumFunc) (SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+typedef void
+(*MksortHandleDupFunc) (SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
/*
* The public part of a Tuple sort operation state. This data structure
* contains the definition of sort-variant-specific interface methods and
@@ -249,6 +264,21 @@ typedef struct
bool tuples; /* Can SortTuple.tuple ever be set? */
void *arg; /* Specific information for the sort variant */
+
+ /*
+ * Function pointer, referencing a function to get specified datum from
+ * SortTuple list with multi-key.
+ * Used by mksort_tuple().
+ */
+ MksortGetDatumFunc mksortGetDatumFunc;
+
+ /*
+ * Function pointer, referencing a function to handle duplicated tuple
+ * from SortTuple list with multi-key.
+ * Used by mksort_tuple().
+ * For now, the function pointer is filled for only btree index tuple.
+ */
+ MksortHandleDupFunc mksortHandleDupFunc;
} TuplesortPublic;
/* Sort parallel code from state for sort__start probes */
diff --git a/src/test/regress/expected/geometry.out b/src/test/regress/expected/geometry.out
index 8be694f46b..094d22861c 100644
--- a/src/test/regress/expected/geometry.out
+++ b/src/test/regress/expected/geometry.out
@@ -4273,7 +4273,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
circle | point | distance
----------------+-------------------+---------------
<(1,2),3> | (-3,4) | 1.472135955
@@ -4310,8 +4310,8 @@ SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
<(3,5),0> | (Infinity,1e+300) | Infinity
<(1,2),3> | (1e+300,Infinity) | Infinity
<(5,1),3> | (1e+300,Infinity) | Infinity
- <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,2),3> | (Infinity,1e+300) | Infinity
+ <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,3),5> | (1e+300,Infinity) | Infinity
<(1,3),5> | (Infinity,1e+300) | Infinity
<(100,200),10> | (1e+300,Infinity) | Infinity
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 5fd54a10b1..e8dba83389 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -520,13 +520,13 @@ select * from (select * from t order by a) s order by a, b limit 55;
-- Test EXPLAIN ANALYZE with only a fullsort group.
select explain_analyze_without_memory('select * from (select * from t order by a) s order by a, b limit 55');
- explain_analyze_without_memory
----------------------------------------------------------------------------------------------------------------
+ explain_analyze_without_memory
+--------------------------------------------------------------------------------------------------------------------
Limit (actual rows=55 loops=1)
-> Incremental Sort (actual rows=55 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 2 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 2 Sort Methods: top-N heapsort, multi-key sort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=101 loops=1)
Sort Key: t.a
Sort Method: quicksort Memory: NNkB
@@ -554,7 +554,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Group Count": 2, +
"Sort Methods Used": [ +
"top-N heapsort", +
- "quicksort" +
+ "multi-key sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
@@ -728,7 +728,7 @@ select explain_analyze_without_memory('select * from (select * from t order by a
-> Incremental Sort (actual rows=70 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 1 Sort Method: quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 1 Sort Method: multi-key sort Average Memory: NNkB Peak Memory: NNkB
Pre-sorted Groups: 5 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=1000 loops=1)
Sort Key: t.a
@@ -756,7 +756,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Full-sort Groups": { +
"Group Count": 1, +
"Sort Methods Used": [ +
- "quicksort" +
+ "multi-key sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
diff --git a/src/test/regress/expected/tuplesort.out b/src/test/regress/expected/tuplesort.out
index 6dd97e7427..4ec7b10884 100644
--- a/src/test/regress/expected/tuplesort.out
+++ b/src/test/regress/expected/tuplesort.out
@@ -703,3 +703,379 @@ EXPLAIN (COSTS OFF) :qry;
(10 rows)
COMMIT;
+-- Test cases for multi-key sort
+set work_mem='100MB';
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+ a | b | c
+---+----+------
+ 0 | 5 | 98f1
+ 0 | 10 | d3d9
+ 1 | 1 | c4ca
+ 1 | 11 | 6512
+ 2 | 2 | c81e
+ 2 | 12 | c20a
+ 3 | 3 | eccb
+ 3 | 13 | c51c
+ 4 | 4 | a87f
+ 4 | 14 | aab3
+ 5 | 0 | 9bf3
+ 5 | 5 | e4da
+ 6 | 1 | c74d
+ 6 | 6 | 1679
+ 7 | 2 | 70ef
+ 7 | 7 | 8f14
+ 8 | 3 | 6f49
+ 8 | 8 | c9f0
+ 9 | 4 | 1f0e
+ 9 | 9 | 45c4
+(20 rows)
+
+-- test sorting on distinct values, in which mksort is supposed to be
+-- not affective, but still can generate correct result
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 20 - g, g, g::text
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+ a | b | c
+----+----+----
+ 0 | 20 | 20
+ 1 | 19 | 19
+ 2 | 18 | 18
+ 3 | 17 | 17
+ 4 | 16 | 16
+ 5 | 15 | 15
+ 6 | 14 | 14
+ 7 | 13 | 13
+ 8 | 12 | 12
+ 9 | 11 | 11
+ 10 | 10 | 10
+ 11 | 9 | 9
+ 12 | 8 | 8
+ 13 | 7 | 7
+ 14 | 6 | 6
+ 15 | 5 | 5
+ 16 | 4 | 4
+ 17 | 3 | 3
+ 18 | 2 | 2
+ 19 | 1 | 1
+(20 rows)
+
+drop table mksort_simple_tbl;
+-- test table with abbr keys
+create table abbr_tbl (a int, b varchar(100), c uuid);
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+--(see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+select c, b, a from abbr_tbl order by c, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+(50 rows)
+
+select c, b, a from abbr_tbl order by c desc, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+(50 rows)
+
+select c, b, a from abbr_tbl order by c, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+NOTICE: index "idx_abbr_tbl" does not exist, skipping
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+---
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+(1 row)
+
+-- Uniqueness check of CREATE INDEX
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+ERROR: could not create unique index "idx_abbr_tbl"
+DETAIL: Key (c, b, a)=(00000000-0000-0000-0000-000000000001, aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1, 1) is duplicated.
+drop table abbr_tbl;
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ae4e8851f8..2de20ca1d0 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -18,13 +18,13 @@ INSERT INTO empsalary VALUES
('sales', 3, 4800, '2007-08-01'),
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | sum
-----------+-------+--------+-------
develop | 7 | 4200 | 25100
develop | 9 | 4500 | 25100
- develop | 11 | 5200 | 25100
develop | 10 | 5200 | 25100
+ develop | 11 | 5200 | 25100
develop | 8 | 6000 | 25100
personnel | 5 | 3500 | 7400
personnel | 2 | 3900 | 7400
@@ -33,13 +33,13 @@ SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM emps
sales | 1 | 5000 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | rank
-----------+-------+--------+------
develop | 7 | 4200 | 1
develop | 9 | 4500 | 2
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
personnel | 5 | 3500 | 1
personnel | 2 | 3900 | 2
@@ -90,18 +90,18 @@ SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PA
sales | 4 | 4800 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
depname | empno | salary | rank
-----------+-------+--------+------
- develop | 7 | 4200 | 1
- personnel | 5 | 3500 | 1
sales | 3 | 4800 | 1
sales | 4 | 4800 | 1
+ personnel | 5 | 3500 | 1
+ develop | 7 | 4200 | 1
personnel | 2 | 3900 | 2
develop | 9 | 4500 | 2
sales | 1 | 5000 | 3
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
(10 rows)
@@ -3749,23 +3749,24 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
empno | depname | rn | rnk | cnt
-------+-----------+----+-----+-----
- 8 | develop | 1 | 1 | 1
- 10 | develop | 2 | 2 | 1
- 11 | develop | 3 | 3 | 1
- 9 | develop | 4 | 4 | 2
- 7 | develop | 5 | 4 | 2
- 2 | personnel | 1 | 1 | 1
- 5 | personnel | 2 | 2 | 1
1 | sales | 1 | 1 | 1
+ 2 | personnel | 1 | 1 | 1
3 | sales | 2 | 2 | 1
4 | sales | 3 | 3 | 1
+ 5 | personnel | 2 | 2 | 1
+ 7 | develop | 4 | 4 | 1
+ 8 | develop | 1 | 1 | 1
+ 9 | develop | 5 | 5 | 1
+ 10 | develop | 2 | 2 | 1
+ 11 | develop | 3 | 3 | 1
(10 rows)
-- Test pushdown of quals into a subquery containing window functions
@@ -4106,17 +4107,17 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
empno | depname | salary | c
-------+-----------+--------+---
+ 1 | sales | 5000 | 1
+ 2 | personnel | 3900 | 1
+ 3 | sales | 4800 | 3
+ 4 | sales | 4800 | 3
+ 5 | personnel | 3500 | 2
8 | develop | 6000 | 1
10 | develop | 5200 | 3
11 | develop | 5200 | 3
- 2 | personnel | 3900 | 1
- 5 | personnel | 3500 | 2
- 1 | sales | 5000 | 1
- 4 | sales | 4800 | 3
- 3 | sales | 4800 | 3
(8 rows)
-- Ensure we get the correct run condition when the window function is both
@@ -4468,14 +4469,15 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
depname | empno | salary | enroll_date | first_emp | last_emp
-----------+-------+--------+-------------+-----------+----------
+ develop | 7 | 4200 | 01-01-2008 | 4 | 1
develop | 8 | 6000 | 10-01-2006 | 1 | 5
- develop | 7 | 4200 | 01-01-2008 | 5 | 1
personnel | 2 | 3900 | 12-23-2006 | 1 | 2
personnel | 5 | 3500 | 12-10-2007 | 2 | 1
sales | 1 | 5000 | 10-01-2006 | 1 | 3
diff --git a/src/test/regress/sql/geometry.sql b/src/test/regress/sql/geometry.sql
index c3ea368da5..1f47f07f31 100644
--- a/src/test/regress/sql/geometry.sql
+++ b/src/test/regress/sql/geometry.sql
@@ -403,7 +403,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
-- To polygon
SELECT f1, f1::polygon FROM CIRCLE_TBL WHERE f1 >= '<(0,0),1>';
diff --git a/src/test/regress/sql/tuplesort.sql b/src/test/regress/sql/tuplesort.sql
index 8476e594e6..7812a3e2bb 100644
--- a/src/test/regress/sql/tuplesort.sql
+++ b/src/test/regress/sql/tuplesort.sql
@@ -305,3 +305,69 @@ EXPLAIN (COSTS OFF) :qry;
:qry;
COMMIT;
+
+-- Test cases for multi-key sort
+
+set work_mem='100MB';
+
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+
+-- test sorting on distinct values, in which mksort is supposed to be
+-- not affective, but still can generate correct result
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 20 - g, g, g::text
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+drop table mksort_simple_tbl;
+
+-- test table with abbr keys
+
+create table abbr_tbl (a int, b varchar(100), c uuid);
+
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+--(see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+
+select c, b, a from abbr_tbl order by c, b, a;
+select c, b, a from abbr_tbl order by c desc, b, a;
+select c, b, a from abbr_tbl order by c, b desc, a;
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+
+-- Uniqueness check of CREATE INDEX
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop table abbr_tbl;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 6de5493b05..46359cb796 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -21,9 +21,9 @@ INSERT INTO empsalary VALUES
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
-- with GROUP BY
SELECT four, ten, SUM(SUM(four)) OVER (PARTITION BY four), AVG(ten) FROM tenk1
@@ -31,7 +31,7 @@ GROUP BY four, ten ORDER BY four, ten;
SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PARTITION BY depname);
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
-- empty window specification
SELECT COUNT(*) OVER () FROM tenk1 WHERE unique2 < 10;
@@ -1146,11 +1146,12 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
-- Test pushdown of quals into a subquery containing window functions
@@ -1332,7 +1333,7 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
-- Ensure we get the correct run condition when the window function is both
-- monotonically increasing and decreasing.
@@ -1510,10 +1511,11 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
-- cleanup
DROP TABLE empsalary;
--
2.25.1
To be accurate, "multi-key sort" includes both "multi-key quick sort"
and "multi-key heap sort". This patch includes code change related to
only "multi-key quick sort" which is used to replace standard quick
sort for tuplesort. The "multi-key heap sort" is about an implementation
of multi-key heap and should be treated as a separated task. We need
to clarify the naming to avoid confusion.
I updated code which is related to only function/var renaming and
relevant comments, plus some minor assertions changes. Please see the
attachment.
Thanks,
Yao Wang
On Fri, May 31, 2024 at 8:09 PM Yao Wang <yao-yw.wang@broadcom.com> wrote:
I added two optimizations to mksort which exist on qsort_tuple():
1. When selecting pivot, always pick the item in the middle of array but
not by random. Theoretically it has the same effect to old approach, but
it can eliminate some unstable perf test results, plus a bit perf benefit by
removing random value generator.
2. Always check whether the array is ordered already, and return
immediately if it is. The pre-ordered check requires extra cost and
impacts perf numbers on some data sets, but can improve perf
significantly on other data sets.By now, mksort has perf results equal or better than qsort on all data
sets I ever used.I also updated test case. Please see v3 code as attachment.
Perf test results:
Data set 1 (with mass duplicate values):
-----------------------------------------create table t1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t1 values (generate_series(1,499999), 0, 0, 0, 0,
'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
update t1 set c2 = c1 % 100, c3 = c1 % 50, c4 = c1 % 10, c5 = c1 % 3;
update t1 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (c1 % 5)::text;Query 1:
explain analyze select c1 from t1 order by c6, c5, c4, c3, c2, c1;
Disable Mksort
3021.636 ms
3014.669 ms
3033.588 msEnable Mksort
1688.590 ms
1686.956 ms
1688.567 msThe improvement is 78.9%, which is reduced from the previous version
(129%). The most cost should be the pre-ordered check.Query 2:
create index idx_t1_mk on t1 (c6, c5, c4, c3, c2, c1);
Disable Mksort
1674.648 ms
1680.608 ms
1681.373 msEnable Mksort
1143.341 ms
1143.462 ms
1143.894 msThe improvement is ~47%, which is also reduced a bit (52%).
Data set 2 (with distinct values):
----------------------------------create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t2 values (generate_series(1,499999), 0, 0, 0, 0, '');
update t2 set c2 = 999990 - c1, c3 = 999991 - c1, c4 = 999992 - c1, c5
= 999993 - c1;
update t2 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (999994 - c1)::text;Query 1:
explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;
Disable Mksort
12199.963 ms
12197.068 ms
12191.657 msEnable Mksort
9538.219 ms
9571.681 ms
9536.335 msThe improvement is 27.9%, which is much better than the old approach (-6.2%).
Query 2 (the data is pre-ordered):
explain analyze select c1 from t2 order by c6 desc, c5, c4, c3, c2, c1;
Enable Mksort
768.191 ms
768.079 ms
767.026 msDisable Mksort
768.757 ms
766.166 ms
766.149 msThey are almost the same since no actual sort was performed, and much
better than the old approach (-1198.1%).Thanks,
Yao Wang
On Fri, May 24, 2024 at 8:50 PM Yao Wang <yao-yw.wang@broadcom.com> wrote:
When all leading keys are different, mksort will finish the entire sort at the
first sort key and never touch other keys. For the case, mksort falls back to
kind of qsort actually.I created another data set with distinct values in all sort keys:
create table t2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 varchar(100));
insert into t2 values (generate_series(1,499999), 0, 0, 0, 0, '');
update t2 set c2 = 999990 - c1, c3 = 999991 - c1, c4 = 999992 - c1, c5
= 999993 - c1;
update t2 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (999994 - c1)::text;
explain analyze select c1 from t2 order by c6, c5, c4, c3, c2, c1;Results:
MKsort:
12374.427 ms
12528.068 ms
12554.718 msqsort:
12251.422 ms
12279.938 ms
12280.254 msMKsort is a bit slower than qsort, which can be explained by extra
checks of MKsort.Yao Wang
On Fri, May 24, 2024 at 8:36 PM Wang Yao <yaowangm@outlook.com> wrote:
获取Outlook for Android
________________________________
From: Heikki Linnakangas <hlinnaka@iki.fi>
Sent: Thursday, May 23, 2024 8:47:29 PM
To: Wang Yao <yaowangm@outlook.com>; PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Cc: interma@outlook.com <interma@outlook.com>
Subject: Re: 回复: An implementation of multi-key sortOn 23/05/2024 15:39, Wang Yao wrote:
No obvious perf regression is expected because PG will follow original
qsort code path when mksort is disabled. For the case, the only extra
cost is the check in tuplesort_sort_memtuples() to enter mksort code path.And what about the case the mksort is enabled, but it's not effective
because all leading keys are different?--
Heikki Linnakangas
Neon (https://neon.tech)
--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.
Attachments:
v4-Implement-multi-key-quick-sort.patchapplication/octet-stream; name=v4-Implement-multi-key-quick-sort.patchDownload
From fa8825580522c03984a5542a25953bb1fe7ecca3 Mon Sep 17 00:00:00 2001
From: Yao Wang <yaowangm@outlook.com>
Date: Tue, 7 May 2024 08:11:13 +0000
Subject: [PATCH] Implement multi-key quick sort
MK qsort (multi-key quick sort) is an alternative of standard qsort algorithm,
which has better performance for particular sort scenarios, i.e. the data set
has multiple keys to be sorted. Comparing to classic quick sort, it can get
significant performance improvement once multiple keys are available.
Author: Yao Wang <yao-yw.wang@broadcom.com>
Co-author: Hongxu Ma <hongxu.ma@broadcom.com>
---
src/backend/utils/misc/guc_tables.c | 11 +
src/backend/utils/sort/mk_qsort_tuple.c | 388 ++++++++++++++++++
src/backend/utils/sort/tuplesort.c | 44 ++
src/backend/utils/sort/tuplesortvariants.c | 313 ++++++++++++--
src/include/c.h | 4 +
src/include/utils/tuplesort.h | 36 +-
src/test/regress/expected/geometry.out | 4 +-
.../regress/expected/incremental_sort.out | 12 +-
src/test/regress/expected/sysviews.out | 3 +-
src/test/regress/expected/tuplesort.out | 376 +++++++++++++++++
src/test/regress/expected/window.out | 58 +--
src/test/regress/sql/geometry.sql | 2 +-
src/test/regress/sql/tuplesort.sql | 66 +++
src/test/regress/sql/window.sql | 22 +-
14 files changed, 1254 insertions(+), 85 deletions(-)
create mode 100644 src/backend/utils/sort/mk_qsort_tuple.c
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 3fd0b14dd8..5aee20f422 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -103,6 +103,7 @@ extern char *default_tablespace;
extern char *temp_tablespaces;
extern bool ignore_checksum_failure;
extern bool ignore_invalid_pages;
+extern bool enable_mk_sort;
#ifdef TRACE_SYNCSCAN
extern bool trace_syncscan;
@@ -839,6 +840,16 @@ struct config_bool ConfigureNamesBool[] =
true,
NULL, NULL, NULL
},
+ {
+ {"enable_mk_sort", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables multi-key sort"),
+ NULL,
+ GUC_EXPLAIN
+ },
+ &enable_mk_sort,
+ true,
+ NULL, NULL, NULL
+ },
{
{"enable_hashagg", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of hashed aggregation plans."),
diff --git a/src/backend/utils/sort/mk_qsort_tuple.c b/src/backend/utils/sort/mk_qsort_tuple.c
new file mode 100644
index 0000000000..9c5715380a
--- /dev/null
+++ b/src/backend/utils/sort/mk_qsort_tuple.c
@@ -0,0 +1,388 @@
+/*
+ * MK qsort (multi-key quick sort) is an alternative of standard qsort
+ * algorithm, which has better performance for particular sort scenarios, i.e.
+ * the data set has multiple keys to be sorted.
+ *
+ * The sorting algorithm blends Quicksort and radix sort; Like regular
+ * Quicksort, it partitions its input into sets less than and greater than a
+ * given value; like radix sort, it moves on to the next field once the current
+ * input is known to be equal in the given field.
+ *
+ * The implementation is based on the paper:
+ * Jon L. Bentley and Robert Sedgewick, "Fast Algorithms for Sorting and
+ * Searching Strings", Jan 1997
+ *
+ * Some improvements which is related to additional handling for equal tuples
+ * have been adapted to keep consistency with the implementations of postgres
+ * qsort.
+ *
+ * For now, mk_qsort_tuple() is called in tuplesort_sort_memtuples() as a
+ * replacement of qsort_tuple() when specific conditions are satisfied.
+ */
+
+/* Swap two tuples in sort tuple array */
+static inline void
+mkqs_swap(int a,
+ int b,
+ SortTuple *x)
+{
+ SortTuple t;
+
+ if (a == b)
+ return;
+ t = x[a];
+ x[a] = x[b];
+ x[b] = t;
+}
+
+/* Swap tuples by batch in sort tuple array */
+static inline void
+mkqs_vec_swap(int a,
+ int b,
+ int size,
+ SortTuple *x)
+{
+ while (size-- > 0)
+ {
+ mkqs_swap(a, b, x);
+ a++;
+ b++;
+ }
+}
+
+/*
+ * Check whether current datum (at specified tuple and depth) is null
+ * Note that the input x means a specified tuple provided by caller but not
+ * a tuple array, so tupleIndex is unnecessary
+ */
+static inline bool
+check_datum_null(SortTuple *x,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum;
+ bool isNull;
+
+ Assert(depth < state->base.nKeys);
+
+ /* Since we have a specified tuple, the tupleIndex is always 0 */
+ state->base.mkqsGetDatumFunc(x, 0, depth, state, &datum, &isNull, false);
+
+ /*
+ * Note: for "abbreviated key", we don't need to handle more here because
+ * if "abbreviated key" of a datum is null, the "full" datum must be null.
+ */
+
+ return isNull;
+}
+
+/*
+ * Compare two tuples at specified depth
+ *
+ * If "abbreviated key" is disabled:
+ * get specified datums and compare them by ApplySortComparator().
+ * If "abbreviated key" is enabled:
+ * Only first datum may be abbr key according to the design (see the comments
+ * of struct SortTuple), so different operations are needed for different
+ * datum.
+ * For first datum (depth == 0): get first datums ("abbr key" version) and
+ * compare them by ApplySortComparator(). If they are equal, get "full"
+ * version and compare again by ApplySortAbbrevFullComparator().
+ * For other datums: get specified datums and compare them by
+ * ApplySortComparator() as regular routine does.
+ *
+ * See comparetup_heap() for details.
+ */
+static inline int
+mkqs_compare_datum(SortTuple *tuple1,
+ SortTuple *tuple2,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum1, datum2;
+ bool isNull1, isNull2;
+ SortSupport sortKey;
+ int ret = 0;
+
+ Assert(state->base.mkqsGetDatumFunc);
+ Assert(depth < state->base.nKeys);
+
+ sortKey = state->base.sortKeys + depth;
+ state->base.mkqsGetDatumFunc(tuple1, 0, depth, state,
+ &datum1, &isNull1, false);
+ state->base.mkqsGetDatumFunc(tuple2, 0, depth, state,
+ &datum2, &isNull2, false);
+
+ ret = ApplySortComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+
+ /*
+ * If "abbreviated key" is enabled, and we are in the first depth, it means
+ * only "abbreviated keys" are compared. If the two datums are determined to
+ * be equal by ApplySortComparator(), we need to perform an extra "full"
+ * comparing by ApplySortAbbrevFullComparator().
+ */
+ if (sortKey->abbrev_converter &&
+ depth == 0 &&
+ ret == 0)
+ {
+ /* Fetch "full" datum by setting useFullKey = true */
+ state->base.mkqsGetDatumFunc(tuple1, 0, depth, state,
+ &datum1, &isNull1, true);
+ state->base.mkqsGetDatumFunc(tuple2, 0, depth, state,
+ &datum2, &isNull2, true);
+
+ ret = ApplySortAbbrevFullComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+ }
+
+ return ret;
+}
+
+#ifdef USE_ASSERT_CHECKING
+/*
+ * Verify whether the SortTuple list is ordered or not at specified depth
+ */
+static void
+mkqs_verify(SortTuple *x,
+ int n,
+ int depth,
+ Tuplesortstate *state)
+{
+ int ret;
+
+ for (int i = 0;i < n - 1;i++)
+ {
+ ret = mkqs_compare_datum(x + i,
+ x + i + 1,
+ depth,
+ state);
+ Assert(ret <= 0);
+ }
+}
+#endif
+
+/*
+ * Major of multi-key quick sort
+ *
+ * seenNull indicates whether we have seen NULL in any datum we checked
+ */
+static void
+mk_qsort_tuple(SortTuple *x,
+ size_t n,
+ int depth,
+ Tuplesortstate *state,
+ bool seenNull)
+{
+ /*
+ * In the process, the tuple array consists of five parts:
+ * left equal, less, not-processed, greater, right equal
+ *
+ * lessStart indicates the first position of less part
+ * lessEnd indicates the next position after less part
+ * greaterStart indicates the prior position before greater part
+ * greaterEnd indicates the latest position of greater part
+ * the range between lessEnd and greaterStart (inclusive) is not-processed
+ */
+ int lessStart, lessEnd, greaterStart, greaterEnd, tupCount;
+ int32 dist;
+ SortTuple *pivot;
+ bool isDatumNull;
+ bool strictOrdered = true;
+
+ Assert(depth <= state->base.nKeys);
+ Assert(state->base.sortKeys);
+ Assert(state->base.mkqsGetDatumFunc);
+
+ if (n <= 1)
+ return;
+
+ /* If we have exceeded the max depth, return immediately */
+ if (depth == state->base.nKeys)
+ return;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /*
+ * Check if the array is ordered already. If yes, return immediately.
+ * Different from qsort_tuple(), the array must be strict ordered (no
+ * equal datums). If there are equal datums, we must continue the mk
+ * qsort process to check datums on lower depth.
+ */
+ for (int i = 0;i < n - 1;i++)
+ {
+ int ret;
+
+ CHECK_FOR_INTERRUPTS();
+ ret = mkqs_compare_datum(x + i,
+ x + i + 1,
+ depth,
+ state);
+ if (ret >= 0)
+ {
+ strictOrdered = false;
+ break;
+ }
+ }
+
+ if (strictOrdered)
+ return;
+
+ /* Select pivot by random and move it to the first position */
+ lessStart = n / 2;
+ mkqs_swap(0, lessStart, x);
+ pivot = x;
+
+ lessStart = 1;
+ lessEnd = 1;
+ greaterStart = n - 1;
+ greaterEnd = n - 1;
+
+ /* Sort the array to three parts: lesser, equal, greater */
+ while (true)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Compare the left end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare lessEnd and pivot at current depth */
+ dist = mkqs_compare_datum(x + lessEnd,
+ pivot,
+ depth,
+ state);
+
+ if (dist > 0)
+ break;
+
+ /* If lessEnd is equal to pivot, move it to lessStart */
+ if (dist == 0)
+ {
+ mkqs_swap(lessEnd, lessStart, x);
+ lessStart++;
+ }
+ lessEnd++;
+ }
+
+ /* Compare the right end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare greaterStart and pivot at current depth */
+ dist = mkqs_compare_datum(x + greaterStart,
+ pivot,
+ depth,
+ state);
+
+ if (dist < 0)
+ break;
+
+ /* If greaterStart is equal to pivot, move it to greaterEnd */
+ if (dist == 0)
+ {
+ mkqs_swap(greaterStart, greaterEnd, x);
+ greaterEnd--;
+ }
+ greaterStart--;
+ }
+
+ if (lessEnd > greaterStart)
+ break;
+ mkqs_swap(lessEnd, greaterStart, x);
+ lessEnd++;
+ greaterStart--;
+ }
+
+ /*
+ * Now the array has four parts:
+ * left equal, lesser, greater, right equal
+ * Note greaterStart is less than lessEnd now
+ */
+
+ /* Move the left equal part to middle */
+ dist = Min(lessStart, lessEnd - lessStart);
+ mkqs_vec_swap(0, lessEnd - dist, dist, x);
+
+ /* Move the right equal part to middle */
+ dist = Min(greaterEnd - greaterStart, n - greaterEnd - 1);
+ mkqs_vec_swap(lessEnd, n - dist, dist, x);
+
+ /*
+ * Now the array has three parts:
+ * lesser, equal, greater
+ * Note that one or two parts may have no element at all.
+ */
+
+ /* Recursively sort the lesser part */
+
+ /* dist means the size of less part */
+ dist = lessEnd - lessStart;
+ mk_qsort_tuple(x,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+ /* Recursively sort the equal part */
+
+ /*
+ * (x + dist) means the first tuple in the equal part
+ * Since all tuples have equal datums at current depth, we just check any one
+ * of them to determine whether we have seen null datum.
+ */
+ isDatumNull = check_datum_null(x + dist, depth, state);
+
+ /* (lessStart + n - greaterEnd - 1) means the size of equal part */
+ tupCount = lessStart + n - greaterEnd - 1;
+
+ if (depth < state->base.nKeys - 1)
+ {
+ mk_qsort_tuple(x + dist,
+ tupCount,
+ depth + 1,
+ state,
+ seenNull || isDatumNull);
+ } else {
+ /*
+ * We have reach the max depth: Call mkqsHandleDupFunc to handle
+ * duplicated tuples if necessary, e.g. checking uniqueness or extra
+ * comparing
+ */
+
+ /*
+ * Call mkqsHandleDupFunc if:
+ * 1. mkqsHandleDupFunc is filled
+ * 2. the size of equal part > 1
+ */
+ if (state->base.mkqsHandleDupFunc &&
+ (tupCount > 1))
+ {
+ state->base.mkqsHandleDupFunc(x + dist,
+ tupCount,
+ seenNull || isDatumNull,
+ state);
+ }
+ }
+
+ /* Recursively sort the greater part */
+
+ /* dist means the size of greater part */
+ dist = greaterEnd - greaterStart;
+ mk_qsort_tuple(x + n - dist,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+#ifdef USE_ASSERT_CHECKING
+ mkqs_verify(x,
+ n,
+ depth,
+ state);
+#endif
+}
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 7c4d6dc106..5718911eb9 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -128,6 +128,7 @@ bool trace_sort = false;
bool optimize_bounded_sort = true;
#endif
+bool enable_mk_sort = true;
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
@@ -337,6 +338,9 @@ struct Tuplesortstate
#ifdef TRACE_SORT
PGRUsage ru_start;
#endif
+
+ /* Whether multi-key quick sort is used */
+ bool mkqsUsed;
};
/*
@@ -622,6 +626,8 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
#define ST_DEFINE
#include "lib/sort_template.h"
+#include "mk_qsort_tuple.c"
+
/*
* tuplesort_begin_xxx
*
@@ -690,6 +696,7 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
state->base.sortopt = sortopt;
state->base.tuples = true;
state->abbrevNext = 10;
+ state->mkqsUsed = false;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -2559,6 +2566,8 @@ tuplesort_get_stats(Tuplesortstate *state,
case TSS_SORTEDINMEM:
if (state->boundUsed)
stats->sortMethod = SORT_TYPE_TOP_N_HEAPSORT;
+ else if (state->mkqsUsed)
+ stats->sortMethod = SORT_TYPE_MK_QSORT;
else
stats->sortMethod = SORT_TYPE_QUICKSORT;
break;
@@ -2592,6 +2601,8 @@ tuplesort_method_name(TuplesortMethod m)
return "external sort";
case SORT_TYPE_EXTERNAL_MERGE:
return "external merge";
+ case SORT_TYPE_MK_QSORT:
+ return "multi-key quick sort";
}
return "unknown";
@@ -2717,6 +2728,39 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
if (state->memtupcount > 1)
{
+ /*
+ * Apply multi-key quick sort when:
+ * 1. enable_mk_sort is set
+ * 2. There are multiple keys available
+ * 3. mkqsGetDatumFunc is filled, which implies that current tuple
+ * type is supported by mk qsort. (By now only Heap tuple and Btree
+ * Index tuple are supported, and more types may be supported in
+ * future.)
+ *
+ * A summary of tuple types supported by mk qsort:
+ *
+ * HeapTuple: supported
+ * IndexTuple(btree): supported
+ * IndexTuple(hash): not supported because there is only one key
+ * DatumTuple: not supported because there is only one key
+ * HeapTuple(for cluster): not supported yet
+ * IndexTuple(gist): not supported yet
+ * IndexTuple(brin): not supported yet
+ */
+ if (enable_mk_sort &&
+ state->base.nKeys > 1 &&
+ state->base.mkqsGetDatumFunc != NULL)
+ {
+ state->mkqsUsed = true;
+ mk_qsort_tuple(state->memtuples,
+ state->memtupcount,
+ 0,
+ state,
+ false);
+
+ return;
+ }
+
/*
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index 05a853caa3..ddcffa5094 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -30,6 +30,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/tuplesort.h"
+#include "miscadmin.h"
/* sort-type codes for sort__start probes */
@@ -92,6 +93,41 @@ static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static void freestate_cluster(Tuplesortstate *state);
+static Datum mkqs_get_datum_heap(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+static Datum mkqs_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+static void
+mkqs_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
+static int
+mkqs_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state);
+
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2);
+
+static inline void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state);
+
/*
* Data structure pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
* the tuplesort_begin_cluster.
@@ -163,6 +199,14 @@ typedef struct BrinSortTuple
/* Size of the BrinSortTuple, given length of the BrinTuple. */
#define BRINSORTTUPLE_SIZE(len) (offsetof(BrinSortTuple, tuple) + (len))
+#define ST_SORT qsort_tuple_by_itempointer
+#define ST_ELEMENT_TYPE SortTuple
+#define ST_COMPARE(a, b, state) mkqs_compare_equal_index_btree(a, b, state)
+#define ST_COMPARE_ARG_TYPE Tuplesortstate
+#define ST_CHECK_FOR_INTERRUPTS
+#define ST_SCOPE static
+#define ST_DEFINE
+#include "lib/sort_template.h"
Tuplesortstate *
tuplesort_begin_heap(TupleDesc tupDesc,
@@ -200,6 +244,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
base->removeabbrev = removeabbrev_heap;
base->comparetup = comparetup_heap;
base->comparetup_tiebreak = comparetup_heap_tiebreak;
+ base->mkqsGetDatumFunc = mkqs_get_datum_heap;
base->writetup = writetup_heap;
base->readtup = readtup_heap;
base->haveDatum1 = true;
@@ -388,6 +433,8 @@ tuplesort_begin_index_btree(Relation heapRel,
base->removeabbrev = removeabbrev_index;
base->comparetup = comparetup_index_btree;
base->comparetup_tiebreak = comparetup_index_btree_tiebreak;
+ base->mkqsGetDatumFunc = mkqs_get_datum_index_btree;
+ base->mkqsHandleDupFunc = mkqs_handle_dup_index_btree;
base->writetup = writetup_index;
base->readtup = readtup_index;
base->haveDatum1 = true;
@@ -1531,10 +1578,6 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
*/
if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
{
- Datum values[INDEX_MAX_KEYS];
- bool isnull[INDEX_MAX_KEYS];
- char *key_desc;
-
/*
* Some rather brain-dead implementations of qsort (such as the one in
* QNX 4) will sometimes call the comparison routine to compare a
@@ -1543,18 +1586,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
*/
Assert(tuple1 != tuple2);
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
+ raise_error_of_dup_index(tuple1, state);
}
/*
@@ -1563,25 +1595,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
* attribute in order to ensure that all keys in the index are physically
* unique.
*/
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
}
static int
@@ -1888,3 +1902,232 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
+
+/*
+ * Get specified datum from SortTuple (HeapTuple) list
+ *
+ * If the first datum is requested (depth == 0), sortTuple->datum1/isnull1
+ * will be returned. For other datums, relevant datum will be extracted from
+ * sortTuple->tuple.
+ *
+ * The parameter "useFullKey" is used for scenario of "abbreviated key":
+ * false - get sortTuple->datum1/isnull1 (abbreviated key)
+ * true - get the "full" datum
+ * If "abbreviated key" is disabled, useFullKey will be ignored.
+ *
+ * See comparetup_heap() for details.
+ */
+static Datum
+mkqs_get_datum_heap(SortTuple *x,
+ int tupleIndex,
+ int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
+{
+ TupleDesc tupDesc = NULL;
+ HeapTupleData heapTuple;
+ AttrNumber attno;
+ SortTuple *sortTuple = x + tupleIndex;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys + depth;;
+
+ Assert(state);
+
+ /*
+ * useFullKey is valid only when depth == 0, because only the first datum
+ * may be involved to "abbreviated key", so only the first datum need to
+ * be checked with "full" version.
+ */
+ AssertImply(useFullKey, depth == 0);
+
+ tupDesc = (TupleDesc)base->arg;
+
+ /*
+ * When useFullKey is false, and the first datum is requested, return the
+ * leading datum
+ */
+ if (depth == 0 && !useFullKey)
+ {
+ *datum = sortTuple->datum1;
+ *isNull = sortTuple->isnull1;
+ return *datum;
+ }
+
+ /* For any datums which depth > 0, extract it from sortTuple->tuple */
+ heapTuple.t_len = ((MinimalTuple) sortTuple->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ heapTuple.t_data = (HeapTupleHeader) ((char *) sortTuple->tuple - MINIMAL_TUPLE_OFFSET);
+ attno = sortKey->ssup_attno;
+ *datum = heap_getattr(&heapTuple, attno, tupDesc, isNull);
+
+ return *datum;
+}
+
+/*
+ * Get specified datum from SortTuple (IndexTuple for btree index) list
+ *
+ * If the first datum is requested (depth == 0), sortTuple->datum1/isnull1
+ * will be returned. For other datums, relevant datum will be extracted from
+ * sortTuple->tuple.
+ *
+ * The parameter "useFullKey" is used for scenario of "abbreviated key":
+ * false - get sortTuple->datum1/isnull1 (abbreviated key)
+ * true - get the "full" datum
+ * If "abbreviated key" is disabled, useFullKey will be ignored.
+ *
+ * See comparetup_index_btree() for details.
+ */
+static Datum
+mkqs_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
+{
+ TupleDesc tupDesc;
+ IndexTuple indexTuple;
+ SortTuple *sortTuple = x + tupleIndex;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ Assert(state);
+
+ /*
+ * useFullKey is valid only when depth == 0, because only the first datum
+ * may be involved to "abbreviated key", so only the first datum need to
+ * be checked with "full" version.
+ */
+ AssertImply(useFullKey, depth == 0);
+
+ /*
+ * When useFullKey is false, and the first datum is requested, return the
+ * leading datum
+ */
+ if (depth == 0 && !useFullKey)
+ {
+ *isNull = sortTuple->isnull1;
+ *datum = sortTuple->datum1;
+ return *datum;
+ }
+
+ indexTuple = (IndexTuple) sortTuple->tuple;
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+
+ /*
+ * Set parameter attnum = depth + 1 because attnum starts from 1 but depth
+ * starts from 0
+ */
+ *datum = index_getattr(indexTuple, depth + 1, tupDesc, isNull);
+
+ return *datum;
+}
+
+/*
+ * Handle duplicated SortTuples (IndexTuple for btree index during mk qsort)
+ * x: the duplicated tuple list
+ * tupleCount: count of the tuples
+ */
+static void
+mkqs_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ /* If enforceUnique is enabled and we never saw NULL, raise error */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && seenNull))
+ {
+ /*
+ * x means the first tuple of duplicated tuple list
+ * Since they are duplicated, simply pick up the first one
+ * to raise error
+ */
+ raise_error_of_dup_index((IndexTuple)(x->tuple), state);
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ qsort_tuple_by_itempointer(x,
+ tupleCount,
+ state);
+}
+
+/*
+ * Compare two btree index tuples by ItemPointer
+ * It is a callback function for qsort_tuple() called by
+ * mkqs_handle_dup_index_btree()
+ */
+static int
+mkqs_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state)
+{
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
+}
+
+/* Compare two index tuples by ItemPointer */
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2)
+{
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+/* Raise error for duplicated tuple when creating unique index */
+static inline void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state)
+{
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ TupleDesc tupDesc;
+ char *key_desc;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+ index_deform_tuple((IndexTuple)x, tupDesc, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+}
diff --git a/src/include/c.h b/src/include/c.h
index dc1841346c..f7c368cd16 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -857,12 +857,14 @@ typedef NameData *Name;
#define Assert(condition) ((void)true)
#define AssertMacro(condition) ((void)true)
+#define AssertImply(condition1, condition2) ((void)true)
#elif defined(FRONTEND)
#include <assert.h>
#define Assert(p) assert(p)
#define AssertMacro(p) ((void) assert(p))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
#else /* USE_ASSERT_CHECKING && !FRONTEND */
@@ -886,6 +888,8 @@ typedef NameData *Name;
((void) ((condition) || \
(ExceptionalCondition(#condition, __FILE__, __LINE__), 0)))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
+
#endif /* USE_ASSERT_CHECKING && !FRONTEND */
/*
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index e7941a1f09..74a6a5ae5c 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -29,7 +29,6 @@
#include "utils/relcache.h"
#include "utils/sortsupport.h"
-
/*
* Tuplesortstate and Sharedsort are opaque types whose details are not
* known outside tuplesort.c.
@@ -79,9 +78,10 @@ typedef enum
SORT_TYPE_QUICKSORT = 1 << 1,
SORT_TYPE_EXTERNAL_SORT = 1 << 2,
SORT_TYPE_EXTERNAL_MERGE = 1 << 3,
+ SORT_TYPE_MK_QSORT = 1 << 4,
} TuplesortMethod;
-#define NUM_TUPLESORTMETHODS 4
+#define NUM_TUPLESORTMETHODS 5
typedef enum
{
@@ -155,6 +155,23 @@ typedef struct
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
+/* Multi-key quick sort */
+
+typedef Datum
+(*MkqsGetDatumFunc) (SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+typedef void
+(*MkqsHandleDupFunc) (SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
/*
* The public part of a Tuple sort operation state. This data structure
* contains the definition of sort-variant-specific interface methods and
@@ -249,6 +266,21 @@ typedef struct
bool tuples; /* Can SortTuple.tuple ever be set? */
void *arg; /* Specific information for the sort variant */
+
+ /*
+ * Function pointer, referencing a function to get specified datum from
+ * SortTuple list with multi-key.
+ * Used by mk_qsort_tuple().
+ */
+ MkqsGetDatumFunc mkqsGetDatumFunc;
+
+ /*
+ * Function pointer, referencing a function to handle duplicated tuple
+ * from SortTuple list with multi-key.
+ * Used by mk_qsort_tuple().
+ * For now, the function pointer is filled for only btree index tuple.
+ */
+ MkqsHandleDupFunc mkqsHandleDupFunc;
} TuplesortPublic;
/* Sort parallel code from state for sort__start probes */
diff --git a/src/test/regress/expected/geometry.out b/src/test/regress/expected/geometry.out
index 8be694f46b..094d22861c 100644
--- a/src/test/regress/expected/geometry.out
+++ b/src/test/regress/expected/geometry.out
@@ -4273,7 +4273,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
circle | point | distance
----------------+-------------------+---------------
<(1,2),3> | (-3,4) | 1.472135955
@@ -4310,8 +4310,8 @@ SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
<(3,5),0> | (Infinity,1e+300) | Infinity
<(1,2),3> | (1e+300,Infinity) | Infinity
<(5,1),3> | (1e+300,Infinity) | Infinity
- <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,2),3> | (Infinity,1e+300) | Infinity
+ <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,3),5> | (1e+300,Infinity) | Infinity
<(1,3),5> | (Infinity,1e+300) | Infinity
<(100,200),10> | (1e+300,Infinity) | Infinity
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 5fd54a10b1..a26f8f100a 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -520,13 +520,13 @@ select * from (select * from t order by a) s order by a, b limit 55;
-- Test EXPLAIN ANALYZE with only a fullsort group.
select explain_analyze_without_memory('select * from (select * from t order by a) s order by a, b limit 55');
- explain_analyze_without_memory
----------------------------------------------------------------------------------------------------------------
+ explain_analyze_without_memory
+--------------------------------------------------------------------------------------------------------------------------
Limit (actual rows=55 loops=1)
-> Incremental Sort (actual rows=55 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 2 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 2 Sort Methods: top-N heapsort, multi-key quick sort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=101 loops=1)
Sort Key: t.a
Sort Method: quicksort Memory: NNkB
@@ -554,7 +554,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Group Count": 2, +
"Sort Methods Used": [ +
"top-N heapsort", +
- "quicksort" +
+ "multi-key quick sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
@@ -728,7 +728,7 @@ select explain_analyze_without_memory('select * from (select * from t order by a
-> Incremental Sort (actual rows=70 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 1 Sort Method: quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 1 Sort Method: multi-key quick sort Average Memory: NNkB Peak Memory: NNkB
Pre-sorted Groups: 5 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=1000 loops=1)
Sort Key: t.a
@@ -756,7 +756,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Full-sort Groups": { +
"Group Count": 1, +
"Sort Methods Used": [ +
- "quicksort" +
+ "multi-key quick sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 2f3eb4e7f1..44840e7e5c 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -146,6 +146,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_material | on
enable_memoize | on
enable_mergejoin | on
+ enable_mk_sort | on
enable_nestloop | on
enable_parallel_append | on
enable_parallel_hash | on
@@ -157,7 +158,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_seqscan | on
enable_sort | on
enable_tidscan | on
-(23 rows)
+(24 rows)
-- There are always wait event descriptions for various types.
select type, count(*) > 0 as ok FROM pg_wait_events
diff --git a/src/test/regress/expected/tuplesort.out b/src/test/regress/expected/tuplesort.out
index 6dd97e7427..ad9e56c254 100644
--- a/src/test/regress/expected/tuplesort.out
+++ b/src/test/regress/expected/tuplesort.out
@@ -703,3 +703,379 @@ EXPLAIN (COSTS OFF) :qry;
(10 rows)
COMMIT;
+-- Test cases for multi-key quick sort
+set work_mem='100MB';
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+ a | b | c
+---+----+------
+ 0 | 5 | 98f1
+ 0 | 10 | d3d9
+ 1 | 1 | c4ca
+ 1 | 11 | 6512
+ 2 | 2 | c81e
+ 2 | 12 | c20a
+ 3 | 3 | eccb
+ 3 | 13 | c51c
+ 4 | 4 | a87f
+ 4 | 14 | aab3
+ 5 | 0 | 9bf3
+ 5 | 5 | e4da
+ 6 | 1 | c74d
+ 6 | 6 | 1679
+ 7 | 2 | 70ef
+ 7 | 7 | 8f14
+ 8 | 3 | 6f49
+ 8 | 8 | c9f0
+ 9 | 4 | 1f0e
+ 9 | 9 | 45c4
+(20 rows)
+
+-- test sorting on distinct values, in which mk qsort is supposed to be
+-- not affective, but still can generate correct result
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 20 - g, g, g::text
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+ a | b | c
+----+----+----
+ 0 | 20 | 20
+ 1 | 19 | 19
+ 2 | 18 | 18
+ 3 | 17 | 17
+ 4 | 16 | 16
+ 5 | 15 | 15
+ 6 | 14 | 14
+ 7 | 13 | 13
+ 8 | 12 | 12
+ 9 | 11 | 11
+ 10 | 10 | 10
+ 11 | 9 | 9
+ 12 | 8 | 8
+ 13 | 7 | 7
+ 14 | 6 | 6
+ 15 | 5 | 5
+ 16 | 4 | 4
+ 17 | 3 | 3
+ 18 | 2 | 2
+ 19 | 1 | 1
+(20 rows)
+
+drop table mksort_simple_tbl;
+-- test table with abbr keys
+create table abbr_tbl (a int, b varchar(100), c uuid);
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+-- (see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+select c, b, a from abbr_tbl order by c, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+(50 rows)
+
+select c, b, a from abbr_tbl order by c desc, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+(50 rows)
+
+select c, b, a from abbr_tbl order by c, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+NOTICE: index "idx_abbr_tbl" does not exist, skipping
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+---
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+(1 row)
+
+-- Uniqueness check of CREATE INDEX
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+ERROR: could not create unique index "idx_abbr_tbl"
+DETAIL: Key (c, b, a)=(00000000-0000-0000-0000-000000000001, aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1, 1) is duplicated.
+drop table abbr_tbl;
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ae4e8851f8..2de20ca1d0 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -18,13 +18,13 @@ INSERT INTO empsalary VALUES
('sales', 3, 4800, '2007-08-01'),
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | sum
-----------+-------+--------+-------
develop | 7 | 4200 | 25100
develop | 9 | 4500 | 25100
- develop | 11 | 5200 | 25100
develop | 10 | 5200 | 25100
+ develop | 11 | 5200 | 25100
develop | 8 | 6000 | 25100
personnel | 5 | 3500 | 7400
personnel | 2 | 3900 | 7400
@@ -33,13 +33,13 @@ SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM emps
sales | 1 | 5000 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | rank
-----------+-------+--------+------
develop | 7 | 4200 | 1
develop | 9 | 4500 | 2
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
personnel | 5 | 3500 | 1
personnel | 2 | 3900 | 2
@@ -90,18 +90,18 @@ SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PA
sales | 4 | 4800 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
depname | empno | salary | rank
-----------+-------+--------+------
- develop | 7 | 4200 | 1
- personnel | 5 | 3500 | 1
sales | 3 | 4800 | 1
sales | 4 | 4800 | 1
+ personnel | 5 | 3500 | 1
+ develop | 7 | 4200 | 1
personnel | 2 | 3900 | 2
develop | 9 | 4500 | 2
sales | 1 | 5000 | 3
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
(10 rows)
@@ -3749,23 +3749,24 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
empno | depname | rn | rnk | cnt
-------+-----------+----+-----+-----
- 8 | develop | 1 | 1 | 1
- 10 | develop | 2 | 2 | 1
- 11 | develop | 3 | 3 | 1
- 9 | develop | 4 | 4 | 2
- 7 | develop | 5 | 4 | 2
- 2 | personnel | 1 | 1 | 1
- 5 | personnel | 2 | 2 | 1
1 | sales | 1 | 1 | 1
+ 2 | personnel | 1 | 1 | 1
3 | sales | 2 | 2 | 1
4 | sales | 3 | 3 | 1
+ 5 | personnel | 2 | 2 | 1
+ 7 | develop | 4 | 4 | 1
+ 8 | develop | 1 | 1 | 1
+ 9 | develop | 5 | 5 | 1
+ 10 | develop | 2 | 2 | 1
+ 11 | develop | 3 | 3 | 1
(10 rows)
-- Test pushdown of quals into a subquery containing window functions
@@ -4106,17 +4107,17 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
empno | depname | salary | c
-------+-----------+--------+---
+ 1 | sales | 5000 | 1
+ 2 | personnel | 3900 | 1
+ 3 | sales | 4800 | 3
+ 4 | sales | 4800 | 3
+ 5 | personnel | 3500 | 2
8 | develop | 6000 | 1
10 | develop | 5200 | 3
11 | develop | 5200 | 3
- 2 | personnel | 3900 | 1
- 5 | personnel | 3500 | 2
- 1 | sales | 5000 | 1
- 4 | sales | 4800 | 3
- 3 | sales | 4800 | 3
(8 rows)
-- Ensure we get the correct run condition when the window function is both
@@ -4468,14 +4469,15 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
depname | empno | salary | enroll_date | first_emp | last_emp
-----------+-------+--------+-------------+-----------+----------
+ develop | 7 | 4200 | 01-01-2008 | 4 | 1
develop | 8 | 6000 | 10-01-2006 | 1 | 5
- develop | 7 | 4200 | 01-01-2008 | 5 | 1
personnel | 2 | 3900 | 12-23-2006 | 1 | 2
personnel | 5 | 3500 | 12-10-2007 | 2 | 1
sales | 1 | 5000 | 10-01-2006 | 1 | 3
diff --git a/src/test/regress/sql/geometry.sql b/src/test/regress/sql/geometry.sql
index c3ea368da5..1f47f07f31 100644
--- a/src/test/regress/sql/geometry.sql
+++ b/src/test/regress/sql/geometry.sql
@@ -403,7 +403,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
-- To polygon
SELECT f1, f1::polygon FROM CIRCLE_TBL WHERE f1 >= '<(0,0),1>';
diff --git a/src/test/regress/sql/tuplesort.sql b/src/test/regress/sql/tuplesort.sql
index 8476e594e6..a7d11a146f 100644
--- a/src/test/regress/sql/tuplesort.sql
+++ b/src/test/regress/sql/tuplesort.sql
@@ -305,3 +305,69 @@ EXPLAIN (COSTS OFF) :qry;
:qry;
COMMIT;
+
+-- Test cases for multi-key quick sort
+
+set work_mem='100MB';
+
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+
+-- test sorting on distinct values, in which mk qsort is supposed to be
+-- not affective, but still can generate correct result
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 20 - g, g, g::text
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+drop table mksort_simple_tbl;
+
+-- test table with abbr keys
+
+create table abbr_tbl (a int, b varchar(100), c uuid);
+
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+-- (see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+
+select c, b, a from abbr_tbl order by c, b, a;
+select c, b, a from abbr_tbl order by c desc, b, a;
+select c, b, a from abbr_tbl order by c, b desc, a;
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+
+-- Uniqueness check of CREATE INDEX
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop table abbr_tbl;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 6de5493b05..46359cb796 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -21,9 +21,9 @@ INSERT INTO empsalary VALUES
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
-- with GROUP BY
SELECT four, ten, SUM(SUM(four)) OVER (PARTITION BY four), AVG(ten) FROM tenk1
@@ -31,7 +31,7 @@ GROUP BY four, ten ORDER BY four, ten;
SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PARTITION BY depname);
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
-- empty window specification
SELECT COUNT(*) OVER () FROM tenk1 WHERE unique2 < 10;
@@ -1146,11 +1146,12 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
-- Test pushdown of quals into a subquery containing window functions
@@ -1332,7 +1333,7 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
-- Ensure we get the correct run condition when the window function is both
-- monotonically increasing and decreasing.
@@ -1510,10 +1511,11 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
-- cleanup
DROP TABLE empsalary;
--
2.25.1
Hello Yao,
I was interested in the patch, considering the promise of significant
speedups of sorting, so I took a quick look and did some basic perf
testing today. Unfortunately, my benchmarks don't really confirm any
peformance benefits, so I haven't looked at the code very much and only
have some very basic feedback:
1) The new GUC is missing from the .sample config, triggering a failure
of "make check-world". Fixed by 0002.
2) There's a place mixing tabs/spaces in indentation. Fixed by 0003.
3) I tried running pgindent, mostly to see how that would affect the
comments, and for most it's probably fine, but a couple are mangled
(usually those with a numbered list of items). Might needs some changes
to use formatting that's not reformatted like this. The changes from
pgindent are in 0004, but this is not a fix - it just shows the changes
after running pgindent.
Now, regarding the performance tests - I decided to do the usual black
box testing, i.e. generate tables with varying numbers of columns, data
types, different data distribution (random, correlated, ...) and so on.
And then run simple ORDER BY queries on that, measuring timing with and
without mk-sort, and checking the effect.
So I wrote a simple bash script (attached) that does exactly that - it
generates a table with 1k - 10M rows, fills with with data (with some
basic simple data distributions), and then runs the queries.
The raw results are too large to attach, I'm only attaching a PDF
showing the summary with a "speedup heatmap" - it's a pivot with the
parameters on the left, and then the GUC and number on columns on top.
So the first group of columns is with enable_mk_sort=off, the second
group with enable_mk_sort=on, and finally the heatmap with relative
timing (enable_mk_sort=on / enable_mk_sort=off).
So values <100% mean it got faster (green color - good), and values
100% mean it got slower (red - bad). And the thing is - pretty much
everything is red, often in the 200%-300% range, meaning it got 2x-3x
slower. There's only very few combinations where it got faster. That
does not seem very promising ... but maybe I did something wrong?
After seeing this, I took a look at your example again, which showed
some nice speedups. But it seems very dependent on the order of keys in
the ORDER BY clause. For example consider this:
set enable_mk_sort = on;
explain (analyze, timing off)
select * from t1 order by c6, c5, c4, c3, c2, c1;
QUERY PLAN
-------------------------------------------------------------------
Sort (cost=72328.81..73578.81 rows=499999 width=76)
(actual rows=499999 loops=1)
Sort Key: c6, c5, c4, c3, c2, c1
Sort Method: quicksort Memory: 59163kB
-> Seq Scan on t1 (cost=0.00..24999.99 rows=499999 width=76)
(actual rows=499999 loops=1)
Planning Time: 0.054 ms
Execution Time: 1095.183 ms
(6 rows)
set enable_mk_sort = on;
explain (analyze, timing off)
select * from t1 order by c6, c5, c4, c3, c2, c1;
QUERY PLAN
-------------------------------------------------------------------
Sort (cost=72328.81..73578.81 rows=499999 width=76)
(actual rows=499999 loops=1)
Sort Key: c6, c5, c4, c3, c2, c1
Sort Method: multi-key quick sort Memory: 59163kB
-> Seq Scan on t1 (cost=0.00..24999.99 rows=499999 width=76)
(actual rows=499999 loops=1)
Planning Time: 0.130 ms
Execution Time: 633.635 ms
(6 rows)
Which seems great, but let's reverse the sort keys:
set enable_mk_sort = off;
explain (analyze, timing off)
select * from t1 order by c1, c2, c3, c4, c5, c6;
QUERY PLAN
-------------------------------------------------------------------
Sort (cost=72328.81..73578.81 rows=499999 width=76)
(actual rows=499999 loops=1)
Sort Key: c1, c2, c3, c4, c5, c6
Sort Method: quicksort Memory: 59163kB
-> Seq Scan on t1 (cost=0.00..24999.99 rows=499999 width=76)
(actual rows=499999 loops=1)
Planning Time: 0.146 ms
Execution Time: 170.085 ms
(6 rows)
set enable_mk_sort = off;
explain (analyze, timing off)
select * from t1 order by c1, c2, c3, c4, c5, c6;
QUERY PLAN
-------------------------------------------------------------------
Sort (cost=72328.81..73578.81 rows=499999 width=76)
(actual rows=499999 loops=1)
Sort Key: c1, c2, c3, c4, c5, c6
Sort Method: multi-key quick sort Memory: 59163kB
-> Seq Scan on t1 (cost=0.00..24999.99 rows=499999 width=76)
(actual rows=499999 loops=1)
Planning Time: 0.127 ms
Execution Time: 367.263 ms
(6 rows)
I believe this is the case Heikki was asking about. I see the response
was that it's OK and the overhead is very low, but without too much
detail so I don't know what case you measured.
Anyway, I think it seems to be very sensitive to the exact data set.
Which is not entirely surprising, I guess - most optimizations have a
mix of improved/regressed cases, yielding a heatmap with a mix of green
and red areas, and we have to either optimize the code (or heuristics to
enable the feature), or convince ourselves the "red" cases are less
important / unlikely etc.
But here the results are almost universally "red", so it's going to be
very hard to convince ourselves this is a good trade off. Of course, you
may argue the cases I've tested are wrong and not representative. I
don't think that's the case, though.
It's also interesting (and perhaps a little bit bizarre) that almost all
the cases that got better are for a single-column sort. Which is exactly
the case the patch should not affect. But it seems pretty consistent, so
maybe this is something worth investigating.
FWIW I'm not familiar with the various quicksort variants, but I noticed
that the Bentley & Sedgewick paper mentioned as the basis for the patch
is from 1997, and apparently implements stuff originally proposed by
Hoare in 1961. So maybe this is just an example of an algorithm that was
good for a hardware at that time, but the changes (e.g. the growing
important of on-CPU caches) made it less relevant?
Another thing I noticed while skimming [1]https://en.wikipedia.org/wiki/Multi-key_quicksort is this:
The algorithm is designed to exploit the property that in many
problems, strings tend to have shared prefixes.
If that's the case, isn't it wrong to apply this to all sorts, including
sorts with non-string keys? It might explain why your example works OK,
as it involves key c6 which is string with all values sharing the same
(fairly long) prefix. But then maybe we should be careful and restrict
this to only such those cases?
regards
[1]: https://en.wikipedia.org/wiki/Multi-key_quicksort
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v20240609-0001-patch-2024-06-07.patchtext/x-patch; charset=UTF-8; name=v20240609-0001-patch-2024-06-07.patchDownload
From 592500255df863baaf2afade60c6801411ab8eca Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 9 Jun 2024 13:26:15 +0200
Subject: [PATCH v20240609 1/4] patch 2024/06/07
---
src/backend/utils/misc/guc_tables.c | 11 +
src/backend/utils/sort/mk_qsort_tuple.c | 388 ++++++++++++++++++
src/backend/utils/sort/tuplesort.c | 44 ++
src/backend/utils/sort/tuplesortvariants.c | 313 ++++++++++++--
src/include/c.h | 4 +
src/include/utils/tuplesort.h | 36 +-
src/test/regress/expected/geometry.out | 4 +-
.../regress/expected/incremental_sort.out | 12 +-
src/test/regress/expected/sysviews.out | 3 +-
src/test/regress/expected/tuplesort.out | 376 +++++++++++++++++
src/test/regress/expected/window.out | 58 +--
src/test/regress/sql/geometry.sql | 2 +-
src/test/regress/sql/tuplesort.sql | 66 +++
src/test/regress/sql/window.sql | 22 +-
14 files changed, 1254 insertions(+), 85 deletions(-)
create mode 100644 src/backend/utils/sort/mk_qsort_tuple.c
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 46c258be282..a5f8b3798cc 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -103,6 +103,7 @@ extern char *default_tablespace;
extern char *temp_tablespaces;
extern bool ignore_checksum_failure;
extern bool ignore_invalid_pages;
+extern bool enable_mk_sort;
#ifdef TRACE_SYNCSCAN
extern bool trace_syncscan;
@@ -839,6 +840,16 @@ struct config_bool ConfigureNamesBool[] =
true,
NULL, NULL, NULL
},
+ {
+ {"enable_mk_sort", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables multi-key sort"),
+ NULL,
+ GUC_EXPLAIN
+ },
+ &enable_mk_sort,
+ true,
+ NULL, NULL, NULL
+ },
{
{"enable_hashagg", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of hashed aggregation plans."),
diff --git a/src/backend/utils/sort/mk_qsort_tuple.c b/src/backend/utils/sort/mk_qsort_tuple.c
new file mode 100644
index 00000000000..9c5715380aa
--- /dev/null
+++ b/src/backend/utils/sort/mk_qsort_tuple.c
@@ -0,0 +1,388 @@
+/*
+ * MK qsort (multi-key quick sort) is an alternative of standard qsort
+ * algorithm, which has better performance for particular sort scenarios, i.e.
+ * the data set has multiple keys to be sorted.
+ *
+ * The sorting algorithm blends Quicksort and radix sort; Like regular
+ * Quicksort, it partitions its input into sets less than and greater than a
+ * given value; like radix sort, it moves on to the next field once the current
+ * input is known to be equal in the given field.
+ *
+ * The implementation is based on the paper:
+ * Jon L. Bentley and Robert Sedgewick, "Fast Algorithms for Sorting and
+ * Searching Strings", Jan 1997
+ *
+ * Some improvements which is related to additional handling for equal tuples
+ * have been adapted to keep consistency with the implementations of postgres
+ * qsort.
+ *
+ * For now, mk_qsort_tuple() is called in tuplesort_sort_memtuples() as a
+ * replacement of qsort_tuple() when specific conditions are satisfied.
+ */
+
+/* Swap two tuples in sort tuple array */
+static inline void
+mkqs_swap(int a,
+ int b,
+ SortTuple *x)
+{
+ SortTuple t;
+
+ if (a == b)
+ return;
+ t = x[a];
+ x[a] = x[b];
+ x[b] = t;
+}
+
+/* Swap tuples by batch in sort tuple array */
+static inline void
+mkqs_vec_swap(int a,
+ int b,
+ int size,
+ SortTuple *x)
+{
+ while (size-- > 0)
+ {
+ mkqs_swap(a, b, x);
+ a++;
+ b++;
+ }
+}
+
+/*
+ * Check whether current datum (at specified tuple and depth) is null
+ * Note that the input x means a specified tuple provided by caller but not
+ * a tuple array, so tupleIndex is unnecessary
+ */
+static inline bool
+check_datum_null(SortTuple *x,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum;
+ bool isNull;
+
+ Assert(depth < state->base.nKeys);
+
+ /* Since we have a specified tuple, the tupleIndex is always 0 */
+ state->base.mkqsGetDatumFunc(x, 0, depth, state, &datum, &isNull, false);
+
+ /*
+ * Note: for "abbreviated key", we don't need to handle more here because
+ * if "abbreviated key" of a datum is null, the "full" datum must be null.
+ */
+
+ return isNull;
+}
+
+/*
+ * Compare two tuples at specified depth
+ *
+ * If "abbreviated key" is disabled:
+ * get specified datums and compare them by ApplySortComparator().
+ * If "abbreviated key" is enabled:
+ * Only first datum may be abbr key according to the design (see the comments
+ * of struct SortTuple), so different operations are needed for different
+ * datum.
+ * For first datum (depth == 0): get first datums ("abbr key" version) and
+ * compare them by ApplySortComparator(). If they are equal, get "full"
+ * version and compare again by ApplySortAbbrevFullComparator().
+ * For other datums: get specified datums and compare them by
+ * ApplySortComparator() as regular routine does.
+ *
+ * See comparetup_heap() for details.
+ */
+static inline int
+mkqs_compare_datum(SortTuple *tuple1,
+ SortTuple *tuple2,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum1, datum2;
+ bool isNull1, isNull2;
+ SortSupport sortKey;
+ int ret = 0;
+
+ Assert(state->base.mkqsGetDatumFunc);
+ Assert(depth < state->base.nKeys);
+
+ sortKey = state->base.sortKeys + depth;
+ state->base.mkqsGetDatumFunc(tuple1, 0, depth, state,
+ &datum1, &isNull1, false);
+ state->base.mkqsGetDatumFunc(tuple2, 0, depth, state,
+ &datum2, &isNull2, false);
+
+ ret = ApplySortComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+
+ /*
+ * If "abbreviated key" is enabled, and we are in the first depth, it means
+ * only "abbreviated keys" are compared. If the two datums are determined to
+ * be equal by ApplySortComparator(), we need to perform an extra "full"
+ * comparing by ApplySortAbbrevFullComparator().
+ */
+ if (sortKey->abbrev_converter &&
+ depth == 0 &&
+ ret == 0)
+ {
+ /* Fetch "full" datum by setting useFullKey = true */
+ state->base.mkqsGetDatumFunc(tuple1, 0, depth, state,
+ &datum1, &isNull1, true);
+ state->base.mkqsGetDatumFunc(tuple2, 0, depth, state,
+ &datum2, &isNull2, true);
+
+ ret = ApplySortAbbrevFullComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+ }
+
+ return ret;
+}
+
+#ifdef USE_ASSERT_CHECKING
+/*
+ * Verify whether the SortTuple list is ordered or not at specified depth
+ */
+static void
+mkqs_verify(SortTuple *x,
+ int n,
+ int depth,
+ Tuplesortstate *state)
+{
+ int ret;
+
+ for (int i = 0;i < n - 1;i++)
+ {
+ ret = mkqs_compare_datum(x + i,
+ x + i + 1,
+ depth,
+ state);
+ Assert(ret <= 0);
+ }
+}
+#endif
+
+/*
+ * Major of multi-key quick sort
+ *
+ * seenNull indicates whether we have seen NULL in any datum we checked
+ */
+static void
+mk_qsort_tuple(SortTuple *x,
+ size_t n,
+ int depth,
+ Tuplesortstate *state,
+ bool seenNull)
+{
+ /*
+ * In the process, the tuple array consists of five parts:
+ * left equal, less, not-processed, greater, right equal
+ *
+ * lessStart indicates the first position of less part
+ * lessEnd indicates the next position after less part
+ * greaterStart indicates the prior position before greater part
+ * greaterEnd indicates the latest position of greater part
+ * the range between lessEnd and greaterStart (inclusive) is not-processed
+ */
+ int lessStart, lessEnd, greaterStart, greaterEnd, tupCount;
+ int32 dist;
+ SortTuple *pivot;
+ bool isDatumNull;
+ bool strictOrdered = true;
+
+ Assert(depth <= state->base.nKeys);
+ Assert(state->base.sortKeys);
+ Assert(state->base.mkqsGetDatumFunc);
+
+ if (n <= 1)
+ return;
+
+ /* If we have exceeded the max depth, return immediately */
+ if (depth == state->base.nKeys)
+ return;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /*
+ * Check if the array is ordered already. If yes, return immediately.
+ * Different from qsort_tuple(), the array must be strict ordered (no
+ * equal datums). If there are equal datums, we must continue the mk
+ * qsort process to check datums on lower depth.
+ */
+ for (int i = 0;i < n - 1;i++)
+ {
+ int ret;
+
+ CHECK_FOR_INTERRUPTS();
+ ret = mkqs_compare_datum(x + i,
+ x + i + 1,
+ depth,
+ state);
+ if (ret >= 0)
+ {
+ strictOrdered = false;
+ break;
+ }
+ }
+
+ if (strictOrdered)
+ return;
+
+ /* Select pivot by random and move it to the first position */
+ lessStart = n / 2;
+ mkqs_swap(0, lessStart, x);
+ pivot = x;
+
+ lessStart = 1;
+ lessEnd = 1;
+ greaterStart = n - 1;
+ greaterEnd = n - 1;
+
+ /* Sort the array to three parts: lesser, equal, greater */
+ while (true)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Compare the left end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare lessEnd and pivot at current depth */
+ dist = mkqs_compare_datum(x + lessEnd,
+ pivot,
+ depth,
+ state);
+
+ if (dist > 0)
+ break;
+
+ /* If lessEnd is equal to pivot, move it to lessStart */
+ if (dist == 0)
+ {
+ mkqs_swap(lessEnd, lessStart, x);
+ lessStart++;
+ }
+ lessEnd++;
+ }
+
+ /* Compare the right end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare greaterStart and pivot at current depth */
+ dist = mkqs_compare_datum(x + greaterStart,
+ pivot,
+ depth,
+ state);
+
+ if (dist < 0)
+ break;
+
+ /* If greaterStart is equal to pivot, move it to greaterEnd */
+ if (dist == 0)
+ {
+ mkqs_swap(greaterStart, greaterEnd, x);
+ greaterEnd--;
+ }
+ greaterStart--;
+ }
+
+ if (lessEnd > greaterStart)
+ break;
+ mkqs_swap(lessEnd, greaterStart, x);
+ lessEnd++;
+ greaterStart--;
+ }
+
+ /*
+ * Now the array has four parts:
+ * left equal, lesser, greater, right equal
+ * Note greaterStart is less than lessEnd now
+ */
+
+ /* Move the left equal part to middle */
+ dist = Min(lessStart, lessEnd - lessStart);
+ mkqs_vec_swap(0, lessEnd - dist, dist, x);
+
+ /* Move the right equal part to middle */
+ dist = Min(greaterEnd - greaterStart, n - greaterEnd - 1);
+ mkqs_vec_swap(lessEnd, n - dist, dist, x);
+
+ /*
+ * Now the array has three parts:
+ * lesser, equal, greater
+ * Note that one or two parts may have no element at all.
+ */
+
+ /* Recursively sort the lesser part */
+
+ /* dist means the size of less part */
+ dist = lessEnd - lessStart;
+ mk_qsort_tuple(x,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+ /* Recursively sort the equal part */
+
+ /*
+ * (x + dist) means the first tuple in the equal part
+ * Since all tuples have equal datums at current depth, we just check any one
+ * of them to determine whether we have seen null datum.
+ */
+ isDatumNull = check_datum_null(x + dist, depth, state);
+
+ /* (lessStart + n - greaterEnd - 1) means the size of equal part */
+ tupCount = lessStart + n - greaterEnd - 1;
+
+ if (depth < state->base.nKeys - 1)
+ {
+ mk_qsort_tuple(x + dist,
+ tupCount,
+ depth + 1,
+ state,
+ seenNull || isDatumNull);
+ } else {
+ /*
+ * We have reach the max depth: Call mkqsHandleDupFunc to handle
+ * duplicated tuples if necessary, e.g. checking uniqueness or extra
+ * comparing
+ */
+
+ /*
+ * Call mkqsHandleDupFunc if:
+ * 1. mkqsHandleDupFunc is filled
+ * 2. the size of equal part > 1
+ */
+ if (state->base.mkqsHandleDupFunc &&
+ (tupCount > 1))
+ {
+ state->base.mkqsHandleDupFunc(x + dist,
+ tupCount,
+ seenNull || isDatumNull,
+ state);
+ }
+ }
+
+ /* Recursively sort the greater part */
+
+ /* dist means the size of greater part */
+ dist = greaterEnd - greaterStart;
+ mk_qsort_tuple(x + n - dist,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+#ifdef USE_ASSERT_CHECKING
+ mkqs_verify(x,
+ n,
+ depth,
+ state);
+#endif
+}
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 7c4d6dc106b..5718911eb9b 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -128,6 +128,7 @@ bool trace_sort = false;
bool optimize_bounded_sort = true;
#endif
+bool enable_mk_sort = true;
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
@@ -337,6 +338,9 @@ struct Tuplesortstate
#ifdef TRACE_SORT
PGRUsage ru_start;
#endif
+
+ /* Whether multi-key quick sort is used */
+ bool mkqsUsed;
};
/*
@@ -622,6 +626,8 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
#define ST_DEFINE
#include "lib/sort_template.h"
+#include "mk_qsort_tuple.c"
+
/*
* tuplesort_begin_xxx
*
@@ -690,6 +696,7 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
state->base.sortopt = sortopt;
state->base.tuples = true;
state->abbrevNext = 10;
+ state->mkqsUsed = false;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -2559,6 +2566,8 @@ tuplesort_get_stats(Tuplesortstate *state,
case TSS_SORTEDINMEM:
if (state->boundUsed)
stats->sortMethod = SORT_TYPE_TOP_N_HEAPSORT;
+ else if (state->mkqsUsed)
+ stats->sortMethod = SORT_TYPE_MK_QSORT;
else
stats->sortMethod = SORT_TYPE_QUICKSORT;
break;
@@ -2592,6 +2601,8 @@ tuplesort_method_name(TuplesortMethod m)
return "external sort";
case SORT_TYPE_EXTERNAL_MERGE:
return "external merge";
+ case SORT_TYPE_MK_QSORT:
+ return "multi-key quick sort";
}
return "unknown";
@@ -2717,6 +2728,39 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
if (state->memtupcount > 1)
{
+ /*
+ * Apply multi-key quick sort when:
+ * 1. enable_mk_sort is set
+ * 2. There are multiple keys available
+ * 3. mkqsGetDatumFunc is filled, which implies that current tuple
+ * type is supported by mk qsort. (By now only Heap tuple and Btree
+ * Index tuple are supported, and more types may be supported in
+ * future.)
+ *
+ * A summary of tuple types supported by mk qsort:
+ *
+ * HeapTuple: supported
+ * IndexTuple(btree): supported
+ * IndexTuple(hash): not supported because there is only one key
+ * DatumTuple: not supported because there is only one key
+ * HeapTuple(for cluster): not supported yet
+ * IndexTuple(gist): not supported yet
+ * IndexTuple(brin): not supported yet
+ */
+ if (enable_mk_sort &&
+ state->base.nKeys > 1 &&
+ state->base.mkqsGetDatumFunc != NULL)
+ {
+ state->mkqsUsed = true;
+ mk_qsort_tuple(state->memtuples,
+ state->memtupcount,
+ 0,
+ state,
+ false);
+
+ return;
+ }
+
/*
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index 05a853caa36..ddcffa5094d 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -30,6 +30,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/tuplesort.h"
+#include "miscadmin.h"
/* sort-type codes for sort__start probes */
@@ -92,6 +93,41 @@ static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static void freestate_cluster(Tuplesortstate *state);
+static Datum mkqs_get_datum_heap(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+static Datum mkqs_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+static void
+mkqs_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
+static int
+mkqs_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state);
+
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2);
+
+static inline void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state);
+
/*
* Data structure pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
* the tuplesort_begin_cluster.
@@ -163,6 +199,14 @@ typedef struct BrinSortTuple
/* Size of the BrinSortTuple, given length of the BrinTuple. */
#define BRINSORTTUPLE_SIZE(len) (offsetof(BrinSortTuple, tuple) + (len))
+#define ST_SORT qsort_tuple_by_itempointer
+#define ST_ELEMENT_TYPE SortTuple
+#define ST_COMPARE(a, b, state) mkqs_compare_equal_index_btree(a, b, state)
+#define ST_COMPARE_ARG_TYPE Tuplesortstate
+#define ST_CHECK_FOR_INTERRUPTS
+#define ST_SCOPE static
+#define ST_DEFINE
+#include "lib/sort_template.h"
Tuplesortstate *
tuplesort_begin_heap(TupleDesc tupDesc,
@@ -200,6 +244,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
base->removeabbrev = removeabbrev_heap;
base->comparetup = comparetup_heap;
base->comparetup_tiebreak = comparetup_heap_tiebreak;
+ base->mkqsGetDatumFunc = mkqs_get_datum_heap;
base->writetup = writetup_heap;
base->readtup = readtup_heap;
base->haveDatum1 = true;
@@ -388,6 +433,8 @@ tuplesort_begin_index_btree(Relation heapRel,
base->removeabbrev = removeabbrev_index;
base->comparetup = comparetup_index_btree;
base->comparetup_tiebreak = comparetup_index_btree_tiebreak;
+ base->mkqsGetDatumFunc = mkqs_get_datum_index_btree;
+ base->mkqsHandleDupFunc = mkqs_handle_dup_index_btree;
base->writetup = writetup_index;
base->readtup = readtup_index;
base->haveDatum1 = true;
@@ -1531,10 +1578,6 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
*/
if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
{
- Datum values[INDEX_MAX_KEYS];
- bool isnull[INDEX_MAX_KEYS];
- char *key_desc;
-
/*
* Some rather brain-dead implementations of qsort (such as the one in
* QNX 4) will sometimes call the comparison routine to compare a
@@ -1543,18 +1586,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
*/
Assert(tuple1 != tuple2);
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
+ raise_error_of_dup_index(tuple1, state);
}
/*
@@ -1563,25 +1595,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
* attribute in order to ensure that all keys in the index are physically
* unique.
*/
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
}
static int
@@ -1888,3 +1902,232 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
+
+/*
+ * Get specified datum from SortTuple (HeapTuple) list
+ *
+ * If the first datum is requested (depth == 0), sortTuple->datum1/isnull1
+ * will be returned. For other datums, relevant datum will be extracted from
+ * sortTuple->tuple.
+ *
+ * The parameter "useFullKey" is used for scenario of "abbreviated key":
+ * false - get sortTuple->datum1/isnull1 (abbreviated key)
+ * true - get the "full" datum
+ * If "abbreviated key" is disabled, useFullKey will be ignored.
+ *
+ * See comparetup_heap() for details.
+ */
+static Datum
+mkqs_get_datum_heap(SortTuple *x,
+ int tupleIndex,
+ int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
+{
+ TupleDesc tupDesc = NULL;
+ HeapTupleData heapTuple;
+ AttrNumber attno;
+ SortTuple *sortTuple = x + tupleIndex;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys + depth;;
+
+ Assert(state);
+
+ /*
+ * useFullKey is valid only when depth == 0, because only the first datum
+ * may be involved to "abbreviated key", so only the first datum need to
+ * be checked with "full" version.
+ */
+ AssertImply(useFullKey, depth == 0);
+
+ tupDesc = (TupleDesc)base->arg;
+
+ /*
+ * When useFullKey is false, and the first datum is requested, return the
+ * leading datum
+ */
+ if (depth == 0 && !useFullKey)
+ {
+ *datum = sortTuple->datum1;
+ *isNull = sortTuple->isnull1;
+ return *datum;
+ }
+
+ /* For any datums which depth > 0, extract it from sortTuple->tuple */
+ heapTuple.t_len = ((MinimalTuple) sortTuple->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ heapTuple.t_data = (HeapTupleHeader) ((char *) sortTuple->tuple - MINIMAL_TUPLE_OFFSET);
+ attno = sortKey->ssup_attno;
+ *datum = heap_getattr(&heapTuple, attno, tupDesc, isNull);
+
+ return *datum;
+}
+
+/*
+ * Get specified datum from SortTuple (IndexTuple for btree index) list
+ *
+ * If the first datum is requested (depth == 0), sortTuple->datum1/isnull1
+ * will be returned. For other datums, relevant datum will be extracted from
+ * sortTuple->tuple.
+ *
+ * The parameter "useFullKey" is used for scenario of "abbreviated key":
+ * false - get sortTuple->datum1/isnull1 (abbreviated key)
+ * true - get the "full" datum
+ * If "abbreviated key" is disabled, useFullKey will be ignored.
+ *
+ * See comparetup_index_btree() for details.
+ */
+static Datum
+mkqs_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
+{
+ TupleDesc tupDesc;
+ IndexTuple indexTuple;
+ SortTuple *sortTuple = x + tupleIndex;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ Assert(state);
+
+ /*
+ * useFullKey is valid only when depth == 0, because only the first datum
+ * may be involved to "abbreviated key", so only the first datum need to
+ * be checked with "full" version.
+ */
+ AssertImply(useFullKey, depth == 0);
+
+ /*
+ * When useFullKey is false, and the first datum is requested, return the
+ * leading datum
+ */
+ if (depth == 0 && !useFullKey)
+ {
+ *isNull = sortTuple->isnull1;
+ *datum = sortTuple->datum1;
+ return *datum;
+ }
+
+ indexTuple = (IndexTuple) sortTuple->tuple;
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+
+ /*
+ * Set parameter attnum = depth + 1 because attnum starts from 1 but depth
+ * starts from 0
+ */
+ *datum = index_getattr(indexTuple, depth + 1, tupDesc, isNull);
+
+ return *datum;
+}
+
+/*
+ * Handle duplicated SortTuples (IndexTuple for btree index during mk qsort)
+ * x: the duplicated tuple list
+ * tupleCount: count of the tuples
+ */
+static void
+mkqs_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ /* If enforceUnique is enabled and we never saw NULL, raise error */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && seenNull))
+ {
+ /*
+ * x means the first tuple of duplicated tuple list
+ * Since they are duplicated, simply pick up the first one
+ * to raise error
+ */
+ raise_error_of_dup_index((IndexTuple)(x->tuple), state);
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ qsort_tuple_by_itempointer(x,
+ tupleCount,
+ state);
+}
+
+/*
+ * Compare two btree index tuples by ItemPointer
+ * It is a callback function for qsort_tuple() called by
+ * mkqs_handle_dup_index_btree()
+ */
+static int
+mkqs_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state)
+{
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
+}
+
+/* Compare two index tuples by ItemPointer */
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2)
+{
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+/* Raise error for duplicated tuple when creating unique index */
+static inline void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state)
+{
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ TupleDesc tupDesc;
+ char *key_desc;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+ index_deform_tuple((IndexTuple)x, tupDesc, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+}
diff --git a/src/include/c.h b/src/include/c.h
index dc1841346cd..f7c368cd162 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -857,12 +857,14 @@ typedef NameData *Name;
#define Assert(condition) ((void)true)
#define AssertMacro(condition) ((void)true)
+#define AssertImply(condition1, condition2) ((void)true)
#elif defined(FRONTEND)
#include <assert.h>
#define Assert(p) assert(p)
#define AssertMacro(p) ((void) assert(p))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
#else /* USE_ASSERT_CHECKING && !FRONTEND */
@@ -886,6 +888,8 @@ typedef NameData *Name;
((void) ((condition) || \
(ExceptionalCondition(#condition, __FILE__, __LINE__), 0)))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
+
#endif /* USE_ASSERT_CHECKING && !FRONTEND */
/*
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index e7941a1f09f..74a6a5ae5ce 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -29,7 +29,6 @@
#include "utils/relcache.h"
#include "utils/sortsupport.h"
-
/*
* Tuplesortstate and Sharedsort are opaque types whose details are not
* known outside tuplesort.c.
@@ -79,9 +78,10 @@ typedef enum
SORT_TYPE_QUICKSORT = 1 << 1,
SORT_TYPE_EXTERNAL_SORT = 1 << 2,
SORT_TYPE_EXTERNAL_MERGE = 1 << 3,
+ SORT_TYPE_MK_QSORT = 1 << 4,
} TuplesortMethod;
-#define NUM_TUPLESORTMETHODS 4
+#define NUM_TUPLESORTMETHODS 5
typedef enum
{
@@ -155,6 +155,23 @@ typedef struct
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
+/* Multi-key quick sort */
+
+typedef Datum
+(*MkqsGetDatumFunc) (SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
+
+typedef void
+(*MkqsHandleDupFunc) (SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
/*
* The public part of a Tuple sort operation state. This data structure
* contains the definition of sort-variant-specific interface methods and
@@ -249,6 +266,21 @@ typedef struct
bool tuples; /* Can SortTuple.tuple ever be set? */
void *arg; /* Specific information for the sort variant */
+
+ /*
+ * Function pointer, referencing a function to get specified datum from
+ * SortTuple list with multi-key.
+ * Used by mk_qsort_tuple().
+ */
+ MkqsGetDatumFunc mkqsGetDatumFunc;
+
+ /*
+ * Function pointer, referencing a function to handle duplicated tuple
+ * from SortTuple list with multi-key.
+ * Used by mk_qsort_tuple().
+ * For now, the function pointer is filled for only btree index tuple.
+ */
+ MkqsHandleDupFunc mkqsHandleDupFunc;
} TuplesortPublic;
/* Sort parallel code from state for sort__start probes */
diff --git a/src/test/regress/expected/geometry.out b/src/test/regress/expected/geometry.out
index 8be694f46be..094d22861c1 100644
--- a/src/test/regress/expected/geometry.out
+++ b/src/test/regress/expected/geometry.out
@@ -4273,7 +4273,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
circle | point | distance
----------------+-------------------+---------------
<(1,2),3> | (-3,4) | 1.472135955
@@ -4310,8 +4310,8 @@ SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
<(3,5),0> | (Infinity,1e+300) | Infinity
<(1,2),3> | (1e+300,Infinity) | Infinity
<(5,1),3> | (1e+300,Infinity) | Infinity
- <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,2),3> | (Infinity,1e+300) | Infinity
+ <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,3),5> | (1e+300,Infinity) | Infinity
<(1,3),5> | (Infinity,1e+300) | Infinity
<(100,200),10> | (1e+300,Infinity) | Infinity
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 5fd54a10b1a..a26f8f100a5 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -520,13 +520,13 @@ select * from (select * from t order by a) s order by a, b limit 55;
-- Test EXPLAIN ANALYZE with only a fullsort group.
select explain_analyze_without_memory('select * from (select * from t order by a) s order by a, b limit 55');
- explain_analyze_without_memory
----------------------------------------------------------------------------------------------------------------
+ explain_analyze_without_memory
+--------------------------------------------------------------------------------------------------------------------------
Limit (actual rows=55 loops=1)
-> Incremental Sort (actual rows=55 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 2 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 2 Sort Methods: top-N heapsort, multi-key quick sort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=101 loops=1)
Sort Key: t.a
Sort Method: quicksort Memory: NNkB
@@ -554,7 +554,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Group Count": 2, +
"Sort Methods Used": [ +
"top-N heapsort", +
- "quicksort" +
+ "multi-key quick sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
@@ -728,7 +728,7 @@ select explain_analyze_without_memory('select * from (select * from t order by a
-> Incremental Sort (actual rows=70 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 1 Sort Method: quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 1 Sort Method: multi-key quick sort Average Memory: NNkB Peak Memory: NNkB
Pre-sorted Groups: 5 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=1000 loops=1)
Sort Key: t.a
@@ -756,7 +756,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Full-sort Groups": { +
"Group Count": 1, +
"Sort Methods Used": [ +
- "quicksort" +
+ "multi-key quick sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index dbfd0c13d46..edd2cbfffd5 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -146,6 +146,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_material | on
enable_memoize | on
enable_mergejoin | on
+ enable_mk_sort | on
enable_nestloop | on
enable_parallel_append | on
enable_parallel_hash | on
@@ -156,7 +157,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_seqscan | on
enable_sort | on
enable_tidscan | on
-(22 rows)
+(23 rows)
-- There are always wait event descriptions for various types.
select type, count(*) > 0 as ok FROM pg_wait_events
diff --git a/src/test/regress/expected/tuplesort.out b/src/test/regress/expected/tuplesort.out
index 6dd97e7427a..ad9e56c2548 100644
--- a/src/test/regress/expected/tuplesort.out
+++ b/src/test/regress/expected/tuplesort.out
@@ -703,3 +703,379 @@ EXPLAIN (COSTS OFF) :qry;
(10 rows)
COMMIT;
+-- Test cases for multi-key quick sort
+set work_mem='100MB';
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+ a | b | c
+---+----+------
+ 0 | 5 | 98f1
+ 0 | 10 | d3d9
+ 1 | 1 | c4ca
+ 1 | 11 | 6512
+ 2 | 2 | c81e
+ 2 | 12 | c20a
+ 3 | 3 | eccb
+ 3 | 13 | c51c
+ 4 | 4 | a87f
+ 4 | 14 | aab3
+ 5 | 0 | 9bf3
+ 5 | 5 | e4da
+ 6 | 1 | c74d
+ 6 | 6 | 1679
+ 7 | 2 | 70ef
+ 7 | 7 | 8f14
+ 8 | 3 | 6f49
+ 8 | 8 | c9f0
+ 9 | 4 | 1f0e
+ 9 | 9 | 45c4
+(20 rows)
+
+-- test sorting on distinct values, in which mk qsort is supposed to be
+-- not affective, but still can generate correct result
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 20 - g, g, g::text
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+ a | b | c
+----+----+----
+ 0 | 20 | 20
+ 1 | 19 | 19
+ 2 | 18 | 18
+ 3 | 17 | 17
+ 4 | 16 | 16
+ 5 | 15 | 15
+ 6 | 14 | 14
+ 7 | 13 | 13
+ 8 | 12 | 12
+ 9 | 11 | 11
+ 10 | 10 | 10
+ 11 | 9 | 9
+ 12 | 8 | 8
+ 13 | 7 | 7
+ 14 | 6 | 6
+ 15 | 5 | 5
+ 16 | 4 | 4
+ 17 | 3 | 3
+ 18 | 2 | 2
+ 19 | 1 | 1
+(20 rows)
+
+drop table mksort_simple_tbl;
+-- test table with abbr keys
+create table abbr_tbl (a int, b varchar(100), c uuid);
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+-- (see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+select c, b, a from abbr_tbl order by c, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+(50 rows)
+
+select c, b, a from abbr_tbl order by c desc, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+(50 rows)
+
+select c, b, a from abbr_tbl order by c, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+NOTICE: index "idx_abbr_tbl" does not exist, skipping
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+---
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+(1 row)
+
+-- Uniqueness check of CREATE INDEX
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+ERROR: could not create unique index "idx_abbr_tbl"
+DETAIL: Key (c, b, a)=(00000000-0000-0000-0000-000000000001, aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1, 1) is duplicated.
+drop table abbr_tbl;
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ae4e8851f8a..2de20ca1d0c 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -18,13 +18,13 @@ INSERT INTO empsalary VALUES
('sales', 3, 4800, '2007-08-01'),
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | sum
-----------+-------+--------+-------
develop | 7 | 4200 | 25100
develop | 9 | 4500 | 25100
- develop | 11 | 5200 | 25100
develop | 10 | 5200 | 25100
+ develop | 11 | 5200 | 25100
develop | 8 | 6000 | 25100
personnel | 5 | 3500 | 7400
personnel | 2 | 3900 | 7400
@@ -33,13 +33,13 @@ SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM emps
sales | 1 | 5000 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | rank
-----------+-------+--------+------
develop | 7 | 4200 | 1
develop | 9 | 4500 | 2
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
personnel | 5 | 3500 | 1
personnel | 2 | 3900 | 2
@@ -90,18 +90,18 @@ SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PA
sales | 4 | 4800 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
depname | empno | salary | rank
-----------+-------+--------+------
- develop | 7 | 4200 | 1
- personnel | 5 | 3500 | 1
sales | 3 | 4800 | 1
sales | 4 | 4800 | 1
+ personnel | 5 | 3500 | 1
+ develop | 7 | 4200 | 1
personnel | 2 | 3900 | 2
develop | 9 | 4500 | 2
sales | 1 | 5000 | 3
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
(10 rows)
@@ -3749,23 +3749,24 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
empno | depname | rn | rnk | cnt
-------+-----------+----+-----+-----
- 8 | develop | 1 | 1 | 1
- 10 | develop | 2 | 2 | 1
- 11 | develop | 3 | 3 | 1
- 9 | develop | 4 | 4 | 2
- 7 | develop | 5 | 4 | 2
- 2 | personnel | 1 | 1 | 1
- 5 | personnel | 2 | 2 | 1
1 | sales | 1 | 1 | 1
+ 2 | personnel | 1 | 1 | 1
3 | sales | 2 | 2 | 1
4 | sales | 3 | 3 | 1
+ 5 | personnel | 2 | 2 | 1
+ 7 | develop | 4 | 4 | 1
+ 8 | develop | 1 | 1 | 1
+ 9 | develop | 5 | 5 | 1
+ 10 | develop | 2 | 2 | 1
+ 11 | develop | 3 | 3 | 1
(10 rows)
-- Test pushdown of quals into a subquery containing window functions
@@ -4106,17 +4107,17 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
empno | depname | salary | c
-------+-----------+--------+---
+ 1 | sales | 5000 | 1
+ 2 | personnel | 3900 | 1
+ 3 | sales | 4800 | 3
+ 4 | sales | 4800 | 3
+ 5 | personnel | 3500 | 2
8 | develop | 6000 | 1
10 | develop | 5200 | 3
11 | develop | 5200 | 3
- 2 | personnel | 3900 | 1
- 5 | personnel | 3500 | 2
- 1 | sales | 5000 | 1
- 4 | sales | 4800 | 3
- 3 | sales | 4800 | 3
(8 rows)
-- Ensure we get the correct run condition when the window function is both
@@ -4468,14 +4469,15 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
depname | empno | salary | enroll_date | first_emp | last_emp
-----------+-------+--------+-------------+-----------+----------
+ develop | 7 | 4200 | 01-01-2008 | 4 | 1
develop | 8 | 6000 | 10-01-2006 | 1 | 5
- develop | 7 | 4200 | 01-01-2008 | 5 | 1
personnel | 2 | 3900 | 12-23-2006 | 1 | 2
personnel | 5 | 3500 | 12-10-2007 | 2 | 1
sales | 1 | 5000 | 10-01-2006 | 1 | 3
diff --git a/src/test/regress/sql/geometry.sql b/src/test/regress/sql/geometry.sql
index c3ea368da5e..1f47f07f311 100644
--- a/src/test/regress/sql/geometry.sql
+++ b/src/test/regress/sql/geometry.sql
@@ -403,7 +403,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
-- To polygon
SELECT f1, f1::polygon FROM CIRCLE_TBL WHERE f1 >= '<(0,0),1>';
diff --git a/src/test/regress/sql/tuplesort.sql b/src/test/regress/sql/tuplesort.sql
index 8476e594e6c..a7d11a146f3 100644
--- a/src/test/regress/sql/tuplesort.sql
+++ b/src/test/regress/sql/tuplesort.sql
@@ -305,3 +305,69 @@ EXPLAIN (COSTS OFF) :qry;
:qry;
COMMIT;
+
+-- Test cases for multi-key quick sort
+
+set work_mem='100MB';
+
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+
+-- test sorting on distinct values, in which mk qsort is supposed to be
+-- not affective, but still can generate correct result
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 20 - g, g, g::text
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+drop table mksort_simple_tbl;
+
+-- test table with abbr keys
+
+create table abbr_tbl (a int, b varchar(100), c uuid);
+
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+-- (see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+
+select c, b, a from abbr_tbl order by c, b, a;
+select c, b, a from abbr_tbl order by c desc, b, a;
+select c, b, a from abbr_tbl order by c, b desc, a;
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+
+-- Uniqueness check of CREATE INDEX
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop table abbr_tbl;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 6de5493b05b..46359cb7968 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -21,9 +21,9 @@ INSERT INTO empsalary VALUES
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
-- with GROUP BY
SELECT four, ten, SUM(SUM(four)) OVER (PARTITION BY four), AVG(ten) FROM tenk1
@@ -31,7 +31,7 @@ GROUP BY four, ten ORDER BY four, ten;
SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PARTITION BY depname);
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
-- empty window specification
SELECT COUNT(*) OVER () FROM tenk1 WHERE unique2 < 10;
@@ -1146,11 +1146,12 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
-- Test pushdown of quals into a subquery containing window functions
@@ -1332,7 +1333,7 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
-- Ensure we get the correct run condition when the window function is both
-- monotonically increasing and decreasing.
@@ -1510,10 +1511,11 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
-- cleanup
DROP TABLE empsalary;
--
2.45.1
v20240609-0002-fix-add-GUC-to-sample.patchtext/x-patch; charset=UTF-8; name=v20240609-0002-fix-add-GUC-to-sample.patchDownload
From 79b2547a8678d7eaf80f2653f48482de57469a18 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 9 Jun 2024 13:26:45 +0200
Subject: [PATCH v20240609 2/4] fix: add GUC to sample
---
src/backend/utils/misc/postgresql.conf.sample | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e0567de2190..f6abe07f824 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -413,6 +413,7 @@
#enable_sort = on
#enable_tidscan = on
#enable_group_by_reordering = on
+#enable_mk_sort = on
# - Planner Cost Constants -
--
2.45.1
v20240609-0003-fix-tabs-and-spaces.patchtext/x-patch; charset=UTF-8; name=v20240609-0003-fix-tabs-and-spaces.patchDownload
From f321d30fbfcfcb34ff547339de001709b86f4ad4 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 9 Jun 2024 13:28:02 +0200
Subject: [PATCH v20240609 3/4] fix: tabs and spaces
---
src/backend/utils/sort/tuplesortvariants.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index ddcffa5094d..412e5b9b588 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -2115,8 +2115,8 @@ raise_error_of_dup_index(IndexTuple x,
bool isnull[INDEX_MAX_KEYS];
TupleDesc tupDesc;
char *key_desc;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
tupDesc = RelationGetDescr(arg->index.indexRel);
index_deform_tuple((IndexTuple)x, tupDesc, values, isnull);
--
2.45.1
v20240609-0004-fix-pgindent.patchtext/x-patch; charset=UTF-8; name=v20240609-0004-fix-pgindent.patchDownload
From a2788b2f7dcceeeaa9229bfbf1ba2e5c413afb99 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Sun, 9 Jun 2024 13:41:54 +0200
Subject: [PATCH v20240609 4/4] fix: pgindent
---
src/backend/utils/sort/mk_qsort_tuple.c | 121 ++++++++++----------
src/backend/utils/sort/tuplesort.c | 26 ++---
src/backend/utils/sort/tuplesortvariants.c | 125 ++++++++++-----------
src/include/utils/tuplesort.h | 34 +++---
4 files changed, 152 insertions(+), 154 deletions(-)
diff --git a/src/backend/utils/sort/mk_qsort_tuple.c b/src/backend/utils/sort/mk_qsort_tuple.c
index 9c5715380aa..e44bc50d652 100644
--- a/src/backend/utils/sort/mk_qsort_tuple.c
+++ b/src/backend/utils/sort/mk_qsort_tuple.c
@@ -22,11 +22,11 @@
/* Swap two tuples in sort tuple array */
static inline void
-mkqs_swap(int a,
- int b,
+mkqs_swap(int a,
+ int b,
SortTuple *x)
{
- SortTuple t;
+ SortTuple t;
if (a == b)
return;
@@ -37,9 +37,9 @@ mkqs_swap(int a,
/* Swap tuples by batch in sort tuple array */
static inline void
-mkqs_vec_swap(int a,
- int b,
- int size,
+mkqs_vec_swap(int a,
+ int b,
+ int size,
SortTuple *x)
{
while (size-- > 0)
@@ -56,12 +56,12 @@ mkqs_vec_swap(int a,
* a tuple array, so tupleIndex is unnecessary
*/
static inline bool
-check_datum_null(SortTuple *x,
- int depth,
+check_datum_null(SortTuple *x,
+ int depth,
Tuplesortstate *state)
{
- Datum datum;
- bool isNull;
+ Datum datum;
+ bool isNull;
Assert(depth < state->base.nKeys);
@@ -94,15 +94,17 @@ check_datum_null(SortTuple *x,
* See comparetup_heap() for details.
*/
static inline int
-mkqs_compare_datum(SortTuple *tuple1,
- SortTuple *tuple2,
- int depth,
+mkqs_compare_datum(SortTuple *tuple1,
+ SortTuple *tuple2,
+ int depth,
Tuplesortstate *state)
{
- Datum datum1, datum2;
- bool isNull1, isNull2;
+ Datum datum1,
+ datum2;
+ bool isNull1,
+ isNull2;
SortSupport sortKey;
- int ret = 0;
+ int ret = 0;
Assert(state->base.mkqsGetDatumFunc);
Assert(depth < state->base.nKeys);
@@ -120,10 +122,10 @@ mkqs_compare_datum(SortTuple *tuple1,
sortKey);
/*
- * If "abbreviated key" is enabled, and we are in the first depth, it means
- * only "abbreviated keys" are compared. If the two datums are determined to
- * be equal by ApplySortComparator(), we need to perform an extra "full"
- * comparing by ApplySortAbbrevFullComparator().
+ * If "abbreviated key" is enabled, and we are in the first depth, it
+ * means only "abbreviated keys" are compared. If the two datums are
+ * determined to be equal by ApplySortComparator(), we need to perform an
+ * extra "full" comparing by ApplySortAbbrevFullComparator().
*/
if (sortKey->abbrev_converter &&
depth == 0 &&
@@ -150,14 +152,14 @@ mkqs_compare_datum(SortTuple *tuple1,
* Verify whether the SortTuple list is ordered or not at specified depth
*/
static void
-mkqs_verify(SortTuple *x,
- int n,
- int depth,
+mkqs_verify(SortTuple *x,
+ int n,
+ int depth,
Tuplesortstate *state)
{
- int ret;
+ int ret;
- for (int i = 0;i < n - 1;i++)
+ for (int i = 0; i < n - 1; i++)
{
ret = mkqs_compare_datum(x + i,
x + i + 1,
@@ -174,27 +176,31 @@ mkqs_verify(SortTuple *x,
* seenNull indicates whether we have seen NULL in any datum we checked
*/
static void
-mk_qsort_tuple(SortTuple *x,
- size_t n,
- int depth,
- Tuplesortstate *state,
- bool seenNull)
+mk_qsort_tuple(SortTuple *x,
+ size_t n,
+ int depth,
+ Tuplesortstate *state,
+ bool seenNull)
{
/*
- * In the process, the tuple array consists of five parts:
- * left equal, less, not-processed, greater, right equal
+ * In the process, the tuple array consists of five parts: left equal,
+ * less, not-processed, greater, right equal
*
- * lessStart indicates the first position of less part
- * lessEnd indicates the next position after less part
- * greaterStart indicates the prior position before greater part
- * greaterEnd indicates the latest position of greater part
- * the range between lessEnd and greaterStart (inclusive) is not-processed
+ * lessStart indicates the first position of less part lessEnd indicates
+ * the next position after less part greaterStart indicates the prior
+ * position before greater part greaterEnd indicates the latest position
+ * of greater part the range between lessEnd and greaterStart (inclusive)
+ * is not-processed
*/
- int lessStart, lessEnd, greaterStart, greaterEnd, tupCount;
- int32 dist;
- SortTuple *pivot;
- bool isDatumNull;
- bool strictOrdered = true;
+ int lessStart,
+ lessEnd,
+ greaterStart,
+ greaterEnd,
+ tupCount;
+ int32 dist;
+ SortTuple *pivot;
+ bool isDatumNull;
+ bool strictOrdered = true;
Assert(depth <= state->base.nKeys);
Assert(state->base.sortKeys);
@@ -212,12 +218,12 @@ mk_qsort_tuple(SortTuple *x,
/*
* Check if the array is ordered already. If yes, return immediately.
* Different from qsort_tuple(), the array must be strict ordered (no
- * equal datums). If there are equal datums, we must continue the mk
- * qsort process to check datums on lower depth.
+ * equal datums). If there are equal datums, we must continue the mk qsort
+ * process to check datums on lower depth.
*/
- for (int i = 0;i < n - 1;i++)
+ for (int i = 0; i < n - 1; i++)
{
- int ret;
+ int ret;
CHECK_FOR_INTERRUPTS();
ret = mkqs_compare_datum(x + i,
@@ -299,8 +305,7 @@ mk_qsort_tuple(SortTuple *x,
}
/*
- * Now the array has four parts:
- * left equal, lesser, greater, right equal
+ * Now the array has four parts: left equal, lesser, greater, right equal
* Note greaterStart is less than lessEnd now
*/
@@ -313,9 +318,8 @@ mk_qsort_tuple(SortTuple *x,
mkqs_vec_swap(lessEnd, n - dist, dist, x);
/*
- * Now the array has three parts:
- * lesser, equal, greater
- * Note that one or two parts may have no element at all.
+ * Now the array has three parts: lesser, equal, greater Note that one or
+ * two parts may have no element at all.
*/
/* Recursively sort the lesser part */
@@ -331,9 +335,9 @@ mk_qsort_tuple(SortTuple *x,
/* Recursively sort the equal part */
/*
- * (x + dist) means the first tuple in the equal part
- * Since all tuples have equal datums at current depth, we just check any one
- * of them to determine whether we have seen null datum.
+ * (x + dist) means the first tuple in the equal part Since all tuples
+ * have equal datums at current depth, we just check any one of them to
+ * determine whether we have seen null datum.
*/
isDatumNull = check_datum_null(x + dist, depth, state);
@@ -347,7 +351,9 @@ mk_qsort_tuple(SortTuple *x,
depth + 1,
state,
seenNull || isDatumNull);
- } else {
+ }
+ else
+ {
/*
* We have reach the max depth: Call mkqsHandleDupFunc to handle
* duplicated tuples if necessary, e.g. checking uniqueness or extra
@@ -355,9 +361,8 @@ mk_qsort_tuple(SortTuple *x,
*/
/*
- * Call mkqsHandleDupFunc if:
- * 1. mkqsHandleDupFunc is filled
- * 2. the size of equal part > 1
+ * Call mkqsHandleDupFunc if: 1. mkqsHandleDupFunc is filled 2. the
+ * size of equal part > 1
*/
if (state->base.mkqsHandleDupFunc &&
(tupCount > 1))
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 5718911eb9b..d51d97b1136 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -340,7 +340,7 @@ struct Tuplesortstate
#endif
/* Whether multi-key quick sort is used */
- bool mkqsUsed;
+ bool mkqsUsed;
};
/*
@@ -2729,23 +2729,19 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
if (state->memtupcount > 1)
{
/*
- * Apply multi-key quick sort when:
- * 1. enable_mk_sort is set
- * 2. There are multiple keys available
- * 3. mkqsGetDatumFunc is filled, which implies that current tuple
- * type is supported by mk qsort. (By now only Heap tuple and Btree
- * Index tuple are supported, and more types may be supported in
- * future.)
+ * Apply multi-key quick sort when: 1. enable_mk_sort is set 2. There
+ * are multiple keys available 3. mkqsGetDatumFunc is filled, which
+ * implies that current tuple type is supported by mk qsort. (By now
+ * only Heap tuple and Btree Index tuple are supported, and more types
+ * may be supported in future.)
*
* A summary of tuple types supported by mk qsort:
*
- * HeapTuple: supported
- * IndexTuple(btree): supported
- * IndexTuple(hash): not supported because there is only one key
- * DatumTuple: not supported because there is only one key
- * HeapTuple(for cluster): not supported yet
- * IndexTuple(gist): not supported yet
- * IndexTuple(brin): not supported yet
+ * HeapTuple: supported IndexTuple(btree): supported IndexTuple(hash):
+ * not supported because there is only one key DatumTuple: not
+ * supported because there is only one key HeapTuple(for cluster): not
+ * supported yet IndexTuple(gist): not supported yet IndexTuple(brin):
+ * not supported yet
*/
if (enable_mk_sort &&
state->base.nKeys > 1 &&
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index 412e5b9b588..a41d4daa4b3 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -93,40 +93,40 @@ static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static void freestate_cluster(Tuplesortstate *state);
-static Datum mkqs_get_datum_heap(SortTuple *x,
- const int tupleIndex,
- const int depth,
+static Datum mkqs_get_datum_heap(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
Tuplesortstate *state,
- Datum *datum,
- bool *isNull,
- bool useFullKey);
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
-static Datum mkqs_get_datum_index_btree(SortTuple *x,
- const int tupleIndex,
- const int depth,
+static Datum mkqs_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
Tuplesortstate *state,
- Datum *datum,
- bool *isNull,
- bool useFullKey);
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
static void
-mkqs_handle_dup_index_btree(SortTuple *x,
- const int tupleCount,
- const bool seenNull,
- Tuplesortstate *state);
+ mkqs_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
static int
-mkqs_compare_equal_index_btree(const SortTuple *a,
- const SortTuple *b,
- Tuplesortstate *state);
+ mkqs_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state);
static inline int
-tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
- const IndexTuple tuple2);
+ tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2);
static inline void
-raise_error_of_dup_index(IndexTuple x,
- Tuplesortstate *state);
+ raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state);
/*
* Data structure pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
@@ -1918,18 +1918,18 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
* See comparetup_heap() for details.
*/
static Datum
-mkqs_get_datum_heap(SortTuple *x,
- int tupleIndex,
- int depth,
+mkqs_get_datum_heap(SortTuple *x,
+ int tupleIndex,
+ int depth,
Tuplesortstate *state,
- Datum *datum,
- bool *isNull,
- bool useFullKey)
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
{
- TupleDesc tupDesc = NULL;
+ TupleDesc tupDesc = NULL;
HeapTupleData heapTuple;
- AttrNumber attno;
- SortTuple *sortTuple = x + tupleIndex;
+ AttrNumber attno;
+ SortTuple *sortTuple = x + tupleIndex;
TuplesortPublic *base = TuplesortstateGetPublic(state);
SortSupport sortKey = base->sortKeys + depth;;
@@ -1942,7 +1942,7 @@ mkqs_get_datum_heap(SortTuple *x,
*/
AssertImply(useFullKey, depth == 0);
- tupDesc = (TupleDesc)base->arg;
+ tupDesc = (TupleDesc) base->arg;
/*
* When useFullKey is false, and the first datum is requested, return the
@@ -1979,16 +1979,16 @@ mkqs_get_datum_heap(SortTuple *x,
* See comparetup_index_btree() for details.
*/
static Datum
-mkqs_get_datum_index_btree(SortTuple *x,
- const int tupleIndex,
- const int depth,
+mkqs_get_datum_index_btree(SortTuple *x,
+ const int tupleIndex,
+ const int depth,
Tuplesortstate *state,
- Datum *datum,
- bool *isNull,
- bool useFullKey)
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey)
{
- TupleDesc tupDesc;
- IndexTuple indexTuple;
+ TupleDesc tupDesc;
+ IndexTuple indexTuple;
SortTuple *sortTuple = x + tupleIndex;
TuplesortPublic *base = TuplesortstateGetPublic(state);
TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
@@ -2031,9 +2031,9 @@ mkqs_get_datum_index_btree(SortTuple *x,
* tupleCount: count of the tuples
*/
static void
-mkqs_handle_dup_index_btree(SortTuple *x,
- const int tupleCount,
- const bool seenNull,
+mkqs_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
Tuplesortstate *state)
{
TuplesortPublic *base = TuplesortstateGetPublic(state);
@@ -2043,11 +2043,10 @@ mkqs_handle_dup_index_btree(SortTuple *x,
if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && seenNull))
{
/*
- * x means the first tuple of duplicated tuple list
- * Since they are duplicated, simply pick up the first one
- * to raise error
+ * x means the first tuple of duplicated tuple list Since they are
+ * duplicated, simply pick up the first one to raise error
*/
- raise_error_of_dup_index((IndexTuple)(x->tuple), state);
+ raise_error_of_dup_index((IndexTuple) (x->tuple), state);
}
/*
@@ -2069,10 +2068,10 @@ mkqs_handle_dup_index_btree(SortTuple *x,
static int
mkqs_compare_equal_index_btree(const SortTuple *a,
const SortTuple *b,
- Tuplesortstate *state)
+ Tuplesortstate *state)
{
- IndexTuple tuple1;
- IndexTuple tuple2;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
tuple1 = (IndexTuple) a->tuple;
tuple2 = (IndexTuple) b->tuple;
@@ -2108,26 +2107,26 @@ tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
/* Raise error for duplicated tuple when creating unique index */
static inline void
-raise_error_of_dup_index(IndexTuple x,
+raise_error_of_dup_index(IndexTuple x,
Tuplesortstate *state)
{
- Datum values[INDEX_MAX_KEYS];
- bool isnull[INDEX_MAX_KEYS];
- TupleDesc tupDesc;
- char *key_desc;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ TupleDesc tupDesc;
+ char *key_desc;
TuplesortPublic *base = TuplesortstateGetPublic(state);
TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
tupDesc = RelationGetDescr(arg->index.indexRel);
- index_deform_tuple((IndexTuple)x, tupDesc, values, isnull);
+ index_deform_tuple((IndexTuple) x, tupDesc, values, isnull);
key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
ereport(ERROR,
(errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
}
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 74a6a5ae5ce..380a106789c 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -158,19 +158,19 @@ typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
/* Multi-key quick sort */
typedef Datum
-(*MkqsGetDatumFunc) (SortTuple *x,
- const int tupleIndex,
- const int depth,
- Tuplesortstate *state,
- Datum *datum,
- bool *isNull,
- bool useFullKey);
+ (*MkqsGetDatumFunc) (SortTuple *x,
+ const int tupleIndex,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum,
+ bool *isNull,
+ bool useFullKey);
typedef void
-(*MkqsHandleDupFunc) (SortTuple *x,
- const int tupleCount,
- const bool seenNull,
- Tuplesortstate *state);
+ (*MkqsHandleDupFunc) (SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
/*
* The public part of a Tuple sort operation state. This data structure
@@ -269,17 +269,15 @@ typedef struct
/*
* Function pointer, referencing a function to get specified datum from
- * SortTuple list with multi-key.
- * Used by mk_qsort_tuple().
- */
+ * SortTuple list with multi-key. Used by mk_qsort_tuple().
+ */
MkqsGetDatumFunc mkqsGetDatumFunc;
/*
* Function pointer, referencing a function to handle duplicated tuple
- * from SortTuple list with multi-key.
- * Used by mk_qsort_tuple().
- * For now, the function pointer is filled for only btree index tuple.
- */
+ * from SortTuple list with multi-key. Used by mk_qsort_tuple(). For now,
+ * the function pointer is filled for only btree index tuple.
+ */
MkqsHandleDupFunc mkqsHandleDupFunc;
} TuplesortPublic;
--
2.45.1
hi Tomas,
So many thanks for your kind response and detailed report. I am working
on locating issues based on your report/script and optimizing code, and
will update later.
Could you please also send me the script to generate report pdf
from the test results (explain*.log)? I can try to make one by myself,
but I'd like to get a report exactly the same as yours. It's really
helpful.
Thanks in advance.
Yao Wang
On Mon, Jun 10, 2024 at 5:09 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
Hello Yao,
I was interested in the patch, considering the promise of significant
speedups of sorting, so I took a quick look and did some basic perf
testing today. Unfortunately, my benchmarks don't really confirm any
peformance benefits, so I haven't looked at the code very much and only
have some very basic feedback:1) The new GUC is missing from the .sample config, triggering a failure
of "make check-world". Fixed by 0002.2) There's a place mixing tabs/spaces in indentation. Fixed by 0003.
3) I tried running pgindent, mostly to see how that would affect the
comments, and for most it's probably fine, but a couple are mangled
(usually those with a numbered list of items). Might needs some changes
to use formatting that's not reformatted like this. The changes from
pgindent are in 0004, but this is not a fix - it just shows the changes
after running pgindent.Now, regarding the performance tests - I decided to do the usual black
box testing, i.e. generate tables with varying numbers of columns, data
types, different data distribution (random, correlated, ...) and so on.
And then run simple ORDER BY queries on that, measuring timing with and
without mk-sort, and checking the effect.So I wrote a simple bash script (attached) that does exactly that - it
generates a table with 1k - 10M rows, fills with with data (with some
basic simple data distributions), and then runs the queries.The raw results are too large to attach, I'm only attaching a PDF
showing the summary with a "speedup heatmap" - it's a pivot with the
parameters on the left, and then the GUC and number on columns on top.
So the first group of columns is with enable_mk_sort=off, the second
group with enable_mk_sort=on, and finally the heatmap with relative
timing (enable_mk_sort=on / enable_mk_sort=off).So values <100% mean it got faster (green color - good), and values
100% mean it got slower (red - bad). And the thing is - pretty much
everything is red, often in the 200%-300% range, meaning it got 2x-3x
slower. There's only very few combinations where it got faster. That
does not seem very promising ... but maybe I did something wrong?After seeing this, I took a look at your example again, which showed
some nice speedups. But it seems very dependent on the order of keys in
the ORDER BY clause. For example consider this:set enable_mk_sort = on;
explain (analyze, timing off)
select * from t1 order by c6, c5, c4, c3, c2, c1;QUERY PLAN
-------------------------------------------------------------------
Sort (cost=72328.81..73578.81 rows=499999 width=76)
(actual rows=499999 loops=1)
Sort Key: c6, c5, c4, c3, c2, c1
Sort Method: quicksort Memory: 59163kB
-> Seq Scan on t1 (cost=0.00..24999.99 rows=499999 width=76)
(actual rows=499999 loops=1)
Planning Time: 0.054 ms
Execution Time: 1095.183 ms
(6 rows)set enable_mk_sort = on;
explain (analyze, timing off)
select * from t1 order by c6, c5, c4, c3, c2, c1;QUERY PLAN
-------------------------------------------------------------------
Sort (cost=72328.81..73578.81 rows=499999 width=76)
(actual rows=499999 loops=1)
Sort Key: c6, c5, c4, c3, c2, c1
Sort Method: multi-key quick sort Memory: 59163kB
-> Seq Scan on t1 (cost=0.00..24999.99 rows=499999 width=76)
(actual rows=499999 loops=1)
Planning Time: 0.130 ms
Execution Time: 633.635 ms
(6 rows)Which seems great, but let's reverse the sort keys:
set enable_mk_sort = off;
explain (analyze, timing off)
select * from t1 order by c1, c2, c3, c4, c5, c6;QUERY PLAN
-------------------------------------------------------------------Sort (cost=72328.81..73578.81 rows=499999 width=76)
(actual rows=499999 loops=1)
Sort Key: c1, c2, c3, c4, c5, c6
Sort Method: quicksort Memory: 59163kB
-> Seq Scan on t1 (cost=0.00..24999.99 rows=499999 width=76)
(actual rows=499999 loops=1)
Planning Time: 0.146 ms
Execution Time: 170.085 ms
(6 rows)set enable_mk_sort = off;
explain (analyze, timing off)
select * from t1 order by c1, c2, c3, c4, c5, c6;QUERY PLAN
-------------------------------------------------------------------
Sort (cost=72328.81..73578.81 rows=499999 width=76)
(actual rows=499999 loops=1)
Sort Key: c1, c2, c3, c4, c5, c6
Sort Method: multi-key quick sort Memory: 59163kB
-> Seq Scan on t1 (cost=0.00..24999.99 rows=499999 width=76)
(actual rows=499999 loops=1)
Planning Time: 0.127 ms
Execution Time: 367.263 ms
(6 rows)I believe this is the case Heikki was asking about. I see the response
was that it's OK and the overhead is very low, but without too much
detail so I don't know what case you measured.Anyway, I think it seems to be very sensitive to the exact data set.
Which is not entirely surprising, I guess - most optimizations have a
mix of improved/regressed cases, yielding a heatmap with a mix of green
and red areas, and we have to either optimize the code (or heuristics to
enable the feature), or convince ourselves the "red" cases are less
important / unlikely etc.But here the results are almost universally "red", so it's going to be
very hard to convince ourselves this is a good trade off. Of course, you
may argue the cases I've tested are wrong and not representative. I
don't think that's the case, though.It's also interesting (and perhaps a little bit bizarre) that almost all
the cases that got better are for a single-column sort. Which is exactly
the case the patch should not affect. But it seems pretty consistent, so
maybe this is something worth investigating.FWIW I'm not familiar with the various quicksort variants, but I noticed
that the Bentley & Sedgewick paper mentioned as the basis for the patch
is from 1997, and apparently implements stuff originally proposed by
Hoare in 1961. So maybe this is just an example of an algorithm that was
good for a hardware at that time, but the changes (e.g. the growing
important of on-CPU caches) made it less relevant?Another thing I noticed while skimming [1] is this:
The algorithm is designed to exploit the property that in many
problems, strings tend to have shared prefixes.If that's the case, isn't it wrong to apply this to all sorts, including
sorts with non-string keys? It might explain why your example works OK,
as it involves key c6 which is string with all values sharing the same
(fairly long) prefix. But then maybe we should be careful and restrict
this to only such those cases?regards
[1] https://en.wikipedia.org/wiki/Multi-key_quicksort
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.
On 6/14/24 13:20, Yao Wang wrote:
hi Tomas,
So many thanks for your kind response and detailed report. I am working
on locating issues based on your report/script and optimizing code, and
will update later.Could you please also send me the script to generate report pdf
from the test results (explain*.log)? I can try to make one by myself,
but I'd like to get a report exactly the same as yours. It's really
helpful.
I don't have a script for that. I simply load the results into a
spreadsheet, do a pivot table to "aggregate and reshuffle" it a bit, and
then add a heatmap. I use google sheets for this, but any other
spreadsheet should handle this too, I think.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Jun 14, 2024 at 6:20 PM Yao Wang <yao-yw.wang@broadcom.com> wrote:
hi Tomas,
So many thanks for your kind response and detailed report. I am working
on locating issues based on your report/script and optimizing code, and
will update later.
Hi,
This is an interesting proof-of-concept!
Given the above, I've set this CF entry to "waiting on author".
Also, I see you've added Heikki as a reviewer. I'm not sure how others
think, but I consider a "reviewer" in the CF app to be someone who has
volunteered to be responsible to help move this patch forward. If
there is a name in the reviewer column, it may discourage others from
doing review. It also can happened that people ping reviewers to ask
"There's been no review for X months -- are you planning on looking at
this?", and it's not great if that message is a surprise.
Note that we prefer not to top-post in emails since it makes our web
archive more difficult to read.
Thanks,
John
Hi John,
Thanks for your kind message. I talked to Heikki before getting Tomas's
response, and he said "no promise but I will take a look". That's why I
added his email. I have updated the CF entry and added Tomas as reviewer.
Hi Tomas,
Again, I'd say a big thank to you. The report and script are really, really
helpful. And your ideas are very valuable.
Firstly, the expectation of mksort performance:
1. When mksort works well, it should be faster than qsort because it saves
the cost of comparing duplicated values every time.
2. When all values are distinct at a particular column, the comparison
will finish immediately, and mksort will actually fall back to qsort. For
the case, mksort should be equal or a bit slower than qsort because it need
to maintain more complex state.
Generally, the benefit of mksort is mainly from duplicated values and sort
keys: the more duplicated values and sort keys are, the bigger benefit it
gets.
Analysis on the report in your previous mail
--------------------------------------------
1. It seems the script uses $count to specify the duplicated values:
number of repetitions for each value (ndistinct = nrows/count)
However, it is not always correct. For type text, the script generates
values like this:
expr="md5(((i / $count) + random())::text)"
But md5() generates totally random values regardless of $count. Some cases
of timestamptz have the same problem.
For all distinct values, the sort will finish at first depth and fall to
qsort actually.
2. Even for the types with correct duplicated setting, the duplicated ratio
is very small: e.g. say $nrows = 10000 and $count = 100, only 1% duplicated
rows can go to depth 2, and only 0.01% of them can go to depth 3. So it still
works on nearly all distinct values.
3. Qsort of PG17 uses kind of specialization for tuple comparator, i.e. it
uses specialized functions for different types, e.g. qsort_tuple_unsigned()
for unsigned int. The specialized comparators avoid all type related checks
and are much faster than regular comparator. That is why we saw 200% or more
regression for the cases.
Code optimizations I did for mk qsort
-------------------------------------
1. Adapted specialization for tuple comparator.
2. Use kind of "hybrid" sort: when we actually adapt bubble sort due to
limited sort items, use bubble sort to check datums since specified depth.
3. Other other optimizations such as pre-ordered check.
Analysis on the new report
--------------------------
I also did some modifications to your script about the issues of data types,
plus an output about distinct value count/distinct ratio, and an indicator
for improvement/regression. I attached the new script and a report on a
data set with 100,000 rows and 2, 5, 8 columns.
1. Generally, the result match the expectation: "When mksort works well, it
should be faster than qsort; when mksort falls to qsort, it should be equal
or a bit slower than qsort."
2. For all values of "sequential" (except text type), mksort is a bit slower
than qsort because no actual sort is performed due to the "pre-ordered"
check.
3. For int and bigint type, mksort became faster and faster when
there were more and more duplicated values and sort keys. Improvement of
the best cases is about 58% (line 333) and 57% (line 711).
4. For timestamptz type, mksort is a bit slower than qsort because the
distinct ratio is always 1 for almost all cases. I think more benefit is
available by increasing the duplicated values.
5. For text type, mksort is faster than qsort for all cases, and
improvement of the best case is about 160% (line 1510). It is the only
tested type in which specialization comparators are disabled.
Obviously, text has much better improvement than others. I suppose the cause
is about the specialisation comparators: for the types with them, the
comparing is too faster so the cost saved by mksort is not significant. Only
when saved cost became big enough, mksort can defeat qsort.
For other types without specialisation comparators, mksort can defeat
qsort completely. It is the "real" performance of mksort.
Answers for some other questions you mentioned
----------------------------------------------
Q1: Why are almost all the cases that got better for a single-column sort?
A: mksort is enabled only for multi column sort. When there is only one
column, qsort works. So we can simply ignore the cases.
Q2: Why did the perf become worse by just reversing the sort keys?
A: In the example we used, the sort keys are ordered from more duplicated
to less. Please see the SQL:
update t1 set c2 = c1 % 100, c3 = c1 % 50, c4 = c1 % 10, c5 = c1 % 3;
update t1 set c6 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
|| (c1 % 5)::text;
So c6 has most duplicated values, c5 has less, and so on. By the order
"C6, c5, c4, ...", mksort can take effect on every sort key.
By the reverse order "c2, c3, c4...", mksort almost finished on first
sort key (c2) because it has only 1% duplicated values, and fell back
to qsort actually.
Based on the new code, I reran the example, and got about 141% improvement
for order "c6, c5, c4...", and about -4% regression for order
"c2, c3, c4...".
Q3: Does mksort work effectively only for particular type, e.g. string?
A: No, the implementation of mksort does not distinguish data type for
special handling. It just calls existing comparators which are also
used by qsort. I used long prefix for string just to enlarge the time
cost of comparing to amplify the result. The new report shows mksort
can work effectively on non-string types and string without long prefix.
Q4: Was the algorithm good for a hardware at that time, but the changes
(e.g. the growing important of on-CPU caches) made it less relevant?
A: As my understanding, the answer is no because the benefit of mksort
is from saving cost for duplicated comparison, which is not related to
hardware. I suppose the new report can prove it.
However, the hardware varying definitely affects the perf, especially
considering that the perf different between mksort and qsort is not so
big when mksort falls back to qsort. I am not able to test on a wide
range of hardwares, so any finding is appreciated.
Potential improvement spaces
----------------------------
I tried some other optimizations but didn't add the code finally because
the benefit is not very sure and/or the implementation is complex. Just
raise them for more discussion if necessary:
1. Use distinct stats info of table to enable mksort
It's kind of heuristics: in optimizer, check Form_pg_statistic->stadistinct
of a table via pg_statistics. Enable mksort only when it is less than a
threshold.
The hacked code works, which need to modify a couple of interfaces of
optimizer. In addition, a complete solution should consider types and
distinct values of all columns, which might be too complex, and the benefit
seems not so big.
2. Cache of datum positions
e.g. for heap tuple, we need to extract datum position from SortTuple by
extract_heaptuple_from_sorttuple() for comparing, which is executed
for each datum. By comparison, qsort does it once for each tuple.
Theoretically we can create a cache to remember the datum positions to
avoid duplicated extracting.
The hacked code works, but the improvement seems limited. Not sure if more
improvement space is available.
3. Template mechanism
Qsort uses kind of template mechanism by macro (see sort_template.h), which
avoids cost of runtime type check. Theoretically template mechanism can be
applied to mksort, but I am hesitating because it will impose more complexity
and the code will become difficult to maintain.
Please let me know your opinion, thanks!
Yao Wang
--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.
Attachments:
v5-Implement-multi-key-quick-sort.patchapplication/octet-stream; name=v5-Implement-multi-key-quick-sort.patchDownload
From 49e4c3804e42e524358545d9409f5bd905a60c6c Mon Sep 17 00:00:00 2001
From: Yao Wang <yaowangm@outlook.com>
Date: Tue, 7 May 2024 08:11:13 +0000
Subject: [PATCH] Implement multi-key quick sort
MK qsort (multi-key quick sort) is an alternative of standard qsort algorithm,
which has better performance for particular sort scenarios, i.e. the data set
has multiple keys to be sorted. Comparing to classic quick sort, it can get
significant performance improvement once multiple keys are available.
Author: Yao Wang <yao-yw.wang@broadcom.com>
Co-author: Hongxu Ma <hongxu.ma@broadcom.com>
---
src/backend/utils/misc/guc_tables.c | 11 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/backend/utils/sort/mk_qsort_tuple.c | 711 ++++++++++++++++++
src/backend/utils/sort/tuplesort.c | 62 ++
src/backend/utils/sort/tuplesortvariants.c | 297 +++++++-
src/include/c.h | 4 +
src/include/utils/tuplesort.h | 46 +-
src/test/regress/expected/geometry.out | 4 +-
.../regress/expected/incremental_sort.out | 12 +-
src/test/regress/expected/sysviews.out | 3 +-
src/test/regress/expected/tuplesort.out | 409 ++++++++++
src/test/regress/expected/window.out | 58 +-
src/test/regress/sql/geometry.sql | 2 +-
src/test/regress/sql/tuplesort.sql | 85 +++
src/test/regress/sql/window.sql | 22 +-
15 files changed, 1641 insertions(+), 86 deletions(-)
create mode 100644 src/backend/utils/sort/mk_qsort_tuple.c
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 3fd0b14dd8..5aee20f422 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -103,6 +103,7 @@ extern char *default_tablespace;
extern char *temp_tablespaces;
extern bool ignore_checksum_failure;
extern bool ignore_invalid_pages;
+extern bool enable_mk_sort;
#ifdef TRACE_SYNCSCAN
extern bool trace_syncscan;
@@ -839,6 +840,16 @@ struct config_bool ConfigureNamesBool[] =
true,
NULL, NULL, NULL
},
+ {
+ {"enable_mk_sort", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables multi-key sort"),
+ NULL,
+ GUC_EXPLAIN
+ },
+ &enable_mk_sort,
+ true,
+ NULL, NULL, NULL
+ },
{
{"enable_hashagg", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of hashed aggregation plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2166ea4a87..e1bf50370e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -413,6 +413,7 @@
#enable_sort = on
#enable_tidscan = on
#enable_group_by_reordering = on
+#enable_mk_sort = on
# - Planner Cost Constants -
diff --git a/src/backend/utils/sort/mk_qsort_tuple.c b/src/backend/utils/sort/mk_qsort_tuple.c
new file mode 100644
index 0000000000..032c77794f
--- /dev/null
+++ b/src/backend/utils/sort/mk_qsort_tuple.c
@@ -0,0 +1,711 @@
+/*
+ * MK qsort (multi-key quick sort) is an alternative of standard qsort
+ * algorithm, which has better performance for particular sort scenarios, i.e.
+ * the data set has multiple keys to be sorted.
+ *
+ * The sorting algorithm blends Quicksort and radix sort; Like regular
+ * Quicksort, it partitions its input into sets less than and greater than a
+ * given value; like radix sort, it moves on to the next field once the current
+ * input is known to be equal in the given field.
+ *
+ * The implementation is based on the paper:
+ * Jon L. Bentley and Robert Sedgewick, "Fast Algorithms for Sorting and
+ * Searching Strings", Jan 1997
+ *
+ * Some improvements which is related to additional handling for equal tuples
+ * have been adapted to keep consistency with the implementations of postgres
+ * qsort.
+ *
+ * For now, mk_qsort_tuple() is called in tuplesort_sort_memtuples() as a
+ * replacement of qsort_tuple() when specific conditions are satisfied.
+ */
+
+/* Swap two tuples in sort tuple array */
+static inline void
+mkqs_swap(int a,
+ int b,
+ SortTuple *x)
+{
+ SortTuple t;
+
+ if (a == b)
+ return;
+ t = x[a];
+ x[a] = x[b];
+ x[b] = t;
+}
+
+/* Swap tuples by batch in sort tuple array */
+static inline void
+mkqs_vec_swap(int a,
+ int b,
+ int size,
+ SortTuple *x)
+{
+ while (size-- > 0)
+ {
+ mkqs_swap(a, b, x);
+ a++;
+ b++;
+ }
+}
+
+/*
+ * Check whether current datum (at specified tuple and depth) is null
+ * Note that the input x means a specified tuple provided by caller but not
+ * a tuple array, so tupleIndex is unnecessary.
+ */
+static inline bool
+check_datum_null(SortTuple *x,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum;
+ bool isNull;
+
+ Assert(depth < state->base.nKeys);
+
+ if (depth == 0)
+ return x->isnull1;
+
+ state->base.mkqsGetDatumFunc(x, NULL, depth, state,
+ &datum, &isNull, NULL, NULL);
+
+ return isNull;
+}
+
+/*
+ * Compare two tuples at specified depth
+ *
+ * If "abbreviated key" is disabled:
+ * get specified datums and compare them by ApplySortComparator().
+ * If "abbreviated key" is enabled:
+ * Only first datum may be abbr key according to the design (see the comments
+ * of struct SortTuple), so different operations are needed for different
+ * datum.
+ * For first datum (depth == 0): get first datums ("abbr key" version) and
+ * compare them by ApplySortComparator(). If they are equal, get "full"
+ * version and compare again by ApplySortAbbrevFullComparator().
+ * For other datums: get specified datums and compare them by
+ * ApplySortComparator() as regular routine does.
+ *
+ * See comparetup_heap() for details.
+ */
+static inline int
+mkqs_compare_datum_tiebreak(SortTuple *tuple1,
+ SortTuple *tuple2,
+ int depth,
+ Tuplesortstate *state)
+{
+ Datum datum1,
+ datum2;
+ bool isNull1,
+ isNull2;
+ SortSupport sortKey;
+ int ret = 0;
+
+ Assert(state->base.mkqsGetDatumFunc);
+ Assert(depth < state->base.nKeys);
+
+ sortKey = state->base.sortKeys + depth;
+ state->base.mkqsGetDatumFunc(tuple1,
+ tuple2,
+ depth,
+ state,
+ &datum1,
+ &isNull1,
+ &datum2,
+ &isNull2);
+
+ /*
+ * If "abbreviated key" is enabled, and we are in the first depth, it
+ * means only "abbreviated keys" was compared. If the two datums were
+ * determined to be equal by ApplySortComparator() in
+ * mkqs_compare_datum(), we need to perform an extra "full" comparing
+ * by ApplySortAbbrevFullComparator().
+ */
+ if (sortKey->abbrev_converter &&
+ depth == 0)
+ ret = ApplySortAbbrevFullComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+ else
+ ret = ApplySortComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+
+
+ return ret;
+}
+
+/*
+ * Compare two tuples at first depth by some shortcuts
+ *
+ * The reason to use MkqsCompFuncType but not compare function pointers
+ * directly is just for performance.
+ */
+static inline int
+mkqs_compare_datum_by_shortcut(SortTuple *tuple1,
+ SortTuple *tuple2,
+ Tuplesortstate *state)
+{
+ int ret = 0;
+ MkqsCompFuncType compFuncType = state->base.mkqsCompFuncType;
+ SortSupport sortKey = &state->base.sortKeys[0];
+
+ if (compFuncType == MKQS_COMP_FUNC_UNSIGNED)
+ ret = ApplyUnsignedSortComparator(tuple1->datum1,
+ tuple1->isnull1,
+ tuple2->datum1,
+ tuple2->isnull1,
+ sortKey);
+ else if (compFuncType == MKQS_COMP_FUNC_SIGNED)
+ ret = ApplySignedSortComparator(tuple1->datum1,
+ tuple1->isnull1,
+ tuple2->datum1,
+ tuple2->isnull1,
+ sortKey);
+ else if (compFuncType == MKQS_COMP_FUNC_INT32)
+ ret = ApplyInt32SortComparator(tuple1->datum1,
+ tuple1->isnull1,
+ tuple2->datum1,
+ tuple2->isnull1,
+ sortKey);
+ else
+ {
+ Assert(compFuncType == MKQS_COMP_FUNC_GENERIC);
+ ret = ApplySortComparator(tuple1->datum1,
+ tuple1->isnull1,
+ tuple2->datum1,
+ tuple2->isnull1,
+ sortKey);
+ }
+
+ return ret;
+}
+
+/*
+ * Compare two tuples at specified depth
+ *
+ * Firstly try to call some shortcuts by mkqs_compare_datum_by_shortcut(),
+ * which are much faster because they just compare leading sort keys; if they
+ * are equal, call mkqs_compare_datum_tiebreak().
+ *
+ * The reason to use MkqsCompFuncType but not compare function pointers
+ * directly is just for performance.
+ *
+ * See comparetup_heap() for details.
+ */
+static inline int
+mkqs_compare_datum(SortTuple *tuple1,
+ SortTuple *tuple2,
+ int depth,
+ Tuplesortstate *state)
+{
+ int ret = 0;
+
+ if (depth == 0)
+ {
+ ret = mkqs_compare_datum_by_shortcut(tuple1, tuple2, state);
+
+ if (ret != 0)
+ return ret;
+
+ /*
+ * If they are equal and it is not an abbr key, no need to
+ * continue.
+ */
+ if (!state->base.sortKeys->abbrev_converter)
+ return ret;
+ }
+
+ ret = mkqs_compare_datum_tiebreak(tuple1,
+ tuple2,
+ depth,
+ state);
+
+ return ret;
+}
+
+/* Find the median of three values */
+static inline int
+get_median_from_three(int a,
+ int b,
+ int c,
+ SortTuple *x,
+ int depth,
+ Tuplesortstate *state)
+{
+ return mkqs_compare_datum(x + a, x + b, depth, state) < 0 ?
+ (mkqs_compare_datum(x + b, x + c, depth, state) < 0 ?
+ b : (mkqs_compare_datum(x + a, x + c, depth, state) < 0 ? c : a))
+ : (mkqs_compare_datum(x + b, x + c, depth, state) > 0 ?
+ b : (mkqs_compare_datum(x + a, x + c, depth, state) < 0 ? a : c));
+}
+
+/*
+ * Compare two tuples by starting specified depth till latest depth
+ *
+ * Caller should guarantee that all datums before specified depth
+ * are equal. The function is used by bubble sort in the middle of
+ * mk qsort.
+ */
+static inline int
+mkqs_compare_tuple_by_range(SortTuple *tuple1,
+ SortTuple *tuple2,
+ int depth,
+ Tuplesortstate *state)
+{
+ int ret = 0;
+ Datum datum1,
+ datum2;
+ bool isNull1,
+ isNull2;
+ SortSupport sortKey;
+ TuplesortPublic *base = NULL;
+
+ if (depth == 0)
+ {
+ ret = mkqs_compare_datum_by_shortcut(tuple1, tuple2, state);
+
+ if (ret != 0)
+ return ret;
+
+ base = TuplesortstateGetPublic(state);
+ sortKey = base->sortKeys + depth;
+
+ Assert(base->mkqsGetDatumFunc);
+ Assert(depth < base->nKeys);
+
+ /*
+ * If "abbreviated key" is enabled, and we are in the first depth, it
+ * means only "abbreviated keys" was compared. If the two datums were
+ * determined to be equal by ApplySortComparator() in
+ * mkqs_compare_datum(), we need to perform an extra "full" comparing
+ * by ApplySortAbbrevFullComparator().
+ */
+ if (sortKey->abbrev_converter)
+ {
+ base->mkqsGetDatumFunc(tuple1,
+ tuple2,
+ depth,
+ state,
+ &datum1,
+ &isNull1,
+ &datum2,
+ &isNull2);
+ ret = ApplySortAbbrevFullComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+ if (ret != 0)
+ return ret;
+ }
+
+ /*
+ * By now, all works about first depth have been down. Move the
+ * depth and sortKey to next level.
+ */
+ depth++;
+ sortKey++;
+ }
+
+ /*
+ * Init base/sortKey because they may not have been initialized
+ * if the init depth > 1
+ */
+ if (base == NULL) {
+ base = TuplesortstateGetPublic(state);
+ sortKey = base->sortKeys + depth;
+
+ Assert(base->mkqsGetDatumFunc);
+ Assert(depth < base->nKeys);
+ }
+
+ while (depth < base->nKeys)
+ {
+ base->mkqsGetDatumFunc(tuple1,
+ tuple2,
+ depth,
+ state,
+ &datum1,
+ &isNull1,
+ &datum2,
+ &isNull2);
+
+ ret = ApplySortComparator(datum1,
+ isNull1,
+ datum2,
+ isNull2,
+ sortKey);
+
+ if (ret != 0)
+ return ret;
+
+ depth++;
+ sortKey++;
+ }
+
+ Assert(ret == 0);
+ return 0;
+}
+
+/*
+ * Compare two tuples by using interfaces of qsort()
+ */
+static inline int
+mkqs_compare_tuple(SortTuple *a, SortTuple *b, Tuplesortstate *state)
+{
+ int ret = 0;
+ MkqsCompFuncType compFuncType = state->base.mkqsCompFuncType;
+
+ /*
+ * The function should never be called with
+ * MKQS_COMP_FUNC_GENERIC
+ */
+ Assert(compFuncType != MKQS_COMP_FUNC_GENERIC);
+
+ if (compFuncType == MKQS_COMP_FUNC_UNSIGNED)
+ ret = qsort_tuple_unsigned_compare(a, b, state);
+ else if (compFuncType == MKQS_COMP_FUNC_SIGNED)
+ ret = qsort_tuple_signed_compare(a, b, state);
+ else if (compFuncType == MKQS_COMP_FUNC_INT32)
+ ret = qsort_tuple_int32_compare(a, b, state);
+ else
+ Assert(false);
+
+ return ret;
+}
+
+#ifdef USE_ASSERT_CHECKING
+/*
+ * Verify whether the SortTuple list is ordered or not at specified depth
+ */
+static void
+mkqs_verify(SortTuple *x,
+ int n,
+ int depth,
+ Tuplesortstate *state)
+{
+ int ret;
+
+ for (int i = 0; i < n - 1; i++)
+ {
+ ret = mkqs_compare_datum(x + i,
+ x + i + 1,
+ depth,
+ state);
+ Assert(ret <= 0);
+ }
+}
+#endif
+
+/*
+ * Major of multi-key quick sort
+ *
+ * seenNull indicates whether we have seen NULL in any datum we checked
+ */
+static void
+mk_qsort_tuple(SortTuple *x,
+ size_t n,
+ int depth,
+ Tuplesortstate *state,
+ bool seenNull)
+{
+ /*
+ * In the process, the tuple array consists of five parts: left equal,
+ * less, not-processed, greater, right equal
+ *
+ * lessStart indicates the first position of less part lessEnd indicates
+ * the next position after less part greaterStart indicates the prior
+ * position before greater part greaterEnd indicates the latest position
+ * of greater part the range between lessEnd and greaterStart (inclusive)
+ * is not-processed
+ */
+ int lessStart,
+ lessEnd,
+ greaterStart,
+ greaterEnd,
+ tupCount;
+ int32 dist;
+ SortTuple *pivot;
+ bool isDatumNull;
+
+ Assert(depth <= state->base.nKeys);
+ Assert(state->base.sortKeys);
+ Assert(state->base.mkqsGetDatumFunc);
+
+ if (n <= 1)
+ return;
+
+ /* If we have exceeded the max depth, return immediately */
+ if (depth == state->base.nKeys)
+ return;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Pre-ordered check */
+ if (state->base.mkqsCompFuncType != MKQS_COMP_FUNC_GENERIC)
+ {
+ /*
+ * If there is specialized comparator for the type, use classic
+ * pre-ordered check by comparing the entire tuples.
+ * The check is performed only for first depth since we compare
+ * entire tuples but not datums.
+ */
+ if (depth == 0)
+ {
+ int ret;
+ bool preOrdered = true;
+
+ for (int i = 0; i < n - 1; i++)
+ {
+
+ CHECK_FOR_INTERRUPTS();
+ ret = mkqs_compare_tuple(x + i, x + i + 1, state);
+ if (ret > 0)
+ {
+ preOrdered = false;
+ break;
+ }
+ }
+
+ if (preOrdered)
+ return;
+ }
+ }
+ else
+ {
+ /*
+ * If there is no specialized comparator for the type, perform
+ * pre-ordered check by comparing datums at each depth.
+ *
+ * Different from qsort_tuple(), the array must be strict ordered (no
+ * equal datums). If there are equal datums, we must continue the mk
+ * qsort process to check datums on lower depth.
+ *
+ * Note uniqueness check is unnecessary here because strict ordered
+ * array guarantees no duplicate.
+ */
+ int ret;
+ bool strictOrdered = true;
+
+ for (int i = 0; i < n - 1; i++)
+ {
+ CHECK_FOR_INTERRUPTS();
+ ret = mkqs_compare_datum(x + i,
+ x + i + 1,
+ depth,
+ state);
+ if (ret >= 0)
+ {
+ strictOrdered = false;
+ break;
+ }
+ }
+
+ if (strictOrdered)
+ return;
+ }
+
+ /*
+ * When the count < 16 and no need to handle duplicated tuples, use
+ * bubble sort.
+ *
+ * Use 16 instead of 7 which is used in standard qsort, because mk qsort
+ * need more cost to maintain more complex state.
+ *
+ * Bubble sort is not applicable for scenario of handle duplicated tuples
+ * because it is difficult to check NULL effectively.
+ *
+ * No need to check for interrupts since the data size is pretty small.
+ *
+ * TODO: Can we check NULL for bubble sort with minimal cost?
+ */
+ if (n < 16 && !state->base.mkqsHandleDupFunc)
+ {
+ for (int m = 0;m < n;m++)
+ for (int l = m; l > 0; l--)
+ {
+ if (mkqs_compare_tuple_by_range(x + l - 1, x + l, depth, state)
+ <= 0)
+ break;
+ mkqs_swap(l, l - 1, x);
+ }
+ return;
+ }
+
+ /* Select pivot by random and move it to the first position */
+ if (n > 7)
+ {
+ int m, l, r, d;
+ m = n / 2;
+ l = 0;
+ r = n - 1;
+ if (n > 40)
+ {
+ d = n / 8;
+ l = get_median_from_three(l, l + d, l + 2 * d, x, depth, state);
+ m = get_median_from_three(m - d, m, m + d, x, depth, state);
+ r = get_median_from_three(r - 2 * d, r - d, r, x, depth, state);
+ }
+ lessStart = get_median_from_three(l, m, r, x, depth, state);
+ }
+ else
+ lessStart = n / 2;
+
+ mkqs_swap(0, lessStart, x);
+ pivot = x;
+
+ lessStart = 1;
+ lessEnd = 1;
+ greaterStart = n - 1;
+ greaterEnd = n - 1;
+
+ /* Sort the array to three parts: lesser, equal, greater */
+ while (true)
+ {
+ CHECK_FOR_INTERRUPTS();
+
+ /* Compare the left end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare lessEnd and pivot at current depth */
+ dist = mkqs_compare_datum(x + lessEnd,
+ pivot,
+ depth,
+ state);
+
+ if (dist > 0)
+ break;
+
+ /* If lessEnd is equal to pivot, move it to lessStart */
+ if (dist == 0)
+ {
+ mkqs_swap(lessEnd, lessStart, x);
+ lessStart++;
+ }
+ lessEnd++;
+ }
+
+ /* Compare the right end of the array */
+ while (lessEnd <= greaterStart)
+ {
+ /* Compare greaterStart and pivot at current depth */
+ dist = mkqs_compare_datum(x + greaterStart,
+ pivot,
+ depth,
+ state);
+
+ if (dist < 0)
+ break;
+
+ /* If greaterStart is equal to pivot, move it to greaterEnd */
+ if (dist == 0)
+ {
+ mkqs_swap(greaterStart, greaterEnd, x);
+ greaterEnd--;
+ }
+ greaterStart--;
+ }
+
+ if (lessEnd > greaterStart)
+ break;
+ mkqs_swap(lessEnd, greaterStart, x);
+ lessEnd++;
+ greaterStart--;
+ }
+
+ /*
+ * Now the array has four parts: left equal, lesser, greater, right equal
+ * Note greaterStart is less than lessEnd now
+ */
+
+ /* Move the left equal part to middle */
+ dist = Min(lessStart, lessEnd - lessStart);
+ mkqs_vec_swap(0, lessEnd - dist, dist, x);
+
+ /* Move the right equal part to middle */
+ dist = Min(greaterEnd - greaterStart, n - greaterEnd - 1);
+ mkqs_vec_swap(lessEnd, n - dist, dist, x);
+
+ /*
+ * Now the array has three parts: lesser, equal, greater Note that one or
+ * two parts may have no element at all.
+ */
+
+ /* Recursively sort the lesser part */
+
+ /* dist means the size of less part */
+ dist = lessEnd - lessStart;
+ mk_qsort_tuple(x,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+ /* Recursively sort the equal part */
+
+ /*
+ * (x + dist) means the first tuple in the equal part Since all tuples
+ * have equal datums at current depth, we just check any one of them to
+ * determine whether we have seen null datum.
+ */
+ isDatumNull = check_datum_null(x + dist, depth, state);
+
+ /* (lessStart + n - greaterEnd - 1) means the size of equal part */
+ tupCount = lessStart + n - greaterEnd - 1;
+
+ if (depth < state->base.nKeys - 1)
+ {
+ mk_qsort_tuple(x + dist,
+ tupCount,
+ depth + 1,
+ state,
+ seenNull || isDatumNull);
+ }
+ else
+ {
+ /*
+ * We have reach the max depth: Call mkqsHandleDupFunc to handle
+ * duplicated tuples if necessary, e.g. checking uniqueness or extra
+ * comparing
+ */
+
+ /*
+ * Call mkqsHandleDupFunc if:
+ * 1. mkqsHandleDupFunc is filled
+ * 2. the size of equal part > 1
+ */
+ if (state->base.mkqsHandleDupFunc &&
+ (tupCount > 1))
+ {
+ state->base.mkqsHandleDupFunc(x + dist,
+ tupCount,
+ seenNull || isDatumNull,
+ state);
+ }
+ }
+
+ /* Recursively sort the greater part */
+
+ /* dist means the size of greater part */
+ dist = greaterEnd - greaterStart;
+ mk_qsort_tuple(x + n - dist,
+ dist,
+ depth,
+ state,
+ seenNull);
+
+#ifdef USE_ASSERT_CHECKING
+ mkqs_verify(x,
+ n,
+ depth,
+ state);
+#endif
+}
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 7c4d6dc106..6dd21a2710 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -128,6 +128,7 @@ bool trace_sort = false;
bool optimize_bounded_sort = true;
#endif
+bool enable_mk_sort = true;
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
@@ -337,6 +338,9 @@ struct Tuplesortstate
#ifdef TRACE_SORT
PGRUsage ru_start;
#endif
+
+ /* Whether multi-key quick sort is used */
+ bool mkqsUsed;
};
/*
@@ -622,6 +626,8 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
#define ST_DEFINE
#include "lib/sort_template.h"
+#include "mk_qsort_tuple.c"
+
/*
* tuplesort_begin_xxx
*
@@ -690,6 +696,7 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
state->base.sortopt = sortopt;
state->base.tuples = true;
state->abbrevNext = 10;
+ state->mkqsUsed = false;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -2559,6 +2566,8 @@ tuplesort_get_stats(Tuplesortstate *state,
case TSS_SORTEDINMEM:
if (state->boundUsed)
stats->sortMethod = SORT_TYPE_TOP_N_HEAPSORT;
+ else if (state->mkqsUsed)
+ stats->sortMethod = SORT_TYPE_MK_QSORT;
else
stats->sortMethod = SORT_TYPE_QUICKSORT;
break;
@@ -2592,6 +2601,8 @@ tuplesort_method_name(TuplesortMethod m)
return "external sort";
case SORT_TYPE_EXTERNAL_MERGE:
return "external merge";
+ case SORT_TYPE_MK_QSORT:
+ return "multi-key quick sort";
}
return "unknown";
@@ -2717,6 +2728,57 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
if (state->memtupcount > 1)
{
+ /*
+ * Apply multi-key quick sort when:
+ * 1. enable_mk_sort is set
+ * 2. There are multiple keys available
+ * 3. mkqsGetDatumFunc is filled, which implies that current tuple
+ * type is supported by mk qsort. (By now only Heap tuple and Btree
+ * Index tuple are supported, and more types may be supported in
+ * future.)
+ *
+ * A summary of tuple types supported by mk qsort:
+ *
+ * HeapTuple: supported
+ * IndexTuple(btree): supportedi
+ * IndexTuple(hash): not supported because there is only one key
+ * DatumTuple: not supported because there is only one key
+ * HeapTuple(for cluster): not supported yet
+ * IndexTuple(gist): not supported yet
+ * IndexTuple(brin): not supported yet
+ */
+ if (enable_mk_sort &&
+ state->base.nKeys > 1 &&
+ state->base.mkqsGetDatumFunc != NULL)
+ {
+ /*
+ * Set relevant Datum Sort Comparator according to concrete data type
+ * of the first sort key
+ */
+ state->base.mkqsCompFuncType = MKQS_COMP_FUNC_GENERIC;
+ if (state->base.haveDatum1)
+ {
+ if (state->base.sortKeys[0].comparator == ssup_datum_unsigned_cmp)
+ state->base.mkqsCompFuncType = MKQS_COMP_FUNC_UNSIGNED;
+#if SIZEOF_DATUM >= 8
+ else if (state->base.sortKeys[0].comparator == ssup_datum_signed_cmp)
+ state->base.mkqsCompFuncType = MKQS_COMP_FUNC_SIGNED;
+#endif
+ else if (state->base.sortKeys[0].comparator == ssup_datum_int32_cmp)
+ state->base.mkqsCompFuncType = MKQS_COMP_FUNC_INT32;
+ }
+
+ state->mkqsUsed = true;
+
+ mk_qsort_tuple(state->memtuples,
+ state->memtupcount,
+ 0,
+ state,
+ false);
+
+ return;
+ }
+
/*
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index 05a853caa3..d1c5efe575 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -30,6 +30,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/tuplesort.h"
+#include "miscadmin.h"
/* sort-type codes for sort__start probes */
@@ -92,6 +93,48 @@ static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static void freestate_cluster(Tuplesortstate *state);
+static void mkqs_get_datum_heap(const SortTuple *x1,
+ const SortTuple *x2,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum1,
+ bool *isNull1,
+ Datum *datum2,
+ bool *isNull2);
+
+static void
+mkqs_get_datum_index_btree(const SortTuple *x1,
+ const SortTuple *x2,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum1,
+ bool *isNull1,
+ Datum *datum2,
+ bool *isNull2);
+
+static void
+ mkqs_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
+static int
+ mkqs_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state);
+
+static pg_attribute_always_inline void
+extract_heaptuple_from_sorttuple(const SortTuple *sortTuple,
+ HeapTupleData *heapTuple);
+
+static inline int
+ tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2);
+
+static inline void
+ raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state);
+
/*
* Data structure pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
* the tuplesort_begin_cluster.
@@ -163,6 +206,14 @@ typedef struct BrinSortTuple
/* Size of the BrinSortTuple, given length of the BrinTuple. */
#define BRINSORTTUPLE_SIZE(len) (offsetof(BrinSortTuple, tuple) + (len))
+#define ST_SORT qsort_tuple_by_itempointer
+#define ST_ELEMENT_TYPE SortTuple
+#define ST_COMPARE(a, b, state) mkqs_compare_equal_index_btree(a, b, state)
+#define ST_COMPARE_ARG_TYPE Tuplesortstate
+#define ST_CHECK_FOR_INTERRUPTS
+#define ST_SCOPE static
+#define ST_DEFINE
+#include "lib/sort_template.h"
Tuplesortstate *
tuplesort_begin_heap(TupleDesc tupDesc,
@@ -200,6 +251,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
base->removeabbrev = removeabbrev_heap;
base->comparetup = comparetup_heap;
base->comparetup_tiebreak = comparetup_heap_tiebreak;
+ base->mkqsGetDatumFunc = mkqs_get_datum_heap;
base->writetup = writetup_heap;
base->readtup = readtup_heap;
base->haveDatum1 = true;
@@ -388,6 +440,8 @@ tuplesort_begin_index_btree(Relation heapRel,
base->removeabbrev = removeabbrev_index;
base->comparetup = comparetup_index_btree;
base->comparetup_tiebreak = comparetup_index_btree_tiebreak;
+ base->mkqsGetDatumFunc = mkqs_get_datum_index_btree;
+ base->mkqsHandleDupFunc = mkqs_handle_dup_index_btree;
base->writetup = writetup_index;
base->readtup = readtup_index;
base->haveDatum1 = true;
@@ -1531,10 +1585,6 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
*/
if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
{
- Datum values[INDEX_MAX_KEYS];
- bool isnull[INDEX_MAX_KEYS];
- char *key_desc;
-
/*
* Some rather brain-dead implementations of qsort (such as the one in
* QNX 4) will sometimes call the comparison routine to compare a
@@ -1543,18 +1593,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
*/
Assert(tuple1 != tuple2);
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
+ raise_error_of_dup_index(tuple1, state);
}
/*
@@ -1563,25 +1602,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
* attribute in order to ensure that all keys in the index are physically
* unique.
*/
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
}
static int
@@ -1888,3 +1909,209 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
+
+/*
+ * Get specified datums from SortTuple (HeapTuple) list
+ *
+ * When x1 and x2 are provided by caller, two datums will be returned.
+ * When x2 is NULL, only one datum will be returned.
+ *
+ * Note the function does not check leading sort key (tuple->datum1 and
+ * tuple->isnull), which should be checked in other functions (e.g.
+ * mkqs_compare_datum()).
+ *
+ * See comparetup_heap() for details.
+ */
+static void mkqs_get_datum_heap(const SortTuple *x1,
+ const SortTuple *x2,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum1,
+ bool *isNull1,
+ Datum *datum2,
+ bool *isNull2)
+{
+ TupleDesc tupDesc = NULL;
+ HeapTupleData heapTuple1, heapTuple2;
+ AttrNumber attno;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys + depth;;
+
+ Assert(state);
+ Assert(x1 != NULL);
+
+ tupDesc = (TupleDesc) base->arg;
+ attno = sortKey->ssup_attno;
+
+ /* Extract datum from sortTuple->tuple */
+ extract_heaptuple_from_sorttuple(x1, &heapTuple1);
+ *datum1 = heap_getattr(&heapTuple1, attno, tupDesc, isNull1);
+
+ if (x2 != NULL)
+ {
+ extract_heaptuple_from_sorttuple(x2, &heapTuple2);
+ *datum2 = heap_getattr(&heapTuple2, attno, tupDesc, isNull2);
+ }
+}
+
+static pg_attribute_always_inline void
+extract_heaptuple_from_sorttuple(const SortTuple *sortTuple,
+ HeapTupleData *heapTuple)
+{
+ heapTuple->t_len = ((MinimalTuple) sortTuple->tuple)->t_len
+ + MINIMAL_TUPLE_OFFSET;
+ heapTuple->t_data = (HeapTupleHeader) ((char *) sortTuple->tuple
+ - MINIMAL_TUPLE_OFFSET);
+}
+
+/*
+ * Get specified datums from SortTuple (IndexTuple for btree index) list
+ *
+ * When x1 and x2 are provided by caller, two datums will be returned.
+ * When x2 is NULL, only one datum will be returned.
+ *
+ * Note the function does not check leading sort key (tuple->datum1 and
+ * tuple->isnull), which should be checked in other functions (e.g.
+ * mkqs_compare_datum()).
+ *
+ * See comparetup_index_btree() for details.
+ */
+static void
+mkqs_get_datum_index_btree(const SortTuple *x1,
+ const SortTuple *x2,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum1,
+ bool *isNull1,
+ Datum *datum2,
+ bool *isNull2)
+{
+ TupleDesc tupDesc;
+ IndexTuple indexTuple1, indexTuple2;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ Assert(state);
+ Assert(x1);
+
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+ indexTuple1 = (IndexTuple) x1->tuple;
+
+ /*
+ * Set parameter attnum = depth + 1 because attnum starts from 1 but depth
+ * starts from 0
+ */
+ *datum1 = index_getattr(indexTuple1, depth + 1, tupDesc, isNull1);
+
+ if (x2 != NULL)
+ {
+ indexTuple2 = (IndexTuple) x2->tuple;
+ *datum2 = index_getattr(indexTuple2, depth + 1, tupDesc, isNull2);
+ }
+}
+
+/*
+ * Handle duplicated SortTuples (IndexTuple for btree index during mk qsort)
+ * x: the duplicated tuple list
+ * tupleCount: count of the tuples
+ */
+static void
+mkqs_handle_dup_index_btree(SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ /* If enforceUnique is enabled and we never saw NULL, raise error */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && seenNull))
+ {
+ /*
+ * x means the first tuple of duplicated tuple list Since they are
+ * duplicated, simply pick up the first one to raise error
+ */
+ raise_error_of_dup_index((IndexTuple) (x->tuple), state);
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ qsort_tuple_by_itempointer(x,
+ tupleCount,
+ state);
+}
+
+/*
+ * Compare two btree index tuples by ItemPointer
+ * It is a callback function for qsort_tuple() called by
+ * mkqs_handle_dup_index_btree()
+ */
+static int
+mkqs_compare_equal_index_btree(const SortTuple *a,
+ const SortTuple *b,
+ Tuplesortstate *state)
+{
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ return tuplesort_compare_by_item_pointer(tuple1, tuple2);
+}
+
+/* Compare two index tuples by ItemPointer */
+static inline int
+tuplesort_compare_by_item_pointer(const IndexTuple tuple1,
+ const IndexTuple tuple2)
+{
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+/* Raise error for duplicated tuple when creating unique index */
+static inline void
+raise_error_of_dup_index(IndexTuple x,
+ Tuplesortstate *state)
+{
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ TupleDesc tupDesc;
+ char *key_desc;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+
+ tupDesc = RelationGetDescr(arg->index.indexRel);
+ index_deform_tuple((IndexTuple) x, tupDesc, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+}
diff --git a/src/include/c.h b/src/include/c.h
index dc1841346c..f7c368cd16 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -857,12 +857,14 @@ typedef NameData *Name;
#define Assert(condition) ((void)true)
#define AssertMacro(condition) ((void)true)
+#define AssertImply(condition1, condition2) ((void)true)
#elif defined(FRONTEND)
#include <assert.h>
#define Assert(p) assert(p)
#define AssertMacro(p) ((void) assert(p))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
#else /* USE_ASSERT_CHECKING && !FRONTEND */
@@ -886,6 +888,8 @@ typedef NameData *Name;
((void) ((condition) || \
(ExceptionalCondition(#condition, __FILE__, __LINE__), 0)))
+#define AssertImply(cond1, cond2) Assert(!(cond1) || (cond2))
+
#endif /* USE_ASSERT_CHECKING && !FRONTEND */
/*
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index e7941a1f09..60eb77ee01 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -29,7 +29,6 @@
#include "utils/relcache.h"
#include "utils/sortsupport.h"
-
/*
* Tuplesortstate and Sharedsort are opaque types whose details are not
* known outside tuplesort.c.
@@ -79,9 +78,10 @@ typedef enum
SORT_TYPE_QUICKSORT = 1 << 1,
SORT_TYPE_EXTERNAL_SORT = 1 << 2,
SORT_TYPE_EXTERNAL_MERGE = 1 << 3,
+ SORT_TYPE_MK_QSORT = 1 << 4,
} TuplesortMethod;
-#define NUM_TUPLESORTMETHODS 4
+#define NUM_TUPLESORTMETHODS 5
typedef enum
{
@@ -89,6 +89,14 @@ typedef enum
SORT_SPACE_TYPE_MEMORY,
} TuplesortSpaceType;
+typedef enum
+{
+ MKQS_COMP_FUNC_GENERIC,
+ MKQS_COMP_FUNC_UNSIGNED,
+ MKQS_COMP_FUNC_SIGNED,
+ MKQS_COMP_FUNC_INT32
+} MkqsCompFuncType;
+
/* Bitwise option flags for tuple sorts */
#define TUPLESORT_NONE 0
@@ -155,6 +163,24 @@ typedef struct
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
+/* Multi-key quick sort */
+
+typedef void
+ (*MkqsGetDatumFunc) (const SortTuple *x1,
+ const SortTuple *x2,
+ const int depth,
+ Tuplesortstate *state,
+ Datum *datum1,
+ bool *isNull1,
+ Datum *datum2,
+ bool *isNull2);
+
+typedef void
+ (*MkqsHandleDupFunc) (SortTuple *x,
+ const int tupleCount,
+ const bool seenNull,
+ Tuplesortstate *state);
+
/*
* The public part of a Tuple sort operation state. This data structure
* contains the definition of sort-variant-specific interface methods and
@@ -249,6 +275,21 @@ typedef struct
bool tuples; /* Can SortTuple.tuple ever be set? */
void *arg; /* Specific information for the sort variant */
+
+ /*
+ * Function pointer, referencing a function to get specified datums from
+ * SortTuple list with multi-key. Used by mk_qsort_tuple().
+ */
+ MkqsGetDatumFunc mkqsGetDatumFunc;
+
+ /*
+ * Function pointer, referencing a function to handle duplicated tuple
+ * from SortTuple list with multi-key. Used by mk_qsort_tuple(). For now,
+ * the function pointer is filled for only btree index tuple.
+ */
+ MkqsHandleDupFunc mkqsHandleDupFunc;
+
+ MkqsCompFuncType mkqsCompFuncType;
} TuplesortPublic;
/* Sort parallel code from state for sort__start probes */
@@ -411,7 +452,6 @@ extern void tuplesort_restorepos(Tuplesortstate *state);
extern void *tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen);
-
/* tuplesortvariants.c */
extern Tuplesortstate *tuplesort_begin_heap(TupleDesc tupDesc,
diff --git a/src/test/regress/expected/geometry.out b/src/test/regress/expected/geometry.out
index 8be694f46b..094d22861c 100644
--- a/src/test/regress/expected/geometry.out
+++ b/src/test/regress/expected/geometry.out
@@ -4273,7 +4273,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
circle | point | distance
----------------+-------------------+---------------
<(1,2),3> | (-3,4) | 1.472135955
@@ -4310,8 +4310,8 @@ SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
<(3,5),0> | (Infinity,1e+300) | Infinity
<(1,2),3> | (1e+300,Infinity) | Infinity
<(5,1),3> | (1e+300,Infinity) | Infinity
- <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,2),3> | (Infinity,1e+300) | Infinity
+ <(5,1),3> | (Infinity,1e+300) | Infinity
<(1,3),5> | (1e+300,Infinity) | Infinity
<(1,3),5> | (Infinity,1e+300) | Infinity
<(100,200),10> | (1e+300,Infinity) | Infinity
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 5fd54a10b1..a26f8f100a 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -520,13 +520,13 @@ select * from (select * from t order by a) s order by a, b limit 55;
-- Test EXPLAIN ANALYZE with only a fullsort group.
select explain_analyze_without_memory('select * from (select * from t order by a) s order by a, b limit 55');
- explain_analyze_without_memory
----------------------------------------------------------------------------------------------------------------
+ explain_analyze_without_memory
+--------------------------------------------------------------------------------------------------------------------------
Limit (actual rows=55 loops=1)
-> Incremental Sort (actual rows=55 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 2 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 2 Sort Methods: top-N heapsort, multi-key quick sort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=101 loops=1)
Sort Key: t.a
Sort Method: quicksort Memory: NNkB
@@ -554,7 +554,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Group Count": 2, +
"Sort Methods Used": [ +
"top-N heapsort", +
- "quicksort" +
+ "multi-key quick sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
@@ -728,7 +728,7 @@ select explain_analyze_without_memory('select * from (select * from t order by a
-> Incremental Sort (actual rows=70 loops=1)
Sort Key: t.a, t.b
Presorted Key: t.a
- Full-sort Groups: 1 Sort Method: quicksort Average Memory: NNkB Peak Memory: NNkB
+ Full-sort Groups: 1 Sort Method: multi-key quick sort Average Memory: NNkB Peak Memory: NNkB
Pre-sorted Groups: 5 Sort Methods: top-N heapsort, quicksort Average Memory: NNkB Peak Memory: NNkB
-> Sort (actual rows=1000 loops=1)
Sort Key: t.a
@@ -756,7 +756,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
"Full-sort Groups": { +
"Group Count": 1, +
"Sort Methods Used": [ +
- "quicksort" +
+ "multi-key quick sort" +
], +
"Sort Space Memory": { +
"Peak Sort Space Used": "NN", +
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 2f3eb4e7f1..44840e7e5c 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -146,6 +146,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_material | on
enable_memoize | on
enable_mergejoin | on
+ enable_mk_sort | on
enable_nestloop | on
enable_parallel_append | on
enable_parallel_hash | on
@@ -157,7 +158,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_seqscan | on
enable_sort | on
enable_tidscan | on
-(23 rows)
+(24 rows)
-- There are always wait event descriptions for various types.
select type, count(*) > 0 as ok FROM pg_wait_events
diff --git a/src/test/regress/expected/tuplesort.out b/src/test/regress/expected/tuplesort.out
index 6dd97e7427..41d99793d7 100644
--- a/src/test/regress/expected/tuplesort.out
+++ b/src/test/regress/expected/tuplesort.out
@@ -703,3 +703,412 @@ EXPLAIN (COSTS OFF) :qry;
(10 rows)
COMMIT;
+-- Test cases for multi-key quick sort
+set work_mem='100MB';
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+ a | b | c
+---+----+------
+ 0 | 5 | 98f1
+ 0 | 10 | d3d9
+ 1 | 1 | c4ca
+ 1 | 11 | 6512
+ 2 | 2 | c81e
+ 2 | 12 | c20a
+ 3 | 3 | eccb
+ 3 | 13 | c51c
+ 4 | 4 | a87f
+ 4 | 14 | aab3
+ 5 | 0 | 9bf3
+ 5 | 5 | e4da
+ 6 | 1 | c74d
+ 6 | 6 | 1679
+ 7 | 2 | 70ef
+ 7 | 7 | 8f14
+ 8 | 3 | 6f49
+ 8 | 8 | c9f0
+ 9 | 4 | 1f0e
+ 9 | 9 | 45c4
+(20 rows)
+
+-- test sorting on distinct values, in which mk qsort is supposed to be
+-- not affective, but still can generate correct result
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 20 - g, g, g::text
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+ a | b | c
+----+----+----
+ 0 | 20 | 20
+ 1 | 19 | 19
+ 2 | 18 | 18
+ 3 | 17 | 17
+ 4 | 16 | 16
+ 5 | 15 | 15
+ 6 | 14 | 14
+ 7 | 13 | 13
+ 8 | 12 | 12
+ 9 | 11 | 11
+ 10 | 10 | 10
+ 11 | 9 | 9
+ 12 | 8 | 8
+ 13 | 7 | 7
+ 14 | 6 | 6
+ 15 | 5 | 5
+ 16 | 4 | 4
+ 17 | 3 | 3
+ 18 | 2 | 2
+ 19 | 1 | 1
+(20 rows)
+
+-- test create index
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 10 - g, g, g::text
+ from generate_series(1, 10) g;
+create unique index idx_mksort_simple on mksort_simple_tbl (a, b ,c);
+drop index idx_mksort_simple;
+-- try to create unique index on duplicated rows
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 1, g, g::text
+ from generate_series(1, 10) g;
+insert into mksort_simple_tbl
+ select * from mksort_simple_tbl order by a, b desc, c limit 1;
+select * from mksort_simple_tbl;
+ a | b | c
+---+----+----
+ 1 | 1 | 1
+ 1 | 2 | 2
+ 1 | 3 | 3
+ 1 | 4 | 4
+ 1 | 5 | 5
+ 1 | 6 | 6
+ 1 | 7 | 7
+ 1 | 8 | 8
+ 1 | 9 | 9
+ 1 | 10 | 10
+ 1 | 10 | 10
+(11 rows)
+
+create unique index idx_mksort_simple on mksort_simple_tbl (a, b ,c);
+ERROR: could not create unique index "idx_mksort_simple"
+DETAIL: Key (a, b, c)=(1, 10, 10) is duplicated.
+drop table mksort_simple_tbl;
+-- test table with abbr keys
+create table abbr_tbl (a int, b varchar(100), c uuid);
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+-- (see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+select c, b, a from abbr_tbl order by c, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+(50 rows)
+
+select c, b, a from abbr_tbl order by c desc, b, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+(50 rows)
+
+select c, b, a from abbr_tbl order by c, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+(50 rows)
+
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+----
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 5
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 25
+ 00000000-0000-0000-0000-000000000000 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 45
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 41
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 1
+ 00000000-0000-0000-0000-000000000001 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 21
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 17
+ 00000000-0000-0000-0000-000000000002 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 37
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 13
+ 00000000-0000-0000-0000-000000000003 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 33
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 9
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 29
+ 00000000-0000-0000-0000-000000000004 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 49
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 10
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 30
+ 11111111-1111-1111-1111-111111111110 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 50
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 6
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 26
+ 11111111-1111-1111-1111-111111111111 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 46
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 2
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 22
+ 11111111-1111-1111-1111-111111111112 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 42
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 18
+ 11111111-1111-1111-1111-111111111113 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 38
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 34
+ 11111111-1111-1111-1111-111111111114 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 14
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 20
+ ffffffff-ffff-ffff-ffff-fffffffffff0 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 40
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 16
+ ffffffff-ffff-ffff-ffff-fffffffffff1 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 36
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 12
+ ffffffff-ffff-ffff-ffff-fffffffffff2 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 32
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 48
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 28
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 4
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 24
+ ffffffff-ffff-ffff-ffff-fffffffffff4 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 44
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb6 | 27
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 19
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb5 | 47
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 11
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb4 | 39
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 3
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3 | 31
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb2 | 23
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 15
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 43
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 7
+ | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb0 | 35
+(50 rows)
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+NOTICE: index "idx_abbr_tbl" does not exist, skipping
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+ c | b | a
+--------------------------------------+---------------------------------------------------------+---
+ ffffffff-ffff-ffff-ffff-fffffffffff3 | aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1 | 8
+(1 row)
+
+-- Uniqueness check of CREATE INDEX
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+drop index if exists idx_abbr_tbl;
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+ERROR: could not create unique index "idx_abbr_tbl"
+DETAIL: Key (c, b, a)=(00000000-0000-0000-0000-000000000001, aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1, 1) is duplicated.
+drop table abbr_tbl;
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ae4e8851f8..2de20ca1d0 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -18,13 +18,13 @@ INSERT INTO empsalary VALUES
('sales', 3, 4800, '2007-08-01'),
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | sum
-----------+-------+--------+-------
develop | 7 | 4200 | 25100
develop | 9 | 4500 | 25100
- develop | 11 | 5200 | 25100
develop | 10 | 5200 | 25100
+ develop | 11 | 5200 | 25100
develop | 8 | 6000 | 25100
personnel | 5 | 3500 | 7400
personnel | 2 | 3900 | 7400
@@ -33,13 +33,13 @@ SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM emps
sales | 1 | 5000 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
depname | empno | salary | rank
-----------+-------+--------+------
develop | 7 | 4200 | 1
develop | 9 | 4500 | 2
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
personnel | 5 | 3500 | 1
personnel | 2 | 3900 | 2
@@ -90,18 +90,18 @@ SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PA
sales | 4 | 4800 | 14600
(10 rows)
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
depname | empno | salary | rank
-----------+-------+--------+------
- develop | 7 | 4200 | 1
- personnel | 5 | 3500 | 1
sales | 3 | 4800 | 1
sales | 4 | 4800 | 1
+ personnel | 5 | 3500 | 1
+ develop | 7 | 4200 | 1
personnel | 2 | 3900 | 2
develop | 9 | 4500 | 2
sales | 1 | 5000 | 3
- develop | 11 | 5200 | 3
develop | 10 | 5200 | 3
+ develop | 11 | 5200 | 3
develop | 8 | 6000 | 5
(10 rows)
@@ -3749,23 +3749,24 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
empno | depname | rn | rnk | cnt
-------+-----------+----+-----+-----
- 8 | develop | 1 | 1 | 1
- 10 | develop | 2 | 2 | 1
- 11 | develop | 3 | 3 | 1
- 9 | develop | 4 | 4 | 2
- 7 | develop | 5 | 4 | 2
- 2 | personnel | 1 | 1 | 1
- 5 | personnel | 2 | 2 | 1
1 | sales | 1 | 1 | 1
+ 2 | personnel | 1 | 1 | 1
3 | sales | 2 | 2 | 1
4 | sales | 3 | 3 | 1
+ 5 | personnel | 2 | 2 | 1
+ 7 | develop | 4 | 4 | 1
+ 8 | develop | 1 | 1 | 1
+ 9 | develop | 5 | 5 | 1
+ 10 | develop | 2 | 2 | 1
+ 11 | develop | 3 | 3 | 1
(10 rows)
-- Test pushdown of quals into a subquery containing window functions
@@ -4106,17 +4107,17 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
empno | depname | salary | c
-------+-----------+--------+---
+ 1 | sales | 5000 | 1
+ 2 | personnel | 3900 | 1
+ 3 | sales | 4800 | 3
+ 4 | sales | 4800 | 3
+ 5 | personnel | 3500 | 2
8 | develop | 6000 | 1
10 | develop | 5200 | 3
11 | develop | 5200 | 3
- 2 | personnel | 3900 | 1
- 5 | personnel | 3500 | 2
- 1 | sales | 5000 | 1
- 4 | sales | 4800 | 3
- 3 | sales | 4800 | 3
(8 rows)
-- Ensure we get the correct run condition when the window function is both
@@ -4468,14 +4469,15 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
depname | empno | salary | enroll_date | first_emp | last_emp
-----------+-------+--------+-------------+-----------+----------
+ develop | 7 | 4200 | 01-01-2008 | 4 | 1
develop | 8 | 6000 | 10-01-2006 | 1 | 5
- develop | 7 | 4200 | 01-01-2008 | 5 | 1
personnel | 2 | 3900 | 12-23-2006 | 1 | 2
personnel | 5 | 3500 | 12-10-2007 | 2 | 1
sales | 1 | 5000 | 10-01-2006 | 1 | 3
diff --git a/src/test/regress/sql/geometry.sql b/src/test/regress/sql/geometry.sql
index c3ea368da5..1f47f07f31 100644
--- a/src/test/regress/sql/geometry.sql
+++ b/src/test/regress/sql/geometry.sql
@@ -403,7 +403,7 @@ SELECT circle(f1)
SELECT c1.f1 AS circle, p1.f1 AS point, (p1.f1 <-> c1.f1) AS distance
FROM CIRCLE_TBL c1, POINT_TBL p1
WHERE (p1.f1 <-> c1.f1) > 0
- ORDER BY distance, area(c1.f1), p1.f1[0];
+ ORDER BY distance, area(c1.f1), p1.f1[0], c1.f1::text;
-- To polygon
SELECT f1, f1::polygon FROM CIRCLE_TBL WHERE f1 >= '<(0,0),1>';
diff --git a/src/test/regress/sql/tuplesort.sql b/src/test/regress/sql/tuplesort.sql
index 8476e594e6..997c6c816a 100644
--- a/src/test/regress/sql/tuplesort.sql
+++ b/src/test/regress/sql/tuplesort.sql
@@ -305,3 +305,88 @@ EXPLAIN (COSTS OFF) :qry;
:qry;
COMMIT;
+
+-- Test cases for multi-key quick sort
+
+set work_mem='100MB';
+
+-- test simple sorting
+create table mksort_simple_tbl(a int, b int, c varchar);
+
+insert into mksort_simple_tbl
+ select g % 10, g % 15, left(md5(g::text), 4)
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+
+-- test sorting on distinct values, in which mk qsort is supposed to be
+-- not affective, but still can generate correct result
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 20 - g, g, g::text
+ from generate_series(1, 20) g;
+select * from mksort_simple_tbl order by a, b, c;
+
+-- test create index
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 10 - g, g, g::text
+ from generate_series(1, 10) g;
+create unique index idx_mksort_simple on mksort_simple_tbl (a, b ,c);
+drop index idx_mksort_simple;
+
+-- try to create unique index on duplicated rows
+truncate table mksort_simple_tbl;
+insert into mksort_simple_tbl
+ select 1, g, g::text
+ from generate_series(1, 10) g;
+insert into mksort_simple_tbl
+ select * from mksort_simple_tbl order by a, b desc, c limit 1;
+select * from mksort_simple_tbl;
+create unique index idx_mksort_simple on mksort_simple_tbl (a, b ,c);
+
+drop table mksort_simple_tbl;
+
+-- test table with abbr keys
+
+create table abbr_tbl (a int, b varchar(100), c uuid);
+
+-- insert data with abbr keys (uuid)
+-- abbr keys of uuid are generated from the first `sizeof(Datum)` bytes of uuid data
+-- (see uuid_abbrev_convert()), so two uuids with only different tailed values should
+-- have same abbr keys but different "full" datum.
+insert into abbr_tbl values (generate_series(1,50), 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb');
+update abbr_tbl set b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb' || (a % 7)::text;
+update abbr_tbl set c = ('fffffffffffffffffffffffffffffff' || (a % 5)::text)::uuid where a % 4 = 0;
+update abbr_tbl set c = ('0000000000000000000000000000000' || (a % 5)::text)::uuid where a % 4 = 1;
+update abbr_tbl set c = ('1111111111111111111111111111111' || (a % 5)::text)::uuid where a % 4 = 2;
+update abbr_tbl set c = null where a % 4 = 3;
+
+select c, b, a from abbr_tbl order by c, b, a;
+select c, b, a from abbr_tbl order by c desc, b, a;
+select c, b, a from abbr_tbl order by c, b desc, a;
+select c, b, a from abbr_tbl order by c nulls first, b desc, a;
+select c, b, a from abbr_tbl order by c nulls last, b desc, a;
+
+-- CREATE INDEX will cover the scenario of sort IndexTuple
+drop index if exists idx_abbr_tbl;
+create index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+analyze abbr_tbl;
+select c, b, a from abbr_tbl where c = 'ffffffff-ffff-ffff-ffff-fffffffffff3' and b = 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1' and a = 8;
+
+-- Uniqueness check of CREATE INDEX
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row with null
+insert into abbr_tbl (a, b, c) values (3, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb3', null);
+-- should succeed because uniquess check is not applicable for rows with null
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop index if exists idx_abbr_tbl;
+
+-- insert a duplicated row without null
+insert into abbr_tbl (a, b, c) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbb1', '00000000-0000-0000-0000-000000000001');
+-- should fail because of duplicated rows
+create unique index idx_abbr_tbl on abbr_tbl(c desc, b, a);
+
+drop table abbr_tbl;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 6de5493b05..46359cb796 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -21,9 +21,9 @@ INSERT INTO empsalary VALUES
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');
-SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
+SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary, empno;
-SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary ORDER BY depname, salary, empno;
-- with GROUP BY
SELECT four, ten, SUM(SUM(four)) OVER (PARTITION BY four), AVG(ten) FROM tenk1
@@ -31,7 +31,7 @@ GROUP BY four, ten ORDER BY four, ten;
SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PARTITION BY depname);
-SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w;
+SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY rank() OVER w, empno;
-- empty window specification
SELECT COUNT(*) OVER () FROM tenk1 WHERE unique2 < 10;
@@ -1146,11 +1146,12 @@ SELECT
empno,
depname,
row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
- rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
+ rank() OVER (PARTITION BY depname ORDER BY enroll_date, empno ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
- count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
+ count(*) OVER (PARTITION BY depname ORDER BY enroll_date, empno RANGE BETWEEN
CURRENT ROW AND CURRENT ROW) cnt
-FROM empsalary;
+FROM empsalary
+ORDER BY empno, depname, rn;
-- Test pushdown of quals into a subquery containing window functions
@@ -1332,7 +1333,7 @@ SELECT * FROM
salary,
count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
FROM empsalary) emp
-WHERE c <= 3;
+WHERE c <= 3 ORDER BY empno, depname, salary, c;
-- Ensure we get the correct run condition when the window function is both
-- monotonically increasing and decreasing.
@@ -1510,10 +1511,11 @@ SELECT * FROM
empno,
salary,
enroll_date,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
- row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date, empno) AS first_emp,
+ row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC, empno) AS last_emp
FROM empsalary) emp
-WHERE first_emp = 1 OR last_emp = 1;
+WHERE first_emp = 1 OR last_emp = 1
+ORDER BY depname, empno, salary, enroll_date, first_emp, last_emp;
-- cleanup
DROP TABLE empsalary;
--
2.25.1
On 04/07/2024 3:45 pm, Yao Wang wrote:
Generally, the benefit of mksort is mainly from duplicated values and sort
keys: the more duplicated values and sort keys are, the bigger benefit it
gets.
...
1. Use distinct stats info of table to enable mksort
It's kind of heuristics: in optimizer, check Form_pg_statistic->stadistinct
of a table via pg_statistics. Enable mksort only when it is less than a
threshold.The hacked code works, which need to modify a couple of interfaces of
optimizer. In addition, a complete solution should consider types and
distinct values of all columns, which might be too complex, and the benefit
seems not so big.
If mksort really provides advantage only when there are a lot of
duplicates (for prefix keys?) and of small fraction of duplicates there
is even some (small) regression
then IMHO taking in account in planner information about estimated
number of distinct values seems to be really important. What was a
problem with accessing this statistics and why it requires modification
of optimizer interfaces? There is `get_variable_numdistinct` function
which is defined and used only in selfuncs.c
Information about values distribution seems to be quite useful for
choosing optimal sort algorithm. Not only for multi-key sort
optimization. For example if we know min.max value of sort key and it is
small, we can use O(N) algorithm for sorting. Also it can help to
estimate when TOP-N search is preferable.
Right now Posgres creates special path for incremental sort. I am not
sure if we also need to be separate path for mk-sort.
But IMHO if we need to change some optimizer interfaces to be able to
take in account statistic and choose preferred sort algorithm at
planning time, then it should be done.
If mksort can increase sort more than two times (for large number of
duplicates), it will be nice to take it in account when choosing optimal
plan.
Also in this case we do not need extra GUC for explicit enabling of
mksort. There are too many parameters for optimizer and adding one more
will make tuning more complex. So I prefer that decision is take buy
optimizer itself based on the available information, especially if
criteria seems to be obvious.
Best regards,
Konstantin
Hello,
Thanks for posting a new version of the patch, and for reporting a bunch
of issues in the bash scripts I used for testing. I decided to repeat
those fixed tests on both the old and new version of the patches, and I
finally have the results from three machines (the i5/xeon I usually use,
and also rpi5 for fun).
The complete scripts, raw results (CSV), and various reports (ODS and
PDF) are available in my github:
https://github.com/tvondra/mksort-tests
I'm not going to attach all of it to this message, because the raw CSV
results alone are ~3MB for each of the three machines.
You can do your own analysis on the raw CSV results, of course - see the
'csv' directory, there are data for the clean branch and the two patch
versions.
But I've also prepared PDF reports comparing how the patches work on
each of the machines - see the 'pdf' directory. There are two types of
reports, depending on what's compared to what.
The general report structure is the same - columns with results for
different combinations of parameters, followed by comparison of the
results and a heatmap (red - bad/regression, green - good/speedup).
The "patch comparison" reports compare v5/v4, so it's essentially
(timing with v5) / (timing with v4)
with the mksort enabled or disabled. And the charts are pretty green,
which means v5 is much faster than v4 - so seems like a step in the
right direction.
The "patch impact" reports compare v4/master and v5/master, i.e. this is
what the users would see after an upgrade. Attached is an small example
from the i5 machine, but the other machines behave in almost exactly the
same way (including the tiny rpi5).
For v4, the results were not great - almost everything regressed (red
color), except for the "text" data type (green).
You can immediately see v5 does much better - it still regresses, but
the regressions are way smaller. And the speedup for "text" it actually
a bit more significant (there's more/darker green).
So as I said before, I think v5 is definitely moving in the right
direction, but the regressions still seem far too significant. If you're
sorting a lot of text data, then sure - this will help a lot. But if
you're sorting int data, and it happens to be random/correlated, you're
going to pay 10-20% more. That's not great.
I haven't analyzed the code very closely, and I don't have a great idea
on how to fix this. But I think to make this patch committable, this
needs to be solved.
Considering the benefits seems to be pretty specific to "text" (and
perhaps some other data types), maybe the best solution would be to only
enable this for those cases. Yes, there are some cases where this helps
for the other data types too, but that also comes with the regressions.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
mksort-v4-vs-v5.pdfapplication/pdf; name=mksort-v4-vs-v5.pdfDownload
%PDF-1.7
%����
4 0 obj
<< /Length 5 0 R
/Filter /FlateDecode
>>
stream
x���M�5Kr�7���f}L
x �(hJ-R�eS$E����]�V����]�� �~k��"�>��v>YY���5|�q���s}M������������������_����_������_�k_�}Zn�������O_��:�c��?��������������?����~���/���_�����o�����/���?�_�����ul�:_}����������W_����������o����������_�0�����������������O������_���/��������_���?�����X��|�_�����������m�f���8���6��4��}Y��kx�2���:�u�7~����x���u��s���<o~��+��.WO�6�a<��kx���l����/E���u����u�S����k�yD��u���:������6�����m��_���l��5�����������8%��ZF
Po�E���<-��F���5g��}0�{�&�\�����}�6�� ���������}�6�|.�a>����9��P6��b��m�p���g��m�q{����z����
��CL��6{Ol_�s���A� �3�q�e ��>%Ep�! �;���o��r��������V/�11O9&x$��u~���;���u~����yJO �a ��d;^��=)���v���A�����F� HO��5����l[����)
����@`���y��C�N�������a[�a��l��������������4.�SS��C<w?(�] ��d^�dg���<������@`��������/l[���:��-�0G�6[B���_�����5�r���l��������m�����nQ�1lO ��9����7{R���sz
{��~����-!�m��<��}�;3���%,z�C�~����� A��L�k�k���4��a����<�'�0�Az2�U�ul[���� HO�Sz�
#H�'��:W9;e�~?+Q�����F� HO��5�vg��5���Q����<�'�0�A�f_�]7���|,�u��e�-�=9��~��dD����h��$��m;��:�)
���������2��m���m6| ����yJO �a ��d_�b����<������@`������l[�����)
����@`���6��l[������-�0G�6[B����}M����m����_A�~����E� HO����v���5|N�E����<�'�0�A�j>�����u� ��v�����_�\�����i~����� p� A� �F��:7;���������Azb��lA� =Y�G������7��-�0G�6[B�����|���?����z�?� �0O��� A���k��������s����<�'�0��z�
����l��KX��m���
h!�m�����������z�?� �0O��� A���k�$��y|M���P~P�����k�d�B#F%�����/)e�����A� H?h)�l��qZ 93��:693e��]��1S��)���$��m�_��a������Q�����F� HO�UFK���5�/��� =1O� 6� A���k�RS6��cy
��-�e��"D����93e��=��8�Q�~��tF� ���0�����m�{�^�\�e��e��E�z[��O����q�pZ�7���8����PB �v#B%��D-]�~X�M8�C:B�"�{V�������@P����U
� �A����o����m
��k��(HO�Sz�
#H�'��GK<l[��@�yDAzb��lA� =Y��4�)�����}�� =1O� 6� A��l�k^�:)�����.`��yJO �a ��d�_�.g�l[���Q��yJO �a ����^�)g�l[���M HO�Sz�
#H�'�����~����'v�VG�����TC���K�a��T���n�}2�H�I�k�����z���V�}�=x�� ����# �$
� �Z)&�/� Q =��H18��R���i�A�J��
"@�6 ���T�(��pC%H�2�� $�b��$��������+A��r�z��{%Xn#��& ����iF���b���J�(Tfv.&����aF�!�b�A�J�(�e�-&����OF���$C��`Q8t�(2�YL6�Y ���b��dC�� Q0d�(6�XL6�X ���1G���+Ab`��Al8��l��$
�������*��p��7~+wz&����)F��b�!�J�(et�A8�T,6<U CbCG�d�K� Q0��(6�SL60T c2bc6���
�T���p
���K1��L%��@;��<f��?|U"�.��H�_��_�x^����vO����WLL^������?~
_�����_����5�����k������5������o�%K�C8��m� �����#���qf�w�2�?c���?�����v��U�;�����4�v���>Q���Q�"��|����^���q���?������?������Z���������9U��(�6g�5�v,��X�P���w����b�K���y��!G,�d���s|m��W9'�bO��=����c,}��K�~����2zj?��c�ZO����=`e;��u�z^s=M�����2-�O�=�_7��u���W�������c��������};�m�Z�����8��������?��_����e�{��o��6�����������_u������0
���?��=�w:��]���w������M������������.�k;�y8J����M�\�W���';�����r=�o�q[��,�m����������������������������_������s�~�����������};��������-�yr8��������<��\�a�mz�����nz������.����g��3^{�����������\�li��Z�l+����(6�?��6�?[����?������������������W��������^�F�m7:m�r��{���u]��.�������{�.�����!���_t{�T���Z��������n�xx�������}Z�%N�/�+e�r��p�~"ig9������V{]����A����~�T��zl���|����8K;|�����:,_�{4�1�7��s�����9���p��(���A���@���}�c��0�
���ml|8��p��qL�&�}Z��y:��k���v6
���2��z�����p��6�������������N���\��U_�hS{'lrI� �������\vm�s�f�=��v��������6�Q�8���:3�d��M; ������
���+V����}L�-�x�/����v��w������0���; �J�-��w}�Fw��!�=�h���w���<�vk��|�9�&<�h��]��S����v6����n�5�I�o����U$�u�F!������D����Y%Z��[yq�����q����e�o�Q�.�����]����w�w��
_o��[vZ�f����`���K�.�5w90���s�; nK�3������U�7]W�[�&����rC�|������#w�����
��O%s5k�M���zBD�&��� ��c�n�}]��UO�������q�!Y{�x������."Gi���
��[M!+��t�|�.��l��{�dR�'�N��/��=d��������a_�'�d�J��St ��!f��z��.q�J��Ho���1��qj�9<6l�<N�����z?�#g��u��{��?�yue�c����c��#���q�z�>�y�v��}^5��]�5��>��~G��m��z��9�������:%�u�����ms���;�(%�zg�s� �:=���Q3wr��
v��������N���q�T��HI����>�������Io�?��{�@��b�a���]�I|��$������k����M����S�)������������K�1L���]���{��e����G
�Ho}�9x�gm>�3����1;Zva#{�]��d��w.��)o��<�dJ�L��i2��^*#t����tS�^��z��^�u��/�y���d���F�u;�����e��z���=�;����s������-���a������o���w�����m������s�cz���Z�����_�bf���5������q�����FBfn$�9T ���Y�������fF.�)U2�����hF��nV����5��9^�H�����HH�������M�r�Z���.��e�D���a3o�AJeg+����|P�I��c�����KA��D�Y�c&���i�%�-i����������"���P�ix��5y�����������y�����i��<�Re4 �
#���N�����^����s8-���������%dnv�c�~����S7ww~�!�;����9;.��C�p�d���~����v7JF�HY�:�����_�i�$zq\?���q�<���L������q7�������Q�����{����Kj>j|?��x�cX�Us�G����dU�$=����7�Y���Ij���m=w�����nNQ�����3��We�F_KO`��A�����?6��xm{�l���Y�����]�f<X���[���)o�2��g�U>�khs�I��I�9N��`0���a��^
3�:��c��7�
�(i`��t$�S��B����x�����w�����2����w�����<��e_o]�=F������$��:y�;�����������&_�lu��H����R�k�S-�U=?"��FV}l)��!�����x0%��o��q�_�09��
r>�%��:��R�����������o�^�y~�2�q�O�t]����o�L�;�����l�?d�z�i]�{����K�iz]��7��^?����~[�1���<=��A��z�Nw2��q��zN�^g"��?_��������k�7���y�'�G�}y�{5;n�y�{�\K�=/��F�kiy��Z�����^ vm��N�
���������
��.�r�x����<���R3'���.���q��g�:���?w��&>�)��z���w��-�}�{��p�g�=o��'�zk�;>,�7��7��_������<���f���i�.��7��^��p�+Z{R���;���]�[U���{�W
��'v�lp�y�w����=������+��xyL.���v��`�;�A.���On����F4{�|�3�=��������g�sn��K�R��se�� ��F2��s���7f��qb�%G���>��?�������8���Q���,fG����,f���<�^�],�/�X�������2��j��Y���\1�����p����}�]��������K�n���E�i�T����^�E����w:O]��'�d��o��n>�%�%�������]�9�'�������vCl���������NR�������zC��R0�K�'�s}���������1�����{���v����q7�i,��/\�?��� �K�-k��7����G��^*�n����P^���;.~eD�K��/���!�B�����b���+����}����;�w�'�������z#��?l��� ��:�����������}6]&�I��L
9pu��k�~R����X��~��o��������qq�����t'�qm?�3��$qi'��v��S����� -}�j 3R?�/����E7��;�z��9!�@w����t
i�{�w'���6��G����s{�
�q�*=��>���}~���=�~���r���>?����]p���}�W�y/��v���9�f���{���,��g�^C�
��JO����7�K�H����dL�,Z�4�z�4�vS�d��B���(�6�����`Wa�{�u�X��s��{�(NqA��9/�������3S ��t<�Zz|�w��eY�>���~u��E�}�Y$[z@�#3�����Dre>&��C�d�iv%;����q��kD�A�<NJ�����}BD}d��`PA�gg��j���-YQ&�YV�l�����������8� �(���O���M��]����f���������k��^���!��WN]��f���{Q��B+�oOz���U^R_;g�e^��yg���)����c����{b��u4�xV ����m�$i~v�v��������?��'��@�2����I����X����Xd{s�1�q������Z�C�+�9�6��h��vL��-Y�L�k��fm��x�b�~�������nv��)���>uuS[�C2�:�<25�;���R3m��$��m���#(`�l�o���eis��� �1YB����yZ�?M���i�i����n�Y�?����i;��3������&�������� ��loG��
��7Y5f�#������2$�������n$v�ff�Uf����\��)3;�5L��&��.��N'f��y���M�Y���"�$Nm�Z�|����}Xf��[2���D2Us�5�J��@�j�������=��dx�9�9��O����k���T{i����.o0�����e]b���B=�;���q��u������y���-��Mt..�)x���������f`��+��������)�.�{% M��M7^�G�=0'���5�q�t>�!�:'+]Y���t:��|�8��X�
��6��5����v���s��2
��-x���l����x��2{��(�����w2����3);�z]������k���
Y2���x��3����F���;���u�s'm
�=Q��v����U��I���c��|�Q������Y��VX:�C�����g�]/�=^���w*��/��{��i~�������M��� �ig�F_��0�r{'��ii�������\{{���[�=8`|��nZoy3��G,[������{`&v������`����*��ay-��h�7YI�b����<*j��`�]��F�Kz����X�g���A�Kv�`U��t;�����NB�l��`����O���TR�{>�����N)�aJ'�NC��[����GI��Se<7�J�/�09�|I�N1br��|����3�gX����f���� ��;��\ s����\dg�g�����-������]������3�wO�ej�W?(0Q���"���7�,�6uY�7��=_a.�'+�a��N�q��{|0; ���xf��u�n���"������L1�;ms�|�`�w��_=�yA"��';��L�o�+�y��@�H����DO������y��������X������3��}BEG�p���/�t?^�p~�B�0�o�7lY�'���0��f{�t����j�����
��y7�G�Bj�{�u ����y���y��T�����������s9=(�����e�3�w(n2�R:q[J����8������]������uNOS�h���.���������,Edt������B���wd~w:��&����=6��N����>�nx�� �=���s���x\p�:������(}��02�s��q����/������S=��� �I�*[��������p(%�����?�>�U�]�{/�����M��E5�������xO�d�,���[�|���3~a��}��8�vot�i�����0}�=�����&u��/d�e�tQmf�D�������^�
��<���6��.Sl��m��,~�A�syL�)�\h�u���O�q4���/�"����z��.�5�`�����o�"����Jr�w�;#������/�33w{�O��6�(��.����F��U�/�fg���y�Q)&~�;�E��y{���\d�Q�o�"���u�T s=�w^`e�w���f���P�Y������@
��= �y��+����!�N/{X�9���s�����������(S<o��(8S<�[��������}+?��JXe�Z�eZ�l���
�^*��;}E�{m�o���;����N��d��������y�G1s����d��1l$�ocv����R&�,y���7���7q3y�z*e2�W�m{W��y>$[��eMVnm����������\�P���=_��y���,��t���M�����3���6'=�U;=�V��i������P��f��\�qi��f���v&Y3���x�P�����N�a�p�s}�uh?� E������T����w�-�?u��5����n��[2�A����O�tw&��=M�x����H���j�������1 �,�����5dy�f��C|��D�,Ov����i]^�'����[���5��Xv���f��l��'Lg����`����/����xZ~|z6C:w:�����~��oSK�t�:��L����`��yt����-7�����eu�mI'����%���tK�0�w,|4Y`�2���C�qj������;2�����>�����rv�`���6^��������y�����sc0����<�_�����S??C��~n�}����>T^�`����>���O���g�-d��O�B�v�z�B�� ����m|yl:g��F���o����irt���U�|��c�Z�����a�����4�<i���0�N���c�9��O��������~����
����&���j������i�y~ :�� �;M��XA�gvfyk��|��$Ov��k�L�����V�!3f���{�GV�&�f�v����?j��
�7v������46s��;%��n�!��Y�|D�;�c>jb�����Z�s��L�����s`���,9kk����y����)�����Y�=���������a�q��#{�W�5i;v6h;�;��&y ��>4��3���������H�N7�_�]�;�����Y�������4
B~!����������}�������������w������9��������9~�6�������^�9��3k��R�:=���w��zj�~�3c�G���=�U^�������&��E���q����$d�{#�}={9j ����^u�~�Z��"��Q����o�����g�Y����4���Xg�s�c�������I���������/o�d
E�?��Jl����_�`���'����#W�������a�=���)�/�y|������(&��n6�tQ�2��ww�������������|P&�[Z7�5�cE!����-���P'�9��������X��LM$vn�L-D^?07����o!�����tN��� ot�O���� ���������"r0? �/���}�E����V:|��B&g�fn�c�"�����7r��c����������3���y�:u����bL�����`j�������7U����]+���rvs4;�������}!�rL-������q10�w~F��/�O���q�bY+XGR8(R��s�b�����v�������Y���=-�]���j�c1���i�f+�}�t�s;�kx�~z&�j��_���Gg0_��{�09?,��Y���j��R�:u#��!NQ�-�R%u�S�w�a��.��5���u8��-��l�m�������&�\/��>����={��m�I���91�4��yDe\
/����{��O��i�X�������;����l�������T�/������m��j�������6����3�p��G���.ZV�^�y������Z.���5�cu�,������~o�
4���/�G�2}�V���!�7���#e����������l�a?�7po�����Bxe�N���zw�W���e�8��b������J����+������.�'�8����G�8�/��9<^�;L��������8��K���O.�'�������a�o�
�#G�8�f�{iFw��IO���|��1/�T��G�w'K0�Rwo)��wd�����Rq�����E����,3���<x>���������;���W������V��e������M��h��|���O��x� ����Gy��T�x��t���B2����� �z�|�yd���:���Q�x�����o��A�N��dqq�8�;]�<
���,��<��� 289d��ZK��N]����]2�Q�l��g��!xM&c����{�@2�/?���.�;�|�z�w��O�`z�������o:1���;r������{��y{��,�t�S/����l���]>;a�F����ag�&vK��n^gq�����*�6��������S�t/{?�L������\�]��x�b*fmg�9��sg��Rdt�����tU���������p��H9���l��!�C��:�$�[�Ch���L���$������2��X>��l��W�����?��]{F����/�%�od���zc�j�{��.��sg{�����t�N'�����^����1��c�{�������fzb�L���:�T�To���G��1]P���2����Yw��gW�y��s�q#g������p��T��eI�v�q�8;3���u�D��������0O:�1E�z���*�i����q)���#yH����������\mI���b�%��_�����P���D-���>q����fi�{y������S�a��S����d?i=�,F�|p�����C��:47V���+�u�{���i�5�z���?�����?t{�|`Y����(�;�~�;��������8�9���]����i��G@w��`A��fe>j�q�A�v�J|%
y���}XP�3�{m/���$��l?�ey��+�d]��}(eZvw��rw��f���M���K����$+;w�N��LN���^���9�y_���6�����{��y�k3�;��r�^%�!�i�I���k�>���
����8��������x��Rg�q���L[���$'�s������
���e��sFz��1������:����g�x�0��Sjfg��:K����J��<C��7�a��O��aS�i��y�e�����B.�I���%��9�y���.�g�S�g��y��=�� ��2�;��g�����y���\���~��%��F��P�ci���O����-��������5��|E�g���������J����vw�~Bb$���Y�g�qKg����9�����p/�"5w2�)�<.z�������������E��DZ��������Hd�_�z��Y���9r�����e�S��,�4�9��f���"���������+cz���jSG�m1F�y��'���~����@�d�o��NV���@���4� ����}��?+p����[��dv��9���.���#�4^O��M ��z�0�8��[����\)�|���,��;�2����[(��=[������?�q�����#����{�Z�����+Nw�gc.�=��|��~��'���9QfF�'���L�=��������y���&��f{V���S�R��{��W5�M�U��A�~�.N��o���dp�7�N�����<0���ph����=�:g��.q�;����.mF��U�b��N"]7��������P�k{'�������){,�!#���?�(|e�v���������oLW��m���L�N��N��3q�s�j8���S>����{{����<�Z��}d���]_u�QgD������K�L��n��Z����V���}����s�����8��_�?L�������w�G�vZ�K��;���^��u9����0d����=5$i���8r4=������yK@�����%$Yfg�a!�o~>7���(G���"������d��Z?�|��d��Z��;���h���^�;8�[�w<6��jw��Z���}��R,������2�;���Q�Z~L�#���'�Z��T��>����
?:��a��2�B���A��o{�Z`&N^(����h��g���=��O.���]/�����t}��x�E�g�RJ;�=U��������A��:����zM? I����9���v���Mg1Z���X��2}m._�a9�������t�N_}�|^���>�#�7�����'{ ���v��F,B2g����\���|~Dg�����xs�����k�`����J�}V�M�3����N���h��gs/���\VgM��B����m�����vL�2�����l+��;f�]�j y}�����eb���:��goq�8��)��>�df���d�[��i�6�[���6�cB���_�L�v��C3���/�|��X�����=&jn�<�2Q���X�ej���7(fj��T��3�Wq���������qI���&�r��Gc�[������J�={�!������kH�'���N ����E���I��t�m�w����h���y����9����I���Q�����g{ 2���w�q ���r��$�v�����/��U����ai��������/_v�����;�
���H�H�}��7����<�xF��~[uLIp�d?�IY���hh�;��������n;G���L���2�����o�&{���{�_s�LQ/����O?��oX����T'_U�O�6�02���|��s�7�I3J&�T�{����pi�?=�w��A���`�sv{o` o���q������;5>�Y����d����`����y�����������q�W?�����u���e[��L��;�g����T�|��E��I���S����Z9���1��������~p�E��N����y�q *dv{���X�2O�r��%b������\�����3��t�3qp)�i�s�v������L���������#+ss��%241������i$��:��M�L���\��J����j������u�>��8{��z�����b����h��D��?~�^���9����>�T��N����I-N���/�}���Es����;;�Wj������������^f��w��j�����N�����^33_�3s7��e��0e����0
;mw���c���V1�y� �s�a
�������(|��c��no_C���,��v��l���$�L�N��W�t�p��x7]9u3=�C~�M����qe�w������E�������"��������Zm���L��E�������.�92L������H�v�#MNO���hL�_>�������45fg~8{p����-=����������s����mk���{{�X���H�G����o��������!6���5�V���#w kH���Oq�)O���):$m�����(����ud~���4����|�������������M��������_�5]#��g����<L���|^���G�i���@������'L�������s�m��$U�2n��v��)f�w�2��;-����$��0s�m��-�����4�C(�g�w/�x8���Q�p&�Y����G��>��I�C���3]����8�C���t�����|b���L��|}��k�&��C�p�3s��oq�����C����dy�'s�����p9{\��?'���Q���]���8��a����I������q5w��#����������yi���o�%�?.����u�k�~V���+3GV���fb@<�=1�����F�|��N3���o~����v����3����U&�'��a����<��^��JH����
�0���/����.<���:����/��� ������������<C�fd��:nG
9���F�&�������f1���de�[�O����sYg���H��C�<�������R�s7������9��/���Y<7�]��j�foy�&�wy\��>�bR����F���H�]������$�2E�co|���u����j��Y�Q�2;��~�a��B'uqJ��&�~�?vL��������m
����5�8����g.�]���K������:,2K�5"�1��L����u:�����L� ��v���t���~Z�7 �S�v���ai{��������Y�u:�{Vy���,��������s�73��t�v]g��{]�k+��;����c��T~�������O�v������?|�|�?.F�~������N�YeE��]�||�����O���,3�G��R��l��dr���2����V3e�'v��=����"$|{���`�r����s�e���/���g\��]95�;nk�.�eb�&���o\�s���f�w���33�{�c���8�6��>v��Ps#]��J�1�<�pn\����J���o�������8��}^���3/������eb^���q�u^�[��pg�u���w��v����L����(��v__���0�}6�>��A����g������/����S�I���� /�?��i����{o��[���=�������������U��;�z�w��� p�y�5��C=��}��F�����q��.[�+w7��U�LM��S��v���L�U����6��]���~h�W��`��t�M�x��[�%�,��;HV�O���vy�������1\���
��IE!�L ���j��t���~^���=t�.�3]��3�����n
��+�L�}���V����`�^3w�w��p��^Y��:d��u��0�w�����"�T��`��n�:�����������zJO��q�]O�<\O��������������������V��������8��k����k������I��h�d�������c-�k��7^�������i�����R~�z������g��Y�3.�g�$��V���W�u���X��Q$R�n�/������ d���b�����}����w�?Gr{����#I��M�������j��C%i|�������2|_�eb�l���<.�-=��J���~����4�^��CG:��A�wS;���\�z��&���N�d����5�1[(������8���j���<����=_�*��o�������q����y#���������.��r�� >�7��0����~������������g9^�����z~E��4���q�Z_��V�������<���*����na�-4Ou]����c������zb�7�"o�y����H�{���f�C��
��#},�#������s������y�f��-x�_����9����>5��.����?I��?��:���g<���I{h�/�)�*�q�,�m?��kz�?�����s�����O�[e�0��X�>{���������'���X��m]+��x�_��5O������r^�����u]����}z�U����oa���?|��/s,��^��H�z�����^�vZ�c|��d�z���D8��-�"����,B�}��F�^�t�^�,}���?�y����x.�MD����k��5����^�������|����5��9�p$���#1���O#��E9��������y��b���Z�Sc���tA�p[\��5�?���W�~�E�L�k�;m/�����{a����<���� B�h�y|m�����g�J� �e�
�Y�~rmG�vh���������Fw�p[\��5oz`��=BLA�-.��^jA��5�#���1����#��c��G���(�kX����7�1S�E�l�ks�v�_�����mqQ��������Pf
B�h�}z�zl��J�����]��5�7�������b�(����xl�*5���S��h��k��X����5�9�R5���M
a����7����y���n�D�����iX�i��CU���pEk_��R��i8_� {3���TQ(�2��xM��^;G�a����'�zM���U�E�L��8�+���U��
.����4��XE�X4�<��5��/V
a����7��I�ep�*l��XF_��e���!�u������i|��B�h�up�kZ�k�������z]������X5���^����5m�/VQ(����zM���v�Qo�U������i�|��B�h�}~
�����R�n�����5����)��Q������|�jS]�����.;��+Va�4�9��5��{xZ�a����7�z����U`�(��,��QW���mp1_������!,u������y�}��B�h�qy
���5n0k�z�6���`�4���b�(��+�|�>������]�l�'_��p;4�<�
6_��`5���`���
6/��Xa�4�2�
6/�{� ���`���
6_��7�(�FY_��k����=�l^W_���|��+� �u���U�����+
��Q��=t-QW���mp1_��}~�
a� �F�g_��}�%�!�u
����t����]��|���=�S�a���|������f
B�h�s|m��QV_��mqQ6_��a^�� �D���U��P Rou��d:V�q���n��r�*����)��Q��W��a% Ro/u���U1|��LA(�2����s%s]����U�,to�z�4�<�*���%H���-���V6S�E�,�k�{-1�(A�-.���VC4S�E����ce=6 Rou��dY��/+^�)�E�l��cX1�LQ�:v���ceY���K]���|����K�)��Q���1�/Q�0�u��W��������Ac���{-^�� A�-.�����2S�E����c�� Ro�u��d9��h
,eF_�0;\�a����7��a>�����ul_�0��LQ�:v���ceV�x����c���:��Bf
B�h�i�uS$J���]��Xs����Ac����t�����p[J���q����oPwMx��p�m�>���F�����mTt4�g~ ?�>�����h��H�l{����w�>� ��������F|�����G�����E ��6x������E ���]2���Q��a�{OE�F!|�O����{~7�� @~������w�> �G��G�F!|��O�U)��w�> Q���hp�n�gA��D���S��Q�D�aR����(��b���ph�T�n�'Q��D�K�X���B�$
��gJ�F|��O��YO�Fas��N����Fas��0-�����,
a��i:�6���������w��Y�:\y���(lN�l����������,X�l�"Sq�Q��Y}�1�&��7
��(��5$m����hl�wsV'���T�m6'A_q�����(l�����e�s�Q��� ���������,
a���;��������]�1�&�����Y�:
��;w��I���mb��{��9X���W������$��+�a6�T�l�������������(�N�]����(lN����fKUw��,
a��:�6�������:w��Y�:
�l�8�6
��(��]���R�.�9�BXGa����n��9����e�M,���8gA �d�U������$`��zU�m6�(�D���E~���$����a61T�lN���~fKU���,
a�6Zy
�I��U0�lb�*��!��0�
Vq�Q��D�}#g��`@s��NW�*�6
��(��`��RU��,`�l�
Vq�Q������
Vq�Q��E!��U�����$��*�a6�����I�:\���(l����pP�EF,G�s�p5�0��6gA�Dp5��n��9���f�M,2b9�� �u���X��Fas�����X��Fas��N�{xR,2^9
��(��b��h��*6�(�DpU�"o��9 ��*f�M2^9
�� ��b���RU1�9�BX'��by
�I��U1�lb����!��U������$�����6Z���8gA �d�U������$`��W������,
a���k+�6
��(�u"�:Vq�Q��E!��������$����a61w��I���1�l�x�6
��(�u"�:V�����9\3�F��n��9BX'��cw��I���1mb��X�s�N���V�m6'Av_�����c`s��NW�*�6�� ��c��� �m6� �(���U�m6'QV_�����c`s��u�0,�����,
a���U�m48'ANW�����c`s��NW�*�6
��(�u"�:V������������n`sd�u�\3��~�������y=�
����T�G�(;�]q��w������;�=��m��C��GE/
�]q����:K��A�������I��@zW�����7=�B�����z�m�:���;(�wp�����K ��"�+�w"�����;J�w4���p��F1z ��;�^9#J�(���zG���w�H�`"�a�+��J�wtz�m�:��A"����N����zG ��&�;�U���A!����N�C�E�%�;�@�D��XP�)���z'F��(��PH�`!�aqGV�$�;�H�D��QzG ��&�;
�u���Dz����wT
����;����w�H�`"����w�6_`����%F��;8@�d��w�V_f����Nz�
��t�Rc���u�!��J�wtz'���(��t��c��`-F� ���Dz'���wT@�h����w�H�`"����(�f����Cz'���(��Dz�����w�6_w���p�D~3B�������<J�(�����U�w�H�`"�A+��;*���Q��l����������;\�Qz��&�;����R������:J� ���DzG���w�v_�����j��;H�w0��Q�|�RzGi�������e��A���N�]�RzG ��&�;\�RzW$�w�d�N����X�{������l����������;\�Rz��&�;
�u���t�
Fz'���)��Dz���W0�w�v_������ {U����Bz'��`J�(m����U0�w�@����mW���A"w���MW{��QZ\�!w���k�r7J�n4����j�r7H�n0���pO����W}��d�U�n���`"w�U�n�v_}���pO�������w�mW��Qw� ������r7H�n0���pO����V_��Q�|�Q�V$�n�d�MW��QZ]�!w��{xR��t��c�MW��A"w���MW��Q:\�!w�mW��A"w���MW��Q�}�1�F���G�$p7x��d���n���hw������
�L�n"�z�U�Fi�����)w�D����)w���
D�&�7f�%p7��� ��"���M�n"�:����������h0���
&r7\S�F��u�����)w����lz�U�Fi�u�����)w�D�����)w���:F�&�7f�[����q7
��c��(���w��1�n�N_���A f�E�%p7���DpuL����1r7�vuL�$r7���DpuL�%p7���DpuL�$23���Dp5H������$3��]��D8�
���)
�,�f",��0nVR/��DXg !�
�- _
�����$_0}��������/
�]p���E���.!���`�`�����0J `�S�8G�cM�`*,���`����"�Re�C#�J!Xf)� +Z�aA"���K���#,�(����T�V
C����,U���dQ�2P�*w��!%�,sf���dx��"�,z��TY�1��<�.-U6wt�hQ�2��)@X4��D�E��*��ZT
�2K�Z*L���p-H�Ztl��J��-J[Um��e@�r)z�TY'���LQ:�b`h
YT���R��Y
�R��S���$�28�&�S����"�RE�** T��R�UeT����"�R��A%&���D�D�)@J4c��U�� �*�"1�$P&���L}uP�Di����&U\uP����.�&U�:k�RX�Y
lR���M���6nR���M���h,R����A"-�����!��Q���`�HW�A"1�������eF���64Re�;�R#H�F4����r#J�F�8R�U%GE2r��#U��8b�oJgU������H$?t���C��J���?*�
���]�?���B(���W� �*��W�=D@��
���VU�@��
�(M�@*���� ��E���J�(-�>�����,�X�� �Tq�Ai$� ���T�����q4J��B*�
�D�]DB��
�LhM�9*�
�<�x�� tL�}�P��D�."U��/1a,��ZU�:�l�B(�)�Q����*�rJ��;*�`b������vTqB�$P��eT87���@"���`FW��P���`h����e3��fh�Q���3�@g�<c�����H�3t����=P
����hTqB
$���WT�q�X�W(�����
�
LBX(����EW!��P:|� d1P��,�HY�"fQ�U�,���BhQ�U%-�@ZhjQ��{��J{U!���*��H�$4��p�����H \�JL}�PVBi�*��U\�PZB��*��*�#0 /�^b. U\�PbB����DW!��@"3���DW!��P51��*�7�rH�&t���j���q���:���9L�NL8��M�
���b2xb�2J�'P����Y�?�D�
�BL� A�b�X����c"�!
L�( Mh1�B�&b�]���@�R<RD�'�%*�\(��RD9���*�E�q����Jy
5*�Pe��$D�$ ��T(�����
5�*�Te;�����q�UD�V�`�������r�����dp�F�Q�Z����C���H�"JF5S,����e�E��p(�KY`4�"�����Fr*�*�*�e�FC-���@k1 ��&������j�[�3�"�/T��q�q�J ����!�.c.����,��.v���j1L�]Lx� �E_F���vT�D��sEwM��F_Dq�D��I�/t����
�%C00��OS�[q����h�
�����CmUM#���I]o��"���,��X��1��UY#�/+T\���h@FWV���$C����(�1��j�@Q|M��v�\>r|%P0C��|Ff� ���(�Kgh���b,F]���������)T|���)�hD��
��EJ)���(�1���>rQ|Y��]TS��jD�U����PgUY����+��
�l�����(>����1|qQhC��
|Fm��Uq���d����*.5j����������%�74nUq1X��P\��G�#������!����+�%R�_Y��g
�E8}�8���B�u�P��rD�����V|���9"���8����������%"��tD��E��i�:��P���B�u����uD����V|��*.�v�lUq�������Q|q��C������S��L;��"�G_\��.���G�#��������q�����+�%C=4�Ue1�c�#V�=�����+����{�#���W�����!
����{&j���PQ|q��B��G�#��y��c�}D?PHz����d��F�Q|q��C�� � �c#@��gH_\��.�����Au��b >� w�Td�^AA���B�u�4>� n��� h���Ox��*.T~�PkU\ �������P!(�@tAq��H0$�/.T|��cC"���p���!������%D4���Pq���h�H__��C��k�W ��� �Y��<nQPJd�e� XX�A�`��<�};)�Q�� ��O'�u��%B� k��M�,�a�� ��&�l��%B� k��G��� k2��{��JTLQ�$��S��1NT�6��8u���yqG]���M��s(*6�.��$��&��u�\��Ai$ '
�a+���c*��`�
���PaP�������J���RR0�_T�\��pI&������J��'!M!n�1*J��x4C0�_T�\�m�|=0��@QQrE��V�*F�`�Jq�Pgi(�+���B0���H*F�`���`b��1'�b� �t�}3t[T �\�
��P[U[�R01�����h�001��(�q�R�s*`LC�UAl)k|t(d�"+��0�e`L�U]l)������T�������H�`���.� c� lQq<M��'�M�}�l�F��|��_T��w�6
�U��L�d��I(�4�� �G�k����4���&��2�cT��+���(C]2A�4�Y����� �&����Qq`N��d�c� �u���0��`�
��Pauj��cKQ�'F_0��G�h���)X���2�� V'��������._2�1PT�X2��,�`�b��'|b�?�l��KdC�w������=����� �'���G
i����:|�l�f��|��_T
]�
��P{U$[��f�~� |QR(6_6�1PT�X6��$�|A�������}h�Ee�e���PF
a�
���B|��_%��&���!)�/*k,����H!�Qq�Q����:|�l����|�R_T�]�
>�P{U&[�'
�P���AT�/�(*S,� ���#�Qq�R��`��Z�B�R�_��U�S�c1F��K1��i����0�� �q��P���/}���t���4F��K1��i ������ �)q�_��P��55�^��2�� �)�0�����._:�_2PT�X:5%��K��P�ww'~�PQQ*F_=muA��P�}�4�HAB��8gU+[J���K��J(C�06�!VO�Q
u�b��P=�G%��P�����T���:�Z�R�X=�G%Y(|A�W��u�;��P�J���K��J(�CHcTU����U)c�� �G�j�E�|V������V��1VO�V
uV�����z�J(��06�!VO�V
u�b��P=AZ%�QU��>��TP��"Ue����z�J(|+�� �|��> �q��V�����w����:�-����x����f����M� A0:k��DP�����f*ta,�A�,L��e�`l�A��y6wd
�"H���D:�0� �5����l�-���W��T�b'*Jg�X�,�X���tV\�73�X���YIg���1*���qu��0���`t�&�Y(�ba����b,5�
Y,CEE����b����Y�.���2PT�����|*d�4FE����f�b����Y�-���e��(���t��X����7���2TT���H:�X,�QqtV��;�����b*C�P���Gg�����X����Y1��h0� �����b'*Jg�X~-S9���R����Y
������L
��rTu��(���t��X����w�,���2�%S�,c��E��Y1��Tv_!��Yq��),��8:K�T�La��)�L��Pc���Y���h
�e����h
��b,��8:+FW4
�2P��d
��rV��,�d
��b,��8:+FW2
�2P��`
��b,��8:K#�,��*�-E��}�cT��q�
��X�����Y(���A���|��@Qc�:[a��Og�X~$SY|�lBg�����X�����)t��X���,���T_$��Yq��),��8:K�\�Ma��9�M�������b�eSX,Ee�eS�,�X���tV\�l
�e����l
��b,��8:+F_6
�2R�P6��R!��/*k,�Bg���1*�����/S9|�lBg����X�����}��PQ�c�:�0� (��/��b(*S,�Bg���1*�����/SY�B�R����U�SXl1F��Y1��i0���0��)t��Y���tV��t
��1*�����N����%�N��P���Gg��K��X���K��Y(���A���|��@Q�b�:�X,�QqtV�����e��(�����ba����b���`,#a�S�,����-E�,�Fg�����P�X=��R9|�lK��Bg���/*����WOa��9VO��Pc���Yq�N.,����tV��z
��1*����WO����1TO��E[|Q�t������b**c��Bg��U�l)k��Bg�����P�X=��R9}�lK��Bg���/*����WOa��9VO��Pc���Yq:Ke�jeKQ:{,�y���`����y��t�L�4� c$Fg�S��`0�x��t�L��B �E� �� t�l�-� �5�����X ��Y�Hg!����f�4�ma�����*U�
Y,�DE����B_�����f�B�@QQ:K#�,c�0F��Y1��x�2P���D:�X,lQqtV���S!�e��(��� �X,|Q:+��%��X���Y1��O�,���(��=��X,|Q:+��pc���4��B1cT������X����YIg���1*���qw��0���`tVLe���X�����}Q�PQQ:+FW
����b��QX,�DE����e*gU[���b4:K0���P�I��T��.���4��B1cT���#��2RT��d
��b,��8:+��C���+dC:+._2���Ggi���),���2��)t
a,|AP:K�XMa��1M��P���Gg�����X
�P�L��T�bi��K��Y(�ba����bt%�`,a���Y(�ba����4��R����R����La�0F��Y��`
�e��L�`
��B_�����La��1�L��E[�Q�tV��G2����� tV\�h
��/*K,�Bg���1*���H:K��E�!��/��b�����4�U��PQ�c�4:�X,|Q:+6_6��2PT�X6��B!��/Jg�����X�����Y(�ba����b�e�`,#aeS�,�X�����)t��X������2����� tV\�p
��/*����Na��9N��Pc���Yq��),���2��)t��X������2��*�-E�,�[U:��cT<��/�c)C(�Bg��U�l)Jg��K��X����}�4�HAXB�:�X,|QqtV��t
�e����t
��B_�����Na��)�N��P���Gg���;��X����Y1��),��8:+F_=
�2R�P=��R9�Z�R���ht�
Y,�
e��S�,�������)t��X�����}��PQ�c�:�0� (������b(*Jg�����X����}�4�HAC�:[a��Ogi��),���2��)t��Y������)t�
Y,�
e��S�,�������)t��X�����}��PQ�c�:�0� (�W��T��V���|;te�"�����������6Ql Qo�5Gy(����Ro+�5K���aK�z[�,,��e� �m�����cs��l Qo���Dl����c�r�L.�B_oO����Ty*d��|GS��B!y-�z[A�x��c*���^&]��P��[�`5,WD��
��B �pkqA��)�R����"N�
N��
�@kqA����G�Y&�}�F\��S!c�-����r�`���� �UL�?��W&�}��.�V(�V�-`��P+�U� ���q�Y�U-� �������*����X�R���P-� �:�����T� �*��Ud��ToZ�/nR%C�o�U����5���x]\U)�����=�UN�*�������J'<���b��)qS)�"L��� J�b������|J��TvW���PT��z'���� ��q�R�|ykS�w�O��W�������x�J&c�x�N�(-� ������0)���C]���R9}ykK�w�L� -� �������(���s]���B16Z\A���ETJe���!���g���P�b�**q�R�|qkS�vBH��W��|T<��
E� ��� -��������2Jeq.n��xBAa�*�@Q(�@�-���1"J�p%.n��yB?�-��Q���� �D� ��� �`���� TL�� �D� L�� �B�Y\��"P���'�a�0��'����b���1�Ieq5.n�� ��B� [�P��{B1�YlA��K%�AO*�+rq[��x|��YlA����W=����9T=��P�6���V�)_�l"L�P��sB1�YlA�7O%�AN*�/s
<�8[U��g�� ��N-��M*��sq{uO�&����� _5��{�1�-����1�I�pu.n/�� ��b����F��{�/'s�{�3��^W��0S<�� �D� L�� ��b������J��T6_����q��� �,� ���� &��U��=��'����]C����e��
Ie�Ea�O�%��U�����'��A�b�~EU\�� �D� ��� ��B>Y\���J���,pa� _O��|B&�-����1PIes�.n��� �,�@���V���� �D� ��� ��r�B��P��NR!�,�(�� ��r�J��P�LB1YlA�O���W>a���9T>A�PH ���V )�$����5��#����4��i��|!�>N��.1A)���-��$$���K9Hx� LRL7�� �(A0*IS���&��'�%���#K� A02I�$�H��`lRL���6 STN�Iu����H�yWT�7�� I_Q����j
�����+jFPJ*��4FE�U���7��
Q���*��L�}WT�w57p�BEE�U�����$}Q1d���%�JyWT�wU7���Q�wE����@NI_T]�m����EE�5#�%�J�����q�����BEE�5#&"K�����qw���� SM��.rK�������EC�**���]e$��-4��k��K�yWT��/]S�� �yWFbM
���!u�4�i�Q���"���l�
Q&�Q�wE���H��)*s]2
pR!��-*����_����B6����%��&}Q�wE�8U%������K��N*`��Ah����h��@Qc�4�I�|������jtE���a�K�QOS��@��%�L�T�9i���+�FW2�:-P��`��B�I[T�]Q3���W������j���'�Q�wE�8U��������P* ��AX��|�4�i��2��i8��O���U���������`LT]�h5_T�X4
�R!�1*���AFM9|�l�F�����P������q����P�9�MR
��E���|�4$j��2��i��
�(}AP�._6��Z����l+�B4JcT�]Q5��I<j��0��i���Q�Ee�e��)"R������<6��e�!8U�/��I����+�F_8
�Z����p>�\J_����N��(*S,��P���}WT��`�)kU([��+j��*�Na��{WT��t�Z� �tN5��
eK�wE��K�T�������NBT��%�N��THQ����+�F_:
�Z����t\��J_����N��(*S,�FX���}WT����S�PQ�wE����aU������WO�U��!TO����U�l)��(���� ���P�X=
��r�b��P=
�R!c�/*���}�4�j��2��i��
P+}A��������Z�����j���p+�Q�wE���'��E
���AX(�\���{W��cU=
�Z����z�5��jeKYc�4k
�+�
e�����)�/�
a ���,X�������WO��**s��f� ��A�����){U+[��+:��k�d%_c�����f����M� A0:k��DP�����f*ta,�A�,L��e�`l�A��y6wd
�"H���D:�0� �5����l�-���W��T�b'*Jg�X�,�X���tV\�73�X���YIg���1*���qu��0���`t�&�Y(�ba����b,5�
Y,CEE����b����Y�.���2PT�����|*d�4FE����f�b����Y�-���e��(���t��X����7���2TT���H:�X,�QqtV��;�����b*C�P���Gg�����X����Y1��h0� �����b'*Jg�X~-S9���R����Y
������L
��rTu��(���t��X����w�,���2�%S�,c��E��Y1��Tv_!��Yq��),��8:K�T�La��)�L��Pc���Y���h
�e����h
��b,��8:+FW4
�2P��d
��rV��,�d
��b,��8:+FW2
�2P��`
��b,��8:K#�,��*�-E��}�cT��q�
��X�����Y(���A���|��@Qc�:[a��Og�X~$SY|�lBg�����X�����)t��X���,���T_$��Yq��),��8:K�\�Ma��9�M�������b�eSX,Ee�eS�,�X���tV\�l
�e����l
��b,��8:+F_6
�2R�P6��R!��/*k,�Bg���1*�����/S9|�lBg����X�����}��PQ�c�:�0� (��/��b(*S,�Bg���1*�����/SY�B�R����U�SXl1F��Y1��i0���0��)t��Y���tV��t
��1*�����N����%�N��P���Gg��K��X���K��Y(���A���|��@Q�b�:�X,�QqtV�����e��(�����ba����b���`,#a�S�,����-E�,�Fg�����P�X=��R9|�lK��Bg���/*����WOa��9VO��Pc���Yq�N.,����tV��z
��1*����WO����1TO��E[|Q�t������b**c��Bg��U�l)k��Bg�����P�X=��R9}�lK��Bg���/*����WOa��9VO��Pc���Yq:Ke�jeKQ:{=�����"�]��1�me�f�G��M[B��Ff�Q��` �����e�R�,b���*�l�6{[�ma�����4[B��Bda!��@ [<���X��?���������)U�
�+��V�UWA�PH^���V+����
�+��V�5),c��[��������7������F��w��p������@�a�"��0�+��|�z����U-�s�#�"��+U 7�2�6���U=r�/S�O�+U ����w���������&v���M�L��D��TW���5�^UM�����m�H����M�L��������7){V�X�+��|���W�NU-�s+"��b���*��|�������Z��U�����V�������7xU��\���|��T5�C=C�|�WJ��;w�<k��SEbOW� n�e�3k�NUS���*U$�t�
�&_��GZ����~�EEf�����M�I=���0UO�G����u�J���������+UM�{-��HD���M�L���&U��|�����7��<k��LE"HW� n�ej�]��Z��c_;�R����*��|��*5��q� �����v��+V 7�&���JT=�v0�"�+�?��"3�6T5��w���@����������9���E����X�g-��H���M�I���|���3|(2s�`?W� n�M��6�S=����P�x��* ���<XO�x��A���y���L���<O�������v�X���6���96��o�yP�&��x���{��r�X��������;F���<�����bp�/Ss�`7�S�������\������{��)�������\������5��k���M�I�����<cx�/Ss�b3U�?�j� 7M~s�.@n�ej�<���M�L����T���������\������{���)p��Ag��^���L����=�K�x���d��[�X����g�'CZ��@n�ej.����M�L����T��|����4���� �����K��kp����4�����g-��H$���M�L����TO����R�~r��g�Jd��=rR5r�/Ss�`&W� n�ej._��j��{-4�"��g� o�M��-��z
�k� (M~s�.���v��.r��j��&M~c���Y�1)
�bp�/Ss�� �S�������\���B�%$M>s�.@n�������?������������W����' ���Z�2-q���"���������8�!"���B�rD���2�N�o�.��c:"!��5�_:�#Z�!$��:�#Zl"�>a"�����cJ�"A��e*�������9U"SdT�MV� ��Lf�7d��:DQd"��/�����<d�l��<`2��bK�(B�\J��w�
����6�o�z��R���T���\f��Ye�r.&S�����Id��"������@Z*=�� ����\������y���)O�*�AI��p��p�@TE�|����i��Y#?p~��������(�"��1�R����%��?�)�|zb�@�lL�WR��@�-bA��������?�d� �����i�����D�.)��4\fUq} cO�8�N���>�9(����{���9?�1�GF�rs���3�fK
'���@������z�w
OE�&;;��6+�i���Yy����Yy�����t�������i�'�X#s0Ge�=M���)�w sS��E������
`.��s�b���(��D�,)�7�l��27e�74\�
U1p&�mS�y����)�p ���i2�N��&{8�9+��c����E6q sV�yCA��TA�������#���-L�\�qN�p�3TE����%�����q�O��2�����g(�Q39�q�?��@��|f�g94�����q��������q1U�KC�����2�����`(���b2�+�"�M��,QZ��jYd�Xd72�e�3\2U�c ��{L��Y���C0�e��J�e�PG2�e�3<m��� ����9iv�,Yp�|�q'���BU��S>S���������8�'(��*�.�9.&��5�%������.��6.�] s\��<�SL���
�TL���G�-3���*��,�]�g[�*���,&��Th����2����i�Xdge���e�3\�
U�V ���9����q����2���� M�N �-b� ���L,�v�/r�g�O����'�9.&��;�%��@�(�EH�%�P�2�e�}�n�2��dT�� M�P ���9�2��gh(�9.����)��d�p�f�g�1J��}�q`-���4��q`-���L���S<�x�#TE&��q1�U�9.�I s\�y[�E1J s\����S!��R d~r1U���*Fd��������oErfm������������:�_�>���9������b�V� O�%�N� ��� ��)��k�9�he
��Z2������ ��)��k�D
� ���_o\��D+�g��Z��abE��&�Q�u�M��H�<����%G�[sQ�(������6)�d�(��k��N.*�m�@�Ht�bM������0�nrU�(���[(N�5��6���t����FQ��~�X79��7�"x~�,��\pC�9_��gsQ�(������\)�d�(��{��1W�j�E�I7��+��E������Er�
����"QW�57�"8G��UMx�(�cs�M�8����F80�4������4�s���t��7i�� �}&q�����������H$��M���~f�5����d��H$�bM������4���� 8���L�4��<>=�8g����<z2q"N�6M�<��}M�45��h>r,N$2M�&�FQ�h�����m�L��3�myE�=�8'��p
67�"8&3b�E�}0qFN$ZN�&�FQ}L������������S���Q���6��r�j����y9{9�
��Rs2��\�����Cs�@��`�y�(���(M�s"/ N�!5Gz��&�M���S���Q�A��3���� 8A���M�=���E<����sU��g3��"��k��t�����sQ�G�&����)W��������s.jr����:��B�<�=�8Tg��+yn5�8VgbS�\�W�&N���*�d�(��:b`��3yp��9�0��57�"8��\�����!;[G�
��Rs:!]������v"��
67�"�������"�Q���6���+�d�(����XG7��t����o�+�E���cv
67�"8�3�M�5�8|'�\�q�(�s:�/]�����<L�
�Rs:�/]�����sx"��
67�"������j�E����w
67�"8�3.�Mn��8�g������FQs*��*T�r����<��������<��P���E�� ��&G�'����*W�G����wr�P5�������<��E������ZO�[Vu���E�o� ���&�^O�3�m[yE^��8�gb�����z����o��x�z���H��rM������UuU�����D|\O�<���:�g������F����3v��������y��M��Z�V��b��Mb�4�@D����eY^V�vu���������H�3�?��&��g����ad����]��>6��c�z�}=3`c�Y+ob����Q��bEl^Wj��Ef}<6�vUM�(��-�H������H��Kl����9�X����*�E�Z�nb����Q�������* ���w�;�fUM�(���r;V�
�FQ����"XW� �+B��G��� nu��U$nu�
�FQ���U�XUO�(��=�H������H}��l����9���}�H��J���H�Y�MUO�{-[���]�
��P���q�HUK�[����o.����+�j �z��������w��x��������X�(��g�X������T�H��J���H���&��o�����;8�+����A�����]�G����u�J���������+UM�{-��HD�����H���&U��|�����7��<k��LE"HW� oEj�]��Z��c_;�R����*�E��*5��q� �Q��v��+V�7�:���JT=�v0�"�+�?��"3�6T5p�(Rs��B�X�q�(R���s\?��"3T��Z<HQ�(�+�E��5����g�Pd���~�X�(���m��z
�y1���+U ��y���)������\���
��y��)p������\��Q��yq����o�yP�&��x���{��r�X�(������;F���<�����bx�(Rs�`7�S�������\������{��)�������\��Q��5��k�����N�����<c���H����T��|����4���� �Q��{��+V�7�"5w/S5���vRS$
s�
�FQ����_���Q��)b{�R�3e&2s��.US�Q��)o�bx�(R���i���Q��|0�+V�7�"5�/�R5�������7���FQ��.Ml*W�������;�����|��"��+V�7�"5��R=�Z>K������+����d�I����H����\��Q��|�������|���@J���FQ�����H��������7���j�`'M�"W����j�D&R���Y�1)
�bx�(Rs�� �S�������\���B�%$M>s�.@n=n��>����?�����X�#�������.)`#��9�6����D�t�r�n)+����z��T�@�$2����C��@��C2�2���D������"j;����!���i������DB��P�L�_Cd$S�gE��4��t� ��2����d�
XU&$Uib3�X�e�\���T5�YQ�~�0��t�I�%c���,�2EMpV��s�C$)l����}�k`O�\��K�����LQ�MP������&<+��{�����j��� �a�XY:��gE���\�2E,2:��5��t� ��28G1�2UMpV���������dhnc�ez���(��'���6�WgE��XS�p�g2b3�����Y��6M�2l���~f�lf���dFp��g:��gE<?��|�B^�XN��dFj:��gE|l����&����N�M�
�v&u�F3r3EM�=��&��6�YQ�h�q���m��X���6�W����ib��`�epL�]g�
�`�~�Xv:��gE�
�l�xEpV��9�Q�6�Y��`�z����3&�D���p�L����LQ�{Oft�H���M�YQ���!�9� N��9�Q��5y�hF����:��gE�
���� q�L���
u� ��&xl���&G��
����sM�H��%��&���hRiQ�
@�25g3^4EM�=�q�&V�6�YQ�lZ�����f3�4D~4�&���8S+R���(����!�9� ���9����5�YQ�pF������}j"]�\T��t���� ��&���3���&<+���er�k�+��� ����8U��8+���N��4��tF������"8+����T���(�s:-Q�T�Y��jb��\�epNgDj��=���&����^ejNglj��<z:cXM,Tl���~�;y�j����(�s=�Ul������ZMS�[�gdk�o��+������H�:xAn��Q�!�9��Y��jb��\�ep�g4k����Q�&R���|e�;���ZS�gE����`�ep���k�
�k=#aE�\�k2��&x��3�5UM���Q�!�m+����366D���r����
���� �Z�(YX����(�s=#aS������5��u� �,S��
�l[yEpV�����7�
D���lw{7��M��Z���*);����`����)@;������j) ;����z��]������+����
��
X�*T v6��C�z��]�&��H��7��uO�Y����U� �,R�3��]�����:h;+�`�ag|���2�Ebg����U������7��uU�Y�D�b�k;��w�#��uQ�Y�����:��v���f��X����"��/x\�����:h;+�`�ag|��<.�UMhg����U����~�kn�������E�bk2�,�s�b]��v��������Ehn#\�{���"�>-���A�"��+;k �b���$���w��+B;����H\��M��E�3��b����'vV$.V�&��"�>(�|�B^ �Y��d��*�d�Y�d������'vV�2V���N�����uQ�{�&��H\��M��Ep�fd��
�����5�myE�=���"q�
6v�1���.*p�� ;+�X�ag��5�l�xEhg�� �`�ag|l� ��&�L�Y�X�
��"5'.�EM�=�������`�ig\�M�s"/ �,Rs4�b�k�������U�������|�H^ �Y��l��*�d�Y�m6�b]��������U� �,bs6�b]���� ;+b�\�Y��l������gvV$.V�&��"8g32�Mn5���&v��5y�l�����*�d�Y���&�9� v�9�p��5v�9�p��jr�p���X�*W�v�9�p�.j�������U�������M^�P^�Y��t���`�ig������N�Y��6�W�v�9�p�
6v�9���n*�������U��������uU���vV�2V���H����uQ�GO'��H\��M��E�3���b]��v���p�
6v�����n*p����5�m[yEhg��5��U���z=agM�s,/���vV$.V�&��"8�.�UM�^O�Y�X�
��"���p�.jB;��\O�X�;��\��X7��z��.�rM��u���'\����{=agM~�V^�W�'���]����� ;k��cy�����"q��5v���p��jr�z���X�*W�v�egM>�V^�������M�������
hg:����e�J
��&�~"X 2ve
��&����e�Z
��*$;��#cW� �l2���F��� ��B����
��M������bW� �,Rk�M�b���v�egE,c�+@;����lb��&�������*�d�Y_����uQ��Y�lgE�bk2�,�k�M�b]��v����X����"����b]��v���&v�6��E���Y .V�&���=���&�������*�d�Y��5��uU�YmgE�bl2�,����[���@�,B��f��X��;��E�XW5��Ep,cd�b`g����&���O�&�m���
��H�����3 ;k��v����:h;+�`�ag��&�X759������U������&��� v�9�p��5v���6�p��j�������U� ��S�m4�b]���� ;+�`�ag����}2agM~�@^�gO&��H\��M��EpLfd��
�`�����*�d�YmgM>�<^�Y�`��*�d�Y�`������vV�2V���H����uQ�{O&��"p�+�d�Y��d���� ;��M�X��<{4agE�bl2�����&�9� v�9�p��5v��c�M�XW59z6cg�b�k;���M�X5y�l���X�*W�v�9�p�.jr������U��������uS�[�&���]�sM^=���"q�
6v�����w����Ej'\�rM��Ep'\���=���"�����EjN'\���<z:agE�bl2�,���e��6�W�v���6�p�+�d�Y�tF������v���
���EpN'\��M��EpNgd��
<k:agE�b�k2�,�s:�b]��������U� �,Rs:�b]���� ;+�`�ag��wr�XW5��Ep�'\��M��Ep�gd��
�j=agM~�V^�YcgM�b� �^O�Y����������U��������uU���vV�2V���H}�;9\�����"8�.V�&��"8�32�M���������\�ig�o� ��&�^O�Y����W��� ;kb����z�����X^�g�'��H\�rM��Ep�'\�������"�����Ej�Y����W�v�u��{�f�"c_�����v6��;h=[���@�l2�'�"cW� �lB��
X��� ��B���92ve
��&��ld�J
��*d;+`�P������.v����"�V��.�=Mhg\vV�2V���H���&v�.jB;����H\��M��E�5^o�X��u�vV$.V�&��"�6��.�UMhg<��U� �,b���.�EMhg\�ob�`�Y����b�k;��s��q�.jB;����H\��M��E�=_��XW5��u�vV$.V�&��"���e��
��"��j��U�����Q��uU�Y�2F�*V v���p��iB;����l�������`���d��]��>���&�m������"q�
6v��l��uS�c�L�Y��X��;����l��y`g��� �\�ag|l� ��&��L�Y�X�
��:u�F.�EM�=���"q�
6v�1���.*p�'v���
�y�d�����*�d�Y�dF�����&��H\�bM��u�v�������Ep&\��M��u��
&\���<z0agE,c�+@;���L�X5��d��.��M��Ep}H6y��� ��H����u���GvV$.V�&��:h;k��#y`g��� �\�ag<����uU��g3vV .V�&�������uQ�G�&���e�rhg��� ��&��M�Y��X�;�����X7��l�����:���� ;+�`�ag\�/�|�L^ �Y��p��*�d�Y�p������� ;+b�\�Y��t�������vV$.V�&��"��_6ymCyEhg|o� ��M��EpNgd��
�j:agM~�P^�Y�t��*�d�Y�tF������vV$.V�&��"8�.�UM��N�Y�X�
��"5�.�EM=���"q�
6v��x'��uU�Y�z��*�d�Y�zF������v���m���U0v��.V�r����5���� �ZO�Y��X��;��\O�XW59z=agE,c�+@;��w������ �,�s=�bl2�,�s=#c�T�^� ;�\��5�v�����p��jr����5�m[yE^����&v�
^�[�'���o��x�z�����*�d�Y�z������� ;+b�\�Y���5�l[yEhg�����vv���o�w�������l���������]��� -;+`����
�������) ;��{����*) ;�����e�B`g:?4�g��j;��Z}�X�4��Ep�Y�X�
��"�>3����� �����"q�
6v��x�-c]T v�!�Y��X��;���|�XW5��E��AD .V�&���}�?2�X5��Epm��]��Mhg<�j��U� �,b������� �����"q�
6v��|��b]��v�A�Y��X�;��g����.*;����Y$.V�&��"8G.�UMhg��X�Y��6���� �,�����o�+B;������+vn�L���|�]�"�������*�d�Y?��.�MM�}2agE�bk2�,�����g.���EjN&\�rM��u��M&\���<z2agE,c�+@;��}M�X5��h�����*�d�Y�hF����m�L�Y��6�W��� ;+�`�ag����>���"q��5v�A�Y��6�W�v�9�p�
6v���6�p��j�������U� �,Rs2�b]���� ;�\�
6�v��!��9'���"5G.��&�M�Y��X�;�����w����Ej�&\�rM��u��f.�UM����Y��X����"6g.�EM=���"�����Ej�&\����{6agE�bl2�,�s6#c�T�V� ;kb�\�W�&��H\��M��Ep}�l��3y`g��� �\�ag�� ��&G'���e�rhg��� ��&��N�Y��X�;���~���
���u��M'\�
6�v�9���n*p����5�mCyEhg�� �`�ag������N�Y��X��;���N�XW59z:agE,c�+@;���N�X5y�t�����*�d�Y?��.�UMhg�� �`�ag�����ZO�Y����W�vV��Y�X/���v��;��<k=agE�b�k���Ep�'\�������"�����E�;���b]��v���p�
6v�����n*p����].v��L;��}[O�XW5��z�������"�^O�Y�X/���v��7��<k=agE�b�k2�,�s=�b]��������U� �,R���|���"����������Z�}_��;���^����#8e,"��B���%J7��+g$�W�"���#26��������-����:a�*��k:�b��5�W5:�+Z,c�Q�W5:�+Z,c�>�"�>�����.�o�����:������9-*SdT��M�� m���I���)��PA^|� C���)kdP
��D�z�0EQ*�����)�k�)LYe�
��-.C�k]��K��
��0�
<�)��PA���Ly_Oa�*+T���qZ_�-]��T��|�iB
��DU�'�{�P�����d���������4\�U�� ��BE`�IN��z�}��?fN��%9Q� 2gc������� c;�/���DOl%��G��1����mB`+�eT�1�OM�Br� sFB������dF4�9.5���J��'p�A��d�U%���
��DUd%��q��?2��RDU$��2���+|o��r~Lc������SD�%��}�����Y�d��2\��R���d�[�gY#�0G%�=�mS�A��Uhx�@4��%A��(�4Y>/M4|��mS�A���g�D ��M�����6j�3�0MF�i�d50g%�=�d�� ���a�%S5r~�b������,Pz sY��3\UQz&2qI-2��s\�� ����g <E����3���J�3�>��H��'Bd��0s��m\��@���������J`~Z A��n(����b2�+���M��,�g(z������d.��i����*����?2��s\�� ����gY7����q:��teh�<����S���������p`�K��*���;��1�o�_��@����qLP�,U�g s\Lf�k�K����v�c�$��m�y2�e80�Sp����<6���-���/��,C\~�e��2�n�b2�N��&/��,!�9��l�L>��qe���b�@>�=s����6.0^ s\��2<���� �����9 {N)�"k*��4�6�R
U�T s\Lf�w�K4���K(A��R(����2lS�wh*�9.&���Rh����7��|�q����2l�����d��g����I�#��"�%��6.C��q�[��2Q�o�% �TQ�& UqK s\Lf�o�K����,����%�9.C�&�UvK �����ZUqB s_L�U���o�>�'����}\( K���I?���J
�!s���@,���>��3|�Dj)���:���h��)���{���D*)������)T � ��_��3,�
5�mb����+"�4�mb�<�gbG�\��c������sQ��KPg�Lb�l�3|���m����V ��It�bMx����7CB��\�����_W1�pR�I�c�����EM`�<3$����&0W�}�l��\��+�����=5��JPg�Lb�l�3|��ks��&�` ��I���M(�������E��:$�D])��g���u��&�`�e��=�
��14��=Mp����w*C~� ^�W���3|�`+vn�L�_�%�cMp�/A��3�S� }���d
��&�>�9�g �X�4�_����� 9����L�4���_��m2!�\�����3|&�i������hB�����G3g�L"�lB/����4����3|!�m ���'3g�LrbO�&s����sQ�c���3��S� ��%�3|!�m��s0��lB���cLh9W5y�`�����rp���9��r.jr���>h�l2���R�<�D^���cj�&��sM�=�9�g��`��u�/�;G��Ss6���k�3| �_�g�+S���M>���26gz�EM=�9�gb��\��cj�&N�����g3g�Lb
l�3|���*tS�[�f����:�����3|&1�
6�td�<���3yr���9�0��5�>�p�����=�9�gb��\��cjN'����<z:s��$�Q�&����B^�P^����t>#W�������utS�[Mg�������"8����N��S� �%�s:#0�T�Y��3|&���5�>�t�_������3|&���>��t�_�����3g�L�/lB��g���_�� T(�s=qbO�&T����uS�[�g�������"8������X�*xAn��9���cy���9�g�\��cp�'T�����9�gb�\��c�;���B]�g����{
6�Uep�g���
�k=s�OVu���3| ����UuU�{�g�������"�^����UU���z=s�/�7��<k=s��$VU�&<���\OXUW59z=s���G��+�3|L�g�B>�V^��{��}��
����������&t~�g�X���Mf�D��e�2hgZvV@2�-`g��]������M�=^Y�X���U�vV@2���� ���s\�BM`g�Z�o"��&��.;+"�\�Y��gf��5��u�vV�.��&��"���dl�
��:d;+b�X�ag\�o"��&���?���:�v���G��&����7��M� �,��W�v��5��E�9_p��5��u�vV�.��&��"����]l����:h;+b�`�ag���\26EbgZ_5���:�d�Y�(���� �,�c-c+ ;�������4��Ep}Z6�m�xEhgW0v�`�X�.�m�I�Y����W�v�A�Y�X�;��g6������O&���]�cM��Ep}P6���� ��H����u����>����MU�GO&���d�shg��o���&�M�Y�X�;��M����� ;k����<{2agE�bl2�,�c2-cST��vV�.��&��:h;k�������"83.��&��:��3.6UM=���"�����EjNf\l���{2ag��U������&�9� v�9�q��5y�h�����:�d�YmgM�s$/ �,Rs6�b�k2����6�q��jr�l��
��:�v�9�q�)j�������u� �,Rs6�bS���� ;+b�`�ag��i����M�Y�����z6agE�bl2�,���e���� ;����X��;�����T59z8agE$c�+@;������5y�t�����:�d�Y���&�m(�����m:�bl2�,�s:-c�T�V� ;k�������"8�3.��&��"8��26M`g����\�ag����&GO'���d�shg�����&��N�Y�X�;��g������ �,�s=�bl2�,�s=-c�T�V� ;k�������*;k"����z���|�X^�g�'���]�sM��Ep�g\l������"�����E�;���bS��v���q�6v�����i*p����]$.V�&��:x��3.6UM�����&�m+����v�D.��r����5���� �ZO�Y�X��;��\���T59z=agE$c�+@;����&�m+������������I>��&t~�g�X���Mf�D�@d����Mh�Y�X���UHvv=G��L��d�����UI�Y�lg,c* ;����y=���P�Y�������� �,����X�*W�v�����.�EMhg����U�������m���������*�d�Y������� �,��"q��5��E�;������ �,�k�M�blB;���U�@\�rM`g{�<.�EMhg����U������k��&�������*�d�Y?�5��uQ��Y��W�"q��5v�9�p��jB;��X��X�
��"4�.�=Mhg\��M~� ^����5��]�p�gv�������u�vV$.V�&��"��Mv�njr�� ;+�X�ag\�M>s!/ �,Rs2�b�k2����m2�b]���� ;+b�\�Y���h������GvV$.V�&��"8F32�En�d�������"��L�Y��X�;�����X8������U�������|�y�"������U����>����uU�G&���e�rhg��� ��&��L��E�bW������&�9� v�9�p��5y�h�����*�d�YmgM�s$/ �,Rs6�b�k2����6�p��jr�l��
��*�v�9�p�.j�������U� �,Rs6�b]���� ;+�`�ag������M�Y�X���z6agE�bl2�,���e���� ;��N�X��;��N�XW59z8agE,c�+@;���N�X5y�t�����*�d�Y���&�m(�����m:�bW��������uS�[M'���o�+B;���N�X�;�����X7x�t�����*�d�Y�t������� ;+b�\�Y��t�������vV$.V�&��"���p��jB;��\O�X�;��\��X7��z�������"��
�����*xAn����&�9��Y� ;+�\�ag�� ��&G�'���e�rhg���wr�X5��Ep�'\��M��Ep�gd��
�k=ag��]�&��:x��.�UM�����&�m+����v��.V�r����5���� �ZO�Y��X��;��\O�XW59z=agE,c�+@;����&�m+������������}��{<
hg:����e�J
��&�~"X 2ve
��&����e�Z
��*$;��#cW� �l2���F��� ��B����
��M������bW� �,Rk�M�b���v�egE,c�+@;����lb��&�������*�d�Y_����uQ��Y�lgE�bk2�,�k�M�b]��v����X����"����b]��v���&v�6��E���Y .V�&���=���&�������*�d�Y��5��uU�YmgE�bl2�,����[���@�,B��f��X��;��E�XW5��Ep,cd�b`g����&���O�&�m���
��H�����3 ;k��v����:h;+�`�ag��&�X759������U������&��� v�9�p��5v���6�p��j�������U� ��S�m4�b]���� ;+�`�ag����}2agM~�@^�gO&��H\��M��EpLfd��
�`�����*�d�YmgM>�<^�Y�`��*�d�Y�`������vV�2V���H����uQ�{O&��"p�+�d�Y��d���� ;��M�X��<{4agE�bl2�����&�9� v�9�p��5v��c�M�XW59z6cg�b�k;���M�X5y�l���X�*W�v�9�p�.jr������U��������uS�[�&���]�sM^=���"q�
6v�����w����Ej'\�rM��Ep'\���=���"�����EjN'\���<z:agE�bl2�,���e��6�W�v���6�p�+�d�Y�tF������v���
���EpN'\��M��EpNgd��
<k:agE�b�k2�,�s:�b]��������U� �,Rs:�b]���� ;+�`�ag��w�g\�����"8�.V�&��"8�32�Mn����&�m+������&v�
^�[�'���w��x�z�����*�d�Y�z������� ;+b�\�Y����.�EMhg�� �`�ag�����ZO��E�bW���������uU�{�'���o��+�����5��U���z=agM~s,/���vV$.V�&��"8�.�UM�^O�Y�X�
��"����g��+B;������f�"c�����z��M��Z�V��b��Mb�4�@D����eY^V�vu���������H�3�?��&��g����ad����]��>6��c�z�}=3`c�Y+ob�������X�����)b�Y�M�]US`i��c+��b�`�ck ����9�X����* ����W�V�X�u�H�
D��T�W����4�j
,�:zl]M�X+�����r�`]��=�k���K���V���+ ����V�U�X�u�������XYV�X����e�bEd}_,��R$X��j��;v��l��v�+�?C�"2�
"U-n�o��&��h`YU�D�����~��+�j���v��t��"��+V@6�=��V��)p�{�*q�R$R��j����������3]�r���B5y�y� ��;�S�����iO��o�W���Z<�S���+ q�{SiR����}�`MM~s�.����T$�t�
H�����@zT-����l�H��J�+eOT��g��X�t�|���]�����$5y�q� �Z;R�����G��{����;��E�B�X�Q����<���3�(2s�`A+�����]�R���5����g�Pd���~�X�P�D��|��]��6/&T �s�
��"47�S5�y��"v�+�?S�"37�S5��y��"��+V@��=��&��q�|����4��T���6�S$�s�
�z�'���;F���<�����b$<��i��#w�Z=�N������:��������=xN�X�+ ���HN�������yo��y�
�o�z��5v��o�{��&�9s`�����vs�
�k�'Z��;v����{��"Q�+V@J�=1�&�9s�������\���2��{P��)�������\������4y��� Kb����|0�+V@�=�&��t�|����4���� K^�'����r��j� .M�c���Y�m)I�b$-�gi��Cw�Z>K������+����d�I�X�r�|����\�2����4y����{-4�"��g��%%z�(M^s�.������7���j�`'M�"W����j��7�����|�"��+V@Z�=��&v��)p��AI��@���L!���&�9t`��=�����&�b��<~6��L���V���*)#����b W� �dB�I
�A�� ��B���9re
�K&��l<�J
�L*d3)`�P������*r���N"����2�=MxV�%(E,$�+@E����jb#��&<+��-�H���M�YQ_����tQ��J��*Eb&k2��"����r�UMxV�����I��@Y"���`(]��gE\�nbI�`�E��"W �R�&P��=�S��&<+����Hd��M�YQ��5��tU�u�S$�R�&��(����[[��@D&B��]�xK������Q��tU�Ep,c��b 4����&<+����k�������
FkHc����3 �i��v������v�"Q�
6gE��&�L759������T��8+����k��y`9��� ��\�qV���6����j�������T� m�S�m4!7]���� �)��`�qV�1�q�.*p�'����
�y�dB|��s*�d�EpLf\��
�`�~�Dv*�d�u�������YQ�`By*�d�u��
&����<z0aBEl>�+@���L�O5��dB�.���M�YQ��c���� '��M(P��<{4�EEbAl2��:h3j��#y�F��� �\�qV��c�M�PW59z6cH"D�kG���M(Q5y�lB��X�*W���9���.jr���+�U��8+�����Q7��lB����:���� g*E�`�qV�����w���8Ej'<�rM�YQ�pB������ }*b]�\
T��t�������U$�T�&��(���d��6�W�gE|o� q��M�YQ�tF������:���
��YQ�tB�*�d�EpNg$��
<k:�UEbQ�k2��"8�"�UM��N�U�T�
P�"5�6�EM=�0�"�
6gE��wr8UW5�YQ�zB�*�d�Ep�g���
�j=![M~�V^�U0���zU�r���r5���� �ZOHW�8V��������fuU����U��U���H}�;9\�����(�s=�[l2��"8�3��M���������\�yV��������jr����5�m[yE^����&��
^�[�'���o��x�zB����*�d�Ep�'$������"����Ej�Y����W�gE����D���=��� ��A��2V%bg�Y?,�2hgZvV�2V-`g��]���+S v6��xe#cUR vV!�Y�X�
��&t~h^�p�+�v���&v��iB;����"�����Ej}f6��uQ�YmgE�bl2�,���z[���@��C��"q��5v���&v��jB;�����@\�rM`g��dp�.jB;���|�X���"x~�,�\�Y�����uQ�YmgE�bl2�,�������� �����"q�
6v��x��k��Eh}�,�X��,�s�b]��v��������Ehn#\�{���"�>-���A�"��+;k �b���$���w��+B;����H\��M��E�3��b����'vV$.V�&��"�>(�|�B^ �Y��d��*�d�Y�d������'vV�2V���N�����uQ�{�&��H\��M��Ep�fd��
�����5�myE�=���"q�
6v�1���.*p�� ;+�X�ag��5�l�xEhg�� �`�ag|l� ��&�L�Y�X�
��"5'.�EM�=�������`�ig\�M�s"/ �,Rs4�b�k�������U�������|�H^ �Y��l��*�d�Y�m6�b]��������U� �,bs6�b]���� ;+b�\�Y��l������gvV$.V�&��"8g32�Mn5���&v��5y�l�����*�d�Y���&�9� v�9�p��5v�9�p��jr�p���X�*W�v�9�p�.j�������U�������M^�P^�Y��t���`�ig������N�Y��6�W�v�9�p�
6v�9���n*�������U��������uU���vV�2V���H����uQ�GO'��H\��M��E�3���b]��v���p�
6v�����n*p����5�m[yEhg��5��U���z=agM�s,/���vV$.V�&��"8�.�UM�^O�Y�X�
��"���p�.jB;��\O�X�;��\��X7��z��.�rM��u���'\����{=agM~�V^�W�'���]����� ;k��cy�����"q��5v���p��jr�z���X�*W�v�egM>�V^�����{+|�������n�oC�����T���
-���<�}������YdyW7���������5��z^���y�T��\�`���x-`S�N��X=K�:�jt��t������,������\�+"_�;)��-a�kE��+��L��D5;Mkjd^�|�����g2[n45�� /�4��Ab%K��Y7�&��s��M�=)�y�3�["�t���$�����������i� ��������S��G~��I��-v�P�H��}A�����hy��X�<�k��s���L�e&�{�:��6�����4�
����[�e-�c� ��Ntd���D�R�x>o�cBw �-a�L�9��6�=G �`� 4��R�i�s8o,bFw���D8.`O��a��\��d�l���V����h��i��m�f�+���`ta���"�0�����-��;F��C3��f�<`z,
t�l2�g6�=�!V��X��>h��
05R�~�o��{f�^s"�:��Y�%��c0�V/����d�@<�m�=�a��
� sx�����R#w�����b���K������M&7��}�6���m�
�f��������mo�
{�m����w@�7�S��F��sxCl���P#Y0��7��I��'������>����>D��}���fK���n�@F,�8���X�5<�����f����Z����w��}�B��|f�g_����������a���i s!���Nw��4<�����l���X�}��
{�}����k���v�.��|�}���f�r�������}�a�������l ���x�����f��J���B�.��SH�F����-�R5[6*=�S ��q��F���\��)����KUL�=S s!(��^c!l��|���n���������T�Nk�Y$<�����~7�l)���!�}��a��H�m(��N[��#<�}��a�*B����g���j�<Qz,� >�=�w����B�����f��I�<���J����a��A�,��o*uw���F'=V<