From 8b1343903a68a1315a3b3334bcebc42cb722a470 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <peter.geoghegan86@gmail.com>
Date: Thu, 9 Jul 2015 15:59:07 -0700
Subject: [PATCH 1/2] Tighten SortSupport abbreviated key contract

Revise the contract that SortSupport has with its clients to forbid an
abbreviated comparator ever returning 0 ("indeterminate") in the event
of two abbreviated keys not being bitwise equal.  This gives certain
cases the leeway to inexpensively determine tuple inequality from afar,
by comparing abbreviated keys using simple unsigned integer (Datum)
comparisons, and trusting that bitwise inequality amounts to leading
attribute inequality (and, therefore, tuple inequality).  Direct access
to SortSupport state is consequently not required, which will be
convenient for tuplesort clients (that only have an opaque state
pointer), while also saving us a few additional cycles.

Cheap inequality checks of this nature might otherwise give wrong
answers due (for example) to some opclass abbreviated comparator
interpreting non-identical abbreviated keys with respect to some state
built during memtuple copying (although that is probably still safe,
depending on the exact details).  Pass-by-reference abbreviated keys
could similarly cause problems, and so are now forbidden.  No existing
opclass with abbreviation support does anything like this, so no change
in any opclass SortSupport routine.

Backpatch to 9.5, where abbreviated keys were introduced.
---
 src/include/utils/sortsupport.h | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/include/utils/sortsupport.h b/src/include/utils/sortsupport.h
index 787404e..3d78c3a 100644
--- a/src/include/utils/sortsupport.h
+++ b/src/include/utils/sortsupport.h
@@ -116,7 +116,7 @@ typedef struct SortSupportData
 	 *
 	 * This allows opclass authors to supply a conversion routine, used to
 	 * create an alternative representation of the underlying type (an
-	 * "abbreviated key").  Typically, this representation is an ad-hoc,
+	 * "abbreviated key").  Invariably, this representation is an ad-hoc,
 	 * pass-by-value Datum format that only the opclass has knowledge of.  An
 	 * alternative comparator, used only with this alternative representation
 	 * must also be provided (which is assigned to "comparator").  This
@@ -135,6 +135,18 @@ typedef struct SortSupportData
 	 * cache miss penalties are expensive; to get good overall performance,
 	 * sort infrastructure must heavily weigh cache performance.
 	 *
+	 * There is one limited sense in which the core code has knowledge of the
+	 * abbreviated key format:  inequality.  The core system makes a generic
+	 * assumption that two non-bitwise-identical abbreviated keys could never
+	 * have the abbreviated comparator return a 0 or "inconclusive" result for
+	 * two non-identical abbreviated keys (this enables cheap inequality tests
+	 * made from a distance, without opclass involvement.  Furthermore, it
+	 * implies that abbreviated keys must always be pass-by-value).  Note that
+	 * it is still permitted for the abbreviated comparator to use state
+	 * generated during calls to abbrev_converter -- for example, a lookup
+	 * table could be used, provided any identifier baked into the abbreviated
+	 * representation is canonical.
+	 *
 	 * Opclass authors must consider the final cardinality of abbreviated keys
 	 * when devising an encoding scheme.  It's possible for a strategy to work
 	 * better than an alternative strategy with one usage pattern, while the
-- 
1.9.1

