Custom tuplesorts for extensions
Hackers,
Some PostgreSQL extensions need to sort their pieces of data. Then it
worth to re-use our tuplesort. But despite our tuplesort having
extensibility, it's hidden inside tuplesort.c. There are at least a
couple of examples of how extensions deal with that.
1. RUM table access method: https://github.com/postgrespro/rum
RUM repository contains a copy of tuplesort.c for each major
PostgreSQL release. A reliable solution, but this is not how things
are intended to work, right?
2. OrioleDB table access method: https://github.com/orioledb/orioledb
OrioleDB runs on patches PostgreSQL. It contains a patch, which just
exposes all the guts of tuplesort.c to the tuplesort.h
https://github.com/orioledb/postgres/commit/d42755f52c
I think we need a proper way to let extension re-use our core
tuplesort facility. The attached patchset is intended to do this the
right way. Patches don't revise all the comments and lack code
beautification. The intention behind publishing this revision is to
verify the direction and get some feedback for further work.
0001-Remove-Tuplesortstate.copytup-v1.patch
It's unclear for me how do we split functionality between
Tuplesortstate.copytup() function and tuplesort_put*() functions. For
instance, copytup_index() and copytup_datum() return error while
tuplesort_putindextuplevalues() and tuplesort_putdatum() do their
work. The patch removes Tuplesortstate.copytup() altogether, putting
their functions to tuplesort_put*().
0002-Tuplesortstate.getdatum1-method-v1.patch
0003-Put-abbreviation-logic-into-puttuple_common-v1.patch
The tuplesort_put*() functions contains common part related to dealing
with abbreviation. The 0002 extracts logic of getting value of
SortTuple.datum1 into Tuplesortstate.getdatum1() function. Thanks to
this new interface function, 0003 puts abbreviation logic into
puttuple().
0004-Move-freeing-memory-away-from-writetup-v1.patch
Assuming that SortTuple.tuple is always just a single chunk of memory,
we can put memory counting logic away from Tuplesortstate.writetup().
This makes Tuplesortstate.getdatum1() easier to implement without
knowledge of tuplesort.c guts.
0005-Reorganize-data-structures-v1.patch
This commit splits the "public" part of Tuplesortstate into
TuplesortOps, which is intended to be exposed outside tuplesort.c.
0006-Split-tuplesortops.c-v1.patch
This patch finally splits tuplesortops.c from tuplesort.c. tuplesort.c
leaves which generic routines for tuple sort, while tuplesortops.c
provides the implementation for particular tuple formats.
------
Regards,
Alexander Korotkov
Attachments:
0001-Remove-Tuplesortstate.copytup-v1.patchapplication/x-patch; name=0001-Remove-Tuplesortstate.copytup-v1.patchDownload
From 1c935fb310dd7e439ea3eeead746e1e198ade939 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 13:28:27 +0300
Subject: [PATCH 1/6] Remove Tuplesortstate.copytup
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/tuplesort.c | 330 ++++++++++++-----------------
1 file changed, 132 insertions(+), 198 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 31554fd867d..0114855c83c 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -279,14 +279,6 @@ struct Tuplesortstate
*/
SortTupleComparator comparetup;
- /*
- * Function to copy a supplied input tuple into palloc'd space and set up
- * its SortTuple representation (ie, set tuple/datum1/isnull1). Also,
- * state->availMem must be decreased by the amount of space used for the
- * tuple copy (note the SortTuple struct itself is not counted).
- */
- void (*copytup) (Tuplesortstate *state, SortTuple *stup, void *tup);
-
/*
* Function to write a stored tuple onto tape. The representation of the
* tuple on tape need not be the same as it is in memory; requirements on
@@ -549,7 +541,6 @@ struct Sharedsort
} while(0)
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
-#define COPYTUP(state,stup,tup) ((*(state)->copytup) (state, stup, tup))
#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
@@ -600,10 +591,7 @@ struct Sharedsort
* a lot better than what we were doing before 7.3. As of 9.6, a
* separate memory context is used for caller passed tuples. Resetting
* it at certain key increments significantly ameliorates fragmentation.
- * Note that this places a responsibility on copytup routines to use the
- * correct memory context for these tuples (and to not use the reset
- * context for anything whose lifetime needs to span multiple external
- * sort runs). readtup routines use the slab allocator (they cannot use
+ * readtup routines use the slab allocator (they cannot use
* the reset context because it gets deleted at the point that merging
* begins).
*/
@@ -643,14 +631,12 @@ static void markrunend(LogicalTape *tape);
static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
static int comparetup_heap(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_cluster(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
@@ -659,14 +645,12 @@ static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_index(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static int comparetup_datum(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_datum(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
@@ -1059,7 +1043,6 @@ tuplesort_begin_heap(TupleDesc tupDesc,
PARALLEL_SORT(state));
state->comparetup = comparetup_heap;
- state->copytup = copytup_heap;
state->writetup = writetup_heap;
state->readtup = readtup_heap;
state->haveDatum1 = true;
@@ -1135,7 +1118,6 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
PARALLEL_SORT(state));
state->comparetup = comparetup_cluster;
- state->copytup = copytup_cluster;
state->writetup = writetup_cluster;
state->readtup = readtup_cluster;
state->abbrevNext = 10;
@@ -1240,7 +1222,6 @@ tuplesort_begin_index_btree(Relation heapRel,
PARALLEL_SORT(state));
state->comparetup = comparetup_index_btree;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->abbrevNext = 10;
@@ -1317,7 +1298,6 @@ tuplesort_begin_index_hash(Relation heapRel,
state->nKeys = 1; /* Only one sort column, the hash code */
state->comparetup = comparetup_index_hash;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->haveDatum1 = true;
@@ -1358,7 +1338,6 @@ tuplesort_begin_index_gist(Relation heapRel,
state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
state->comparetup = comparetup_index_btree;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->haveDatum1 = true;
@@ -1422,7 +1401,6 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
PARALLEL_SORT(state));
state->comparetup = comparetup_datum;
- state->copytup = copytup_datum;
state->writetup = writetup_datum;
state->readtup = readtup_datum;
state->abbrevNext = 10;
@@ -1839,14 +1817,75 @@ noalloc:
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
+ Datum original;
+ MinimalTuple tuple;
+ HeapTupleData htup;
- /*
- * Copy the given tuple into memory we control, and decrease availMem.
- * Then call the common code.
- */
- COPYTUP(state, &stup, (void *) slot);
+ /* copy the tuple into sort storage */
+ tuple = ExecCopySlotMinimalTuple(slot);
+ stup.tuple = (void *) tuple;
+ USEMEM(state, GetMemoryChunkSpace(tuple));
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ original = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup.isnull1);
+
+ MemoryContextSwitchTo(state->sortcontext);
+
+ if (!state->sortKeys->abbrev_converter || stup.isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ stup.datum1 = original;
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ stup.datum1 = state->sortKeys->abbrev_converter(original,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ stup.datum1 = original;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ {
+ SortTuple *mtup = &state->memtuples[i];
+
+ htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
+ MINIMAL_TUPLE_OFFSET);
+
+ mtup->datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &mtup->isnull1);
+ }
+ }
puttuple_common(state, &stup);
@@ -1861,14 +1900,74 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
+ Datum original;
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+
+ /* copy the tuple into sort storage */
+ tup = heap_copytuple(tup);
+ stup.tuple = (void *) tup;
+ USEMEM(state, GetMemoryChunkSpace(tup));
+
+ MemoryContextSwitchTo(state->sortcontext);
/*
- * Copy the given tuple into memory we control, and decrease availMem.
- * Then call the common code.
+ * set up first-column key value, and potentially abbreviate, if it's a
+ * simple column
*/
- COPYTUP(state, &stup, (void *) tup);
+ if (state->haveDatum1)
+ {
+ original = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup.isnull1);
+
+ if (!state->sortKeys->abbrev_converter || stup.isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ stup.datum1 = original;
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ stup.datum1 = state->sortKeys->abbrev_converter(original,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ stup.datum1 = original;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ {
+ SortTuple *mtup = &state->memtuples[i];
+
+ tup = (HeapTuple) mtup->tuple;
+ mtup->datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &mtup->isnull1);
+ }
+ }
+ }
puttuple_common(state, &stup);
@@ -3946,84 +4045,6 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
return 0;
}
-static void
-copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /*
- * We expect the passed "tup" to be a TupleTableSlot, and form a
- * MinimalTuple using the exported interface for that.
- */
- TupleTableSlot *slot = (TupleTableSlot *) tup;
- Datum original;
- MinimalTuple tuple;
- HeapTupleData htup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
-
- /* copy the tuple into sort storage */
- tuple = ExecCopySlotMinimalTuple(slot);
- stup->tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- original = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &stup->isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-
- if (!state->sortKeys->abbrev_converter || stup->isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup->datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup->datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup->datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- mtup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &mtup->isnull1);
- }
- }
-}
-
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4192,79 +4213,6 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
return 0;
}
-static void
-copytup_cluster(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- HeapTuple tuple = (HeapTuple) tup;
- Datum original;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
-
- /* copy the tuple into sort storage */
- tuple = heap_copytuple(tuple);
- stup->tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
-
- MemoryContextSwitchTo(oldcontext);
-
- /*
- * set up first-column key value, and potentially abbreviate, if it's a
- * simple column
- */
- if (!state->haveDatum1)
- return;
-
- original = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &stup->isnull1);
-
- if (!state->sortKeys->abbrev_converter || stup->isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup->datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup->datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup->datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tuple = (HeapTuple) mtup->tuple;
- mtup->datum1 = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &mtup->isnull1);
- }
- }
-}
-
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4511,13 +4459,6 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
return 0;
}
-static void
-copytup_index(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /* Not currently needed */
- elog(ERROR, "copytup_index() should not be called");
-}
-
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4582,13 +4523,6 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
return compare;
}
-static void
-copytup_datum(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /* Not currently needed */
- elog(ERROR, "copytup_datum() should not be called");
-}
-
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
--
2.24.3 (Apple Git-128)
0002-Tuplesortstate.getdatum1-method-v1.patchapplication/x-patch; name=0002-Tuplesortstate.getdatum1-method-v1.patchDownload
From 47168e9ec865bad74e58cf714152d14cd7865d23 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:03:13 +0300
Subject: [PATCH 2/6] Tuplesortstate.getdatum1 method
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/tuplesort.c | 100 ++++++++++++++++++-----------
1 file changed, 64 insertions(+), 36 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 0114855c83c..c649043fbb0 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -279,6 +279,8 @@ struct Tuplesortstate
*/
SortTupleComparator comparetup;
+ void (*getdatum1) (Tuplesortstate *state, SortTuple *stup);
+
/*
* Function to write a stored tuple onto tape. The representation of the
* tuple on tape need not be the same as it is in memory; requirements on
@@ -540,6 +542,7 @@ struct Sharedsort
pfree(buf); \
} while(0)
+#define GETDATUM1(state,stup) ((*(state)->getdatum1) (state, stup))
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
@@ -629,6 +632,10 @@ static void reversedirection(Tuplesortstate *state);
static unsigned int getlen(LogicalTape *tape, bool eofOK);
static void markrunend(LogicalTape *tape);
static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
+static void getdatum1_heap(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_cluster(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_index(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_datum(Tuplesortstate *state, SortTuple *stup);
static int comparetup_heap(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
@@ -1042,6 +1049,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->getdatum1 = getdatum1_heap;
state->comparetup = comparetup_heap;
state->writetup = writetup_heap;
state->readtup = readtup_heap;
@@ -1117,6 +1125,7 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->getdatum1 = getdatum1_cluster;
state->comparetup = comparetup_cluster;
state->writetup = writetup_cluster;
state->readtup = readtup_cluster;
@@ -1221,6 +1230,7 @@ tuplesort_begin_index_btree(Relation heapRel,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->getdatum1 = getdatum1_index;
state->comparetup = comparetup_index_btree;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1297,6 +1307,7 @@ tuplesort_begin_index_hash(Relation heapRel,
state->nKeys = 1; /* Only one sort column, the hash code */
+ state->getdatum1 = getdatum1_index;
state->comparetup = comparetup_index_hash;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1337,6 +1348,7 @@ tuplesort_begin_index_gist(Relation heapRel,
state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ state->getdatum1 = getdatum1_index;
state->comparetup = comparetup_index_btree;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1400,6 +1412,7 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->getdatum1 = getdatum1_datum;
state->comparetup = comparetup_datum;
state->writetup = writetup_datum;
state->readtup = readtup_datum;
@@ -1872,19 +1885,7 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- mtup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &mtup->isnull1);
- }
+ GETDATUM1(state, &state->memtuples[i]);
}
puttuple_common(state, &stup);
@@ -1957,15 +1958,7 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tup = (HeapTuple) mtup->tuple;
- mtup->datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &mtup->isnull1);
- }
+ GETDATUM1(state, &state->memtuples[i]);
}
}
@@ -2035,15 +2028,7 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tuple = mtup->tuple;
- mtup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(state->indexRel),
- &mtup->isnull1);
- }
+ GETDATUM1(state, &state->memtuples[i]);
}
puttuple_common(state, &stup);
@@ -2122,11 +2107,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* case).
*/
for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- mtup->datum1 = PointerGetDatum(mtup->tuple);
- }
+ GETDATUM1(state, &state->memtuples[i]);
}
}
@@ -3983,6 +3964,23 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
* Routines specialized for HeapTuple (actually MinimalTuple) case
*/
+static void
+getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
+{
+ HeapTupleData htup;
+
+ htup.t_len = ((MinimalTuple) stup->tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) stup->tuple -
+ MINIMAL_TUPLE_OFFSET);
+
+ stup->datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup->isnull1);
+
+}
+
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
@@ -4101,6 +4099,18 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
* comparisons per a btree index definition)
*/
+static void
+getdatum1_cluster(Tuplesortstate *state, SortTuple *stup)
+{
+ HeapTuple tup;
+
+ tup = (HeapTuple) stup->tuple;
+ stup->datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup->isnull1);
+}
+
static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
@@ -4270,6 +4280,18 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
* functions can be shared.
*/
+static void
+getdatum1_index(Tuplesortstate *state, SortTuple *stup)
+{
+ IndexTuple tuple;
+
+ tuple = stup->tuple;
+ stup->datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(state->indexRel),
+ &stup->isnull1);
+}
+
static int
comparetup_index_btree(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
@@ -4502,6 +4524,12 @@ readtup_index(Tuplesortstate *state, SortTuple *stup,
* Routines specialized for DatumTuple case
*/
+static void
+getdatum1_datum(Tuplesortstate *state, SortTuple *stup)
+{
+ stup->datum1 = PointerGetDatum(stup->tuple);
+}
+
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
--
2.24.3 (Apple Git-128)
0003-Put-abbreviation-logic-into-puttuple_common-v1.patchapplication/x-patch; name=0003-Put-abbreviation-logic-into-puttuple_common-v1.patchDownload
From f8c73bbbceba4ab014348da20fec317bfbd64c86 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:13:56 +0300
Subject: [PATCH 3/6] Put abbreviation logic into puttuple_common()
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/tuplesort.c | 213 +++++++----------------------
1 file changed, 50 insertions(+), 163 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index c649043fbb0..c4d8c183f62 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1832,7 +1832,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
- Datum original;
MinimalTuple tuple;
HeapTupleData htup;
@@ -1843,51 +1842,13 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- original = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &stup.isnull1);
+ stup.datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup.isnull1);
MemoryContextSwitchTo(state->sortcontext);
- if (!state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- GETDATUM1(state, &state->memtuples[i]);
- }
-
puttuple_common(state, &stup);
MemoryContextSwitchTo(oldcontext);
@@ -1902,7 +1863,6 @@ void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
SortTuple stup;
- Datum original;
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
/* copy the tuple into sort storage */
@@ -1918,48 +1878,10 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
*/
if (state->haveDatum1)
{
- original = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &stup.isnull1);
-
- if (!state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- GETDATUM1(state, &state->memtuples[i]);
- }
+ stup.datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup.isnull1);
}
puttuple_common(state, &stup);
@@ -1978,7 +1900,6 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
- Datum original;
IndexTuple tuple;
stup.tuple = index_form_tuple(RelationGetDescr(rel), values, isnull);
@@ -1986,51 +1907,13 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
tuple->t_tid = *self;
USEMEM(state, GetMemoryChunkSpace(stup.tuple));
/* set up first-column key value */
- original = index_getattr(tuple,
- 1,
- RelationGetDescr(state->indexRel),
- &stup.isnull1);
+ stup.datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(state->indexRel),
+ &stup.isnull1);
MemoryContextSwitchTo(state->sortcontext);
- if (!state->sortKeys || !state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- GETDATUM1(state, &state->memtuples[i]);
- }
-
puttuple_common(state, &stup);
MemoryContextSwitchTo(oldcontext);
@@ -2072,43 +1955,11 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
}
else
{
- Datum original = datumCopy(val, false, state->datumTypeLen);
-
stup.isnull1 = false;
- stup.tuple = DatumGetPointer(original);
+ stup.datum1 = datumCopy(val, false, state->datumTypeLen);
+ stup.tuple = DatumGetPointer(stup.datum1);
USEMEM(state, GetMemoryChunkSpace(stup.tuple));
MemoryContextSwitchTo(state->sortcontext);
-
- if (!state->sortKeys->abbrev_converter)
- {
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any
- * case).
- */
- for (i = 0; i < state->memtupcount; i++)
- GETDATUM1(state, &state->memtuples[i]);
- }
}
puttuple_common(state, &stup);
@@ -2124,6 +1975,42 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
{
Assert(!LEADER(state));
+ if (!state->sortKeys || !state->haveDatum1 || !state->tuples ||
+ !state->sortKeys->abbrev_converter || tuple->isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ tuple->datum1 = state->sortKeys->abbrev_converter(tuple->datum1,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ GETDATUM1(state, &state->memtuples[i]);
+ }
+
switch (state->status)
{
case TSS_INITIAL:
--
2.24.3 (Apple Git-128)
0004-Move-freeing-memory-away-from-writetup-v1.patchapplication/x-patch; name=0004-Move-freeing-memory-away-from-writetup-v1.patchDownload
From f00a42cc9c08f8b9b6a54df1bb2ebaf6bf8597c8 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 00:14:51 +0300
Subject: [PATCH 4/6] Move freeing memory away from writetup()
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/tuplesort.c | 64 ++++++++++++------------------
1 file changed, 26 insertions(+), 38 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index c4d8c183f62..3bf990a1b34 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -612,6 +612,8 @@ static Tuplesortstate *tuplesort_begin_common(int workMem,
int sortopt);
static void tuplesort_begin_batch(Tuplesortstate *state);
static void puttuple_common(Tuplesortstate *state, SortTuple *tuple);
+static void writetuple_common(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
static bool consider_abort_common(Tuplesortstate *state);
static void inittapes(Tuplesortstate *state, bool mergeruns);
static void inittapestate(Tuplesortstate *state, int maxTapes);
@@ -1838,7 +1840,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
/* copy the tuple into sort storage */
tuple = ExecCopySlotMinimalTuple(slot);
stup.tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
@@ -1847,8 +1848,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
state->tupDesc,
&stup.isnull1);
- MemoryContextSwitchTo(state->sortcontext);
-
puttuple_common(state, &stup);
MemoryContextSwitchTo(oldcontext);
@@ -1868,9 +1867,6 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
stup.tuple = (void *) tup;
- USEMEM(state, GetMemoryChunkSpace(tup));
-
- MemoryContextSwitchTo(state->sortcontext);
/*
* set up first-column key value, and potentially abbreviate, if it's a
@@ -1905,15 +1901,12 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
stup.tuple = index_form_tuple(RelationGetDescr(rel), values, isnull);
tuple = ((IndexTuple) stup.tuple);
tuple->t_tid = *self;
- USEMEM(state, GetMemoryChunkSpace(stup.tuple));
/* set up first-column key value */
stup.datum1 = index_getattr(tuple,
1,
RelationGetDescr(state->indexRel),
&stup.isnull1);
- MemoryContextSwitchTo(state->sortcontext);
-
puttuple_common(state, &stup);
MemoryContextSwitchTo(oldcontext);
@@ -1951,15 +1944,12 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
stup.datum1 = !isNull ? val : (Datum) 0;
stup.isnull1 = isNull;
stup.tuple = NULL; /* no separate storage */
- MemoryContextSwitchTo(state->sortcontext);
}
else
{
stup.isnull1 = false;
stup.datum1 = datumCopy(val, false, state->datumTypeLen);
stup.tuple = DatumGetPointer(stup.datum1);
- USEMEM(state, GetMemoryChunkSpace(stup.tuple));
- MemoryContextSwitchTo(state->sortcontext);
}
puttuple_common(state, &stup);
@@ -1973,8 +1963,13 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
static void
puttuple_common(Tuplesortstate *state, SortTuple *tuple)
{
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+
Assert(!LEADER(state));
+ if (tuple->tuple != NULL)
+ USEMEM(state, GetMemoryChunkSpace(tuple->tuple));
+
if (!state->sortKeys || !state->haveDatum1 || !state->tuples ||
!state->sortKeys->abbrev_converter || tuple->isnull1)
{
@@ -2052,6 +2047,7 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
pg_rusage_show(&state->ru_start));
#endif
make_bounded_heap(state);
+ MemoryContextSwitchTo(oldcontext);
return;
}
@@ -2059,7 +2055,10 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
* Done if we still fit in available memory and have array slots.
*/
if (state->memtupcount < state->memtupsize && !LACKMEM(state))
+ {
+ MemoryContextSwitchTo(oldcontext);
return;
+ }
/*
* Nope; time to switch to tape-based operation.
@@ -2113,6 +2112,19 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
elog(ERROR, "invalid tuplesort state");
break;
}
+ MemoryContextSwitchTo(oldcontext);
+}
+
+static void
+writetuple_common(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ WRITETUP(state, tape, stup);
+
+ if (!state->slabAllocatorUsed && stup->tuple)
+ {
+ FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
+ pfree(stup->tuple);
+ }
}
static bool
@@ -3170,7 +3182,7 @@ mergeonerun(Tuplesortstate *state)
/* write the tuple to destTape */
srcTapeIndex = state->memtuples[0].srctape;
srcTape = state->inputTapes[srcTapeIndex];
- WRITETUP(state, state->destTape, &state->memtuples[0]);
+ writetuple_common(state, state->destTape, &state->memtuples[0]);
/* recycle the slot of the tuple we just wrote out, for the next read */
if (state->memtuples[0].tuple)
@@ -3316,7 +3328,7 @@ dumptuples(Tuplesortstate *state, bool alltuples)
memtupwrite = state->memtupcount;
for (i = 0; i < memtupwrite; i++)
{
- WRITETUP(state, state->destTape, &state->memtuples[i]);
+ writetuple_common(state, state->destTape, &state->memtuples[i]);
state->memtupcount--;
}
@@ -3947,12 +3959,6 @@ writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- heap_free_minimal_tuple(tuple);
- }
}
static void
@@ -4123,12 +4129,6 @@ writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- heap_freetuple(tuple);
- }
}
static void
@@ -4380,12 +4380,6 @@ writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- pfree(tuple);
- }
}
static void
@@ -4469,12 +4463,6 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
-
- if (!state->slabAllocatorUsed && stup->tuple)
- {
- FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
- pfree(stup->tuple);
- }
}
static void
--
2.24.3 (Apple Git-128)
0005-Reorganize-data-structures-v1.patchapplication/x-patch; name=0005-Reorganize-data-structures-v1.patchDownload
From 1a519dd20fa40ff5ee850724f0da31a67ee555ee Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 18:11:26 +0300
Subject: [PATCH 5/6] Reorganize data structures
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/tuplesort.c | 762 ++++++++++++++++-------------
1 file changed, 432 insertions(+), 330 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 3bf990a1b34..e106e1ff9e2 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -126,8 +126,8 @@
#define CLUSTER_SORT 3
/* Sort parallel code from state for sort__start probes */
-#define PARALLEL_SORT(state) ((state)->shared == NULL ? 0 : \
- (state)->worker >= 0 ? 1 : 2)
+#define PARALLEL_SORT(coordinate) ((coordinate)->sharedsort == NULL ? 0 : \
+ (coordinate)->isWorker >= 0 ? 1 : 2)
/*
* Initial size of memtuples array. We're trying to select this size so that
@@ -236,37 +236,17 @@ typedef enum
#define TAPE_BUFFER_OVERHEAD BLCKSZ
#define MERGE_BUFFER_SIZE (BLCKSZ * 32)
+typedef struct TuplesortOps TuplesortOps;
+
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-/*
- * Private state of a Tuplesort operation.
- */
-struct Tuplesortstate
+struct TuplesortOps
{
- TupSortStatus status; /* enumerated value as shown above */
- int nKeys; /* number of columns in sort key */
- int sortopt; /* Bitmask of flags used to setup sort */
- bool bounded; /* did caller specify a maximum number of
- * tuples to return? */
- bool boundUsed; /* true if we made use of a bounded heap */
- int bound; /* if bounded, the maximum number of tuples */
- bool tuples; /* Can SortTuple.tuple ever be set? */
- int64 availMem; /* remaining memory available, in bytes */
- int64 allowedMem; /* total memory allowed, in bytes */
- int maxTapes; /* max number of input tapes to merge in each
- * pass */
- int64 maxSpace; /* maximum amount of space occupied among sort
- * of groups, either in-memory or on-disk */
- bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
- * space, false when it's value for in-memory
- * space */
- TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
MemoryContext maincontext; /* memory context for tuple sort metadata that
* persists across multiple batches */
MemoryContext sortcontext; /* memory context holding most sort data */
MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
- LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
/*
* These function pointers decouple the routines that must know what kind
@@ -300,12 +280,116 @@ struct Tuplesortstate
void (*readtup) (Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
+ void (*freestate) (Tuplesortstate *state);
+
/*
* Whether SortTuple's datum1 and isnull1 members are maintained by the
* above routines. If not, some sort specializations are disabled.
*/
bool haveDatum1;
+ /*
+ * The sortKeys variable is used by every case other than the hash index
+ * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
+ * MinimalTuple and CLUSTER routines, though.
+ */
+ int nKeys; /* number of columns in sort key */
+ SortSupport sortKeys; /* array of length nKeys */
+
+ /*
+ * This variable is shared by the single-key MinimalTuple case and the
+ * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
+ * presence of a value in this field is also checked by various sort
+ * specialization functions as an optimization when comparing the leading
+ * key in a tiebreak situation to determine if there are any subsequent
+ * keys to sort on.
+ */
+ SortSupport onlyKey;
+
+ int sortopt; /* Bitmask of flags used to setup sort */
+
+ bool tuples; /* Can SortTuple.tuple ever be set? */
+
+ void *arg;
+};
+
+typedef struct
+{
+ TupleDesc tupDesc;
+
+ /*
+ * These variables are specific to the CLUSTER case; they are set by
+ * tuplesort_begin_cluster.
+ */
+ IndexInfo *indexInfo; /* info about index being used for reference */
+ EState *estate; /* for evaluating index expressions */
+} TupleSortClusterArg;
+
+typedef struct
+{
+ /*
+ * These variables are specific to the IndexTuple case; they are set by
+ * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
+ */
+ Relation heapRel; /* table the index is being built on */
+ Relation indexRel; /* index being built */
+} TupleSortIndexArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /* These are specific to the index_btree subcase: */
+ bool enforceUnique; /* complain if we find duplicate tuples */
+ bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+} TupleSortIndexBTreeArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /* These are specific to the index_hash subcase: */
+ uint32 high_mask; /* masks for sortable part of hash code */
+ uint32 low_mask;
+ uint32 max_buckets;
+} TupleSortIndexHashArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /*
+ * These variables are specific to the Datum case; they are set by
+ * tuplesort_begin_datum and used only by the DatumTuple routines.
+ */
+ Oid datumType;
+ /* we need typelen in order to know how to copy the Datums. */
+ int datumTypeLen;
+} TupleSortDatumArg;
+
+/*
+ * Private state of a Tuplesort operation.
+ */
+struct Tuplesortstate
+{
+ TuplesortOps ops;
+ TupSortStatus status; /* enumerated value as shown above */
+ bool bounded; /* did caller specify a maximum number of
+ * tuples to return? */
+ bool boundUsed; /* true if we made use of a bounded heap */
+ int bound; /* if bounded, the maximum number of tuples */
+ int64 availMem; /* remaining memory available, in bytes */
+ int64 allowedMem; /* total memory allowed, in bytes */
+ int maxTapes; /* max number of input tapes to merge in each
+ * pass */
+ int64 maxSpace; /* maximum amount of space occupied among sort
+ * of groups, either in-memory or on-disk */
+ bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
+ * space, false when it's value for in-memory
+ * space */
+ TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
+ LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
+
/*
* This array holds the tuples now in sort memory. If we are in state
* INITIAL, the tuples are in no particular order; if we are in state
@@ -420,24 +504,6 @@ struct Tuplesortstate
Sharedsort *shared;
int nParticipants;
- /*
- * The sortKeys variable is used by every case other than the hash index
- * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
- * MinimalTuple and CLUSTER routines, though.
- */
- TupleDesc tupDesc;
- SortSupport sortKeys; /* array of length nKeys */
-
- /*
- * This variable is shared by the single-key MinimalTuple case and the
- * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
- * presence of a value in this field is also checked by various sort
- * specialization functions as an optimization when comparing the leading
- * key in a tiebreak situation to determine if there are any subsequent
- * keys to sort on.
- */
- SortSupport onlyKey;
-
/*
* Additional state for managing "abbreviated key" sortsupport routines
* (which currently may be used by all cases except the hash index case).
@@ -447,37 +513,6 @@ struct Tuplesortstate
int64 abbrevNext; /* Tuple # at which to next check
* applicability */
- /*
- * These variables are specific to the CLUSTER case; they are set by
- * tuplesort_begin_cluster.
- */
- IndexInfo *indexInfo; /* info about index being used for reference */
- EState *estate; /* for evaluating index expressions */
-
- /*
- * These variables are specific to the IndexTuple case; they are set by
- * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
- */
- Relation heapRel; /* table the index is being built on */
- Relation indexRel; /* index being built */
-
- /* These are specific to the index_btree subcase: */
- bool enforceUnique; /* complain if we find duplicate tuples */
- bool uniqueNullsNotDistinct; /* unique constraint null treatment */
-
- /* These are specific to the index_hash subcase: */
- uint32 high_mask; /* masks for sortable part of hash code */
- uint32 low_mask;
- uint32 max_buckets;
-
- /*
- * These variables are specific to the Datum case; they are set by
- * tuplesort_begin_datum and used only by the DatumTuple routines.
- */
- Oid datumType;
- /* we need typelen in order to know how to copy the Datums. */
- int datumTypeLen;
-
/*
* Resource snapshot for time of sort start.
*/
@@ -542,10 +577,13 @@ struct Sharedsort
pfree(buf); \
} while(0)
-#define GETDATUM1(state,stup) ((*(state)->getdatum1) (state, stup))
-#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
-#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
-#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
+#define TuplesortstateGetOps(state) ((TuplesortOps *) state);
+
+#define GETDATUM1(state,stup) ((*(state)->ops.getdatum1) (state, stup))
+#define COMPARETUP(state,a,b) ((*(state)->ops.comparetup) (a, b, state))
+#define WRITETUP(state,tape,stup) ((*(state)->ops.writetup) (state, tape, stup))
+#define READTUP(state,stup,tape,len) ((*(state)->ops.readtup) (state, stup, tape, len))
+#define FREESTATE(state) ((state)->ops.freestate ? (*(state)->ops.freestate) (state) : (void) 0)
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
#define USEMEM(state,amt) ((state)->availMem -= (amt))
#define FREEMEM(state,amt) ((state)->availMem += (amt))
@@ -664,6 +702,7 @@ static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
+static void freestate_cluster(Tuplesortstate *state);
static int worker_get_identifier(Tuplesortstate *state);
static void worker_freeze_result_tape(Tuplesortstate *state);
static void worker_nomergeruns(Tuplesortstate *state);
@@ -694,7 +733,7 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplyUnsignedSortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->ops.sortKeys[0]);
if (compare != 0)
return compare;
@@ -702,10 +741,10 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->ops.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->ops.comparetup(a, b, state);
}
#if SIZEOF_DATUM >= 8
@@ -717,7 +756,7 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplySignedSortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->ops.sortKeys[0]);
if (compare != 0)
return compare;
@@ -726,10 +765,10 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->ops.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->ops.comparetup(a, b, state);
}
#endif
@@ -741,7 +780,7 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplyInt32SortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->ops.sortKeys[0]);
if (compare != 0)
return compare;
@@ -750,10 +789,10 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->ops.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->ops.comparetup(a, b, state);
}
/*
@@ -880,8 +919,9 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
pg_rusage_init(&state->ru_start);
#endif
- state->sortopt = sortopt;
- state->tuples = true;
+ state->ops.sortopt = sortopt;
+ state->ops.tuples = true;
+ state->abbrevNext = 10;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -890,8 +930,8 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
* with very little memory.
*/
state->allowedMem = Max(workMem, 64) * (int64) 1024;
- state->sortcontext = sortcontext;
- state->maincontext = maincontext;
+ state->ops.sortcontext = sortcontext;
+ state->ops.maincontext = maincontext;
/*
* Initial size of array must be more than ALLOCSET_SEPARATE_THRESHOLD;
@@ -950,7 +990,7 @@ tuplesort_begin_batch(Tuplesortstate *state)
{
MemoryContext oldcontext;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
/*
* Caller tuple (e.g. IndexTuple) memory context.
@@ -965,12 +1005,12 @@ tuplesort_begin_batch(Tuplesortstate *state)
* generation.c context as this keeps allocations more compact with less
* wastage. Allocations are also slightly more CPU efficient.
*/
- if (state->sortopt & TUPLESORT_ALLOWBOUNDED)
- state->tuplecontext = AllocSetContextCreate(state->sortcontext,
+ if (state->ops.sortopt & TUPLESORT_ALLOWBOUNDED)
+ state->ops.tuplecontext = AllocSetContextCreate(state->ops.sortcontext,
"Caller tuples",
ALLOCSET_DEFAULT_SIZES);
else
- state->tuplecontext = GenerationContextCreate(state->sortcontext,
+ state->ops.tuplecontext = GenerationContextCreate(state->ops.sortcontext,
"Caller tuples",
ALLOCSET_DEFAULT_SIZES);
@@ -1028,10 +1068,11 @@ tuplesort_begin_heap(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
MemoryContext oldcontext;
int i;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
AssertArg(nkeys > 0);
@@ -1042,30 +1083,28 @@ tuplesort_begin_heap(TupleDesc tupDesc,
nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = nkeys;
+ ops->nKeys = nkeys;
TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
false, /* no unique check */
nkeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->getdatum1 = getdatum1_heap;
- state->comparetup = comparetup_heap;
- state->writetup = writetup_heap;
- state->readtup = readtup_heap;
- state->haveDatum1 = true;
-
- state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
- state->abbrevNext = 10;
+ ops->getdatum1 = getdatum1_heap;
+ ops->comparetup = comparetup_heap;
+ ops->writetup = writetup_heap;
+ ops->readtup = readtup_heap;
+ ops->haveDatum1 = true;
+ ops->arg = tupDesc; /* assume we need not copy tupDesc */
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+ ops->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
for (i = 0; i < nkeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = ops->sortKeys + i;
AssertArg(attNums[i] != 0);
AssertArg(sortOperators[i] != 0);
@@ -1075,7 +1114,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
sortKey->ssup_nulls_first = nullsFirstFlags[i];
sortKey->ssup_attno = attNums[i];
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
}
@@ -1086,8 +1125,8 @@ tuplesort_begin_heap(TupleDesc tupDesc,
* is only of value to pass-by-value types anyway, whereas abbreviated
* keys are typically only of value to pass-by-reference types.
*/
- if (nkeys == 1 && !state->sortKeys->abbrev_converter)
- state->onlyKey = state->sortKeys;
+ if (nkeys == 1 && !ops->sortKeys->abbrev_converter)
+ ops->onlyKey = ops->sortKeys;
MemoryContextSwitchTo(oldcontext);
@@ -1102,13 +1141,16 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
BTScanInsert indexScanKey;
MemoryContext oldcontext;
+ TupleSortClusterArg *arg;
int i;
Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
+ arg = (TupleSortClusterArg *) palloc0(sizeof(TupleSortClusterArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1118,37 +1160,38 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
false, /* no unique check */
- state->nKeys,
+ ops->nKeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->getdatum1 = getdatum1_cluster;
- state->comparetup = comparetup_cluster;
- state->writetup = writetup_cluster;
- state->readtup = readtup_cluster;
- state->abbrevNext = 10;
+ ops->getdatum1 = getdatum1_cluster;
+ ops->comparetup = comparetup_cluster;
+ ops->writetup = writetup_cluster;
+ ops->readtup = readtup_cluster;
+ ops->freestate = freestate_cluster;
+ ops->arg = arg;
- state->indexInfo = BuildIndexInfo(indexRel);
+ arg->indexInfo = BuildIndexInfo(indexRel);
/*
* If we don't have a simple leading attribute, we don't currently
* initialize datum1, so disable optimizations that require it.
*/
- if (state->indexInfo->ii_IndexAttrNumbers[0] == 0)
- state->haveDatum1 = false;
+ if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
+ ops->haveDatum1 = false;
else
- state->haveDatum1 = true;
+ ops->haveDatum1 = true;
- state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
+ arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
indexScanKey = _bt_mkscankey(indexRel, NULL);
- if (state->indexInfo->ii_Expressions != NULL)
+ if (arg->indexInfo->ii_Expressions != NULL)
{
TupleTableSlot *slot;
ExprContext *econtext;
@@ -1159,19 +1202,19 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
* TupleTableSlot to put the table tuples into. The econtext's
* scantuple has to point to that slot, too.
*/
- state->estate = CreateExecutorState();
+ arg->estate = CreateExecutorState();
slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
- econtext = GetPerTupleExprContext(state->estate);
+ econtext = GetPerTupleExprContext(arg->estate);
econtext->ecxt_scantuple = slot;
}
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < ops->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = ops->sortKeys + i;
ScanKey scanKey = indexScanKey->scankeys + i;
int16 strategy;
@@ -1181,7 +1224,7 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
(scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
sortKey->ssup_attno = scanKey->sk_attno;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1209,11 +1252,14 @@ tuplesort_begin_index_btree(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
BTScanInsert indexScanKey;
+ TupleSortIndexBTreeArg *arg;
MemoryContext oldcontext;
int i;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
+ arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1223,36 +1269,36 @@ tuplesort_begin_index_btree(Relation heapRel,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
enforceUnique,
state->nKeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->getdatum1 = getdatum1_index;
- state->comparetup = comparetup_index_btree;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->abbrevNext = 10;
- state->haveDatum1 = true;
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_btree;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
- state->enforceUnique = enforceUnique;
- state->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = enforceUnique;
+ arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
indexScanKey = _bt_mkscankey(indexRel, NULL);
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < ops->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = ops->sortKeys + i;
ScanKey scanKey = indexScanKey->scankeys + i;
int16 strategy;
@@ -1262,7 +1308,7 @@ tuplesort_begin_index_btree(Relation heapRel,
(scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
sortKey->ssup_attno = scanKey->sk_attno;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1291,9 +1337,12 @@ tuplesort_begin_index_hash(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
MemoryContext oldcontext;
+ TupleSortIndexHashArg *arg;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
+ arg = (TupleSortIndexHashArg *) palloc(sizeof(TupleSortIndexHashArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1307,20 +1356,21 @@ tuplesort_begin_index_hash(Relation heapRel,
sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = 1; /* Only one sort column, the hash code */
+ ops->nKeys = 1; /* Only one sort column, the hash code */
- state->getdatum1 = getdatum1_index;
- state->comparetup = comparetup_index_hash;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->haveDatum1 = true;
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_hash;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
- state->high_mask = high_mask;
- state->low_mask = low_mask;
- state->max_buckets = max_buckets;
+ arg->high_mask = high_mask;
+ arg->low_mask = low_mask;
+ arg->max_buckets = max_buckets;
MemoryContextSwitchTo(oldcontext);
@@ -1336,10 +1386,13 @@ tuplesort_begin_index_gist(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
MemoryContext oldcontext;
+ TupleSortIndexBTreeArg *arg;
int i;
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
+ arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1348,31 +1401,34 @@ tuplesort_begin_index_gist(Relation heapRel,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
- state->getdatum1 = getdatum1_index;
- state->comparetup = comparetup_index_btree;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->haveDatum1 = true;
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_btree;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = false;
+ arg->uniqueNullsNotDistinct = false;
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < ops->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = ops->sortKeys + i;
sortKey->ssup_cxt = CurrentMemoryContext;
sortKey->ssup_collation = indexRel->rd_indcollation[i];
sortKey->ssup_nulls_first = false;
sortKey->ssup_attno = i + 1;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1392,11 +1448,14 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg;
MemoryContext oldcontext;
int16 typlen;
bool typbyval;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
+ arg = (TupleSortDatumArg *) palloc(sizeof(TupleSortDatumArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1405,35 +1464,36 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = 1; /* always a one-column sort */
+ ops->nKeys = 1; /* always a one-column sort */
TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
false, /* no unique check */
1,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->getdatum1 = getdatum1_datum;
- state->comparetup = comparetup_datum;
- state->writetup = writetup_datum;
- state->readtup = readtup_datum;
+ ops->getdatum1 = getdatum1_datum;
+ ops->comparetup = comparetup_datum;
+ ops->writetup = writetup_datum;
+ ops->readtup = readtup_datum;
state->abbrevNext = 10;
- state->haveDatum1 = true;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
- state->datumType = datumType;
+ arg->datumType = datumType;
/* lookup necessary attributes of the datum type */
get_typlenbyval(datumType, &typlen, &typbyval);
- state->datumTypeLen = typlen;
- state->tuples = !typbyval;
+ arg->datumTypeLen = typlen;
+ ops->tuples = !typbyval;
/* Prepare SortSupport data */
- state->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+ ops->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
- state->sortKeys->ssup_cxt = CurrentMemoryContext;
- state->sortKeys->ssup_collation = sortCollation;
- state->sortKeys->ssup_nulls_first = nullsFirstFlag;
+ ops->sortKeys->ssup_cxt = CurrentMemoryContext;
+ ops->sortKeys->ssup_collation = sortCollation;
+ ops->sortKeys->ssup_nulls_first = nullsFirstFlag;
/*
* Abbreviation is possible here only for by-reference types. In theory,
@@ -1443,9 +1503,9 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
* can't, because a datum sort only stores a single copy of the datum; the
* "tuple" field of each SortTuple is NULL.
*/
- state->sortKeys->abbreviate = !typbyval;
+ ops->sortKeys->abbreviate = !typbyval;
- PrepareSortSupportFromOrderingOp(sortOperator, state->sortKeys);
+ PrepareSortSupportFromOrderingOp(sortOperator, ops->sortKeys);
/*
* The "onlyKey" optimization cannot be used with abbreviated keys, since
@@ -1453,8 +1513,8 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
* is only of value to pass-by-value types anyway, whereas abbreviated
* keys are typically only of value to pass-by-reference types.
*/
- if (!state->sortKeys->abbrev_converter)
- state->onlyKey = state->sortKeys;
+ if (!ops->sortKeys->abbrev_converter)
+ ops->onlyKey = ops->sortKeys;
MemoryContextSwitchTo(oldcontext);
@@ -1479,7 +1539,7 @@ tuplesort_set_bound(Tuplesortstate *state, int64 bound)
/* Assert we're called before loading any tuples */
Assert(state->status == TSS_INITIAL && state->memtupcount == 0);
/* Assert we allow bounded sorts */
- Assert(state->sortopt & TUPLESORT_ALLOWBOUNDED);
+ Assert(state->ops.sortopt & TUPLESORT_ALLOWBOUNDED);
/* Can't set the bound twice, either */
Assert(!state->bounded);
/* Also, this shouldn't be called in a parallel worker */
@@ -1507,13 +1567,13 @@ tuplesort_set_bound(Tuplesortstate *state, int64 bound)
* optimization. Disable by setting state to be consistent with no
* abbreviation support.
*/
- state->sortKeys->abbrev_converter = NULL;
- if (state->sortKeys->abbrev_full_comparator)
- state->sortKeys->comparator = state->sortKeys->abbrev_full_comparator;
+ state->ops.sortKeys->abbrev_converter = NULL;
+ if (state->ops.sortKeys->abbrev_full_comparator)
+ state->ops.sortKeys->comparator = state->ops.sortKeys->abbrev_full_comparator;
/* Not strictly necessary, but be tidy */
- state->sortKeys->abbrev_abort = NULL;
- state->sortKeys->abbrev_full_comparator = NULL;
+ state->ops.sortKeys->abbrev_abort = NULL;
+ state->ops.sortKeys->abbrev_full_comparator = NULL;
}
/*
@@ -1536,7 +1596,7 @@ static void
tuplesort_free(Tuplesortstate *state)
{
/* context swap probably not needed, but let's be safe */
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
#ifdef TRACE_SORT
long spaceUsed;
@@ -1583,21 +1643,13 @@ tuplesort_free(Tuplesortstate *state)
TRACE_POSTGRESQL_SORT_DONE(state->tapeset != NULL, 0L);
#endif
- /* Free any execution state created for CLUSTER case */
- if (state->estate != NULL)
- {
- ExprContext *econtext = GetPerTupleExprContext(state->estate);
-
- ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
- FreeExecutorState(state->estate);
- }
-
+ FREESTATE(state);
MemoryContextSwitchTo(oldcontext);
/*
* Free the per-sort memory context, thereby releasing all working memory.
*/
- MemoryContextReset(state->sortcontext);
+ MemoryContextReset(state->ops.sortcontext);
}
/*
@@ -1618,7 +1670,7 @@ tuplesort_end(Tuplesortstate *state)
* Free the main memory context, including the Tuplesortstate struct
* itself.
*/
- MemoryContextDelete(state->maincontext);
+ MemoryContextDelete(state->ops.maincontext);
}
/*
@@ -1832,7 +1884,9 @@ noalloc:
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleDesc tupDesc = (TupleDesc) ops->arg;
SortTuple stup;
MinimalTuple tuple;
HeapTupleData htup;
@@ -1844,8 +1898,8 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
stup.datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ ops->sortKeys[0].ssup_attno,
+ tupDesc,
&stup.isnull1);
puttuple_common(state, &stup);
@@ -1862,7 +1916,9 @@ void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
SortTuple stup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
@@ -1872,11 +1928,11 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
* set up first-column key value, and potentially abbreviate, if it's a
* simple column
*/
- if (state->haveDatum1)
+ if (ops->haveDatum1)
{
stup.datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup.isnull1);
}
@@ -1894,9 +1950,11 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
ItemPointer self, Datum *values,
bool *isnull)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
SortTuple stup;
IndexTuple tuple;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
stup.tuple = index_form_tuple(RelationGetDescr(rel), values, isnull);
tuple = ((IndexTuple) stup.tuple);
@@ -1904,7 +1962,7 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
/* set up first-column key value */
stup.datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup.isnull1);
puttuple_common(state, &stup);
@@ -1920,7 +1978,9 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
void
tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
SortTuple stup;
/*
@@ -1935,7 +1995,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* identical to stup.tuple.
*/
- if (isNull || !state->tuples)
+ if (isNull || !state->ops.tuples)
{
/*
* Set datum1 to zeroed representation for NULLs (to be consistent,
@@ -1948,7 +2008,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
else
{
stup.isnull1 = false;
- stup.datum1 = datumCopy(val, false, state->datumTypeLen);
+ stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
stup.tuple = DatumGetPointer(stup.datum1);
}
@@ -1963,15 +2023,15 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
static void
puttuple_common(Tuplesortstate *state, SortTuple *tuple)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
Assert(!LEADER(state));
if (tuple->tuple != NULL)
USEMEM(state, GetMemoryChunkSpace(tuple->tuple));
- if (!state->sortKeys || !state->haveDatum1 || !state->tuples ||
- !state->sortKeys->abbrev_converter || tuple->isnull1)
+ if (!state->ops.sortKeys || !state->ops.haveDatum1 || !state->ops.tuples ||
+ !state->ops.sortKeys->abbrev_converter || tuple->isnull1)
{
/*
* Store ordinary Datum representation, or NULL value. If there is a
@@ -1985,8 +2045,8 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
else if (!consider_abort_common(state))
{
/* Store abbreviated key representation */
- tuple->datum1 = state->sortKeys->abbrev_converter(tuple->datum1,
- state->sortKeys);
+ tuple->datum1 = state->ops.sortKeys->abbrev_converter(tuple->datum1,
+ state->ops.sortKeys);
}
else
{
@@ -2130,9 +2190,9 @@ writetuple_common(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
static bool
consider_abort_common(Tuplesortstate *state)
{
- Assert(state->sortKeys[0].abbrev_converter != NULL);
- Assert(state->sortKeys[0].abbrev_abort != NULL);
- Assert(state->sortKeys[0].abbrev_full_comparator != NULL);
+ Assert(state->ops.sortKeys[0].abbrev_converter != NULL);
+ Assert(state->ops.sortKeys[0].abbrev_abort != NULL);
+ Assert(state->ops.sortKeys[0].abbrev_full_comparator != NULL);
/*
* Check effectiveness of abbreviation optimization. Consider aborting
@@ -2147,19 +2207,19 @@ consider_abort_common(Tuplesortstate *state)
* Check opclass-supplied abbreviation abort routine. It may indicate
* that abbreviation should not proceed.
*/
- if (!state->sortKeys->abbrev_abort(state->memtupcount,
- state->sortKeys))
+ if (!state->ops.sortKeys->abbrev_abort(state->memtupcount,
+ state->ops.sortKeys))
return false;
/*
* Finally, restore authoritative comparator, and indicate that
* abbreviation is not in play by setting abbrev_converter to NULL
*/
- state->sortKeys[0].comparator = state->sortKeys[0].abbrev_full_comparator;
- state->sortKeys[0].abbrev_converter = NULL;
+ state->ops.sortKeys[0].comparator = state->ops.sortKeys[0].abbrev_full_comparator;
+ state->ops.sortKeys[0].abbrev_converter = NULL;
/* Not strictly necessary, but be tidy */
- state->sortKeys[0].abbrev_abort = NULL;
- state->sortKeys[0].abbrev_full_comparator = NULL;
+ state->ops.sortKeys[0].abbrev_abort = NULL;
+ state->ops.sortKeys[0].abbrev_full_comparator = NULL;
/* Give up - expect original pass-by-value representation */
return true;
@@ -2174,7 +2234,7 @@ consider_abort_common(Tuplesortstate *state)
void
tuplesort_performsort(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
#ifdef TRACE_SORT
if (trace_sort)
@@ -2294,7 +2354,7 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
switch (state->status)
{
case TSS_SORTEDINMEM:
- Assert(forward || state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(forward || state->ops.sortopt & TUPLESORT_RANDOMACCESS);
Assert(!state->slabAllocatorUsed);
if (forward)
{
@@ -2338,7 +2398,7 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
break;
case TSS_SORTEDONTAPE:
- Assert(forward || state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(forward || state->ops.sortopt & TUPLESORT_RANDOMACCESS);
Assert(state->slabAllocatorUsed);
/*
@@ -2540,7 +2600,7 @@ bool
tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
TupleTableSlot *slot, Datum *abbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2551,7 +2611,7 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
if (stup.tuple)
{
/* Record abbreviated key for caller */
- if (state->sortKeys->abbrev_converter && abbrev)
+ if (state->ops.sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
if (copy)
@@ -2576,7 +2636,7 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
HeapTuple
tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2596,7 +2656,7 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
IndexTuple
tuplesort_getindextuple(Tuplesortstate *state, bool forward)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2626,7 +2686,9 @@ bool
tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2639,10 +2701,10 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
MemoryContextSwitchTo(oldcontext);
/* Record abbreviated key for caller */
- if (state->sortKeys->abbrev_converter && abbrev)
+ if (ops->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
- if (stup.isnull1 || !state->tuples)
+ if (stup.isnull1 || !state->ops.tuples)
{
*val = stup.datum1;
*isNull = stup.isnull1;
@@ -2650,7 +2712,7 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
else
{
/* use stup.tuple because stup.datum1 may be an abbreviation */
- *val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
*isNull = false;
}
@@ -2703,7 +2765,7 @@ tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples, bool forward)
* We could probably optimize these cases better, but for now it's
* not worth the trouble.
*/
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
while (ntuples-- > 0)
{
SortTuple stup;
@@ -2979,7 +3041,7 @@ mergeruns(Tuplesortstate *state)
Assert(state->status == TSS_BUILDRUNS);
Assert(state->memtupcount == 0);
- if (state->sortKeys != NULL && state->sortKeys->abbrev_converter != NULL)
+ if (state->ops.sortKeys != NULL && state->ops.sortKeys->abbrev_converter != NULL)
{
/*
* If there are multiple runs to be merged, when we go to read back
@@ -2987,19 +3049,19 @@ mergeruns(Tuplesortstate *state)
* we don't care to regenerate them. Disable abbreviation from this
* point on.
*/
- state->sortKeys->abbrev_converter = NULL;
- state->sortKeys->comparator = state->sortKeys->abbrev_full_comparator;
+ state->ops.sortKeys->abbrev_converter = NULL;
+ state->ops.sortKeys->comparator = state->ops.sortKeys->abbrev_full_comparator;
/* Not strictly necessary, but be tidy */
- state->sortKeys->abbrev_abort = NULL;
- state->sortKeys->abbrev_full_comparator = NULL;
+ state->ops.sortKeys->abbrev_abort = NULL;
+ state->ops.sortKeys->abbrev_full_comparator = NULL;
}
/*
* Reset tuple memory. We've freed all the tuples that we previously
* allocated. We will use the slab allocator from now on.
*/
- MemoryContextResetOnly(state->tuplecontext);
+ MemoryContextResetOnly(state->ops.tuplecontext);
/*
* We no longer need a large memtuples array. (We will allocate a smaller
@@ -3022,7 +3084,7 @@ mergeruns(Tuplesortstate *state)
* From this point on, we no longer use the USEMEM()/LACKMEM() mechanism
* to track memory usage of individual tuples.
*/
- if (state->tuples)
+ if (state->ops.tuples)
init_slab_allocator(state, state->nOutputTapes + 1);
else
init_slab_allocator(state, 0);
@@ -3036,7 +3098,7 @@ mergeruns(Tuplesortstate *state)
* number of input tapes will not increase between passes.)
*/
state->memtupsize = state->nOutputTapes;
- state->memtuples = (SortTuple *) MemoryContextAlloc(state->maincontext,
+ state->memtuples = (SortTuple *) MemoryContextAlloc(state->ops.maincontext,
state->nOutputTapes * sizeof(SortTuple));
USEMEM(state, GetMemoryChunkSpace(state->memtuples));
@@ -3113,7 +3175,7 @@ mergeruns(Tuplesortstate *state)
* sorted tape, we can stop at this point and do the final merge
* on-the-fly.
*/
- if ((state->sortopt & TUPLESORT_RANDOMACCESS) == 0
+ if ((state->ops.sortopt & TUPLESORT_RANDOMACCESS) == 0
&& state->nInputRuns <= state->nInputTapes
&& !WORKER(state))
{
@@ -3339,7 +3401,7 @@ dumptuples(Tuplesortstate *state, bool alltuples)
* AllocSetFree's bucketing by size class might be particularly bad if
* this step wasn't taken.
*/
- MemoryContextReset(state->tuplecontext);
+ MemoryContextReset(state->ops.tuplecontext);
markrunend(state->destTape);
@@ -3357,9 +3419,9 @@ dumptuples(Tuplesortstate *state, bool alltuples)
void
tuplesort_rescan(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->ops.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3390,9 +3452,9 @@ tuplesort_rescan(Tuplesortstate *state)
void
tuplesort_markpos(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->ops.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3421,9 +3483,9 @@ tuplesort_markpos(Tuplesortstate *state)
void
tuplesort_restorepos(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->ops.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3639,9 +3701,9 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
*/
- if (state->haveDatum1 && state->sortKeys)
+ if (state->ops.haveDatum1 && state->ops.sortKeys)
{
- if (state->sortKeys[0].comparator == ssup_datum_unsigned_cmp)
+ if (state->ops.sortKeys[0].comparator == ssup_datum_unsigned_cmp)
{
qsort_tuple_unsigned(state->memtuples,
state->memtupcount,
@@ -3649,7 +3711,7 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
return;
}
#if SIZEOF_DATUM >= 8
- else if (state->sortKeys[0].comparator == ssup_datum_signed_cmp)
+ else if (state->ops.sortKeys[0].comparator == ssup_datum_signed_cmp)
{
qsort_tuple_signed(state->memtuples,
state->memtupcount,
@@ -3657,7 +3719,7 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
return;
}
#endif
- else if (state->sortKeys[0].comparator == ssup_datum_int32_cmp)
+ else if (state->ops.sortKeys[0].comparator == ssup_datum_int32_cmp)
{
qsort_tuple_int32(state->memtuples,
state->memtupcount,
@@ -3667,16 +3729,16 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
}
/* Can we use the single-key sort function? */
- if (state->onlyKey != NULL)
+ if (state->ops.onlyKey != NULL)
{
qsort_ssup(state->memtuples, state->memtupcount,
- state->onlyKey);
+ state->ops.onlyKey);
}
else
{
qsort_tuple(state->memtuples,
state->memtupcount,
- state->comparetup,
+ state->ops.comparetup,
state);
}
}
@@ -3793,10 +3855,10 @@ tuplesort_heap_replace_top(Tuplesortstate *state, SortTuple *tuple)
static void
reversedirection(Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ SortSupport sortKey = state->ops.sortKeys;
int nkey;
- for (nkey = 0; nkey < state->nKeys; nkey++, sortKey++)
+ for (nkey = 0; nkey < state->ops.nKeys; nkey++, sortKey++)
{
sortKey->ssup_reverse = !sortKey->ssup_reverse;
sortKey->ssup_nulls_first = !sortKey->ssup_nulls_first;
@@ -3847,7 +3909,7 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
Assert(state->slabFreeHead);
if (tuplen > SLAB_SLOT_SIZE || !state->slabFreeHead)
- return MemoryContextAlloc(state->sortcontext, tuplen);
+ return MemoryContextAlloc(state->ops.sortcontext, tuplen);
else
{
buf = state->slabFreeHead;
@@ -3866,6 +3928,7 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
static void
getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
HeapTupleData htup;
htup.t_len = ((MinimalTuple) stup->tuple)->t_len +
@@ -3874,8 +3937,8 @@ getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
MINIMAL_TUPLE_OFFSET);
stup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ ops->sortKeys[0].ssup_attno,
+ (TupleDesc) ops->arg,
&stup->isnull1);
}
@@ -3883,7 +3946,8 @@ getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ SortSupport sortKey = ops->sortKeys;
HeapTupleData ltup;
HeapTupleData rtup;
TupleDesc tupDesc;
@@ -3908,7 +3972,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
- tupDesc = state->tupDesc;
+ tupDesc = (TupleDesc) ops->arg;
if (sortKey->abbrev_converter)
{
@@ -3925,7 +3989,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
}
sortKey++;
- for (nkey = 1; nkey < state->nKeys; nkey++, sortKey++)
+ for (nkey = 1; nkey < ops->nKeys; nkey++, sortKey++)
{
attno = sortKey->ssup_attno;
@@ -3945,6 +4009,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
MinimalTuple tuple = (MinimalTuple) stup->tuple;
/* the part of the MinimalTuple we'll write: */
@@ -3956,7 +4021,7 @@ writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
}
@@ -3969,12 +4034,13 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
MinimalTuple tuple = (MinimalTuple) readtup_alloc(state, tuplen);
char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
HeapTupleData htup;
/* read in the tuple proper */
tuple->t_len = tuplen;
LogicalTapeReadExact(tape, tupbody, tupbodylen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
@@ -3982,8 +4048,8 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
stup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ ops->sortKeys[0].ssup_attno,
+ (TupleDesc) ops->arg,
&stup->isnull1);
}
@@ -3995,12 +4061,14 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
static void
getdatum1_cluster(Tuplesortstate *state, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
HeapTuple tup;
tup = (HeapTuple) stup->tuple;
stup->datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup->isnull1);
}
@@ -4008,7 +4076,9 @@ static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+ SortSupport sortKey = ops->sortKeys;
HeapTuple ltup;
HeapTuple rtup;
TupleDesc tupDesc;
@@ -4022,10 +4092,10 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
/* Be prepared to compare additional sort keys */
ltup = (HeapTuple) a->tuple;
rtup = (HeapTuple) b->tuple;
- tupDesc = state->tupDesc;
+ tupDesc = arg->tupDesc;
/* Compare the leading sort key, if it's simple */
- if (state->haveDatum1)
+ if (ops->haveDatum1)
{
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
@@ -4035,7 +4105,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
if (sortKey->abbrev_converter)
{
- AttrNumber leading = state->indexInfo->ii_IndexAttrNumbers[0];
+ AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
@@ -4044,7 +4114,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
datum2, isnull2,
sortKey);
}
- if (compare != 0 || state->nKeys == 1)
+ if (compare != 0 || ops->nKeys == 1)
return compare;
/* Compare additional columns the hard way */
sortKey++;
@@ -4056,13 +4126,13 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
nkey = 0;
}
- if (state->indexInfo->ii_Expressions == NULL)
+ if (arg->indexInfo->ii_Expressions == NULL)
{
/* If not expression index, just compare the proper heap attrs */
- for (; nkey < state->nKeys; nkey++, sortKey++)
+ for (; nkey < ops->nKeys; nkey++, sortKey++)
{
- AttrNumber attno = state->indexInfo->ii_IndexAttrNumbers[nkey];
+ AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
@@ -4089,19 +4159,19 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
TupleTableSlot *ecxt_scantuple;
/* Reset context each time to prevent memory leakage */
- ResetPerTupleExprContext(state->estate);
+ ResetPerTupleExprContext(arg->estate);
- ecxt_scantuple = GetPerTupleExprContext(state->estate)->ecxt_scantuple;
+ ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
- FormIndexDatum(state->indexInfo, ecxt_scantuple, state->estate,
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
l_index_values, l_index_isnull);
ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
- FormIndexDatum(state->indexInfo, ecxt_scantuple, state->estate,
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
r_index_values, r_index_isnull);
- for (; nkey < state->nKeys; nkey++, sortKey++)
+ for (; nkey < ops->nKeys; nkey++, sortKey++)
{
compare = ApplySortComparator(l_index_values[nkey],
l_index_isnull[nkey],
@@ -4119,6 +4189,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
HeapTuple tuple = (HeapTuple) stup->tuple;
unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
@@ -4126,7 +4197,7 @@ writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
}
@@ -4135,6 +4206,8 @@ static void
readtup_cluster(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int tuplen)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
HeapTuple tuple = (HeapTuple) readtup_alloc(state,
t_len + HEAPTUPLESIZE);
@@ -4147,18 +4220,34 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
tuple->t_tableOid = InvalidOid;
/* Read in the tuple body */
LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value, if it's a simple column */
- if (state->haveDatum1)
+ if (ops->haveDatum1)
stup->datum1 = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup->isnull1);
}
+static void
+freestate_cluster(Tuplesortstate *state)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+
+ /* Free any execution state created for CLUSTER case */
+ if (arg->estate != NULL)
+ {
+ ExprContext *econtext = GetPerTupleExprContext(arg->estate);
+
+ ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
+ FreeExecutorState(arg->estate);
+ }
+}
+
/*
* Routines specialized for IndexTuple case
*
@@ -4170,12 +4259,14 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
static void
getdatum1_index(Tuplesortstate *state, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
IndexTuple tuple;
tuple = stup->tuple;
stup->datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup->isnull1);
}
@@ -4188,7 +4279,9 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* is also special handling for enforcing uniqueness, and special
* treatment for equal keys at the end.
*/
- SortSupport sortKey = state->sortKeys;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexBTreeArg *arg = (TupleSortIndexBTreeArg *) ops->arg;
+ SortSupport sortKey = ops->sortKeys;
IndexTuple tuple1;
IndexTuple tuple2;
int keysz;
@@ -4212,8 +4305,8 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
/* Compare additional sort keys */
tuple1 = (IndexTuple) a->tuple;
tuple2 = (IndexTuple) b->tuple;
- keysz = state->nKeys;
- tupDes = RelationGetDescr(state->indexRel);
+ keysz = ops->nKeys;
+ tupDes = RelationGetDescr(arg->index.indexRel);
if (sortKey->abbrev_converter)
{
@@ -4258,7 +4351,7 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* sort algorithm wouldn't have checked whether one must appear before the
* other.
*/
- if (state->enforceUnique && !(!state->uniqueNullsNotDistinct && equal_hasnull))
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
{
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
@@ -4274,16 +4367,16 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
index_deform_tuple(tuple1, tupDes, values, isnull);
- key_desc = BuildIndexValueDescription(state->indexRel, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
ereport(ERROR,
(errcode(ERRCODE_UNIQUE_VIOLATION),
errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(state->indexRel)),
+ RelationGetRelationName(arg->index.indexRel)),
key_desc ? errdetail("Key %s is duplicated.", key_desc) :
errdetail("Duplicate keys exist."),
- errtableconstraint(state->heapRel,
- RelationGetRelationName(state->indexRel))));
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
}
/*
@@ -4321,6 +4414,8 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Bucket bucket2;
IndexTuple tuple1;
IndexTuple tuple2;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexHashArg *arg = (TupleSortIndexHashArg *) ops->arg;
/*
* Fetch hash keys and mask off bits we don't want to sort by. We know
@@ -4328,12 +4423,12 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
*/
Assert(!a->isnull1);
bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
- state->max_buckets, state->high_mask,
- state->low_mask);
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
Assert(!b->isnull1);
bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
- state->max_buckets, state->high_mask,
- state->low_mask);
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
if (bucket1 > bucket2)
return 1;
else if (bucket1 < bucket2)
@@ -4371,13 +4466,14 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
IndexTuple tuple = (IndexTuple) stup->tuple;
unsigned int tuplen;
tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
}
@@ -4386,18 +4482,20 @@ static void
readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
unsigned int tuplen = len - sizeof(unsigned int);
IndexTuple tuple = (IndexTuple) readtup_alloc(state, tuplen);
LogicalTapeReadExact(tape, tuple, tuplen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value */
stup->datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup->isnull1);
}
@@ -4414,20 +4512,21 @@ getdatum1_datum(Tuplesortstate *state, SortTuple *stup)
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
int compare;
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- state->sortKeys);
+ ops->sortKeys);
if (compare != 0)
return compare;
/* if we have abbreviations, then "tuple" has the original value */
- if (state->sortKeys->abbrev_converter)
+ if (ops->sortKeys->abbrev_converter)
compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
PointerGetDatum(b->tuple), b->isnull1,
- state->sortKeys);
+ ops->sortKeys);
return compare;
}
@@ -4435,6 +4534,8 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
void *waddr;
unsigned int tuplen;
unsigned int writtenlen;
@@ -4444,7 +4545,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
waddr = NULL;
tuplen = 0;
}
- else if (!state->tuples)
+ else if (!state->ops.tuples)
{
waddr = &stup->datum1;
tuplen = sizeof(Datum);
@@ -4452,7 +4553,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
else
{
waddr = stup->tuple;
- tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, state->datumTypeLen);
+ tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
Assert(tuplen != 0);
}
@@ -4460,7 +4561,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
LogicalTapeWrite(tape, waddr, tuplen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
}
@@ -4469,6 +4570,7 @@ static void
readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
unsigned int tuplen = len - sizeof(unsigned int);
if (tuplen == 0)
@@ -4478,7 +4580,7 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
stup->isnull1 = true;
stup->tuple = NULL;
}
- else if (!state->tuples)
+ else if (!state->ops.tuples)
{
Assert(tuplen == sizeof(Datum));
LogicalTapeReadExact(tape, &stup->datum1, tuplen);
@@ -4495,7 +4597,7 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
stup->tuple = raddr;
}
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
--
2.24.3 (Apple Git-128)
0006-Split-tuplesortops.c-v1.patchapplication/x-patch; name=0006-Split-tuplesortops.c-v1.patchDownload
From d33cd3bed1fa4f22313042a5d9821a973c5f8a88 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 21:48:05 +0300
Subject: [PATCH 6/6] Split tuplesortops.c
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/Makefile | 1 +
src/backend/utils/sort/tuplesort.c | 1710 +------------------------
src/backend/utils/sort/tuplesortops.c | 1550 ++++++++++++++++++++++
src/include/utils/tuplesort.h | 203 ++-
4 files changed, 1746 insertions(+), 1718 deletions(-)
create mode 100644 src/backend/utils/sort/tuplesortops.c
diff --git a/src/backend/utils/sort/Makefile b/src/backend/utils/sort/Makefile
index 26f65fcaf7a..bfe86c02f67 100644
--- a/src/backend/utils/sort/Makefile
+++ b/src/backend/utils/sort/Makefile
@@ -19,6 +19,7 @@ OBJS = \
sharedtuplestore.o \
sortsupport.o \
tuplesort.o \
+ tuplesortops.o \
tuplestore.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index e106e1ff9e2..6e681ca8afa 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -100,35 +100,17 @@
#include <limits.h>
-#include "access/hash.h"
-#include "access/htup_details.h"
-#include "access/nbtree.h"
-#include "catalog/index.h"
#include "catalog/pg_am.h"
#include "commands/tablespace.h"
#include "executor/executor.h"
#include "miscadmin.h"
#include "pg_trace.h"
-#include "utils/datum.h"
-#include "utils/logtape.h"
-#include "utils/lsyscache.h"
+#include "storage/shmem.h"
#include "utils/memutils.h"
#include "utils/pg_rusage.h"
#include "utils/rel.h"
-#include "utils/sortsupport.h"
#include "utils/tuplesort.h"
-
-/* sort-type codes for sort__start probes */
-#define HEAP_SORT 0
-#define INDEX_SORT 1
-#define DATUM_SORT 2
-#define CLUSTER_SORT 3
-
-/* Sort parallel code from state for sort__start probes */
-#define PARALLEL_SORT(coordinate) ((coordinate)->sharedsort == NULL ? 0 : \
- (coordinate)->isWorker >= 0 ? 1 : 2)
-
/*
* Initial size of memtuples array. We're trying to select this size so that
* array doesn't exceed ALLOCSET_SEPARATE_THRESHOLD and so that the overhead of
@@ -149,43 +131,6 @@ bool optimize_bounded_sort = true;
#endif
-/*
- * The objects we actually sort are SortTuple structs. These contain
- * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
- * which is a separate palloc chunk --- we assume it is just one chunk and
- * can be freed by a simple pfree() (except during merge, when we use a
- * simple slab allocator). SortTuples also contain the tuple's first key
- * column in Datum/nullflag format, and a source/input tape number that
- * tracks which tape each heap element/slot belongs to during merging.
- *
- * Storing the first key column lets us save heap_getattr or index_getattr
- * calls during tuple comparisons. We could extract and save all the key
- * columns not just the first, but this would increase code complexity and
- * overhead, and wouldn't actually save any comparison cycles in the common
- * case where the first key determines the comparison result. Note that
- * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
- *
- * There is one special case: when the sort support infrastructure provides an
- * "abbreviated key" representation, where the key is (typically) a pass by
- * value proxy for a pass by reference type. In this case, the abbreviated key
- * is stored in datum1 in place of the actual first key column.
- *
- * When sorting single Datums, the data value is represented directly by
- * datum1/isnull1 for pass by value types (or null values). If the datatype is
- * pass-by-reference and isnull1 is false, then "tuple" points to a separately
- * palloc'd data value, otherwise "tuple" is NULL. The value of datum1 is then
- * either the same pointer as "tuple", or is an abbreviated key value as
- * described above. Accordingly, "tuple" is always used in preference to
- * datum1 as the authoritative value for pass-by-reference cases.
- */
-typedef struct
-{
- void *tuple; /* the tuple itself */
- Datum datum1; /* value of first key column */
- bool isnull1; /* is first key column NULL? */
- int srctape; /* source tape number */
-} SortTuple;
-
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
* tuples. To avoid palloc/pfree overhead.
@@ -236,136 +181,6 @@ typedef enum
#define TAPE_BUFFER_OVERHEAD BLCKSZ
#define MERGE_BUFFER_SIZE (BLCKSZ * 32)
-typedef struct TuplesortOps TuplesortOps;
-
-typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-
-struct TuplesortOps
-{
- MemoryContext maincontext; /* memory context for tuple sort metadata that
- * persists across multiple batches */
- MemoryContext sortcontext; /* memory context holding most sort data */
- MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
-
- /*
- * These function pointers decouple the routines that must know what kind
- * of tuple we are sorting from the routines that don't need to know it.
- * They are set up by the tuplesort_begin_xxx routines.
- *
- * Function to compare two tuples; result is per qsort() convention, ie:
- * <0, 0, >0 according as a<b, a=b, a>b. The API must match
- * qsort_arg_comparator.
- */
- SortTupleComparator comparetup;
-
- void (*getdatum1) (Tuplesortstate *state, SortTuple *stup);
-
- /*
- * Function to write a stored tuple onto tape. The representation of the
- * tuple on tape need not be the same as it is in memory; requirements on
- * the tape representation are given below. Unless the slab allocator is
- * used, after writing the tuple, pfree() the out-of-line data (not the
- * SortTuple struct!), and increase state->availMem by the amount of
- * memory space thereby released.
- */
- void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-
- /*
- * Function to read a stored tuple from tape back into memory. 'len' is
- * the already-read length of the stored tuple. The tuple is allocated
- * from the slab memory arena, or is palloc'd, see readtup_alloc().
- */
- void (*readtup) (Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-
- void (*freestate) (Tuplesortstate *state);
-
- /*
- * Whether SortTuple's datum1 and isnull1 members are maintained by the
- * above routines. If not, some sort specializations are disabled.
- */
- bool haveDatum1;
-
- /*
- * The sortKeys variable is used by every case other than the hash index
- * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
- * MinimalTuple and CLUSTER routines, though.
- */
- int nKeys; /* number of columns in sort key */
- SortSupport sortKeys; /* array of length nKeys */
-
- /*
- * This variable is shared by the single-key MinimalTuple case and the
- * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
- * presence of a value in this field is also checked by various sort
- * specialization functions as an optimization when comparing the leading
- * key in a tiebreak situation to determine if there are any subsequent
- * keys to sort on.
- */
- SortSupport onlyKey;
-
- int sortopt; /* Bitmask of flags used to setup sort */
-
- bool tuples; /* Can SortTuple.tuple ever be set? */
-
- void *arg;
-};
-
-typedef struct
-{
- TupleDesc tupDesc;
-
- /*
- * These variables are specific to the CLUSTER case; they are set by
- * tuplesort_begin_cluster.
- */
- IndexInfo *indexInfo; /* info about index being used for reference */
- EState *estate; /* for evaluating index expressions */
-} TupleSortClusterArg;
-
-typedef struct
-{
- /*
- * These variables are specific to the IndexTuple case; they are set by
- * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
- */
- Relation heapRel; /* table the index is being built on */
- Relation indexRel; /* index being built */
-} TupleSortIndexArg;
-
-typedef struct
-{
- TupleSortIndexArg index;
-
- /* These are specific to the index_btree subcase: */
- bool enforceUnique; /* complain if we find duplicate tuples */
- bool uniqueNullsNotDistinct; /* unique constraint null treatment */
-} TupleSortIndexBTreeArg;
-
-typedef struct
-{
- TupleSortIndexArg index;
-
- /* These are specific to the index_hash subcase: */
- uint32 high_mask; /* masks for sortable part of hash code */
- uint32 low_mask;
- uint32 max_buckets;
-} TupleSortIndexHashArg;
-
-typedef struct
-{
- TupleSortIndexArg index;
-
- /*
- * These variables are specific to the Datum case; they are set by
- * tuplesort_begin_datum and used only by the DatumTuple routines.
- */
- Oid datumType;
- /* we need typelen in order to know how to copy the Datums. */
- int datumTypeLen;
-} TupleSortDatumArg;
/*
* Private state of a Tuplesort operation.
@@ -577,8 +392,6 @@ struct Sharedsort
pfree(buf); \
} while(0)
-#define TuplesortstateGetOps(state) ((TuplesortOps *) state);
-
#define GETDATUM1(state,stup) ((*(state)->ops.getdatum1) (state, stup))
#define COMPARETUP(state,a,b) ((*(state)->ops.comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) ((*(state)->ops.writetup) (state, tape, stup))
@@ -637,19 +450,8 @@ struct Sharedsort
* begins).
*/
-/* When using this macro, beware of double evaluation of len */
-#define LogicalTapeReadExact(tape, ptr, len) \
- do { \
- if (LogicalTapeRead(tape, ptr, len) != (size_t) (len)) \
- elog(ERROR, "unexpected end of data"); \
- } while(0)
-
-static Tuplesortstate *tuplesort_begin_common(int workMem,
- SortCoordinate coordinate,
- int sortopt);
static void tuplesort_begin_batch(Tuplesortstate *state);
-static void puttuple_common(Tuplesortstate *state, SortTuple *tuple);
static void writetuple_common(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static bool consider_abort_common(Tuplesortstate *state);
@@ -671,38 +473,6 @@ static void tuplesort_heap_delete_top(Tuplesortstate *state);
static void reversedirection(Tuplesortstate *state);
static unsigned int getlen(LogicalTape *tape, bool eofOK);
static void markrunend(LogicalTape *tape);
-static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
-static void getdatum1_heap(Tuplesortstate *state, SortTuple *stup);
-static void getdatum1_cluster(Tuplesortstate *state, SortTuple *stup);
-static void getdatum1_index(Tuplesortstate *state, SortTuple *stup);
-static void getdatum1_datum(Tuplesortstate *state, SortTuple *stup);
-static int comparetup_heap(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_index(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_datum(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static void freestate_cluster(Tuplesortstate *state);
static int worker_get_identifier(Tuplesortstate *state);
static void worker_freeze_result_tape(Tuplesortstate *state);
static void worker_nomergeruns(Tuplesortstate *state);
@@ -873,7 +643,7 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* sort options. See TUPLESORT_* definitions in tuplesort.h
*/
-static Tuplesortstate *
+Tuplesortstate *
tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
{
Tuplesortstate *state;
@@ -1059,468 +829,6 @@ tuplesort_begin_batch(Tuplesortstate *state)
MemoryContextSwitchTo(oldcontext);
}
-Tuplesortstate *
-tuplesort_begin_heap(TupleDesc tupDesc,
- int nkeys, AttrNumber *attNums,
- Oid *sortOperators, Oid *sortCollations,
- bool *nullsFirstFlags,
- int workMem, SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- MemoryContext oldcontext;
- int i;
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
-
- AssertArg(nkeys > 0);
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
- nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = nkeys;
-
- TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
- false, /* no unique check */
- nkeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- ops->getdatum1 = getdatum1_heap;
- ops->comparetup = comparetup_heap;
- ops->writetup = writetup_heap;
- ops->readtup = readtup_heap;
- ops->haveDatum1 = true;
- ops->arg = tupDesc; /* assume we need not copy tupDesc */
-
- /* Prepare SortSupport data for each column */
- ops->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
-
- for (i = 0; i < nkeys; i++)
- {
- SortSupport sortKey = ops->sortKeys + i;
-
- AssertArg(attNums[i] != 0);
- AssertArg(sortOperators[i] != 0);
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = sortCollations[i];
- sortKey->ssup_nulls_first = nullsFirstFlags[i];
- sortKey->ssup_attno = attNums[i];
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && ops->haveDatum1);
-
- PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
- }
-
- /*
- * The "onlyKey" optimization cannot be used with abbreviated keys, since
- * tie-breaker comparisons may be required. Typically, the optimization
- * is only of value to pass-by-value types anyway, whereas abbreviated
- * keys are typically only of value to pass-by-reference types.
- */
- if (nkeys == 1 && !ops->sortKeys->abbrev_converter)
- ops->onlyKey = ops->sortKeys;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_cluster(TupleDesc tupDesc,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- BTScanInsert indexScanKey;
- MemoryContext oldcontext;
- TupleSortClusterArg *arg;
- int i;
-
- Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
- arg = (TupleSortClusterArg *) palloc0(sizeof(TupleSortClusterArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
- RelationGetNumberOfAttributes(indexRel),
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
- false, /* no unique check */
- ops->nKeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- ops->getdatum1 = getdatum1_cluster;
- ops->comparetup = comparetup_cluster;
- ops->writetup = writetup_cluster;
- ops->readtup = readtup_cluster;
- ops->freestate = freestate_cluster;
- ops->arg = arg;
-
- arg->indexInfo = BuildIndexInfo(indexRel);
-
- /*
- * If we don't have a simple leading attribute, we don't currently
- * initialize datum1, so disable optimizations that require it.
- */
- if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
- ops->haveDatum1 = false;
- else
- ops->haveDatum1 = true;
-
- arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
-
- indexScanKey = _bt_mkscankey(indexRel, NULL);
-
- if (arg->indexInfo->ii_Expressions != NULL)
- {
- TupleTableSlot *slot;
- ExprContext *econtext;
-
- /*
- * We will need to use FormIndexDatum to evaluate the index
- * expressions. To do that, we need an EState, as well as a
- * TupleTableSlot to put the table tuples into. The econtext's
- * scantuple has to point to that slot, too.
- */
- arg->estate = CreateExecutorState();
- slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
- econtext = GetPerTupleExprContext(arg->estate);
- econtext->ecxt_scantuple = slot;
- }
-
- /* Prepare SortSupport data for each column */
- ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < ops->nKeys; i++)
- {
- SortSupport sortKey = ops->sortKeys + i;
- ScanKey scanKey = indexScanKey->scankeys + i;
- int16 strategy;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = scanKey->sk_collation;
- sortKey->ssup_nulls_first =
- (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
- sortKey->ssup_attno = scanKey->sk_attno;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && ops->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
- BTGreaterStrategyNumber : BTLessStrategyNumber;
-
- PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
- }
-
- pfree(indexScanKey);
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_btree(Relation heapRel,
- Relation indexRel,
- bool enforceUnique,
- bool uniqueNullsNotDistinct,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- BTScanInsert indexScanKey;
- TupleSortIndexBTreeArg *arg;
- MemoryContext oldcontext;
- int i;
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
- arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: unique = %c, workMem = %d, randomAccess = %c",
- enforceUnique ? 't' : 'f',
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
- enforceUnique,
- state->nKeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- ops->getdatum1 = getdatum1_index;
- ops->comparetup = comparetup_index_btree;
- ops->writetup = writetup_index;
- ops->readtup = readtup_index;
- ops->haveDatum1 = true;
- ops->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
- arg->enforceUnique = enforceUnique;
- arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
-
- indexScanKey = _bt_mkscankey(indexRel, NULL);
-
- /* Prepare SortSupport data for each column */
- ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < ops->nKeys; i++)
- {
- SortSupport sortKey = ops->sortKeys + i;
- ScanKey scanKey = indexScanKey->scankeys + i;
- int16 strategy;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = scanKey->sk_collation;
- sortKey->ssup_nulls_first =
- (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
- sortKey->ssup_attno = scanKey->sk_attno;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && ops->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
- BTGreaterStrategyNumber : BTLessStrategyNumber;
-
- PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
- }
-
- pfree(indexScanKey);
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_hash(Relation heapRel,
- Relation indexRel,
- uint32 high_mask,
- uint32 low_mask,
- uint32 max_buckets,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- MemoryContext oldcontext;
- TupleSortIndexHashArg *arg;
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
- arg = (TupleSortIndexHashArg *) palloc(sizeof(TupleSortIndexHashArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: high_mask = 0x%x, low_mask = 0x%x, "
- "max_buckets = 0x%x, workMem = %d, randomAccess = %c",
- high_mask,
- low_mask,
- max_buckets,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = 1; /* Only one sort column, the hash code */
-
- ops->getdatum1 = getdatum1_index;
- ops->comparetup = comparetup_index_hash;
- ops->writetup = writetup_index;
- ops->readtup = readtup_index;
- ops->haveDatum1 = true;
- ops->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
-
- arg->high_mask = high_mask;
- arg->low_mask = low_mask;
- arg->max_buckets = max_buckets;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_gist(Relation heapRel,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- MemoryContext oldcontext;
- TupleSortIndexBTreeArg *arg;
- int i;
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
- arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: workMem = %d, randomAccess = %c",
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- ops->getdatum1 = getdatum1_index;
- ops->comparetup = comparetup_index_btree;
- ops->writetup = writetup_index;
- ops->readtup = readtup_index;
- ops->haveDatum1 = true;
- ops->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
- arg->enforceUnique = false;
- arg->uniqueNullsNotDistinct = false;
-
- /* Prepare SortSupport data for each column */
- ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < ops->nKeys; i++)
- {
- SortSupport sortKey = ops->sortKeys + i;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = indexRel->rd_indcollation[i];
- sortKey->ssup_nulls_first = false;
- sortKey->ssup_attno = i + 1;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && ops->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- /* Look for a sort support function */
- PrepareSortSupportFromGistIndexRel(indexRel, sortKey);
- }
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
- bool nullsFirstFlag, int workMem,
- SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortDatumArg *arg;
- MemoryContext oldcontext;
- int16 typlen;
- bool typbyval;
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
- arg = (TupleSortDatumArg *) palloc(sizeof(TupleSortDatumArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin datum sort: workMem = %d, randomAccess = %c",
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = 1; /* always a one-column sort */
-
- TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
- false, /* no unique check */
- 1,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- ops->getdatum1 = getdatum1_datum;
- ops->comparetup = comparetup_datum;
- ops->writetup = writetup_datum;
- ops->readtup = readtup_datum;
- state->abbrevNext = 10;
- ops->haveDatum1 = true;
- ops->arg = arg;
-
- arg->datumType = datumType;
-
- /* lookup necessary attributes of the datum type */
- get_typlenbyval(datumType, &typlen, &typbyval);
- arg->datumTypeLen = typlen;
- ops->tuples = !typbyval;
-
- /* Prepare SortSupport data */
- ops->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
-
- ops->sortKeys->ssup_cxt = CurrentMemoryContext;
- ops->sortKeys->ssup_collation = sortCollation;
- ops->sortKeys->ssup_nulls_first = nullsFirstFlag;
-
- /*
- * Abbreviation is possible here only for by-reference types. In theory,
- * a pass-by-value datatype could have an abbreviated form that is cheaper
- * to compare. In a tuple sort, we could support that, because we can
- * always extract the original datum from the tuple as needed. Here, we
- * can't, because a datum sort only stores a single copy of the datum; the
- * "tuple" field of each SortTuple is NULL.
- */
- ops->sortKeys->abbreviate = !typbyval;
-
- PrepareSortSupportFromOrderingOp(sortOperator, ops->sortKeys);
-
- /*
- * The "onlyKey" optimization cannot be used with abbreviated keys, since
- * tie-breaker comparisons may be required. Typically, the optimization
- * is only of value to pass-by-value types anyway, whereas abbreviated
- * keys are typically only of value to pass-by-reference types.
- */
- if (!ops->sortKeys->abbrev_converter)
- ops->onlyKey = ops->sortKeys;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
/*
* tuplesort_set_bound
*
@@ -1876,152 +1184,11 @@ noalloc:
return false;
}
-/*
- * Accept one tuple while collecting input data for sort.
- *
- * Note that the input data is always copied; the caller need not save it.
- */
-void
-tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleDesc tupDesc = (TupleDesc) ops->arg;
- SortTuple stup;
- MinimalTuple tuple;
- HeapTupleData htup;
-
- /* copy the tuple into sort storage */
- tuple = ExecCopySlotMinimalTuple(slot);
- stup.tuple = (void *) tuple;
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- stup.datum1 = heap_getattr(&htup,
- ops->sortKeys[0].ssup_attno,
- tupDesc,
- &stup.isnull1);
-
- puttuple_common(state, &stup);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Accept one tuple while collecting input data for sort.
- *
- * Note that the input data is always copied; the caller need not save it.
- */
-void
-tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
-{
- SortTuple stup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
-
- /* copy the tuple into sort storage */
- tup = heap_copytuple(tup);
- stup.tuple = (void *) tup;
-
- /*
- * set up first-column key value, and potentially abbreviate, if it's a
- * simple column
- */
- if (ops->haveDatum1)
- {
- stup.datum1 = heap_getattr(tup,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup.isnull1);
- }
-
- puttuple_common(state, &stup);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Collect one index tuple while collecting input data for sort, building
- * it from caller-supplied values.
- */
-void
-tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
- ItemPointer self, Datum *values,
- bool *isnull)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
- SortTuple stup;
- IndexTuple tuple;
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
-
- stup.tuple = index_form_tuple(RelationGetDescr(rel), values, isnull);
- tuple = ((IndexTuple) stup.tuple);
- tuple->t_tid = *self;
- /* set up first-column key value */
- stup.datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup.isnull1);
-
- puttuple_common(state, &stup);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Accept one Datum while collecting input data for sort.
- *
- * If the Datum is pass-by-ref type, the value will be copied.
- */
-void
-tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
- SortTuple stup;
-
- /*
- * Pass-by-value types or null values are just stored directly in
- * stup.datum1 (and stup.tuple is not used and set to NULL).
- *
- * Non-null pass-by-reference values need to be copied into memory we
- * control, and possibly abbreviated. The copied value is pointed to by
- * stup.tuple and is treated as the canonical copy (e.g. to return via
- * tuplesort_getdatum or when writing to tape); stup.datum1 gets the
- * abbreviated value if abbreviation is happening, otherwise it's
- * identical to stup.tuple.
- */
-
- if (isNull || !state->ops.tuples)
- {
- /*
- * Set datum1 to zeroed representation for NULLs (to be consistent,
- * and to support cheap inequality tests for NULL abbreviated keys).
- */
- stup.datum1 = !isNull ? val : (Datum) 0;
- stup.isnull1 = isNull;
- stup.tuple = NULL; /* no separate storage */
- }
- else
- {
- stup.isnull1 = false;
- stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
- stup.tuple = DatumGetPointer(stup.datum1);
- }
-
- puttuple_common(state, &stup);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
/*
* Shared code for tuple and datum cases.
*/
-static void
-puttuple_common(Tuplesortstate *state, SortTuple *tuple)
+void
+tuplesort_puttuple_common(Tuplesortstate *state, SortTuple *tuple)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
@@ -2342,7 +1509,7 @@ tuplesort_performsort(Tuplesortstate *state)
* by caller. Note that fetched tuple is stored in memory that may be
* recycled by any future fetch.
*/
-static bool
+bool
tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
SortTuple *stup)
{
@@ -2552,171 +1719,28 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
{
/*
* If no more data, we've reached end of run on this tape.
- * Remove the top node from the heap.
- */
- tuplesort_heap_delete_top(state);
- state->nInputRuns--;
-
- /*
- * Close the tape. It'd go away at the end of the sort
- * anyway, but better to release the memory early.
- */
- LogicalTapeClose(srcTape);
- return true;
- }
- newtup.srctape = srcTapeIndex;
- tuplesort_heap_replace_top(state, &newtup);
- return true;
- }
- return false;
-
- default:
- elog(ERROR, "invalid tuplesort state");
- return false; /* keep compiler quiet */
- }
-}
-
-/*
- * Fetch the next tuple in either forward or back direction.
- * If successful, put tuple in slot and return true; else, clear the slot
- * and return false.
- *
- * Caller may optionally be passed back abbreviated value (on true return
- * value) when abbreviation was used, which can be used to cheaply avoid
- * equality checks that might otherwise be required. Caller can safely make a
- * determination of "non-equal tuple" based on simple binary inequality. A
- * NULL value in leading attribute will set abbreviated value to zeroed
- * representation, which caller may rely on in abbreviated inequality check.
- *
- * If copy is true, the slot receives a tuple that's been copied into the
- * caller's memory context, so that it will stay valid regardless of future
- * manipulations of the tuplesort's state (up to and including deleting the
- * tuplesort). If copy is false, the slot will just receive a pointer to a
- * tuple held within the tuplesort, which is more efficient, but only safe for
- * callers that are prepared to have any subsequent manipulation of the
- * tuplesort's state invalidate slot contents.
- */
-bool
-tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
- TupleTableSlot *slot, Datum *abbrev)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- if (stup.tuple)
- {
- /* Record abbreviated key for caller */
- if (state->ops.sortKeys->abbrev_converter && abbrev)
- *abbrev = stup.datum1;
-
- if (copy)
- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
-
- ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
- return true;
- }
- else
- {
- ExecClearTuple(slot);
- return false;
- }
-}
-
-/*
- * Fetch the next tuple in either forward or back direction.
- * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller may not rely on tuple
- * remaining valid after any further manipulation of tuplesort.
- */
-HeapTuple
-tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- return stup.tuple;
-}
-
-/*
- * Fetch the next index tuple in either forward or back direction.
- * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller may not rely on tuple
- * remaining valid after any further manipulation of tuplesort.
- */
-IndexTuple
-tuplesort_getindextuple(Tuplesortstate *state, bool forward)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- return (IndexTuple) stup.tuple;
-}
-
-/*
- * Fetch the next Datum in either forward or back direction.
- * Returns false if no more datums.
- *
- * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * in caller's context, and is now owned by the caller (this differs from
- * similar routines for other types of tuplesorts).
- *
- * Caller may optionally be passed back abbreviated value (on true return
- * value) when abbreviation was used, which can be used to cheaply avoid
- * equality checks that might otherwise be required. Caller can safely make a
- * determination of "non-equal tuple" based on simple binary inequality. A
- * NULL value will have a zeroed abbreviated value representation, which caller
- * may rely on in abbreviated inequality check.
- */
-bool
-tuplesort_getdatum(Tuplesortstate *state, bool forward,
- Datum *val, bool *isNull, Datum *abbrev)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- {
- MemoryContextSwitchTo(oldcontext);
- return false;
- }
-
- /* Ensure we copy into caller's memory context */
- MemoryContextSwitchTo(oldcontext);
+ * Remove the top node from the heap.
+ */
+ tuplesort_heap_delete_top(state);
+ state->nInputRuns--;
- /* Record abbreviated key for caller */
- if (ops->sortKeys->abbrev_converter && abbrev)
- *abbrev = stup.datum1;
+ /*
+ * Close the tape. It'd go away at the end of the sort
+ * anyway, but better to release the memory early.
+ */
+ LogicalTapeClose(srcTape);
+ return true;
+ }
+ newtup.srctape = srcTapeIndex;
+ tuplesort_heap_replace_top(state, &newtup);
+ return true;
+ }
+ return false;
- if (stup.isnull1 || !state->ops.tuples)
- {
- *val = stup.datum1;
- *isNull = stup.isnull1;
- }
- else
- {
- /* use stup.tuple because stup.datum1 may be an abbreviation */
- *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
- *isNull = false;
+ default:
+ elog(ERROR, "invalid tuplesort state");
+ return false; /* keep compiler quiet */
}
-
- return true;
}
/*
@@ -3897,8 +2921,8 @@ markrunend(LogicalTape *tape)
* We use next free slot from the slab allocator, or palloc() if the tuple
* is too large for that.
*/
-static void *
-readtup_alloc(Tuplesortstate *state, Size tuplen)
+void *
+tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen)
{
SlabSlot *buf;
@@ -3920,688 +2944,6 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
}
}
-
-/*
- * Routines specialized for HeapTuple (actually MinimalTuple) case
- */
-
-static void
-getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- HeapTupleData htup;
-
- htup.t_len = ((MinimalTuple) stup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) stup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- stup->datum1 = heap_getattr(&htup,
- ops->sortKeys[0].ssup_attno,
- (TupleDesc) ops->arg,
- &stup->isnull1);
-
-}
-
-static int
-comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- SortSupport sortKey = ops->sortKeys;
- HeapTupleData ltup;
- HeapTupleData rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- AttrNumber attno;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
- /* Compare the leading sort key */
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- /* Compare additional sort keys */
- ltup.t_len = ((MinimalTuple) a->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
- ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
- rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
- rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
- tupDesc = (TupleDesc) ops->arg;
-
- if (sortKey->abbrev_converter)
- {
- attno = sortKey->ssup_attno;
-
- datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- sortKey++;
- for (nkey = 1; nkey < ops->nKeys; nkey++, sortKey++)
- {
- attno = sortKey->ssup_attno;
-
- datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- return 0;
-}
-
-static void
-writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- MinimalTuple tuple = (MinimalTuple) stup->tuple;
-
- /* the part of the MinimalTuple we'll write: */
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
-
- /* total on-disk footprint: */
- unsigned int tuplen = tupbodylen + sizeof(int);
-
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_heap(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- unsigned int tupbodylen = len - sizeof(int);
- unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
- MinimalTuple tuple = (MinimalTuple) readtup_alloc(state, tuplen);
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- TuplesortOps *ops = TuplesortstateGetOps(state);
- HeapTupleData htup;
-
- /* read in the tuple proper */
- tuple->t_len = tuplen;
- LogicalTapeReadExact(tape, tupbody, tupbodylen);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- stup->datum1 = heap_getattr(&htup,
- ops->sortKeys[0].ssup_attno,
- (TupleDesc) ops->arg,
- &stup->isnull1);
-}
-
-/*
- * Routines specialized for the CLUSTER case (HeapTuple data, with
- * comparisons per a btree index definition)
- */
-
-static void
-getdatum1_cluster(Tuplesortstate *state, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
- HeapTuple tup;
-
- tup = (HeapTuple) stup->tuple;
- stup->datum1 = heap_getattr(tup,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup->isnull1);
-}
-
-static int
-comparetup_cluster(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
- SortSupport sortKey = ops->sortKeys;
- HeapTuple ltup;
- HeapTuple rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
- /* Be prepared to compare additional sort keys */
- ltup = (HeapTuple) a->tuple;
- rtup = (HeapTuple) b->tuple;
- tupDesc = arg->tupDesc;
-
- /* Compare the leading sort key, if it's simple */
- if (ops->haveDatum1)
- {
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- if (sortKey->abbrev_converter)
- {
- AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
-
- datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
- datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- }
- if (compare != 0 || ops->nKeys == 1)
- return compare;
- /* Compare additional columns the hard way */
- sortKey++;
- nkey = 1;
- }
- else
- {
- /* Must compare all keys the hard way */
- nkey = 0;
- }
-
- if (arg->indexInfo->ii_Expressions == NULL)
- {
- /* If not expression index, just compare the proper heap attrs */
-
- for (; nkey < ops->nKeys; nkey++, sortKey++)
- {
- AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
-
- datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
- }
- else
- {
- /*
- * In the expression index case, compute the whole index tuple and
- * then compare values. It would perhaps be faster to compute only as
- * many columns as we need to compare, but that would require
- * duplicating all the logic in FormIndexDatum.
- */
- Datum l_index_values[INDEX_MAX_KEYS];
- bool l_index_isnull[INDEX_MAX_KEYS];
- Datum r_index_values[INDEX_MAX_KEYS];
- bool r_index_isnull[INDEX_MAX_KEYS];
- TupleTableSlot *ecxt_scantuple;
-
- /* Reset context each time to prevent memory leakage */
- ResetPerTupleExprContext(arg->estate);
-
- ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
-
- ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
- FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
- l_index_values, l_index_isnull);
-
- ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
- FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
- r_index_values, r_index_isnull);
-
- for (; nkey < ops->nKeys; nkey++, sortKey++)
- {
- compare = ApplySortComparator(l_index_values[nkey],
- l_index_isnull[nkey],
- r_index_values[nkey],
- r_index_isnull[nkey],
- sortKey);
- if (compare != 0)
- return compare;
- }
- }
-
- return 0;
-}
-
-static void
-writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- HeapTuple tuple = (HeapTuple) stup->tuple;
- unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
-
- /* We need to store t_self, but not other fields of HeapTupleData */
- LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
- LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_cluster(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int tuplen)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
- unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
- HeapTuple tuple = (HeapTuple) readtup_alloc(state,
- t_len + HEAPTUPLESIZE);
-
- /* Reconstruct the HeapTupleData header */
- tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
- tuple->t_len = t_len;
- LogicalTapeReadExact(tape, &tuple->t_self, sizeof(ItemPointerData));
- /* We don't currently bother to reconstruct t_tableOid */
- tuple->t_tableOid = InvalidOid;
- /* Read in the tuple body */
- LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value, if it's a simple column */
- if (ops->haveDatum1)
- stup->datum1 = heap_getattr(tuple,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup->isnull1);
-}
-
-static void
-freestate_cluster(Tuplesortstate *state)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
-
- /* Free any execution state created for CLUSTER case */
- if (arg->estate != NULL)
- {
- ExprContext *econtext = GetPerTupleExprContext(arg->estate);
-
- ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
- FreeExecutorState(arg->estate);
- }
-}
-
-/*
- * Routines specialized for IndexTuple case
- *
- * The btree and hash cases require separate comparison functions, but the
- * IndexTuple representation is the same so the copy/write/read support
- * functions can be shared.
- */
-
-static void
-getdatum1_index(Tuplesortstate *state, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
- IndexTuple tuple;
-
- tuple = stup->tuple;
- stup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup->isnull1);
-}
-
-static int
-comparetup_index_btree(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- /*
- * This is similar to comparetup_heap(), but expects index tuples. There
- * is also special handling for enforcing uniqueness, and special
- * treatment for equal keys at the end.
- */
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortIndexBTreeArg *arg = (TupleSortIndexBTreeArg *) ops->arg;
- SortSupport sortKey = ops->sortKeys;
- IndexTuple tuple1;
- IndexTuple tuple2;
- int keysz;
- TupleDesc tupDes;
- bool equal_hasnull = false;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
- /* Compare the leading sort key */
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- /* Compare additional sort keys */
- tuple1 = (IndexTuple) a->tuple;
- tuple2 = (IndexTuple) b->tuple;
- keysz = ops->nKeys;
- tupDes = RelationGetDescr(arg->index.indexRel);
-
- if (sortKey->abbrev_converter)
- {
- datum1 = index_getattr(tuple1, 1, tupDes, &isnull1);
- datum2 = index_getattr(tuple2, 1, tupDes, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- /* they are equal, so we only need to examine one null flag */
- if (a->isnull1)
- equal_hasnull = true;
-
- sortKey++;
- for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
- {
- datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
- datum2 = index_getattr(tuple2, nkey, tupDes, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare; /* done when we find unequal attributes */
-
- /* they are equal, so we only need to examine one null flag */
- if (isnull1)
- equal_hasnull = true;
- }
-
- /*
- * If btree has asked us to enforce uniqueness, complain if two equal
- * tuples are detected (unless there was at least one NULL field and NULLS
- * NOT DISTINCT was not set).
- *
- * It is sufficient to make the test here, because if two tuples are equal
- * they *must* get compared at some stage of the sort --- otherwise the
- * sort algorithm wouldn't have checked whether one must appear before the
- * other.
- */
- if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
- {
- Datum values[INDEX_MAX_KEYS];
- bool isnull[INDEX_MAX_KEYS];
- char *key_desc;
-
- /*
- * Some rather brain-dead implementations of qsort (such as the one in
- * QNX 4) will sometimes call the comparison routine to compare a
- * value to itself, but we always use our own implementation, which
- * does not.
- */
- Assert(tuple1 != tuple2);
-
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
- }
-
- /*
- * If key values are equal, we sort on ItemPointer. This is required for
- * btree indexes, since heap TID is treated as an implicit last key
- * attribute in order to ensure that all keys in the index are physically
- * unique.
- */
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
-}
-
-static int
-comparetup_index_hash(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- Bucket bucket1;
- Bucket bucket2;
- IndexTuple tuple1;
- IndexTuple tuple2;
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortIndexHashArg *arg = (TupleSortIndexHashArg *) ops->arg;
-
- /*
- * Fetch hash keys and mask off bits we don't want to sort by. We know
- * that the first column of the index tuple is the hash key.
- */
- Assert(!a->isnull1);
- bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
- arg->max_buckets, arg->high_mask,
- arg->low_mask);
- Assert(!b->isnull1);
- bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
- arg->max_buckets, arg->high_mask,
- arg->low_mask);
- if (bucket1 > bucket2)
- return 1;
- else if (bucket1 < bucket2)
- return -1;
-
- /*
- * If hash values are equal, we sort on ItemPointer. This does not affect
- * validity of the finished index, but it may be useful to have index
- * scans in physical order.
- */
- tuple1 = (IndexTuple) a->tuple;
- tuple2 = (IndexTuple) b->tuple;
-
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
-}
-
-static void
-writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- IndexTuple tuple = (IndexTuple) stup->tuple;
- unsigned int tuplen;
-
- tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_index(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
- unsigned int tuplen = len - sizeof(unsigned int);
- IndexTuple tuple = (IndexTuple) readtup_alloc(state, tuplen);
-
- LogicalTapeReadExact(tape, tuple, tuplen);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value */
- stup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup->isnull1);
-}
-
-/*
- * Routines specialized for DatumTuple case
- */
-
-static void
-getdatum1_datum(Tuplesortstate *state, SortTuple *stup)
-{
- stup->datum1 = PointerGetDatum(stup->tuple);
-}
-
-static int
-comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- int compare;
-
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- ops->sortKeys);
- if (compare != 0)
- return compare;
-
- /* if we have abbreviations, then "tuple" has the original value */
-
- if (ops->sortKeys->abbrev_converter)
- compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
- PointerGetDatum(b->tuple), b->isnull1,
- ops->sortKeys);
-
- return compare;
-}
-
-static void
-writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
- void *waddr;
- unsigned int tuplen;
- unsigned int writtenlen;
-
- if (stup->isnull1)
- {
- waddr = NULL;
- tuplen = 0;
- }
- else if (!state->ops.tuples)
- {
- waddr = &stup->datum1;
- tuplen = sizeof(Datum);
- }
- else
- {
- waddr = stup->tuple;
- tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
- Assert(tuplen != 0);
- }
-
- writtenlen = tuplen + sizeof(unsigned int);
-
- LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
- LogicalTapeWrite(tape, waddr, tuplen);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
-}
-
-static void
-readtup_datum(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- unsigned int tuplen = len - sizeof(unsigned int);
-
- if (tuplen == 0)
- {
- /* it's NULL */
- stup->datum1 = (Datum) 0;
- stup->isnull1 = true;
- stup->tuple = NULL;
- }
- else if (!state->ops.tuples)
- {
- Assert(tuplen == sizeof(Datum));
- LogicalTapeReadExact(tape, &stup->datum1, tuplen);
- stup->isnull1 = false;
- stup->tuple = NULL;
- }
- else
- {
- void *raddr = readtup_alloc(state, tuplen);
-
- LogicalTapeReadExact(tape, raddr, tuplen);
- stup->datum1 = PointerGetDatum(raddr);
- stup->isnull1 = false;
- stup->tuple = raddr;
- }
-
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
-}
-
/*
* Parallel sort routines
*/
diff --git a/src/backend/utils/sort/tuplesortops.c b/src/backend/utils/sort/tuplesortops.c
new file mode 100644
index 00000000000..8f7a8704e76
--- /dev/null
+++ b/src/backend/utils/sort/tuplesortops.c
@@ -0,0 +1,1550 @@
+/*-------------------------------------------------------------------------
+ *
+ * tuplesortops.c
+ * Implementation of tuple sorting.
+ *
+ *
+ * Copyright (c) 2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/utils/sort/tuplesortops.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "access/htup_details.h"
+#include "access/nbtree.h"
+#include "catalog/index.h"
+#include "executor/executor.h"
+#include "pg_trace.h"
+#include "utils/datum.h"
+#include "utils/lsyscache.h"
+#include "utils/guc.h"
+#include "utils/tuplesort.h"
+
+
+/* sort-type codes for sort__start probes */
+#define HEAP_SORT 0
+#define INDEX_SORT 1
+#define DATUM_SORT 2
+#define CLUSTER_SORT 3
+
+static void getdatum1_heap(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_cluster(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_index(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_datum(Tuplesortstate *state, SortTuple *stup);
+static int comparetup_heap(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_index(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_datum(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static void freestate_cluster(Tuplesortstate *state);
+
+typedef struct
+{
+ TupleDesc tupDesc;
+
+ /*
+ * These variables are specific to the CLUSTER case; they are set by
+ * tuplesort_begin_cluster.
+ */
+ IndexInfo *indexInfo; /* info about index being used for reference */
+ EState *estate; /* for evaluating index expressions */
+} TupleSortClusterArg;
+
+typedef struct
+{
+ /*
+ * These variables are specific to the IndexTuple case; they are set by
+ * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
+ */
+ Relation heapRel; /* table the index is being built on */
+ Relation indexRel; /* index being built */
+} TupleSortIndexArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /* These are specific to the index_btree subcase: */
+ bool enforceUnique; /* complain if we find duplicate tuples */
+ bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+} TupleSortIndexBTreeArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /* These are specific to the index_hash subcase: */
+ uint32 high_mask; /* masks for sortable part of hash code */
+ uint32 low_mask;
+ uint32 max_buckets;
+} TupleSortIndexHashArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /*
+ * These variables are specific to the Datum case; they are set by
+ * tuplesort_begin_datum and used only by the DatumTuple routines.
+ */
+ Oid datumType;
+ /* we need typelen in order to know how to copy the Datums. */
+ int datumTypeLen;
+} TupleSortDatumArg;
+
+Tuplesortstate *
+tuplesort_begin_heap(TupleDesc tupDesc,
+ int nkeys, AttrNumber *attNums,
+ Oid *sortOperators, Oid *sortCollations,
+ bool *nullsFirstFlags,
+ int workMem, SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+
+ AssertArg(nkeys > 0);
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
+ nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = nkeys;
+
+ TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
+ false, /* no unique check */
+ nkeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ ops->getdatum1 = getdatum1_heap;
+ ops->comparetup = comparetup_heap;
+ ops->writetup = writetup_heap;
+ ops->readtup = readtup_heap;
+ ops->haveDatum1 = true;
+ ops->arg = tupDesc; /* assume we need not copy tupDesc */
+
+ /* Prepare SortSupport data for each column */
+ ops->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+
+ for (i = 0; i < nkeys; i++)
+ {
+ SortSupport sortKey = ops->sortKeys + i;
+
+ AssertArg(attNums[i] != 0);
+ AssertArg(sortOperators[i] != 0);
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = sortCollations[i];
+ sortKey->ssup_nulls_first = nullsFirstFlags[i];
+ sortKey->ssup_attno = attNums[i];
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
+
+ PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
+ }
+
+ /*
+ * The "onlyKey" optimization cannot be used with abbreviated keys, since
+ * tie-breaker comparisons may be required. Typically, the optimization
+ * is only of value to pass-by-value types anyway, whereas abbreviated
+ * keys are typically only of value to pass-by-reference types.
+ */
+ if (nkeys == 1 && !ops->sortKeys->abbrev_converter)
+ ops->onlyKey = ops->sortKeys;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_cluster(TupleDesc tupDesc,
+ Relation indexRel,
+ int workMem,
+ SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ BTScanInsert indexScanKey;
+ MemoryContext oldcontext;
+ TupleSortClusterArg *arg;
+ int i;
+
+ Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+ arg = (TupleSortClusterArg *) palloc0(sizeof(TupleSortClusterArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
+ RelationGetNumberOfAttributes(indexRel),
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
+ false, /* no unique check */
+ ops->nKeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ ops->getdatum1 = getdatum1_cluster;
+ ops->comparetup = comparetup_cluster;
+ ops->writetup = writetup_cluster;
+ ops->readtup = readtup_cluster;
+ ops->freestate = freestate_cluster;
+ ops->arg = arg;
+
+ arg->indexInfo = BuildIndexInfo(indexRel);
+
+ /*
+ * If we don't have a simple leading attribute, we don't currently
+ * initialize datum1, so disable optimizations that require it.
+ */
+ if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
+ ops->haveDatum1 = false;
+ else
+ ops->haveDatum1 = true;
+
+ arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
+
+ indexScanKey = _bt_mkscankey(indexRel, NULL);
+
+ if (arg->indexInfo->ii_Expressions != NULL)
+ {
+ TupleTableSlot *slot;
+ ExprContext *econtext;
+
+ /*
+ * We will need to use FormIndexDatum to evaluate the index
+ * expressions. To do that, we need an EState, as well as a
+ * TupleTableSlot to put the table tuples into. The econtext's
+ * scantuple has to point to that slot, too.
+ */
+ arg->estate = CreateExecutorState();
+ slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
+ econtext = GetPerTupleExprContext(arg->estate);
+ econtext->ecxt_scantuple = slot;
+ }
+
+ /* Prepare SortSupport data for each column */
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < ops->nKeys; i++)
+ {
+ SortSupport sortKey = ops->sortKeys + i;
+ ScanKey scanKey = indexScanKey->scankeys + i;
+ int16 strategy;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = scanKey->sk_collation;
+ sortKey->ssup_nulls_first =
+ (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
+ sortKey->ssup_attno = scanKey->sk_attno;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
+ BTGreaterStrategyNumber : BTLessStrategyNumber;
+
+ PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
+ }
+
+ pfree(indexScanKey);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_btree(Relation heapRel,
+ Relation indexRel,
+ bool enforceUnique,
+ bool uniqueNullsNotDistinct,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ BTScanInsert indexScanKey;
+ TupleSortIndexBTreeArg *arg;
+ MemoryContext oldcontext;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+ arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: unique = %c, workMem = %d, randomAccess = %c",
+ enforceUnique ? 't' : 'f',
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
+ enforceUnique,
+ ops->nKeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_btree;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = enforceUnique;
+ arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+
+ indexScanKey = _bt_mkscankey(indexRel, NULL);
+
+ /* Prepare SortSupport data for each column */
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < ops->nKeys; i++)
+ {
+ SortSupport sortKey = ops->sortKeys + i;
+ ScanKey scanKey = indexScanKey->scankeys + i;
+ int16 strategy;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = scanKey->sk_collation;
+ sortKey->ssup_nulls_first =
+ (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
+ sortKey->ssup_attno = scanKey->sk_attno;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
+ BTGreaterStrategyNumber : BTLessStrategyNumber;
+
+ PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
+ }
+
+ pfree(indexScanKey);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_hash(Relation heapRel,
+ Relation indexRel,
+ uint32 high_mask,
+ uint32 low_mask,
+ uint32 max_buckets,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext;
+ TupleSortIndexHashArg *arg;
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+ arg = (TupleSortIndexHashArg *) palloc(sizeof(TupleSortIndexHashArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: high_mask = 0x%x, low_mask = 0x%x, "
+ "max_buckets = 0x%x, workMem = %d, randomAccess = %c",
+ high_mask,
+ low_mask,
+ max_buckets,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = 1; /* Only one sort column, the hash code */
+
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_hash;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+
+ arg->high_mask = high_mask;
+ arg->low_mask = low_mask;
+ arg->max_buckets = max_buckets;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_gist(Relation heapRel,
+ Relation indexRel,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext;
+ TupleSortIndexBTreeArg *arg;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+ arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: workMem = %d, randomAccess = %c",
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_btree;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = false;
+ arg->uniqueNullsNotDistinct = false;
+
+ /* Prepare SortSupport data for each column */
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < ops->nKeys; i++)
+ {
+ SortSupport sortKey = ops->sortKeys + i;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = indexRel->rd_indcollation[i];
+ sortKey->ssup_nulls_first = false;
+ sortKey->ssup_attno = i + 1;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ /* Look for a sort support function */
+ PrepareSortSupportFromGistIndexRel(indexRel, sortKey);
+ }
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
+ bool nullsFirstFlag, int workMem,
+ SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg;
+ MemoryContext oldcontext;
+ int16 typlen;
+ bool typbyval;
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+ arg = (TupleSortDatumArg *) palloc(sizeof(TupleSortDatumArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin datum sort: workMem = %d, randomAccess = %c",
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = 1; /* always a one-column sort */
+
+ TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
+ false, /* no unique check */
+ 1,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ ops->getdatum1 = getdatum1_datum;
+ ops->comparetup = comparetup_datum;
+ ops->writetup = writetup_datum;
+ ops->readtup = readtup_datum;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
+
+ arg->datumType = datumType;
+
+ /* lookup necessary attributes of the datum type */
+ get_typlenbyval(datumType, &typlen, &typbyval);
+ arg->datumTypeLen = typlen;
+ ops->tuples = !typbyval;
+
+ /* Prepare SortSupport data */
+ ops->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+
+ ops->sortKeys->ssup_cxt = CurrentMemoryContext;
+ ops->sortKeys->ssup_collation = sortCollation;
+ ops->sortKeys->ssup_nulls_first = nullsFirstFlag;
+
+ /*
+ * Abbreviation is possible here only for by-reference types. In theory,
+ * a pass-by-value datatype could have an abbreviated form that is cheaper
+ * to compare. In a tuple sort, we could support that, because we can
+ * always extract the original datum from the tuple as needed. Here, we
+ * can't, because a datum sort only stores a single copy of the datum; the
+ * "tuple" field of each SortTuple is NULL.
+ */
+ ops->sortKeys->abbreviate = !typbyval;
+
+ PrepareSortSupportFromOrderingOp(sortOperator, ops->sortKeys);
+
+ /*
+ * The "onlyKey" optimization cannot be used with abbreviated keys, since
+ * tie-breaker comparisons may be required. Typically, the optimization
+ * is only of value to pass-by-value types anyway, whereas abbreviated
+ * keys are typically only of value to pass-by-reference types.
+ */
+ if (!ops->sortKeys->abbrev_converter)
+ ops->onlyKey = ops->sortKeys;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+/*
+ * Accept one tuple while collecting input data for sort.
+ *
+ * Note that the input data is always copied; the caller need not save it.
+ */
+void
+tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->tuplecontext);
+ TupleDesc tupDesc = (TupleDesc) ops->arg;
+ SortTuple stup;
+ MinimalTuple tuple;
+ HeapTupleData htup;
+
+ /* copy the tuple into sort storage */
+ tuple = ExecCopySlotMinimalTuple(slot);
+ stup.tuple = (void *) tuple;
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ stup.datum1 = heap_getattr(&htup,
+ ops->sortKeys[0].ssup_attno,
+ tupDesc,
+ &stup.isnull1);
+
+ tuplesort_puttuple_common(state, &stup);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Accept one tuple while collecting input data for sort.
+ *
+ * Note that the input data is always copied; the caller need not save it.
+ */
+void
+tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
+{
+ SortTuple stup;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->tuplecontext);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+
+ /* copy the tuple into sort storage */
+ tup = heap_copytuple(tup);
+ stup.tuple = (void *) tup;
+
+ /*
+ * set up first-column key value, and potentially abbreviate, if it's a
+ * simple column
+ */
+ if (ops->haveDatum1)
+ {
+ stup.datum1 = heap_getattr(tup,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup.isnull1);
+ }
+
+ tuplesort_puttuple_common(state, &stup);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Collect one index tuple while collecting input data for sort, building
+ * it from caller-supplied values.
+ */
+void
+tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
+ ItemPointer self, Datum *values,
+ bool *isnull)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->tuplecontext);
+ SortTuple stup;
+ IndexTuple tuple;
+
+ stup.tuple = index_form_tuple(RelationGetDescr(rel), values, isnull);
+ tuple = ((IndexTuple) stup.tuple);
+ tuple->t_tid = *self;
+ /* set up first-column key value */
+ stup.datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup.isnull1);
+
+ tuplesort_puttuple_common(state, &stup);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Accept one Datum while collecting input data for sort.
+ *
+ * If the Datum is pass-by-ref type, the value will be copied.
+ */
+void
+tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->tuplecontext);
+ SortTuple stup;
+
+ /*
+ * Pass-by-value types or null values are just stored directly in
+ * stup.datum1 (and stup.tuple is not used and set to NULL).
+ *
+ * Non-null pass-by-reference values need to be copied into memory we
+ * control, and possibly abbreviated. The copied value is pointed to by
+ * stup.tuple and is treated as the canonical copy (e.g. to return via
+ * tuplesort_getdatum or when writing to tape); stup.datum1 gets the
+ * abbreviated value if abbreviation is happening, otherwise it's
+ * identical to stup.tuple.
+ */
+
+ if (isNull || !ops->tuples)
+ {
+ /*
+ * Set datum1 to zeroed representation for NULLs (to be consistent,
+ * and to support cheap inequality tests for NULL abbreviated keys).
+ */
+ stup.datum1 = !isNull ? val : (Datum) 0;
+ stup.isnull1 = isNull;
+ stup.tuple = NULL; /* no separate storage */
+ }
+ else
+ {
+ stup.isnull1 = false;
+ stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
+ stup.tuple = DatumGetPointer(stup.datum1);
+ }
+
+ tuplesort_puttuple_common(state, &stup);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Fetch the next tuple in either forward or back direction.
+ * If successful, put tuple in slot and return true; else, clear the slot
+ * and return false.
+ *
+ * Caller may optionally be passed back abbreviated value (on true return
+ * value) when abbreviation was used, which can be used to cheaply avoid
+ * equality checks that might otherwise be required. Caller can safely make a
+ * determination of "non-equal tuple" based on simple binary inequality. A
+ * NULL value in leading attribute will set abbreviated value to zeroed
+ * representation, which caller may rely on in abbreviated inequality check.
+ *
+ * If copy is true, the slot receives a tuple that's been copied into the
+ * caller's memory context, so that it will stay valid regardless of future
+ * manipulations of the tuplesort's state (up to and including deleting the
+ * tuplesort). If copy is false, the slot will just receive a pointer to a
+ * tuple held within the tuplesort, which is more efficient, but only safe for
+ * callers that are prepared to have any subsequent manipulation of the
+ * tuplesort's state invalidate slot contents.
+ */
+bool
+tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
+ TupleTableSlot *slot, Datum *abbrev)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ if (stup.tuple)
+ {
+ /* Record abbreviated key for caller */
+ if (ops->sortKeys->abbrev_converter && abbrev)
+ *abbrev = stup.datum1;
+
+ if (copy)
+ stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
+
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
+ return true;
+ }
+ else
+ {
+ ExecClearTuple(slot);
+ return false;
+ }
+}
+
+/*
+ * Fetch the next tuple in either forward or back direction.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller may not rely on tuple
+ * remaining valid after any further manipulation of tuplesort.
+ */
+HeapTuple
+tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return stup.tuple;
+}
+
+/*
+ * Fetch the next index tuple in either forward or back direction.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller may not rely on tuple
+ * remaining valid after any further manipulation of tuplesort.
+ */
+IndexTuple
+tuplesort_getindextuple(Tuplesortstate *state, bool forward)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return (IndexTuple) stup.tuple;
+}
+
+/*
+ * Fetch the next Datum in either forward or back direction.
+ * Returns false if no more datums.
+ *
+ * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
+ * in caller's context, and is now owned by the caller (this differs from
+ * similar routines for other types of tuplesorts).
+ *
+ * Caller may optionally be passed back abbreviated value (on true return
+ * value) when abbreviation was used, which can be used to cheaply avoid
+ * equality checks that might otherwise be required. Caller can safely make a
+ * determination of "non-equal tuple" based on simple binary inequality. A
+ * NULL value will have a zeroed abbreviated value representation, which caller
+ * may rely on in abbreviated inequality check.
+ */
+bool
+tuplesort_getdatum(Tuplesortstate *state, bool forward,
+ Datum *val, bool *isNull, Datum *abbrev)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ {
+ MemoryContextSwitchTo(oldcontext);
+ return false;
+ }
+
+ /* Ensure we copy into caller's memory context */
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Record abbreviated key for caller */
+ if (ops->sortKeys->abbrev_converter && abbrev)
+ *abbrev = stup.datum1;
+
+ if (stup.isnull1 || !ops->tuples)
+ {
+ *val = stup.datum1;
+ *isNull = stup.isnull1;
+ }
+ else
+ {
+ /* use stup.tuple because stup.datum1 may be an abbreviation */
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
+ *isNull = false;
+ }
+
+ return true;
+}
+
+
+/*
+ * Routines specialized for HeapTuple (actually MinimalTuple) case
+ */
+
+static void
+getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ HeapTupleData htup;
+
+ htup.t_len = ((MinimalTuple) stup->tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) stup->tuple -
+ MINIMAL_TUPLE_OFFSET);
+
+ stup->datum1 = heap_getattr(&htup,
+ ops->sortKeys[0].ssup_attno,
+ (TupleDesc) ops->arg,
+ &stup->isnull1);
+
+}
+
+static int
+comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ SortSupport sortKey = ops->sortKeys;
+ HeapTupleData ltup;
+ HeapTupleData rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ AttrNumber attno;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+
+ /* Compare the leading sort key */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ /* Compare additional sort keys */
+ ltup.t_len = ((MinimalTuple) a->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
+ rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
+ tupDesc = (TupleDesc) ops->arg;
+
+ if (sortKey->abbrev_converter)
+ {
+ attno = sortKey->ssup_attno;
+
+ datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ sortKey++;
+ for (nkey = 1; nkey < ops->nKeys; nkey++, sortKey++)
+ {
+ attno = sortKey->ssup_attno;
+
+ datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ return 0;
+}
+
+static void
+writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MinimalTuple tuple = (MinimalTuple) stup->tuple;
+
+ /* the part of the MinimalTuple we'll write: */
+ char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
+
+ /* total on-disk footprint: */
+ unsigned int tuplen = tupbodylen + sizeof(int);
+
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_heap(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ unsigned int tupbodylen = len - sizeof(int);
+ unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
+ MinimalTuple tuple = (MinimalTuple) tuplesort_readtup_alloc(state, tuplen);
+ char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ HeapTupleData htup;
+
+ /* read in the tuple proper */
+ tuple->t_len = tuplen;
+ LogicalTapeReadExact(tape, tupbody, tupbodylen);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ stup->datum1 = heap_getattr(&htup,
+ ops->sortKeys[0].ssup_attno,
+ (TupleDesc) ops->arg,
+ &stup->isnull1);
+}
+
+/*
+ * Routines specialized for the CLUSTER case (HeapTuple data, with
+ * comparisons per a btree index definition)
+ */
+
+static void
+getdatum1_cluster(Tuplesortstate *state, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+ HeapTuple tup;
+
+ tup = (HeapTuple) stup->tuple;
+ stup->datum1 = heap_getattr(tup,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup->isnull1);
+}
+
+static int
+comparetup_cluster(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+ SortSupport sortKey = ops->sortKeys;
+ HeapTuple ltup;
+ HeapTuple rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+ /* Be prepared to compare additional sort keys */
+ ltup = (HeapTuple) a->tuple;
+ rtup = (HeapTuple) b->tuple;
+ tupDesc = arg->tupDesc;
+
+ /* Compare the leading sort key, if it's simple */
+ if (ops->haveDatum1)
+ {
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ if (sortKey->abbrev_converter)
+ {
+ AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
+
+ datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
+ datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ }
+ if (compare != 0 || ops->nKeys == 1)
+ return compare;
+ /* Compare additional columns the hard way */
+ sortKey++;
+ nkey = 1;
+ }
+ else
+ {
+ /* Must compare all keys the hard way */
+ nkey = 0;
+ }
+
+ if (arg->indexInfo->ii_Expressions == NULL)
+ {
+ /* If not expression index, just compare the proper heap attrs */
+
+ for (; nkey < ops->nKeys; nkey++, sortKey++)
+ {
+ AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
+
+ datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+ }
+ else
+ {
+ /*
+ * In the expression index case, compute the whole index tuple and
+ * then compare values. It would perhaps be faster to compute only as
+ * many columns as we need to compare, but that would require
+ * duplicating all the logic in FormIndexDatum.
+ */
+ Datum l_index_values[INDEX_MAX_KEYS];
+ bool l_index_isnull[INDEX_MAX_KEYS];
+ Datum r_index_values[INDEX_MAX_KEYS];
+ bool r_index_isnull[INDEX_MAX_KEYS];
+ TupleTableSlot *ecxt_scantuple;
+
+ /* Reset context each time to prevent memory leakage */
+ ResetPerTupleExprContext(arg->estate);
+
+ ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
+
+ ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
+ l_index_values, l_index_isnull);
+
+ ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
+ r_index_values, r_index_isnull);
+
+ for (; nkey < ops->nKeys; nkey++, sortKey++)
+ {
+ compare = ApplySortComparator(l_index_values[nkey],
+ l_index_isnull[nkey],
+ r_index_values[nkey],
+ r_index_isnull[nkey],
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+ }
+
+ return 0;
+}
+
+static void
+writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ HeapTuple tuple = (HeapTuple) stup->tuple;
+ unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
+
+ /* We need to store t_self, but not other fields of HeapTupleData */
+ LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
+ LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_cluster(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int tuplen)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+ unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
+ HeapTuple tuple = (HeapTuple) tuplesort_readtup_alloc(state,
+ t_len + HEAPTUPLESIZE);
+
+ /* Reconstruct the HeapTupleData header */
+ tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
+ tuple->t_len = t_len;
+ LogicalTapeReadExact(tape, &tuple->t_self, sizeof(ItemPointerData));
+ /* We don't currently bother to reconstruct t_tableOid */
+ tuple->t_tableOid = InvalidOid;
+ /* Read in the tuple body */
+ LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value, if it's a simple column */
+ if (ops->haveDatum1)
+ stup->datum1 = heap_getattr(tuple,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup->isnull1);
+}
+
+static void
+freestate_cluster(Tuplesortstate *state)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+
+ /* Free any execution state created for CLUSTER case */
+ if (arg->estate != NULL)
+ {
+ ExprContext *econtext = GetPerTupleExprContext(arg->estate);
+
+ ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
+ FreeExecutorState(arg->estate);
+ }
+}
+
+/*
+ * Routines specialized for IndexTuple case
+ *
+ * The btree and hash cases require separate comparison functions, but the
+ * IndexTuple representation is the same so the copy/write/read support
+ * functions can be shared.
+ */
+
+static void
+getdatum1_index(Tuplesortstate *state, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
+ IndexTuple tuple;
+
+ tuple = stup->tuple;
+ stup->datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup->isnull1);
+}
+
+static int
+comparetup_index_btree(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ /*
+ * This is similar to comparetup_heap(), but expects index tuples. There
+ * is also special handling for enforcing uniqueness, and special
+ * treatment for equal keys at the end.
+ */
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexBTreeArg *arg = (TupleSortIndexBTreeArg *) ops->arg;
+ SortSupport sortKey = ops->sortKeys;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+ int keysz;
+ TupleDesc tupDes;
+ bool equal_hasnull = false;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+
+ /* Compare the leading sort key */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ /* Compare additional sort keys */
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+ keysz = ops->nKeys;
+ tupDes = RelationGetDescr(arg->index.indexRel);
+
+ if (sortKey->abbrev_converter)
+ {
+ datum1 = index_getattr(tuple1, 1, tupDes, &isnull1);
+ datum2 = index_getattr(tuple2, 1, tupDes, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ /* they are equal, so we only need to examine one null flag */
+ if (a->isnull1)
+ equal_hasnull = true;
+
+ sortKey++;
+ for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
+ {
+ datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
+ datum2 = index_getattr(tuple2, nkey, tupDes, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare; /* done when we find unequal attributes */
+
+ /* they are equal, so we only need to examine one null flag */
+ if (isnull1)
+ equal_hasnull = true;
+ }
+
+ /*
+ * If btree has asked us to enforce uniqueness, complain if two equal
+ * tuples are detected (unless there was at least one NULL field and NULLS
+ * NOT DISTINCT was not set).
+ *
+ * It is sufficient to make the test here, because if two tuples are equal
+ * they *must* get compared at some stage of the sort --- otherwise the
+ * sort algorithm wouldn't have checked whether one must appear before the
+ * other.
+ */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
+ {
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ char *key_desc;
+
+ /*
+ * Some rather brain-dead implementations of qsort (such as the one in
+ * QNX 4) will sometimes call the comparison routine to compare a
+ * value to itself, but we always use our own implementation, which
+ * does not.
+ */
+ Assert(tuple1 != tuple2);
+
+ index_deform_tuple(tuple1, tupDes, values, isnull);
+
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+static int
+comparetup_index_hash(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ Bucket bucket1;
+ Bucket bucket2;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexHashArg *arg = (TupleSortIndexHashArg *) ops->arg;
+
+ /*
+ * Fetch hash keys and mask off bits we don't want to sort by. We know
+ * that the first column of the index tuple is the hash key.
+ */
+ Assert(!a->isnull1);
+ bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
+ Assert(!b->isnull1);
+ bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
+ if (bucket1 > bucket2)
+ return 1;
+ else if (bucket1 < bucket2)
+ return -1;
+
+ /*
+ * If hash values are equal, we sort on ItemPointer. This does not affect
+ * validity of the finished index, but it may be useful to have index
+ * scans in physical order.
+ */
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+static void
+writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ IndexTuple tuple = (IndexTuple) stup->tuple;
+ unsigned int tuplen;
+
+ tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_index(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
+ unsigned int tuplen = len - sizeof(unsigned int);
+ IndexTuple tuple = (IndexTuple) tuplesort_readtup_alloc(state, tuplen);
+
+ LogicalTapeReadExact(tape, tuple, tuplen);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value */
+ stup->datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup->isnull1);
+}
+
+/*
+ * Routines specialized for DatumTuple case
+ */
+
+static void
+getdatum1_datum(Tuplesortstate *state, SortTuple *stup)
+{
+ stup->datum1 = PointerGetDatum(stup->tuple);
+}
+
+static int
+comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ int compare;
+
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ ops->sortKeys);
+ if (compare != 0)
+ return compare;
+
+ /* if we have abbreviations, then "tuple" has the original value */
+
+ if (ops->sortKeys->abbrev_converter)
+ compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
+ PointerGetDatum(b->tuple), b->isnull1,
+ ops->sortKeys);
+
+ return compare;
+}
+
+static void
+writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
+ void *waddr;
+ unsigned int tuplen;
+ unsigned int writtenlen;
+
+ if (stup->isnull1)
+ {
+ waddr = NULL;
+ tuplen = 0;
+ }
+ else if (!ops->tuples)
+ {
+ waddr = &stup->datum1;
+ tuplen = sizeof(Datum);
+ }
+ else
+ {
+ waddr = stup->tuple;
+ tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
+ Assert(tuplen != 0);
+ }
+
+ writtenlen = tuplen + sizeof(unsigned int);
+
+ LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
+ LogicalTapeWrite(tape, waddr, tuplen);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
+}
+
+static void
+readtup_datum(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ unsigned int tuplen = len - sizeof(unsigned int);
+
+ if (tuplen == 0)
+ {
+ /* it's NULL */
+ stup->datum1 = (Datum) 0;
+ stup->isnull1 = true;
+ stup->tuple = NULL;
+ }
+ else if (!ops->tuples)
+ {
+ Assert(tuplen == sizeof(Datum));
+ LogicalTapeReadExact(tape, &stup->datum1, tuplen);
+ stup->isnull1 = false;
+ stup->tuple = NULL;
+ }
+ else
+ {
+ void *raddr = tuplesort_readtup_alloc(state, tuplen);
+
+ LogicalTapeReadExact(tape, raddr, tuplen);
+ stup->datum1 = PointerGetDatum(raddr);
+ stup->isnull1 = false;
+ stup->tuple = raddr;
+ }
+
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+}
+
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 364cf132fcb..1c617565de9 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -24,7 +24,9 @@
#include "access/itup.h"
#include "executor/tuptable.h"
#include "storage/dsm.h"
+#include "utils/logtape.h"
#include "utils/relcache.h"
+#include "utils/sortsupport.h"
/*
@@ -102,6 +104,130 @@ typedef struct TuplesortInstrumentation
int64 spaceUsed; /* space consumption, in kB */
} TuplesortInstrumentation;
+/*
+ * The objects we actually sort are SortTuple structs. These contain
+ * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
+ * which is a separate palloc chunk --- we assume it is just one chunk and
+ * can be freed by a simple pfree() (except during merge, when we use a
+ * simple slab allocator). SortTuples also contain the tuple's first key
+ * column in Datum/nullflag format, and a source/input tape number that
+ * tracks which tape each heap element/slot belongs to during merging.
+ *
+ * Storing the first key column lets us save heap_getattr or index_getattr
+ * calls during tuple comparisons. We could extract and save all the key
+ * columns not just the first, but this would increase code complexity and
+ * overhead, and wouldn't actually save any comparison cycles in the common
+ * case where the first key determines the comparison result. Note that
+ * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
+ *
+ * There is one special case: when the sort support infrastructure provides an
+ * "abbreviated key" representation, where the key is (typically) a pass by
+ * value proxy for a pass by reference type. In this case, the abbreviated key
+ * is stored in datum1 in place of the actual first key column.
+ *
+ * When sorting single Datums, the data value is represented directly by
+ * datum1/isnull1 for pass by value types (or null values). If the datatype is
+ * pass-by-reference and isnull1 is false, then "tuple" points to a separately
+ * palloc'd data value, otherwise "tuple" is NULL. The value of datum1 is then
+ * either the same pointer as "tuple", or is an abbreviated key value as
+ * described above. Accordingly, "tuple" is always used in preference to
+ * datum1 as the authoritative value for pass-by-reference cases.
+ */
+typedef struct
+{
+ void *tuple; /* the tuple itself */
+ Datum datum1; /* value of first key column */
+ bool isnull1; /* is first key column NULL? */
+ int srctape; /* source tape number */
+} SortTuple;
+
+typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+
+typedef struct
+{
+ MemoryContext maincontext; /* memory context for tuple sort metadata that
+ * persists across multiple batches */
+ MemoryContext sortcontext; /* memory context holding most sort data */
+ MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
+
+ /*
+ * These function pointers decouple the routines that must know what kind
+ * of tuple we are sorting from the routines that don't need to know it.
+ * They are set up by the tuplesort_begin_xxx routines.
+ *
+ * Function to compare two tuples; result is per qsort() convention, ie:
+ * <0, 0, >0 according as a<b, a=b, a>b. The API must match
+ * qsort_arg_comparator.
+ */
+ SortTupleComparator comparetup;
+
+ void (*getdatum1) (Tuplesortstate *state, SortTuple *stup);
+
+ /*
+ * Function to write a stored tuple onto tape. The representation of the
+ * tuple on tape need not be the same as it is in memory; requirements on
+ * the tape representation are given below. Unless the slab allocator is
+ * used, after writing the tuple, pfree() the out-of-line data (not the
+ * SortTuple struct!), and increase state->availMem by the amount of
+ * memory space thereby released.
+ */
+ void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+
+ /*
+ * Function to read a stored tuple from tape back into memory. 'len' is
+ * the already-read length of the stored tuple. The tuple is allocated
+ * from the slab memory arena, or is palloc'd, see tuplesort_readtup_alloc().
+ */
+ void (*readtup) (Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+
+ void (*freestate) (Tuplesortstate *state);
+
+ /*
+ * Whether SortTuple's datum1 and isnull1 members are maintained by the
+ * above routines. If not, some sort specializations are disabled.
+ */
+ bool haveDatum1;
+
+ /*
+ * The sortKeys variable is used by every case other than the hash index
+ * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
+ * MinimalTuple and CLUSTER routines, though.
+ */
+ int nKeys; /* number of columns in sort key */
+ SortSupport sortKeys; /* array of length nKeys */
+
+ /*
+ * This variable is shared by the single-key MinimalTuple case and the
+ * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
+ * presence of a value in this field is also checked by various sort
+ * specialization functions as an optimization when comparing the leading
+ * key in a tiebreak situation to determine if there are any subsequent
+ * keys to sort on.
+ */
+ SortSupport onlyKey;
+
+ int sortopt; /* Bitmask of flags used to setup sort */
+
+ bool tuples; /* Can SortTuple.tuple ever be set? */
+
+ void *arg;
+} TuplesortOps;
+
+/* Sort parallel code from state for sort__start probes */
+#define PARALLEL_SORT(coordinate) ((coordinate)->sharedsort == NULL ? 0 : \
+ (coordinate)->isWorker >= 0 ? 1 : 2)
+
+#define TuplesortstateGetOps(state) ((TuplesortOps *) state);
+
+/* When using this macro, beware of double evaluation of len */
+#define LogicalTapeReadExact(tape, ptr, len) \
+ do { \
+ if (LogicalTapeRead(tape, ptr, len) != (size_t) (len)) \
+ elog(ERROR, "unexpected end of data"); \
+ } while(0)
/*
* We provide multiple interfaces to what is essentially the same code,
@@ -205,6 +331,49 @@ typedef struct TuplesortInstrumentation
* generated (typically, caller uses a parallel heap scan).
*/
+
+extern Tuplesortstate *tuplesort_begin_common(int workMem,
+ SortCoordinate coordinate,
+ int sortopt);
+extern void tuplesort_set_bound(Tuplesortstate *state, int64 bound);
+extern bool tuplesort_used_bound(Tuplesortstate *state);
+extern void tuplesort_puttuple_common(Tuplesortstate *state, SortTuple *tuple);
+extern void tuplesort_performsort(Tuplesortstate *state);
+extern bool tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
+ SortTuple *stup);
+extern bool tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples,
+ bool forward);
+extern void tuplesort_end(Tuplesortstate *state);
+extern void tuplesort_reset(Tuplesortstate *state);
+
+extern void tuplesort_get_stats(Tuplesortstate *state,
+ TuplesortInstrumentation *stats);
+extern const char *tuplesort_method_name(TuplesortMethod m);
+extern const char *tuplesort_space_type_name(TuplesortSpaceType t);
+
+extern int tuplesort_merge_order(int64 allowedMem);
+
+extern Size tuplesort_estimate_shared(int nworkers);
+extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
+ dsm_segment *seg);
+extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
+
+/*
+ * These routines may only be called if randomAccess was specified 'true'.
+ * Likewise, backwards scan in gettuple/getdatum is only allowed if
+ * randomAccess was specified. Note that parallel sorts do not support
+ * randomAccess.
+ */
+
+extern void tuplesort_rescan(Tuplesortstate *state);
+extern void tuplesort_markpos(Tuplesortstate *state);
+extern void tuplesort_restorepos(Tuplesortstate *state);
+
+extern void *tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen);
+
+
+/* tuplesortops.c */
+
extern Tuplesortstate *tuplesort_begin_heap(TupleDesc tupDesc,
int nkeys, AttrNumber *attNums,
Oid *sortOperators, Oid *sortCollations,
@@ -238,9 +407,6 @@ extern Tuplesortstate *tuplesort_begin_datum(Oid datumType,
int workMem, SortCoordinate coordinate,
int sortopt);
-extern void tuplesort_set_bound(Tuplesortstate *state, int64 bound);
-extern bool tuplesort_used_bound(Tuplesortstate *state);
-
extern void tuplesort_puttupleslot(Tuplesortstate *state,
TupleTableSlot *slot);
extern void tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup);
@@ -250,8 +416,6 @@ extern void tuplesort_putindextuplevalues(Tuplesortstate *state,
extern void tuplesort_putdatum(Tuplesortstate *state, Datum val,
bool isNull);
-extern void tuplesort_performsort(Tuplesortstate *state);
-
extern bool tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
bool copy, TupleTableSlot *slot, Datum *abbrev);
extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward);
@@ -259,34 +423,5 @@ extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward);
extern bool tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev);
-extern bool tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples,
- bool forward);
-
-extern void tuplesort_end(Tuplesortstate *state);
-
-extern void tuplesort_reset(Tuplesortstate *state);
-
-extern void tuplesort_get_stats(Tuplesortstate *state,
- TuplesortInstrumentation *stats);
-extern const char *tuplesort_method_name(TuplesortMethod m);
-extern const char *tuplesort_space_type_name(TuplesortSpaceType t);
-
-extern int tuplesort_merge_order(int64 allowedMem);
-
-extern Size tuplesort_estimate_shared(int nworkers);
-extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
- dsm_segment *seg);
-extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
-
-/*
- * These routines may only be called if randomAccess was specified 'true'.
- * Likewise, backwards scan in gettuple/getdatum is only allowed if
- * randomAccess was specified. Note that parallel sorts do not support
- * randomAccess.
- */
-
-extern void tuplesort_rescan(Tuplesortstate *state);
-extern void tuplesort_markpos(Tuplesortstate *state);
-extern void tuplesort_restorepos(Tuplesortstate *state);
#endif /* TUPLESORT_H */
--
2.24.3 (Apple Git-128)
I've bumped into this case in RUM extension. The need to build it with
tuplesort changes in different PG versions led me to reluctantly including
different tuplesort.c versions into the extension code. So I totally
support the intention of this patch and I'm planning to invest some time to
review it.
--
Best regards,
Pavel Borisov
Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>
Some PostgreSQL extensions need to sort their pieces of data. Then it
worth to re-use our tuplesort. But despite our tuplesort having
extensibility, it's hidden inside tuplesort.c. There are at least a
couple of examples of how extensions deal with that.1. RUM table access method: https://github.com/postgrespro/rum
RUM repository contains a copy of tuplesort.c for each major
PostgreSQL release. A reliable solution, but this is not how things
are intended to work, right?
2. OrioleDB table access method: https://github.com/orioledb/orioledb
OrioleDB runs on patches PostgreSQL. It contains a patch, which just
exposes all the guts of tuplesort.c to the tuplesort.h
https://github.com/orioledb/postgres/commit/d42755f52cI think we need a proper way to let extension re-use our core
tuplesort facility. The attached patchset is intended to do this the
right way. Patches don't revise all the comments and lack code
beautification. The intention behind publishing this revision is to
verify the direction and get some feedback for further work.
I still have one doubt about the thing: the compatibility with previous PG
versions requires me to support code paths that I already added into RUM
extension. I won't be able to drop it from extension for quite long time in
the future. It could be avoided if we backpatch this, which seems doubtful
to me provided the volume of code changes.
If we just change this thing since say v16 this will only help to
extensions that doesn't support earlier PG versions. I still consider the
change beneficial but wonder do you have some view on how should it be
managed in existing extensions to benefit them?
--
Best regards,
Pavel Borisov
Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>
Hi!
I've reviewed the patchset and noticed some minor issues:
- extra semicolon in macro (lead to warnings)
- comparison of var isWorker should be done in different way
Here is an upgraded version of the patchset.
Overall, I consider this patchset useful. Any opinions?
--
Best regards,
Maxim Orlov.
Attachments:
v2-0004-Move-freeing-memory-away-from-writetup.patchtext/x-patch; charset=US-ASCII; name=v2-0004-Move-freeing-memory-away-from-writetup.patchDownload
From ee2dd46b07d62e13ed66b5a38272fb5667c943f3 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 00:14:51 +0300
Subject: [PATCH v2 4/6] Move freeing memory away from writetup()
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/tuplesort.c | 64 ++++++++++++------------------
1 file changed, 26 insertions(+), 38 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index c4d8c183f62..3bf990a1b34 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -612,6 +612,8 @@ static Tuplesortstate *tuplesort_begin_common(int workMem,
int sortopt);
static void tuplesort_begin_batch(Tuplesortstate *state);
static void puttuple_common(Tuplesortstate *state, SortTuple *tuple);
+static void writetuple_common(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
static bool consider_abort_common(Tuplesortstate *state);
static void inittapes(Tuplesortstate *state, bool mergeruns);
static void inittapestate(Tuplesortstate *state, int maxTapes);
@@ -1838,7 +1840,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
/* copy the tuple into sort storage */
tuple = ExecCopySlotMinimalTuple(slot);
stup.tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
@@ -1847,8 +1848,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
state->tupDesc,
&stup.isnull1);
- MemoryContextSwitchTo(state->sortcontext);
-
puttuple_common(state, &stup);
MemoryContextSwitchTo(oldcontext);
@@ -1868,9 +1867,6 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
stup.tuple = (void *) tup;
- USEMEM(state, GetMemoryChunkSpace(tup));
-
- MemoryContextSwitchTo(state->sortcontext);
/*
* set up first-column key value, and potentially abbreviate, if it's a
@@ -1905,15 +1901,12 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
stup.tuple = index_form_tuple(RelationGetDescr(rel), values, isnull);
tuple = ((IndexTuple) stup.tuple);
tuple->t_tid = *self;
- USEMEM(state, GetMemoryChunkSpace(stup.tuple));
/* set up first-column key value */
stup.datum1 = index_getattr(tuple,
1,
RelationGetDescr(state->indexRel),
&stup.isnull1);
- MemoryContextSwitchTo(state->sortcontext);
-
puttuple_common(state, &stup);
MemoryContextSwitchTo(oldcontext);
@@ -1951,15 +1944,12 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
stup.datum1 = !isNull ? val : (Datum) 0;
stup.isnull1 = isNull;
stup.tuple = NULL; /* no separate storage */
- MemoryContextSwitchTo(state->sortcontext);
}
else
{
stup.isnull1 = false;
stup.datum1 = datumCopy(val, false, state->datumTypeLen);
stup.tuple = DatumGetPointer(stup.datum1);
- USEMEM(state, GetMemoryChunkSpace(stup.tuple));
- MemoryContextSwitchTo(state->sortcontext);
}
puttuple_common(state, &stup);
@@ -1973,8 +1963,13 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
static void
puttuple_common(Tuplesortstate *state, SortTuple *tuple)
{
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+
Assert(!LEADER(state));
+ if (tuple->tuple != NULL)
+ USEMEM(state, GetMemoryChunkSpace(tuple->tuple));
+
if (!state->sortKeys || !state->haveDatum1 || !state->tuples ||
!state->sortKeys->abbrev_converter || tuple->isnull1)
{
@@ -2052,6 +2047,7 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
pg_rusage_show(&state->ru_start));
#endif
make_bounded_heap(state);
+ MemoryContextSwitchTo(oldcontext);
return;
}
@@ -2059,7 +2055,10 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
* Done if we still fit in available memory and have array slots.
*/
if (state->memtupcount < state->memtupsize && !LACKMEM(state))
+ {
+ MemoryContextSwitchTo(oldcontext);
return;
+ }
/*
* Nope; time to switch to tape-based operation.
@@ -2113,6 +2112,19 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
elog(ERROR, "invalid tuplesort state");
break;
}
+ MemoryContextSwitchTo(oldcontext);
+}
+
+static void
+writetuple_common(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ WRITETUP(state, tape, stup);
+
+ if (!state->slabAllocatorUsed && stup->tuple)
+ {
+ FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
+ pfree(stup->tuple);
+ }
}
static bool
@@ -3170,7 +3182,7 @@ mergeonerun(Tuplesortstate *state)
/* write the tuple to destTape */
srcTapeIndex = state->memtuples[0].srctape;
srcTape = state->inputTapes[srcTapeIndex];
- WRITETUP(state, state->destTape, &state->memtuples[0]);
+ writetuple_common(state, state->destTape, &state->memtuples[0]);
/* recycle the slot of the tuple we just wrote out, for the next read */
if (state->memtuples[0].tuple)
@@ -3316,7 +3328,7 @@ dumptuples(Tuplesortstate *state, bool alltuples)
memtupwrite = state->memtupcount;
for (i = 0; i < memtupwrite; i++)
{
- WRITETUP(state, state->destTape, &state->memtuples[i]);
+ writetuple_common(state, state->destTape, &state->memtuples[i]);
state->memtupcount--;
}
@@ -3947,12 +3959,6 @@ writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- heap_free_minimal_tuple(tuple);
- }
}
static void
@@ -4123,12 +4129,6 @@ writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- heap_freetuple(tuple);
- }
}
static void
@@ -4380,12 +4380,6 @@ writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- pfree(tuple);
- }
}
static void
@@ -4469,12 +4463,6 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
-
- if (!state->slabAllocatorUsed && stup->tuple)
- {
- FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
- pfree(stup->tuple);
- }
}
static void
--
2.30.2
v2-0006-Split-tuplesortops.c.patchtext/x-patch; charset=US-ASCII; name=v2-0006-Split-tuplesortops.c.patchDownload
From b06bcb5f3666f0541dfcc27c9c8462af2b5ec9e0 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 21:48:05 +0300
Subject: [PATCH v2 6/6] Split tuplesortops.c
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/Makefile | 1 +
src/backend/utils/sort/tuplesort.c | 1710 +------------------------
src/backend/utils/sort/tuplesortops.c | 1550 ++++++++++++++++++++++
src/include/utils/tuplesort.h | 203 ++-
4 files changed, 1746 insertions(+), 1718 deletions(-)
create mode 100644 src/backend/utils/sort/tuplesortops.c
diff --git a/src/backend/utils/sort/Makefile b/src/backend/utils/sort/Makefile
index 26f65fcaf7a..bfe86c02f67 100644
--- a/src/backend/utils/sort/Makefile
+++ b/src/backend/utils/sort/Makefile
@@ -19,6 +19,7 @@ OBJS = \
sharedtuplestore.o \
sortsupport.o \
tuplesort.o \
+ tuplesortops.o \
tuplestore.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index e106e1ff9e2..6e681ca8afa 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -100,35 +100,17 @@
#include <limits.h>
-#include "access/hash.h"
-#include "access/htup_details.h"
-#include "access/nbtree.h"
-#include "catalog/index.h"
#include "catalog/pg_am.h"
#include "commands/tablespace.h"
#include "executor/executor.h"
#include "miscadmin.h"
#include "pg_trace.h"
-#include "utils/datum.h"
-#include "utils/logtape.h"
-#include "utils/lsyscache.h"
+#include "storage/shmem.h"
#include "utils/memutils.h"
#include "utils/pg_rusage.h"
#include "utils/rel.h"
-#include "utils/sortsupport.h"
#include "utils/tuplesort.h"
-
-/* sort-type codes for sort__start probes */
-#define HEAP_SORT 0
-#define INDEX_SORT 1
-#define DATUM_SORT 2
-#define CLUSTER_SORT 3
-
-/* Sort parallel code from state for sort__start probes */
-#define PARALLEL_SORT(coordinate) ((coordinate)->sharedsort == NULL ? 0 : \
- (coordinate)->isWorker >= 0 ? 1 : 2)
-
/*
* Initial size of memtuples array. We're trying to select this size so that
* array doesn't exceed ALLOCSET_SEPARATE_THRESHOLD and so that the overhead of
@@ -149,43 +131,6 @@ bool optimize_bounded_sort = true;
#endif
-/*
- * The objects we actually sort are SortTuple structs. These contain
- * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
- * which is a separate palloc chunk --- we assume it is just one chunk and
- * can be freed by a simple pfree() (except during merge, when we use a
- * simple slab allocator). SortTuples also contain the tuple's first key
- * column in Datum/nullflag format, and a source/input tape number that
- * tracks which tape each heap element/slot belongs to during merging.
- *
- * Storing the first key column lets us save heap_getattr or index_getattr
- * calls during tuple comparisons. We could extract and save all the key
- * columns not just the first, but this would increase code complexity and
- * overhead, and wouldn't actually save any comparison cycles in the common
- * case where the first key determines the comparison result. Note that
- * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
- *
- * There is one special case: when the sort support infrastructure provides an
- * "abbreviated key" representation, where the key is (typically) a pass by
- * value proxy for a pass by reference type. In this case, the abbreviated key
- * is stored in datum1 in place of the actual first key column.
- *
- * When sorting single Datums, the data value is represented directly by
- * datum1/isnull1 for pass by value types (or null values). If the datatype is
- * pass-by-reference and isnull1 is false, then "tuple" points to a separately
- * palloc'd data value, otherwise "tuple" is NULL. The value of datum1 is then
- * either the same pointer as "tuple", or is an abbreviated key value as
- * described above. Accordingly, "tuple" is always used in preference to
- * datum1 as the authoritative value for pass-by-reference cases.
- */
-typedef struct
-{
- void *tuple; /* the tuple itself */
- Datum datum1; /* value of first key column */
- bool isnull1; /* is first key column NULL? */
- int srctape; /* source tape number */
-} SortTuple;
-
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
* tuples. To avoid palloc/pfree overhead.
@@ -236,136 +181,6 @@ typedef enum
#define TAPE_BUFFER_OVERHEAD BLCKSZ
#define MERGE_BUFFER_SIZE (BLCKSZ * 32)
-typedef struct TuplesortOps TuplesortOps;
-
-typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-
-struct TuplesortOps
-{
- MemoryContext maincontext; /* memory context for tuple sort metadata that
- * persists across multiple batches */
- MemoryContext sortcontext; /* memory context holding most sort data */
- MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
-
- /*
- * These function pointers decouple the routines that must know what kind
- * of tuple we are sorting from the routines that don't need to know it.
- * They are set up by the tuplesort_begin_xxx routines.
- *
- * Function to compare two tuples; result is per qsort() convention, ie:
- * <0, 0, >0 according as a<b, a=b, a>b. The API must match
- * qsort_arg_comparator.
- */
- SortTupleComparator comparetup;
-
- void (*getdatum1) (Tuplesortstate *state, SortTuple *stup);
-
- /*
- * Function to write a stored tuple onto tape. The representation of the
- * tuple on tape need not be the same as it is in memory; requirements on
- * the tape representation are given below. Unless the slab allocator is
- * used, after writing the tuple, pfree() the out-of-line data (not the
- * SortTuple struct!), and increase state->availMem by the amount of
- * memory space thereby released.
- */
- void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-
- /*
- * Function to read a stored tuple from tape back into memory. 'len' is
- * the already-read length of the stored tuple. The tuple is allocated
- * from the slab memory arena, or is palloc'd, see readtup_alloc().
- */
- void (*readtup) (Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-
- void (*freestate) (Tuplesortstate *state);
-
- /*
- * Whether SortTuple's datum1 and isnull1 members are maintained by the
- * above routines. If not, some sort specializations are disabled.
- */
- bool haveDatum1;
-
- /*
- * The sortKeys variable is used by every case other than the hash index
- * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
- * MinimalTuple and CLUSTER routines, though.
- */
- int nKeys; /* number of columns in sort key */
- SortSupport sortKeys; /* array of length nKeys */
-
- /*
- * This variable is shared by the single-key MinimalTuple case and the
- * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
- * presence of a value in this field is also checked by various sort
- * specialization functions as an optimization when comparing the leading
- * key in a tiebreak situation to determine if there are any subsequent
- * keys to sort on.
- */
- SortSupport onlyKey;
-
- int sortopt; /* Bitmask of flags used to setup sort */
-
- bool tuples; /* Can SortTuple.tuple ever be set? */
-
- void *arg;
-};
-
-typedef struct
-{
- TupleDesc tupDesc;
-
- /*
- * These variables are specific to the CLUSTER case; they are set by
- * tuplesort_begin_cluster.
- */
- IndexInfo *indexInfo; /* info about index being used for reference */
- EState *estate; /* for evaluating index expressions */
-} TupleSortClusterArg;
-
-typedef struct
-{
- /*
- * These variables are specific to the IndexTuple case; they are set by
- * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
- */
- Relation heapRel; /* table the index is being built on */
- Relation indexRel; /* index being built */
-} TupleSortIndexArg;
-
-typedef struct
-{
- TupleSortIndexArg index;
-
- /* These are specific to the index_btree subcase: */
- bool enforceUnique; /* complain if we find duplicate tuples */
- bool uniqueNullsNotDistinct; /* unique constraint null treatment */
-} TupleSortIndexBTreeArg;
-
-typedef struct
-{
- TupleSortIndexArg index;
-
- /* These are specific to the index_hash subcase: */
- uint32 high_mask; /* masks for sortable part of hash code */
- uint32 low_mask;
- uint32 max_buckets;
-} TupleSortIndexHashArg;
-
-typedef struct
-{
- TupleSortIndexArg index;
-
- /*
- * These variables are specific to the Datum case; they are set by
- * tuplesort_begin_datum and used only by the DatumTuple routines.
- */
- Oid datumType;
- /* we need typelen in order to know how to copy the Datums. */
- int datumTypeLen;
-} TupleSortDatumArg;
/*
* Private state of a Tuplesort operation.
@@ -577,8 +392,6 @@ struct Sharedsort
pfree(buf); \
} while(0)
-#define TuplesortstateGetOps(state) ((TuplesortOps *) state);
-
#define GETDATUM1(state,stup) ((*(state)->ops.getdatum1) (state, stup))
#define COMPARETUP(state,a,b) ((*(state)->ops.comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) ((*(state)->ops.writetup) (state, tape, stup))
@@ -637,19 +450,8 @@ struct Sharedsort
* begins).
*/
-/* When using this macro, beware of double evaluation of len */
-#define LogicalTapeReadExact(tape, ptr, len) \
- do { \
- if (LogicalTapeRead(tape, ptr, len) != (size_t) (len)) \
- elog(ERROR, "unexpected end of data"); \
- } while(0)
-
-static Tuplesortstate *tuplesort_begin_common(int workMem,
- SortCoordinate coordinate,
- int sortopt);
static void tuplesort_begin_batch(Tuplesortstate *state);
-static void puttuple_common(Tuplesortstate *state, SortTuple *tuple);
static void writetuple_common(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static bool consider_abort_common(Tuplesortstate *state);
@@ -671,38 +473,6 @@ static void tuplesort_heap_delete_top(Tuplesortstate *state);
static void reversedirection(Tuplesortstate *state);
static unsigned int getlen(LogicalTape *tape, bool eofOK);
static void markrunend(LogicalTape *tape);
-static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
-static void getdatum1_heap(Tuplesortstate *state, SortTuple *stup);
-static void getdatum1_cluster(Tuplesortstate *state, SortTuple *stup);
-static void getdatum1_index(Tuplesortstate *state, SortTuple *stup);
-static void getdatum1_datum(Tuplesortstate *state, SortTuple *stup);
-static int comparetup_heap(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_index(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_datum(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static void freestate_cluster(Tuplesortstate *state);
static int worker_get_identifier(Tuplesortstate *state);
static void worker_freeze_result_tape(Tuplesortstate *state);
static void worker_nomergeruns(Tuplesortstate *state);
@@ -873,7 +643,7 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* sort options. See TUPLESORT_* definitions in tuplesort.h
*/
-static Tuplesortstate *
+Tuplesortstate *
tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
{
Tuplesortstate *state;
@@ -1059,468 +829,6 @@ tuplesort_begin_batch(Tuplesortstate *state)
MemoryContextSwitchTo(oldcontext);
}
-Tuplesortstate *
-tuplesort_begin_heap(TupleDesc tupDesc,
- int nkeys, AttrNumber *attNums,
- Oid *sortOperators, Oid *sortCollations,
- bool *nullsFirstFlags,
- int workMem, SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- MemoryContext oldcontext;
- int i;
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
-
- AssertArg(nkeys > 0);
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
- nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = nkeys;
-
- TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
- false, /* no unique check */
- nkeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- ops->getdatum1 = getdatum1_heap;
- ops->comparetup = comparetup_heap;
- ops->writetup = writetup_heap;
- ops->readtup = readtup_heap;
- ops->haveDatum1 = true;
- ops->arg = tupDesc; /* assume we need not copy tupDesc */
-
- /* Prepare SortSupport data for each column */
- ops->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
-
- for (i = 0; i < nkeys; i++)
- {
- SortSupport sortKey = ops->sortKeys + i;
-
- AssertArg(attNums[i] != 0);
- AssertArg(sortOperators[i] != 0);
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = sortCollations[i];
- sortKey->ssup_nulls_first = nullsFirstFlags[i];
- sortKey->ssup_attno = attNums[i];
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && ops->haveDatum1);
-
- PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
- }
-
- /*
- * The "onlyKey" optimization cannot be used with abbreviated keys, since
- * tie-breaker comparisons may be required. Typically, the optimization
- * is only of value to pass-by-value types anyway, whereas abbreviated
- * keys are typically only of value to pass-by-reference types.
- */
- if (nkeys == 1 && !ops->sortKeys->abbrev_converter)
- ops->onlyKey = ops->sortKeys;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_cluster(TupleDesc tupDesc,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- BTScanInsert indexScanKey;
- MemoryContext oldcontext;
- TupleSortClusterArg *arg;
- int i;
-
- Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
- arg = (TupleSortClusterArg *) palloc0(sizeof(TupleSortClusterArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
- RelationGetNumberOfAttributes(indexRel),
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
- false, /* no unique check */
- ops->nKeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- ops->getdatum1 = getdatum1_cluster;
- ops->comparetup = comparetup_cluster;
- ops->writetup = writetup_cluster;
- ops->readtup = readtup_cluster;
- ops->freestate = freestate_cluster;
- ops->arg = arg;
-
- arg->indexInfo = BuildIndexInfo(indexRel);
-
- /*
- * If we don't have a simple leading attribute, we don't currently
- * initialize datum1, so disable optimizations that require it.
- */
- if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
- ops->haveDatum1 = false;
- else
- ops->haveDatum1 = true;
-
- arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
-
- indexScanKey = _bt_mkscankey(indexRel, NULL);
-
- if (arg->indexInfo->ii_Expressions != NULL)
- {
- TupleTableSlot *slot;
- ExprContext *econtext;
-
- /*
- * We will need to use FormIndexDatum to evaluate the index
- * expressions. To do that, we need an EState, as well as a
- * TupleTableSlot to put the table tuples into. The econtext's
- * scantuple has to point to that slot, too.
- */
- arg->estate = CreateExecutorState();
- slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
- econtext = GetPerTupleExprContext(arg->estate);
- econtext->ecxt_scantuple = slot;
- }
-
- /* Prepare SortSupport data for each column */
- ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < ops->nKeys; i++)
- {
- SortSupport sortKey = ops->sortKeys + i;
- ScanKey scanKey = indexScanKey->scankeys + i;
- int16 strategy;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = scanKey->sk_collation;
- sortKey->ssup_nulls_first =
- (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
- sortKey->ssup_attno = scanKey->sk_attno;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && ops->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
- BTGreaterStrategyNumber : BTLessStrategyNumber;
-
- PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
- }
-
- pfree(indexScanKey);
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_btree(Relation heapRel,
- Relation indexRel,
- bool enforceUnique,
- bool uniqueNullsNotDistinct,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- BTScanInsert indexScanKey;
- TupleSortIndexBTreeArg *arg;
- MemoryContext oldcontext;
- int i;
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
- arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: unique = %c, workMem = %d, randomAccess = %c",
- enforceUnique ? 't' : 'f',
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
- enforceUnique,
- state->nKeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- ops->getdatum1 = getdatum1_index;
- ops->comparetup = comparetup_index_btree;
- ops->writetup = writetup_index;
- ops->readtup = readtup_index;
- ops->haveDatum1 = true;
- ops->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
- arg->enforceUnique = enforceUnique;
- arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
-
- indexScanKey = _bt_mkscankey(indexRel, NULL);
-
- /* Prepare SortSupport data for each column */
- ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < ops->nKeys; i++)
- {
- SortSupport sortKey = ops->sortKeys + i;
- ScanKey scanKey = indexScanKey->scankeys + i;
- int16 strategy;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = scanKey->sk_collation;
- sortKey->ssup_nulls_first =
- (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
- sortKey->ssup_attno = scanKey->sk_attno;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && ops->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
- BTGreaterStrategyNumber : BTLessStrategyNumber;
-
- PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
- }
-
- pfree(indexScanKey);
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_hash(Relation heapRel,
- Relation indexRel,
- uint32 high_mask,
- uint32 low_mask,
- uint32 max_buckets,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- MemoryContext oldcontext;
- TupleSortIndexHashArg *arg;
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
- arg = (TupleSortIndexHashArg *) palloc(sizeof(TupleSortIndexHashArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: high_mask = 0x%x, low_mask = 0x%x, "
- "max_buckets = 0x%x, workMem = %d, randomAccess = %c",
- high_mask,
- low_mask,
- max_buckets,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = 1; /* Only one sort column, the hash code */
-
- ops->getdatum1 = getdatum1_index;
- ops->comparetup = comparetup_index_hash;
- ops->writetup = writetup_index;
- ops->readtup = readtup_index;
- ops->haveDatum1 = true;
- ops->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
-
- arg->high_mask = high_mask;
- arg->low_mask = low_mask;
- arg->max_buckets = max_buckets;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_gist(Relation heapRel,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- MemoryContext oldcontext;
- TupleSortIndexBTreeArg *arg;
- int i;
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
- arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: workMem = %d, randomAccess = %c",
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- ops->getdatum1 = getdatum1_index;
- ops->comparetup = comparetup_index_btree;
- ops->writetup = writetup_index;
- ops->readtup = readtup_index;
- ops->haveDatum1 = true;
- ops->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
- arg->enforceUnique = false;
- arg->uniqueNullsNotDistinct = false;
-
- /* Prepare SortSupport data for each column */
- ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < ops->nKeys; i++)
- {
- SortSupport sortKey = ops->sortKeys + i;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = indexRel->rd_indcollation[i];
- sortKey->ssup_nulls_first = false;
- sortKey->ssup_attno = i + 1;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && ops->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- /* Look for a sort support function */
- PrepareSortSupportFromGistIndexRel(indexRel, sortKey);
- }
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
- bool nullsFirstFlag, int workMem,
- SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortDatumArg *arg;
- MemoryContext oldcontext;
- int16 typlen;
- bool typbyval;
-
- oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
- arg = (TupleSortDatumArg *) palloc(sizeof(TupleSortDatumArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin datum sort: workMem = %d, randomAccess = %c",
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- ops->nKeys = 1; /* always a one-column sort */
-
- TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
- false, /* no unique check */
- 1,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- ops->getdatum1 = getdatum1_datum;
- ops->comparetup = comparetup_datum;
- ops->writetup = writetup_datum;
- ops->readtup = readtup_datum;
- state->abbrevNext = 10;
- ops->haveDatum1 = true;
- ops->arg = arg;
-
- arg->datumType = datumType;
-
- /* lookup necessary attributes of the datum type */
- get_typlenbyval(datumType, &typlen, &typbyval);
- arg->datumTypeLen = typlen;
- ops->tuples = !typbyval;
-
- /* Prepare SortSupport data */
- ops->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
-
- ops->sortKeys->ssup_cxt = CurrentMemoryContext;
- ops->sortKeys->ssup_collation = sortCollation;
- ops->sortKeys->ssup_nulls_first = nullsFirstFlag;
-
- /*
- * Abbreviation is possible here only for by-reference types. In theory,
- * a pass-by-value datatype could have an abbreviated form that is cheaper
- * to compare. In a tuple sort, we could support that, because we can
- * always extract the original datum from the tuple as needed. Here, we
- * can't, because a datum sort only stores a single copy of the datum; the
- * "tuple" field of each SortTuple is NULL.
- */
- ops->sortKeys->abbreviate = !typbyval;
-
- PrepareSortSupportFromOrderingOp(sortOperator, ops->sortKeys);
-
- /*
- * The "onlyKey" optimization cannot be used with abbreviated keys, since
- * tie-breaker comparisons may be required. Typically, the optimization
- * is only of value to pass-by-value types anyway, whereas abbreviated
- * keys are typically only of value to pass-by-reference types.
- */
- if (!ops->sortKeys->abbrev_converter)
- ops->onlyKey = ops->sortKeys;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
/*
* tuplesort_set_bound
*
@@ -1876,152 +1184,11 @@ noalloc:
return false;
}
-/*
- * Accept one tuple while collecting input data for sort.
- *
- * Note that the input data is always copied; the caller need not save it.
- */
-void
-tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleDesc tupDesc = (TupleDesc) ops->arg;
- SortTuple stup;
- MinimalTuple tuple;
- HeapTupleData htup;
-
- /* copy the tuple into sort storage */
- tuple = ExecCopySlotMinimalTuple(slot);
- stup.tuple = (void *) tuple;
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- stup.datum1 = heap_getattr(&htup,
- ops->sortKeys[0].ssup_attno,
- tupDesc,
- &stup.isnull1);
-
- puttuple_common(state, &stup);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Accept one tuple while collecting input data for sort.
- *
- * Note that the input data is always copied; the caller need not save it.
- */
-void
-tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
-{
- SortTuple stup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
-
- /* copy the tuple into sort storage */
- tup = heap_copytuple(tup);
- stup.tuple = (void *) tup;
-
- /*
- * set up first-column key value, and potentially abbreviate, if it's a
- * simple column
- */
- if (ops->haveDatum1)
- {
- stup.datum1 = heap_getattr(tup,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup.isnull1);
- }
-
- puttuple_common(state, &stup);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Collect one index tuple while collecting input data for sort, building
- * it from caller-supplied values.
- */
-void
-tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
- ItemPointer self, Datum *values,
- bool *isnull)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
- SortTuple stup;
- IndexTuple tuple;
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
-
- stup.tuple = index_form_tuple(RelationGetDescr(rel), values, isnull);
- tuple = ((IndexTuple) stup.tuple);
- tuple->t_tid = *self;
- /* set up first-column key value */
- stup.datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup.isnull1);
-
- puttuple_common(state, &stup);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Accept one Datum while collecting input data for sort.
- *
- * If the Datum is pass-by-ref type, the value will be copied.
- */
-void
-tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
- SortTuple stup;
-
- /*
- * Pass-by-value types or null values are just stored directly in
- * stup.datum1 (and stup.tuple is not used and set to NULL).
- *
- * Non-null pass-by-reference values need to be copied into memory we
- * control, and possibly abbreviated. The copied value is pointed to by
- * stup.tuple and is treated as the canonical copy (e.g. to return via
- * tuplesort_getdatum or when writing to tape); stup.datum1 gets the
- * abbreviated value if abbreviation is happening, otherwise it's
- * identical to stup.tuple.
- */
-
- if (isNull || !state->ops.tuples)
- {
- /*
- * Set datum1 to zeroed representation for NULLs (to be consistent,
- * and to support cheap inequality tests for NULL abbreviated keys).
- */
- stup.datum1 = !isNull ? val : (Datum) 0;
- stup.isnull1 = isNull;
- stup.tuple = NULL; /* no separate storage */
- }
- else
- {
- stup.isnull1 = false;
- stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
- stup.tuple = DatumGetPointer(stup.datum1);
- }
-
- puttuple_common(state, &stup);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
/*
* Shared code for tuple and datum cases.
*/
-static void
-puttuple_common(Tuplesortstate *state, SortTuple *tuple)
+void
+tuplesort_puttuple_common(Tuplesortstate *state, SortTuple *tuple)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
@@ -2342,7 +1509,7 @@ tuplesort_performsort(Tuplesortstate *state)
* by caller. Note that fetched tuple is stored in memory that may be
* recycled by any future fetch.
*/
-static bool
+bool
tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
SortTuple *stup)
{
@@ -2552,171 +1719,28 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
{
/*
* If no more data, we've reached end of run on this tape.
- * Remove the top node from the heap.
- */
- tuplesort_heap_delete_top(state);
- state->nInputRuns--;
-
- /*
- * Close the tape. It'd go away at the end of the sort
- * anyway, but better to release the memory early.
- */
- LogicalTapeClose(srcTape);
- return true;
- }
- newtup.srctape = srcTapeIndex;
- tuplesort_heap_replace_top(state, &newtup);
- return true;
- }
- return false;
-
- default:
- elog(ERROR, "invalid tuplesort state");
- return false; /* keep compiler quiet */
- }
-}
-
-/*
- * Fetch the next tuple in either forward or back direction.
- * If successful, put tuple in slot and return true; else, clear the slot
- * and return false.
- *
- * Caller may optionally be passed back abbreviated value (on true return
- * value) when abbreviation was used, which can be used to cheaply avoid
- * equality checks that might otherwise be required. Caller can safely make a
- * determination of "non-equal tuple" based on simple binary inequality. A
- * NULL value in leading attribute will set abbreviated value to zeroed
- * representation, which caller may rely on in abbreviated inequality check.
- *
- * If copy is true, the slot receives a tuple that's been copied into the
- * caller's memory context, so that it will stay valid regardless of future
- * manipulations of the tuplesort's state (up to and including deleting the
- * tuplesort). If copy is false, the slot will just receive a pointer to a
- * tuple held within the tuplesort, which is more efficient, but only safe for
- * callers that are prepared to have any subsequent manipulation of the
- * tuplesort's state invalidate slot contents.
- */
-bool
-tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
- TupleTableSlot *slot, Datum *abbrev)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- if (stup.tuple)
- {
- /* Record abbreviated key for caller */
- if (state->ops.sortKeys->abbrev_converter && abbrev)
- *abbrev = stup.datum1;
-
- if (copy)
- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
-
- ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
- return true;
- }
- else
- {
- ExecClearTuple(slot);
- return false;
- }
-}
-
-/*
- * Fetch the next tuple in either forward or back direction.
- * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller may not rely on tuple
- * remaining valid after any further manipulation of tuplesort.
- */
-HeapTuple
-tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- return stup.tuple;
-}
-
-/*
- * Fetch the next index tuple in either forward or back direction.
- * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller may not rely on tuple
- * remaining valid after any further manipulation of tuplesort.
- */
-IndexTuple
-tuplesort_getindextuple(Tuplesortstate *state, bool forward)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- return (IndexTuple) stup.tuple;
-}
-
-/*
- * Fetch the next Datum in either forward or back direction.
- * Returns false if no more datums.
- *
- * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * in caller's context, and is now owned by the caller (this differs from
- * similar routines for other types of tuplesorts).
- *
- * Caller may optionally be passed back abbreviated value (on true return
- * value) when abbreviation was used, which can be used to cheaply avoid
- * equality checks that might otherwise be required. Caller can safely make a
- * determination of "non-equal tuple" based on simple binary inequality. A
- * NULL value will have a zeroed abbreviated value representation, which caller
- * may rely on in abbreviated inequality check.
- */
-bool
-tuplesort_getdatum(Tuplesortstate *state, bool forward,
- Datum *val, bool *isNull, Datum *abbrev)
-{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- {
- MemoryContextSwitchTo(oldcontext);
- return false;
- }
-
- /* Ensure we copy into caller's memory context */
- MemoryContextSwitchTo(oldcontext);
+ * Remove the top node from the heap.
+ */
+ tuplesort_heap_delete_top(state);
+ state->nInputRuns--;
- /* Record abbreviated key for caller */
- if (ops->sortKeys->abbrev_converter && abbrev)
- *abbrev = stup.datum1;
+ /*
+ * Close the tape. It'd go away at the end of the sort
+ * anyway, but better to release the memory early.
+ */
+ LogicalTapeClose(srcTape);
+ return true;
+ }
+ newtup.srctape = srcTapeIndex;
+ tuplesort_heap_replace_top(state, &newtup);
+ return true;
+ }
+ return false;
- if (stup.isnull1 || !state->ops.tuples)
- {
- *val = stup.datum1;
- *isNull = stup.isnull1;
- }
- else
- {
- /* use stup.tuple because stup.datum1 may be an abbreviation */
- *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
- *isNull = false;
+ default:
+ elog(ERROR, "invalid tuplesort state");
+ return false; /* keep compiler quiet */
}
-
- return true;
}
/*
@@ -3897,8 +2921,8 @@ markrunend(LogicalTape *tape)
* We use next free slot from the slab allocator, or palloc() if the tuple
* is too large for that.
*/
-static void *
-readtup_alloc(Tuplesortstate *state, Size tuplen)
+void *
+tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen)
{
SlabSlot *buf;
@@ -3920,688 +2944,6 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
}
}
-
-/*
- * Routines specialized for HeapTuple (actually MinimalTuple) case
- */
-
-static void
-getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- HeapTupleData htup;
-
- htup.t_len = ((MinimalTuple) stup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) stup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- stup->datum1 = heap_getattr(&htup,
- ops->sortKeys[0].ssup_attno,
- (TupleDesc) ops->arg,
- &stup->isnull1);
-
-}
-
-static int
-comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- SortSupport sortKey = ops->sortKeys;
- HeapTupleData ltup;
- HeapTupleData rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- AttrNumber attno;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
- /* Compare the leading sort key */
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- /* Compare additional sort keys */
- ltup.t_len = ((MinimalTuple) a->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
- ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
- rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
- rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
- tupDesc = (TupleDesc) ops->arg;
-
- if (sortKey->abbrev_converter)
- {
- attno = sortKey->ssup_attno;
-
- datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- sortKey++;
- for (nkey = 1; nkey < ops->nKeys; nkey++, sortKey++)
- {
- attno = sortKey->ssup_attno;
-
- datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- return 0;
-}
-
-static void
-writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- MinimalTuple tuple = (MinimalTuple) stup->tuple;
-
- /* the part of the MinimalTuple we'll write: */
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
-
- /* total on-disk footprint: */
- unsigned int tuplen = tupbodylen + sizeof(int);
-
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_heap(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- unsigned int tupbodylen = len - sizeof(int);
- unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
- MinimalTuple tuple = (MinimalTuple) readtup_alloc(state, tuplen);
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- TuplesortOps *ops = TuplesortstateGetOps(state);
- HeapTupleData htup;
-
- /* read in the tuple proper */
- tuple->t_len = tuplen;
- LogicalTapeReadExact(tape, tupbody, tupbodylen);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- stup->datum1 = heap_getattr(&htup,
- ops->sortKeys[0].ssup_attno,
- (TupleDesc) ops->arg,
- &stup->isnull1);
-}
-
-/*
- * Routines specialized for the CLUSTER case (HeapTuple data, with
- * comparisons per a btree index definition)
- */
-
-static void
-getdatum1_cluster(Tuplesortstate *state, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
- HeapTuple tup;
-
- tup = (HeapTuple) stup->tuple;
- stup->datum1 = heap_getattr(tup,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup->isnull1);
-}
-
-static int
-comparetup_cluster(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
- SortSupport sortKey = ops->sortKeys;
- HeapTuple ltup;
- HeapTuple rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
- /* Be prepared to compare additional sort keys */
- ltup = (HeapTuple) a->tuple;
- rtup = (HeapTuple) b->tuple;
- tupDesc = arg->tupDesc;
-
- /* Compare the leading sort key, if it's simple */
- if (ops->haveDatum1)
- {
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- if (sortKey->abbrev_converter)
- {
- AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
-
- datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
- datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- }
- if (compare != 0 || ops->nKeys == 1)
- return compare;
- /* Compare additional columns the hard way */
- sortKey++;
- nkey = 1;
- }
- else
- {
- /* Must compare all keys the hard way */
- nkey = 0;
- }
-
- if (arg->indexInfo->ii_Expressions == NULL)
- {
- /* If not expression index, just compare the proper heap attrs */
-
- for (; nkey < ops->nKeys; nkey++, sortKey++)
- {
- AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
-
- datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
- }
- else
- {
- /*
- * In the expression index case, compute the whole index tuple and
- * then compare values. It would perhaps be faster to compute only as
- * many columns as we need to compare, but that would require
- * duplicating all the logic in FormIndexDatum.
- */
- Datum l_index_values[INDEX_MAX_KEYS];
- bool l_index_isnull[INDEX_MAX_KEYS];
- Datum r_index_values[INDEX_MAX_KEYS];
- bool r_index_isnull[INDEX_MAX_KEYS];
- TupleTableSlot *ecxt_scantuple;
-
- /* Reset context each time to prevent memory leakage */
- ResetPerTupleExprContext(arg->estate);
-
- ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
-
- ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
- FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
- l_index_values, l_index_isnull);
-
- ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
- FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
- r_index_values, r_index_isnull);
-
- for (; nkey < ops->nKeys; nkey++, sortKey++)
- {
- compare = ApplySortComparator(l_index_values[nkey],
- l_index_isnull[nkey],
- r_index_values[nkey],
- r_index_isnull[nkey],
- sortKey);
- if (compare != 0)
- return compare;
- }
- }
-
- return 0;
-}
-
-static void
-writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- HeapTuple tuple = (HeapTuple) stup->tuple;
- unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
-
- /* We need to store t_self, but not other fields of HeapTupleData */
- LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
- LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_cluster(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int tuplen)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
- unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
- HeapTuple tuple = (HeapTuple) readtup_alloc(state,
- t_len + HEAPTUPLESIZE);
-
- /* Reconstruct the HeapTupleData header */
- tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
- tuple->t_len = t_len;
- LogicalTapeReadExact(tape, &tuple->t_self, sizeof(ItemPointerData));
- /* We don't currently bother to reconstruct t_tableOid */
- tuple->t_tableOid = InvalidOid;
- /* Read in the tuple body */
- LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value, if it's a simple column */
- if (ops->haveDatum1)
- stup->datum1 = heap_getattr(tuple,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup->isnull1);
-}
-
-static void
-freestate_cluster(Tuplesortstate *state)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
-
- /* Free any execution state created for CLUSTER case */
- if (arg->estate != NULL)
- {
- ExprContext *econtext = GetPerTupleExprContext(arg->estate);
-
- ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
- FreeExecutorState(arg->estate);
- }
-}
-
-/*
- * Routines specialized for IndexTuple case
- *
- * The btree and hash cases require separate comparison functions, but the
- * IndexTuple representation is the same so the copy/write/read support
- * functions can be shared.
- */
-
-static void
-getdatum1_index(Tuplesortstate *state, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
- IndexTuple tuple;
-
- tuple = stup->tuple;
- stup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup->isnull1);
-}
-
-static int
-comparetup_index_btree(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- /*
- * This is similar to comparetup_heap(), but expects index tuples. There
- * is also special handling for enforcing uniqueness, and special
- * treatment for equal keys at the end.
- */
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortIndexBTreeArg *arg = (TupleSortIndexBTreeArg *) ops->arg;
- SortSupport sortKey = ops->sortKeys;
- IndexTuple tuple1;
- IndexTuple tuple2;
- int keysz;
- TupleDesc tupDes;
- bool equal_hasnull = false;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
- /* Compare the leading sort key */
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- /* Compare additional sort keys */
- tuple1 = (IndexTuple) a->tuple;
- tuple2 = (IndexTuple) b->tuple;
- keysz = ops->nKeys;
- tupDes = RelationGetDescr(arg->index.indexRel);
-
- if (sortKey->abbrev_converter)
- {
- datum1 = index_getattr(tuple1, 1, tupDes, &isnull1);
- datum2 = index_getattr(tuple2, 1, tupDes, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- /* they are equal, so we only need to examine one null flag */
- if (a->isnull1)
- equal_hasnull = true;
-
- sortKey++;
- for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
- {
- datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
- datum2 = index_getattr(tuple2, nkey, tupDes, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare; /* done when we find unequal attributes */
-
- /* they are equal, so we only need to examine one null flag */
- if (isnull1)
- equal_hasnull = true;
- }
-
- /*
- * If btree has asked us to enforce uniqueness, complain if two equal
- * tuples are detected (unless there was at least one NULL field and NULLS
- * NOT DISTINCT was not set).
- *
- * It is sufficient to make the test here, because if two tuples are equal
- * they *must* get compared at some stage of the sort --- otherwise the
- * sort algorithm wouldn't have checked whether one must appear before the
- * other.
- */
- if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
- {
- Datum values[INDEX_MAX_KEYS];
- bool isnull[INDEX_MAX_KEYS];
- char *key_desc;
-
- /*
- * Some rather brain-dead implementations of qsort (such as the one in
- * QNX 4) will sometimes call the comparison routine to compare a
- * value to itself, but we always use our own implementation, which
- * does not.
- */
- Assert(tuple1 != tuple2);
-
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
- }
-
- /*
- * If key values are equal, we sort on ItemPointer. This is required for
- * btree indexes, since heap TID is treated as an implicit last key
- * attribute in order to ensure that all keys in the index are physically
- * unique.
- */
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
-}
-
-static int
-comparetup_index_hash(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- Bucket bucket1;
- Bucket bucket2;
- IndexTuple tuple1;
- IndexTuple tuple2;
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortIndexHashArg *arg = (TupleSortIndexHashArg *) ops->arg;
-
- /*
- * Fetch hash keys and mask off bits we don't want to sort by. We know
- * that the first column of the index tuple is the hash key.
- */
- Assert(!a->isnull1);
- bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
- arg->max_buckets, arg->high_mask,
- arg->low_mask);
- Assert(!b->isnull1);
- bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
- arg->max_buckets, arg->high_mask,
- arg->low_mask);
- if (bucket1 > bucket2)
- return 1;
- else if (bucket1 < bucket2)
- return -1;
-
- /*
- * If hash values are equal, we sort on ItemPointer. This does not affect
- * validity of the finished index, but it may be useful to have index
- * scans in physical order.
- */
- tuple1 = (IndexTuple) a->tuple;
- tuple2 = (IndexTuple) b->tuple;
-
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
-}
-
-static void
-writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- IndexTuple tuple = (IndexTuple) stup->tuple;
- unsigned int tuplen;
-
- tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_index(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
- unsigned int tuplen = len - sizeof(unsigned int);
- IndexTuple tuple = (IndexTuple) readtup_alloc(state, tuplen);
-
- LogicalTapeReadExact(tape, tuple, tuplen);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value */
- stup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup->isnull1);
-}
-
-/*
- * Routines specialized for DatumTuple case
- */
-
-static void
-getdatum1_datum(Tuplesortstate *state, SortTuple *stup)
-{
- stup->datum1 = PointerGetDatum(stup->tuple);
-}
-
-static int
-comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- int compare;
-
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- ops->sortKeys);
- if (compare != 0)
- return compare;
-
- /* if we have abbreviations, then "tuple" has the original value */
-
- if (ops->sortKeys->abbrev_converter)
- compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
- PointerGetDatum(b->tuple), b->isnull1,
- ops->sortKeys);
-
- return compare;
-}
-
-static void
-writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
- void *waddr;
- unsigned int tuplen;
- unsigned int writtenlen;
-
- if (stup->isnull1)
- {
- waddr = NULL;
- tuplen = 0;
- }
- else if (!state->ops.tuples)
- {
- waddr = &stup->datum1;
- tuplen = sizeof(Datum);
- }
- else
- {
- waddr = stup->tuple;
- tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
- Assert(tuplen != 0);
- }
-
- writtenlen = tuplen + sizeof(unsigned int);
-
- LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
- LogicalTapeWrite(tape, waddr, tuplen);
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
-}
-
-static void
-readtup_datum(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- TuplesortOps *ops = TuplesortstateGetOps(state);
- unsigned int tuplen = len - sizeof(unsigned int);
-
- if (tuplen == 0)
- {
- /* it's NULL */
- stup->datum1 = (Datum) 0;
- stup->isnull1 = true;
- stup->tuple = NULL;
- }
- else if (!state->ops.tuples)
- {
- Assert(tuplen == sizeof(Datum));
- LogicalTapeReadExact(tape, &stup->datum1, tuplen);
- stup->isnull1 = false;
- stup->tuple = NULL;
- }
- else
- {
- void *raddr = readtup_alloc(state, tuplen);
-
- LogicalTapeReadExact(tape, raddr, tuplen);
- stup->datum1 = PointerGetDatum(raddr);
- stup->isnull1 = false;
- stup->tuple = raddr;
- }
-
- if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
-}
-
/*
* Parallel sort routines
*/
diff --git a/src/backend/utils/sort/tuplesortops.c b/src/backend/utils/sort/tuplesortops.c
new file mode 100644
index 00000000000..8f7a8704e76
--- /dev/null
+++ b/src/backend/utils/sort/tuplesortops.c
@@ -0,0 +1,1550 @@
+/*-------------------------------------------------------------------------
+ *
+ * tuplesortops.c
+ * Implementation of tuple sorting.
+ *
+ *
+ * Copyright (c) 2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/utils/sort/tuplesortops.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "access/htup_details.h"
+#include "access/nbtree.h"
+#include "catalog/index.h"
+#include "executor/executor.h"
+#include "pg_trace.h"
+#include "utils/datum.h"
+#include "utils/lsyscache.h"
+#include "utils/guc.h"
+#include "utils/tuplesort.h"
+
+
+/* sort-type codes for sort__start probes */
+#define HEAP_SORT 0
+#define INDEX_SORT 1
+#define DATUM_SORT 2
+#define CLUSTER_SORT 3
+
+static void getdatum1_heap(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_cluster(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_index(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_datum(Tuplesortstate *state, SortTuple *stup);
+static int comparetup_heap(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_index(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_datum(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static void freestate_cluster(Tuplesortstate *state);
+
+typedef struct
+{
+ TupleDesc tupDesc;
+
+ /*
+ * These variables are specific to the CLUSTER case; they are set by
+ * tuplesort_begin_cluster.
+ */
+ IndexInfo *indexInfo; /* info about index being used for reference */
+ EState *estate; /* for evaluating index expressions */
+} TupleSortClusterArg;
+
+typedef struct
+{
+ /*
+ * These variables are specific to the IndexTuple case; they are set by
+ * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
+ */
+ Relation heapRel; /* table the index is being built on */
+ Relation indexRel; /* index being built */
+} TupleSortIndexArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /* These are specific to the index_btree subcase: */
+ bool enforceUnique; /* complain if we find duplicate tuples */
+ bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+} TupleSortIndexBTreeArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /* These are specific to the index_hash subcase: */
+ uint32 high_mask; /* masks for sortable part of hash code */
+ uint32 low_mask;
+ uint32 max_buckets;
+} TupleSortIndexHashArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /*
+ * These variables are specific to the Datum case; they are set by
+ * tuplesort_begin_datum and used only by the DatumTuple routines.
+ */
+ Oid datumType;
+ /* we need typelen in order to know how to copy the Datums. */
+ int datumTypeLen;
+} TupleSortDatumArg;
+
+Tuplesortstate *
+tuplesort_begin_heap(TupleDesc tupDesc,
+ int nkeys, AttrNumber *attNums,
+ Oid *sortOperators, Oid *sortCollations,
+ bool *nullsFirstFlags,
+ int workMem, SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+
+ AssertArg(nkeys > 0);
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
+ nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = nkeys;
+
+ TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
+ false, /* no unique check */
+ nkeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ ops->getdatum1 = getdatum1_heap;
+ ops->comparetup = comparetup_heap;
+ ops->writetup = writetup_heap;
+ ops->readtup = readtup_heap;
+ ops->haveDatum1 = true;
+ ops->arg = tupDesc; /* assume we need not copy tupDesc */
+
+ /* Prepare SortSupport data for each column */
+ ops->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+
+ for (i = 0; i < nkeys; i++)
+ {
+ SortSupport sortKey = ops->sortKeys + i;
+
+ AssertArg(attNums[i] != 0);
+ AssertArg(sortOperators[i] != 0);
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = sortCollations[i];
+ sortKey->ssup_nulls_first = nullsFirstFlags[i];
+ sortKey->ssup_attno = attNums[i];
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
+
+ PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
+ }
+
+ /*
+ * The "onlyKey" optimization cannot be used with abbreviated keys, since
+ * tie-breaker comparisons may be required. Typically, the optimization
+ * is only of value to pass-by-value types anyway, whereas abbreviated
+ * keys are typically only of value to pass-by-reference types.
+ */
+ if (nkeys == 1 && !ops->sortKeys->abbrev_converter)
+ ops->onlyKey = ops->sortKeys;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_cluster(TupleDesc tupDesc,
+ Relation indexRel,
+ int workMem,
+ SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ BTScanInsert indexScanKey;
+ MemoryContext oldcontext;
+ TupleSortClusterArg *arg;
+ int i;
+
+ Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+ arg = (TupleSortClusterArg *) palloc0(sizeof(TupleSortClusterArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
+ RelationGetNumberOfAttributes(indexRel),
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
+ false, /* no unique check */
+ ops->nKeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ ops->getdatum1 = getdatum1_cluster;
+ ops->comparetup = comparetup_cluster;
+ ops->writetup = writetup_cluster;
+ ops->readtup = readtup_cluster;
+ ops->freestate = freestate_cluster;
+ ops->arg = arg;
+
+ arg->indexInfo = BuildIndexInfo(indexRel);
+
+ /*
+ * If we don't have a simple leading attribute, we don't currently
+ * initialize datum1, so disable optimizations that require it.
+ */
+ if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
+ ops->haveDatum1 = false;
+ else
+ ops->haveDatum1 = true;
+
+ arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
+
+ indexScanKey = _bt_mkscankey(indexRel, NULL);
+
+ if (arg->indexInfo->ii_Expressions != NULL)
+ {
+ TupleTableSlot *slot;
+ ExprContext *econtext;
+
+ /*
+ * We will need to use FormIndexDatum to evaluate the index
+ * expressions. To do that, we need an EState, as well as a
+ * TupleTableSlot to put the table tuples into. The econtext's
+ * scantuple has to point to that slot, too.
+ */
+ arg->estate = CreateExecutorState();
+ slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
+ econtext = GetPerTupleExprContext(arg->estate);
+ econtext->ecxt_scantuple = slot;
+ }
+
+ /* Prepare SortSupport data for each column */
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < ops->nKeys; i++)
+ {
+ SortSupport sortKey = ops->sortKeys + i;
+ ScanKey scanKey = indexScanKey->scankeys + i;
+ int16 strategy;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = scanKey->sk_collation;
+ sortKey->ssup_nulls_first =
+ (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
+ sortKey->ssup_attno = scanKey->sk_attno;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
+ BTGreaterStrategyNumber : BTLessStrategyNumber;
+
+ PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
+ }
+
+ pfree(indexScanKey);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_btree(Relation heapRel,
+ Relation indexRel,
+ bool enforceUnique,
+ bool uniqueNullsNotDistinct,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ BTScanInsert indexScanKey;
+ TupleSortIndexBTreeArg *arg;
+ MemoryContext oldcontext;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+ arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: unique = %c, workMem = %d, randomAccess = %c",
+ enforceUnique ? 't' : 'f',
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
+ enforceUnique,
+ ops->nKeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_btree;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = enforceUnique;
+ arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+
+ indexScanKey = _bt_mkscankey(indexRel, NULL);
+
+ /* Prepare SortSupport data for each column */
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < ops->nKeys; i++)
+ {
+ SortSupport sortKey = ops->sortKeys + i;
+ ScanKey scanKey = indexScanKey->scankeys + i;
+ int16 strategy;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = scanKey->sk_collation;
+ sortKey->ssup_nulls_first =
+ (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
+ sortKey->ssup_attno = scanKey->sk_attno;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
+ BTGreaterStrategyNumber : BTLessStrategyNumber;
+
+ PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
+ }
+
+ pfree(indexScanKey);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_hash(Relation heapRel,
+ Relation indexRel,
+ uint32 high_mask,
+ uint32 low_mask,
+ uint32 max_buckets,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext;
+ TupleSortIndexHashArg *arg;
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+ arg = (TupleSortIndexHashArg *) palloc(sizeof(TupleSortIndexHashArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: high_mask = 0x%x, low_mask = 0x%x, "
+ "max_buckets = 0x%x, workMem = %d, randomAccess = %c",
+ high_mask,
+ low_mask,
+ max_buckets,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = 1; /* Only one sort column, the hash code */
+
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_hash;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+
+ arg->high_mask = high_mask;
+ arg->low_mask = low_mask;
+ arg->max_buckets = max_buckets;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_gist(Relation heapRel,
+ Relation indexRel,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext;
+ TupleSortIndexBTreeArg *arg;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+ arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: workMem = %d, randomAccess = %c",
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_btree;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = false;
+ arg->uniqueNullsNotDistinct = false;
+
+ /* Prepare SortSupport data for each column */
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < ops->nKeys; i++)
+ {
+ SortSupport sortKey = ops->sortKeys + i;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = indexRel->rd_indcollation[i];
+ sortKey->ssup_nulls_first = false;
+ sortKey->ssup_attno = i + 1;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ /* Look for a sort support function */
+ PrepareSortSupportFromGistIndexRel(indexRel, sortKey);
+ }
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
+ bool nullsFirstFlag, int workMem,
+ SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg;
+ MemoryContext oldcontext;
+ int16 typlen;
+ bool typbyval;
+
+ oldcontext = MemoryContextSwitchTo(ops->maincontext);
+ arg = (TupleSortDatumArg *) palloc(sizeof(TupleSortDatumArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin datum sort: workMem = %d, randomAccess = %c",
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ ops->nKeys = 1; /* always a one-column sort */
+
+ TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
+ false, /* no unique check */
+ 1,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ ops->getdatum1 = getdatum1_datum;
+ ops->comparetup = comparetup_datum;
+ ops->writetup = writetup_datum;
+ ops->readtup = readtup_datum;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
+
+ arg->datumType = datumType;
+
+ /* lookup necessary attributes of the datum type */
+ get_typlenbyval(datumType, &typlen, &typbyval);
+ arg->datumTypeLen = typlen;
+ ops->tuples = !typbyval;
+
+ /* Prepare SortSupport data */
+ ops->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+
+ ops->sortKeys->ssup_cxt = CurrentMemoryContext;
+ ops->sortKeys->ssup_collation = sortCollation;
+ ops->sortKeys->ssup_nulls_first = nullsFirstFlag;
+
+ /*
+ * Abbreviation is possible here only for by-reference types. In theory,
+ * a pass-by-value datatype could have an abbreviated form that is cheaper
+ * to compare. In a tuple sort, we could support that, because we can
+ * always extract the original datum from the tuple as needed. Here, we
+ * can't, because a datum sort only stores a single copy of the datum; the
+ * "tuple" field of each SortTuple is NULL.
+ */
+ ops->sortKeys->abbreviate = !typbyval;
+
+ PrepareSortSupportFromOrderingOp(sortOperator, ops->sortKeys);
+
+ /*
+ * The "onlyKey" optimization cannot be used with abbreviated keys, since
+ * tie-breaker comparisons may be required. Typically, the optimization
+ * is only of value to pass-by-value types anyway, whereas abbreviated
+ * keys are typically only of value to pass-by-reference types.
+ */
+ if (!ops->sortKeys->abbrev_converter)
+ ops->onlyKey = ops->sortKeys;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+/*
+ * Accept one tuple while collecting input data for sort.
+ *
+ * Note that the input data is always copied; the caller need not save it.
+ */
+void
+tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->tuplecontext);
+ TupleDesc tupDesc = (TupleDesc) ops->arg;
+ SortTuple stup;
+ MinimalTuple tuple;
+ HeapTupleData htup;
+
+ /* copy the tuple into sort storage */
+ tuple = ExecCopySlotMinimalTuple(slot);
+ stup.tuple = (void *) tuple;
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ stup.datum1 = heap_getattr(&htup,
+ ops->sortKeys[0].ssup_attno,
+ tupDesc,
+ &stup.isnull1);
+
+ tuplesort_puttuple_common(state, &stup);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Accept one tuple while collecting input data for sort.
+ *
+ * Note that the input data is always copied; the caller need not save it.
+ */
+void
+tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
+{
+ SortTuple stup;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->tuplecontext);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+
+ /* copy the tuple into sort storage */
+ tup = heap_copytuple(tup);
+ stup.tuple = (void *) tup;
+
+ /*
+ * set up first-column key value, and potentially abbreviate, if it's a
+ * simple column
+ */
+ if (ops->haveDatum1)
+ {
+ stup.datum1 = heap_getattr(tup,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup.isnull1);
+ }
+
+ tuplesort_puttuple_common(state, &stup);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Collect one index tuple while collecting input data for sort, building
+ * it from caller-supplied values.
+ */
+void
+tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
+ ItemPointer self, Datum *values,
+ bool *isnull)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->tuplecontext);
+ SortTuple stup;
+ IndexTuple tuple;
+
+ stup.tuple = index_form_tuple(RelationGetDescr(rel), values, isnull);
+ tuple = ((IndexTuple) stup.tuple);
+ tuple->t_tid = *self;
+ /* set up first-column key value */
+ stup.datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup.isnull1);
+
+ tuplesort_puttuple_common(state, &stup);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Accept one Datum while collecting input data for sort.
+ *
+ * If the Datum is pass-by-ref type, the value will be copied.
+ */
+void
+tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->tuplecontext);
+ SortTuple stup;
+
+ /*
+ * Pass-by-value types or null values are just stored directly in
+ * stup.datum1 (and stup.tuple is not used and set to NULL).
+ *
+ * Non-null pass-by-reference values need to be copied into memory we
+ * control, and possibly abbreviated. The copied value is pointed to by
+ * stup.tuple and is treated as the canonical copy (e.g. to return via
+ * tuplesort_getdatum or when writing to tape); stup.datum1 gets the
+ * abbreviated value if abbreviation is happening, otherwise it's
+ * identical to stup.tuple.
+ */
+
+ if (isNull || !ops->tuples)
+ {
+ /*
+ * Set datum1 to zeroed representation for NULLs (to be consistent,
+ * and to support cheap inequality tests for NULL abbreviated keys).
+ */
+ stup.datum1 = !isNull ? val : (Datum) 0;
+ stup.isnull1 = isNull;
+ stup.tuple = NULL; /* no separate storage */
+ }
+ else
+ {
+ stup.isnull1 = false;
+ stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
+ stup.tuple = DatumGetPointer(stup.datum1);
+ }
+
+ tuplesort_puttuple_common(state, &stup);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Fetch the next tuple in either forward or back direction.
+ * If successful, put tuple in slot and return true; else, clear the slot
+ * and return false.
+ *
+ * Caller may optionally be passed back abbreviated value (on true return
+ * value) when abbreviation was used, which can be used to cheaply avoid
+ * equality checks that might otherwise be required. Caller can safely make a
+ * determination of "non-equal tuple" based on simple binary inequality. A
+ * NULL value in leading attribute will set abbreviated value to zeroed
+ * representation, which caller may rely on in abbreviated inequality check.
+ *
+ * If copy is true, the slot receives a tuple that's been copied into the
+ * caller's memory context, so that it will stay valid regardless of future
+ * manipulations of the tuplesort's state (up to and including deleting the
+ * tuplesort). If copy is false, the slot will just receive a pointer to a
+ * tuple held within the tuplesort, which is more efficient, but only safe for
+ * callers that are prepared to have any subsequent manipulation of the
+ * tuplesort's state invalidate slot contents.
+ */
+bool
+tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
+ TupleTableSlot *slot, Datum *abbrev)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ if (stup.tuple)
+ {
+ /* Record abbreviated key for caller */
+ if (ops->sortKeys->abbrev_converter && abbrev)
+ *abbrev = stup.datum1;
+
+ if (copy)
+ stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
+
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
+ return true;
+ }
+ else
+ {
+ ExecClearTuple(slot);
+ return false;
+ }
+}
+
+/*
+ * Fetch the next tuple in either forward or back direction.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller may not rely on tuple
+ * remaining valid after any further manipulation of tuplesort.
+ */
+HeapTuple
+tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return stup.tuple;
+}
+
+/*
+ * Fetch the next index tuple in either forward or back direction.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller may not rely on tuple
+ * remaining valid after any further manipulation of tuplesort.
+ */
+IndexTuple
+tuplesort_getindextuple(Tuplesortstate *state, bool forward)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return (IndexTuple) stup.tuple;
+}
+
+/*
+ * Fetch the next Datum in either forward or back direction.
+ * Returns false if no more datums.
+ *
+ * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
+ * in caller's context, and is now owned by the caller (this differs from
+ * similar routines for other types of tuplesorts).
+ *
+ * Caller may optionally be passed back abbreviated value (on true return
+ * value) when abbreviation was used, which can be used to cheaply avoid
+ * equality checks that might otherwise be required. Caller can safely make a
+ * determination of "non-equal tuple" based on simple binary inequality. A
+ * NULL value will have a zeroed abbreviated value representation, which caller
+ * may rely on in abbreviated inequality check.
+ */
+bool
+tuplesort_getdatum(Tuplesortstate *state, bool forward,
+ Datum *val, bool *isNull, Datum *abbrev)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
+ MemoryContext oldcontext = MemoryContextSwitchTo(ops->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ {
+ MemoryContextSwitchTo(oldcontext);
+ return false;
+ }
+
+ /* Ensure we copy into caller's memory context */
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Record abbreviated key for caller */
+ if (ops->sortKeys->abbrev_converter && abbrev)
+ *abbrev = stup.datum1;
+
+ if (stup.isnull1 || !ops->tuples)
+ {
+ *val = stup.datum1;
+ *isNull = stup.isnull1;
+ }
+ else
+ {
+ /* use stup.tuple because stup.datum1 may be an abbreviation */
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
+ *isNull = false;
+ }
+
+ return true;
+}
+
+
+/*
+ * Routines specialized for HeapTuple (actually MinimalTuple) case
+ */
+
+static void
+getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ HeapTupleData htup;
+
+ htup.t_len = ((MinimalTuple) stup->tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) stup->tuple -
+ MINIMAL_TUPLE_OFFSET);
+
+ stup->datum1 = heap_getattr(&htup,
+ ops->sortKeys[0].ssup_attno,
+ (TupleDesc) ops->arg,
+ &stup->isnull1);
+
+}
+
+static int
+comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ SortSupport sortKey = ops->sortKeys;
+ HeapTupleData ltup;
+ HeapTupleData rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ AttrNumber attno;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+
+ /* Compare the leading sort key */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ /* Compare additional sort keys */
+ ltup.t_len = ((MinimalTuple) a->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
+ rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
+ tupDesc = (TupleDesc) ops->arg;
+
+ if (sortKey->abbrev_converter)
+ {
+ attno = sortKey->ssup_attno;
+
+ datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ sortKey++;
+ for (nkey = 1; nkey < ops->nKeys; nkey++, sortKey++)
+ {
+ attno = sortKey->ssup_attno;
+
+ datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ return 0;
+}
+
+static void
+writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ MinimalTuple tuple = (MinimalTuple) stup->tuple;
+
+ /* the part of the MinimalTuple we'll write: */
+ char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
+
+ /* total on-disk footprint: */
+ unsigned int tuplen = tupbodylen + sizeof(int);
+
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_heap(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ unsigned int tupbodylen = len - sizeof(int);
+ unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
+ MinimalTuple tuple = (MinimalTuple) tuplesort_readtup_alloc(state, tuplen);
+ char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ HeapTupleData htup;
+
+ /* read in the tuple proper */
+ tuple->t_len = tuplen;
+ LogicalTapeReadExact(tape, tupbody, tupbodylen);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ stup->datum1 = heap_getattr(&htup,
+ ops->sortKeys[0].ssup_attno,
+ (TupleDesc) ops->arg,
+ &stup->isnull1);
+}
+
+/*
+ * Routines specialized for the CLUSTER case (HeapTuple data, with
+ * comparisons per a btree index definition)
+ */
+
+static void
+getdatum1_cluster(Tuplesortstate *state, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+ HeapTuple tup;
+
+ tup = (HeapTuple) stup->tuple;
+ stup->datum1 = heap_getattr(tup,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup->isnull1);
+}
+
+static int
+comparetup_cluster(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+ SortSupport sortKey = ops->sortKeys;
+ HeapTuple ltup;
+ HeapTuple rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+ /* Be prepared to compare additional sort keys */
+ ltup = (HeapTuple) a->tuple;
+ rtup = (HeapTuple) b->tuple;
+ tupDesc = arg->tupDesc;
+
+ /* Compare the leading sort key, if it's simple */
+ if (ops->haveDatum1)
+ {
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ if (sortKey->abbrev_converter)
+ {
+ AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
+
+ datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
+ datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ }
+ if (compare != 0 || ops->nKeys == 1)
+ return compare;
+ /* Compare additional columns the hard way */
+ sortKey++;
+ nkey = 1;
+ }
+ else
+ {
+ /* Must compare all keys the hard way */
+ nkey = 0;
+ }
+
+ if (arg->indexInfo->ii_Expressions == NULL)
+ {
+ /* If not expression index, just compare the proper heap attrs */
+
+ for (; nkey < ops->nKeys; nkey++, sortKey++)
+ {
+ AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
+
+ datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+ }
+ else
+ {
+ /*
+ * In the expression index case, compute the whole index tuple and
+ * then compare values. It would perhaps be faster to compute only as
+ * many columns as we need to compare, but that would require
+ * duplicating all the logic in FormIndexDatum.
+ */
+ Datum l_index_values[INDEX_MAX_KEYS];
+ bool l_index_isnull[INDEX_MAX_KEYS];
+ Datum r_index_values[INDEX_MAX_KEYS];
+ bool r_index_isnull[INDEX_MAX_KEYS];
+ TupleTableSlot *ecxt_scantuple;
+
+ /* Reset context each time to prevent memory leakage */
+ ResetPerTupleExprContext(arg->estate);
+
+ ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
+
+ ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
+ l_index_values, l_index_isnull);
+
+ ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
+ r_index_values, r_index_isnull);
+
+ for (; nkey < ops->nKeys; nkey++, sortKey++)
+ {
+ compare = ApplySortComparator(l_index_values[nkey],
+ l_index_isnull[nkey],
+ r_index_values[nkey],
+ r_index_isnull[nkey],
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+ }
+
+ return 0;
+}
+
+static void
+writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ HeapTuple tuple = (HeapTuple) stup->tuple;
+ unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
+
+ /* We need to store t_self, but not other fields of HeapTupleData */
+ LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
+ LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_cluster(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int tuplen)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+ unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
+ HeapTuple tuple = (HeapTuple) tuplesort_readtup_alloc(state,
+ t_len + HEAPTUPLESIZE);
+
+ /* Reconstruct the HeapTupleData header */
+ tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
+ tuple->t_len = t_len;
+ LogicalTapeReadExact(tape, &tuple->t_self, sizeof(ItemPointerData));
+ /* We don't currently bother to reconstruct t_tableOid */
+ tuple->t_tableOid = InvalidOid;
+ /* Read in the tuple body */
+ LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value, if it's a simple column */
+ if (ops->haveDatum1)
+ stup->datum1 = heap_getattr(tuple,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup->isnull1);
+}
+
+static void
+freestate_cluster(Tuplesortstate *state)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+
+ /* Free any execution state created for CLUSTER case */
+ if (arg->estate != NULL)
+ {
+ ExprContext *econtext = GetPerTupleExprContext(arg->estate);
+
+ ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
+ FreeExecutorState(arg->estate);
+ }
+}
+
+/*
+ * Routines specialized for IndexTuple case
+ *
+ * The btree and hash cases require separate comparison functions, but the
+ * IndexTuple representation is the same so the copy/write/read support
+ * functions can be shared.
+ */
+
+static void
+getdatum1_index(Tuplesortstate *state, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
+ IndexTuple tuple;
+
+ tuple = stup->tuple;
+ stup->datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup->isnull1);
+}
+
+static int
+comparetup_index_btree(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ /*
+ * This is similar to comparetup_heap(), but expects index tuples. There
+ * is also special handling for enforcing uniqueness, and special
+ * treatment for equal keys at the end.
+ */
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexBTreeArg *arg = (TupleSortIndexBTreeArg *) ops->arg;
+ SortSupport sortKey = ops->sortKeys;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+ int keysz;
+ TupleDesc tupDes;
+ bool equal_hasnull = false;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+
+ /* Compare the leading sort key */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ /* Compare additional sort keys */
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+ keysz = ops->nKeys;
+ tupDes = RelationGetDescr(arg->index.indexRel);
+
+ if (sortKey->abbrev_converter)
+ {
+ datum1 = index_getattr(tuple1, 1, tupDes, &isnull1);
+ datum2 = index_getattr(tuple2, 1, tupDes, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ /* they are equal, so we only need to examine one null flag */
+ if (a->isnull1)
+ equal_hasnull = true;
+
+ sortKey++;
+ for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
+ {
+ datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
+ datum2 = index_getattr(tuple2, nkey, tupDes, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare; /* done when we find unequal attributes */
+
+ /* they are equal, so we only need to examine one null flag */
+ if (isnull1)
+ equal_hasnull = true;
+ }
+
+ /*
+ * If btree has asked us to enforce uniqueness, complain if two equal
+ * tuples are detected (unless there was at least one NULL field and NULLS
+ * NOT DISTINCT was not set).
+ *
+ * It is sufficient to make the test here, because if two tuples are equal
+ * they *must* get compared at some stage of the sort --- otherwise the
+ * sort algorithm wouldn't have checked whether one must appear before the
+ * other.
+ */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
+ {
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ char *key_desc;
+
+ /*
+ * Some rather brain-dead implementations of qsort (such as the one in
+ * QNX 4) will sometimes call the comparison routine to compare a
+ * value to itself, but we always use our own implementation, which
+ * does not.
+ */
+ Assert(tuple1 != tuple2);
+
+ index_deform_tuple(tuple1, tupDes, values, isnull);
+
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+static int
+comparetup_index_hash(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ Bucket bucket1;
+ Bucket bucket2;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexHashArg *arg = (TupleSortIndexHashArg *) ops->arg;
+
+ /*
+ * Fetch hash keys and mask off bits we don't want to sort by. We know
+ * that the first column of the index tuple is the hash key.
+ */
+ Assert(!a->isnull1);
+ bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
+ Assert(!b->isnull1);
+ bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
+ if (bucket1 > bucket2)
+ return 1;
+ else if (bucket1 < bucket2)
+ return -1;
+
+ /*
+ * If hash values are equal, we sort on ItemPointer. This does not affect
+ * validity of the finished index, but it may be useful to have index
+ * scans in physical order.
+ */
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+static void
+writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ IndexTuple tuple = (IndexTuple) stup->tuple;
+ unsigned int tuplen;
+
+ tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_index(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
+ unsigned int tuplen = len - sizeof(unsigned int);
+ IndexTuple tuple = (IndexTuple) tuplesort_readtup_alloc(state, tuplen);
+
+ LogicalTapeReadExact(tape, tuple, tuplen);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value */
+ stup->datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup->isnull1);
+}
+
+/*
+ * Routines specialized for DatumTuple case
+ */
+
+static void
+getdatum1_datum(Tuplesortstate *state, SortTuple *stup)
+{
+ stup->datum1 = PointerGetDatum(stup->tuple);
+}
+
+static int
+comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ int compare;
+
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ ops->sortKeys);
+ if (compare != 0)
+ return compare;
+
+ /* if we have abbreviations, then "tuple" has the original value */
+
+ if (ops->sortKeys->abbrev_converter)
+ compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
+ PointerGetDatum(b->tuple), b->isnull1,
+ ops->sortKeys);
+
+ return compare;
+}
+
+static void
+writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
+ void *waddr;
+ unsigned int tuplen;
+ unsigned int writtenlen;
+
+ if (stup->isnull1)
+ {
+ waddr = NULL;
+ tuplen = 0;
+ }
+ else if (!ops->tuples)
+ {
+ waddr = &stup->datum1;
+ tuplen = sizeof(Datum);
+ }
+ else
+ {
+ waddr = stup->tuple;
+ tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
+ Assert(tuplen != 0);
+ }
+
+ writtenlen = tuplen + sizeof(unsigned int);
+
+ LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
+ LogicalTapeWrite(tape, waddr, tuplen);
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
+}
+
+static void
+readtup_datum(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ unsigned int tuplen = len - sizeof(unsigned int);
+
+ if (tuplen == 0)
+ {
+ /* it's NULL */
+ stup->datum1 = (Datum) 0;
+ stup->isnull1 = true;
+ stup->tuple = NULL;
+ }
+ else if (!ops->tuples)
+ {
+ Assert(tuplen == sizeof(Datum));
+ LogicalTapeReadExact(tape, &stup->datum1, tuplen);
+ stup->isnull1 = false;
+ stup->tuple = NULL;
+ }
+ else
+ {
+ void *raddr = tuplesort_readtup_alloc(state, tuplen);
+
+ LogicalTapeReadExact(tape, raddr, tuplen);
+ stup->datum1 = PointerGetDatum(raddr);
+ stup->isnull1 = false;
+ stup->tuple = raddr;
+ }
+
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ * word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+}
+
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 364cf132fcb..fabb13c4463 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -24,7 +24,9 @@
#include "access/itup.h"
#include "executor/tuptable.h"
#include "storage/dsm.h"
+#include "utils/logtape.h"
#include "utils/relcache.h"
+#include "utils/sortsupport.h"
/*
@@ -102,6 +104,130 @@ typedef struct TuplesortInstrumentation
int64 spaceUsed; /* space consumption, in kB */
} TuplesortInstrumentation;
+/*
+ * The objects we actually sort are SortTuple structs. These contain
+ * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
+ * which is a separate palloc chunk --- we assume it is just one chunk and
+ * can be freed by a simple pfree() (except during merge, when we use a
+ * simple slab allocator). SortTuples also contain the tuple's first key
+ * column in Datum/nullflag format, and a source/input tape number that
+ * tracks which tape each heap element/slot belongs to during merging.
+ *
+ * Storing the first key column lets us save heap_getattr or index_getattr
+ * calls during tuple comparisons. We could extract and save all the key
+ * columns not just the first, but this would increase code complexity and
+ * overhead, and wouldn't actually save any comparison cycles in the common
+ * case where the first key determines the comparison result. Note that
+ * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
+ *
+ * There is one special case: when the sort support infrastructure provides an
+ * "abbreviated key" representation, where the key is (typically) a pass by
+ * value proxy for a pass by reference type. In this case, the abbreviated key
+ * is stored in datum1 in place of the actual first key column.
+ *
+ * When sorting single Datums, the data value is represented directly by
+ * datum1/isnull1 for pass by value types (or null values). If the datatype is
+ * pass-by-reference and isnull1 is false, then "tuple" points to a separately
+ * palloc'd data value, otherwise "tuple" is NULL. The value of datum1 is then
+ * either the same pointer as "tuple", or is an abbreviated key value as
+ * described above. Accordingly, "tuple" is always used in preference to
+ * datum1 as the authoritative value for pass-by-reference cases.
+ */
+typedef struct
+{
+ void *tuple; /* the tuple itself */
+ Datum datum1; /* value of first key column */
+ bool isnull1; /* is first key column NULL? */
+ int srctape; /* source tape number */
+} SortTuple;
+
+typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+
+typedef struct
+{
+ MemoryContext maincontext; /* memory context for tuple sort metadata that
+ * persists across multiple batches */
+ MemoryContext sortcontext; /* memory context holding most sort data */
+ MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
+
+ /*
+ * These function pointers decouple the routines that must know what kind
+ * of tuple we are sorting from the routines that don't need to know it.
+ * They are set up by the tuplesort_begin_xxx routines.
+ *
+ * Function to compare two tuples; result is per qsort() convention, ie:
+ * <0, 0, >0 according as a<b, a=b, a>b. The API must match
+ * qsort_arg_comparator.
+ */
+ SortTupleComparator comparetup;
+
+ void (*getdatum1) (Tuplesortstate *state, SortTuple *stup);
+
+ /*
+ * Function to write a stored tuple onto tape. The representation of the
+ * tuple on tape need not be the same as it is in memory; requirements on
+ * the tape representation are given below. Unless the slab allocator is
+ * used, after writing the tuple, pfree() the out-of-line data (not the
+ * SortTuple struct!), and increase state->availMem by the amount of
+ * memory space thereby released.
+ */
+ void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+
+ /*
+ * Function to read a stored tuple from tape back into memory. 'len' is
+ * the already-read length of the stored tuple. The tuple is allocated
+ * from the slab memory arena, or is palloc'd, see tuplesort_readtup_alloc().
+ */
+ void (*readtup) (Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+
+ void (*freestate) (Tuplesortstate *state);
+
+ /*
+ * Whether SortTuple's datum1 and isnull1 members are maintained by the
+ * above routines. If not, some sort specializations are disabled.
+ */
+ bool haveDatum1;
+
+ /*
+ * The sortKeys variable is used by every case other than the hash index
+ * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
+ * MinimalTuple and CLUSTER routines, though.
+ */
+ int nKeys; /* number of columns in sort key */
+ SortSupport sortKeys; /* array of length nKeys */
+
+ /*
+ * This variable is shared by the single-key MinimalTuple case and the
+ * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
+ * presence of a value in this field is also checked by various sort
+ * specialization functions as an optimization when comparing the leading
+ * key in a tiebreak situation to determine if there are any subsequent
+ * keys to sort on.
+ */
+ SortSupport onlyKey;
+
+ int sortopt; /* Bitmask of flags used to setup sort */
+
+ bool tuples; /* Can SortTuple.tuple ever be set? */
+
+ void *arg;
+} TuplesortOps;
+
+/* Sort parallel code from state for sort__start probes */
+#define PARALLEL_SORT(coordinate) ((coordinate)->sharedsort == NULL ? 0 : \
+ !(coordinate)->isWorker ? 1 : 2)
+
+#define TuplesortstateGetOps(state) ((TuplesortOps *) state)
+
+/* When using this macro, beware of double evaluation of len */
+#define LogicalTapeReadExact(tape, ptr, len) \
+ do { \
+ if (LogicalTapeRead(tape, ptr, len) != (size_t) (len)) \
+ elog(ERROR, "unexpected end of data"); \
+ } while(0)
/*
* We provide multiple interfaces to what is essentially the same code,
@@ -205,6 +331,49 @@ typedef struct TuplesortInstrumentation
* generated (typically, caller uses a parallel heap scan).
*/
+
+extern Tuplesortstate *tuplesort_begin_common(int workMem,
+ SortCoordinate coordinate,
+ int sortopt);
+extern void tuplesort_set_bound(Tuplesortstate *state, int64 bound);
+extern bool tuplesort_used_bound(Tuplesortstate *state);
+extern void tuplesort_puttuple_common(Tuplesortstate *state, SortTuple *tuple);
+extern void tuplesort_performsort(Tuplesortstate *state);
+extern bool tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
+ SortTuple *stup);
+extern bool tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples,
+ bool forward);
+extern void tuplesort_end(Tuplesortstate *state);
+extern void tuplesort_reset(Tuplesortstate *state);
+
+extern void tuplesort_get_stats(Tuplesortstate *state,
+ TuplesortInstrumentation *stats);
+extern const char *tuplesort_method_name(TuplesortMethod m);
+extern const char *tuplesort_space_type_name(TuplesortSpaceType t);
+
+extern int tuplesort_merge_order(int64 allowedMem);
+
+extern Size tuplesort_estimate_shared(int nworkers);
+extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
+ dsm_segment *seg);
+extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
+
+/*
+ * These routines may only be called if randomAccess was specified 'true'.
+ * Likewise, backwards scan in gettuple/getdatum is only allowed if
+ * randomAccess was specified. Note that parallel sorts do not support
+ * randomAccess.
+ */
+
+extern void tuplesort_rescan(Tuplesortstate *state);
+extern void tuplesort_markpos(Tuplesortstate *state);
+extern void tuplesort_restorepos(Tuplesortstate *state);
+
+extern void *tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen);
+
+
+/* tuplesortops.c */
+
extern Tuplesortstate *tuplesort_begin_heap(TupleDesc tupDesc,
int nkeys, AttrNumber *attNums,
Oid *sortOperators, Oid *sortCollations,
@@ -238,9 +407,6 @@ extern Tuplesortstate *tuplesort_begin_datum(Oid datumType,
int workMem, SortCoordinate coordinate,
int sortopt);
-extern void tuplesort_set_bound(Tuplesortstate *state, int64 bound);
-extern bool tuplesort_used_bound(Tuplesortstate *state);
-
extern void tuplesort_puttupleslot(Tuplesortstate *state,
TupleTableSlot *slot);
extern void tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup);
@@ -250,8 +416,6 @@ extern void tuplesort_putindextuplevalues(Tuplesortstate *state,
extern void tuplesort_putdatum(Tuplesortstate *state, Datum val,
bool isNull);
-extern void tuplesort_performsort(Tuplesortstate *state);
-
extern bool tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
bool copy, TupleTableSlot *slot, Datum *abbrev);
extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward);
@@ -259,34 +423,5 @@ extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward);
extern bool tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev);
-extern bool tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples,
- bool forward);
-
-extern void tuplesort_end(Tuplesortstate *state);
-
-extern void tuplesort_reset(Tuplesortstate *state);
-
-extern void tuplesort_get_stats(Tuplesortstate *state,
- TuplesortInstrumentation *stats);
-extern const char *tuplesort_method_name(TuplesortMethod m);
-extern const char *tuplesort_space_type_name(TuplesortSpaceType t);
-
-extern int tuplesort_merge_order(int64 allowedMem);
-
-extern Size tuplesort_estimate_shared(int nworkers);
-extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
- dsm_segment *seg);
-extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
-
-/*
- * These routines may only be called if randomAccess was specified 'true'.
- * Likewise, backwards scan in gettuple/getdatum is only allowed if
- * randomAccess was specified. Note that parallel sorts do not support
- * randomAccess.
- */
-
-extern void tuplesort_rescan(Tuplesortstate *state);
-extern void tuplesort_markpos(Tuplesortstate *state);
-extern void tuplesort_restorepos(Tuplesortstate *state);
#endif /* TUPLESORT_H */
--
2.30.2
v2-0002-Tuplesortstate.getdatum1-method.patchtext/x-patch; charset=US-ASCII; name=v2-0002-Tuplesortstate.getdatum1-method.patchDownload
From 1d78e271b22d7c6a1557defbe15ea5039ff28510 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:03:13 +0300
Subject: [PATCH v2 2/6] Tuplesortstate.getdatum1 method
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/tuplesort.c | 100 ++++++++++++++++++-----------
1 file changed, 64 insertions(+), 36 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 0114855c83c..c649043fbb0 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -279,6 +279,8 @@ struct Tuplesortstate
*/
SortTupleComparator comparetup;
+ void (*getdatum1) (Tuplesortstate *state, SortTuple *stup);
+
/*
* Function to write a stored tuple onto tape. The representation of the
* tuple on tape need not be the same as it is in memory; requirements on
@@ -540,6 +542,7 @@ struct Sharedsort
pfree(buf); \
} while(0)
+#define GETDATUM1(state,stup) ((*(state)->getdatum1) (state, stup))
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
@@ -629,6 +632,10 @@ static void reversedirection(Tuplesortstate *state);
static unsigned int getlen(LogicalTape *tape, bool eofOK);
static void markrunend(LogicalTape *tape);
static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
+static void getdatum1_heap(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_cluster(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_index(Tuplesortstate *state, SortTuple *stup);
+static void getdatum1_datum(Tuplesortstate *state, SortTuple *stup);
static int comparetup_heap(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
@@ -1042,6 +1049,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->getdatum1 = getdatum1_heap;
state->comparetup = comparetup_heap;
state->writetup = writetup_heap;
state->readtup = readtup_heap;
@@ -1117,6 +1125,7 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->getdatum1 = getdatum1_cluster;
state->comparetup = comparetup_cluster;
state->writetup = writetup_cluster;
state->readtup = readtup_cluster;
@@ -1221,6 +1230,7 @@ tuplesort_begin_index_btree(Relation heapRel,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->getdatum1 = getdatum1_index;
state->comparetup = comparetup_index_btree;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1297,6 +1307,7 @@ tuplesort_begin_index_hash(Relation heapRel,
state->nKeys = 1; /* Only one sort column, the hash code */
+ state->getdatum1 = getdatum1_index;
state->comparetup = comparetup_index_hash;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1337,6 +1348,7 @@ tuplesort_begin_index_gist(Relation heapRel,
state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ state->getdatum1 = getdatum1_index;
state->comparetup = comparetup_index_btree;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1400,6 +1412,7 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->getdatum1 = getdatum1_datum;
state->comparetup = comparetup_datum;
state->writetup = writetup_datum;
state->readtup = readtup_datum;
@@ -1872,19 +1885,7 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- mtup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &mtup->isnull1);
- }
+ GETDATUM1(state, &state->memtuples[i]);
}
puttuple_common(state, &stup);
@@ -1957,15 +1958,7 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tup = (HeapTuple) mtup->tuple;
- mtup->datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &mtup->isnull1);
- }
+ GETDATUM1(state, &state->memtuples[i]);
}
}
@@ -2035,15 +2028,7 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tuple = mtup->tuple;
- mtup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(state->indexRel),
- &mtup->isnull1);
- }
+ GETDATUM1(state, &state->memtuples[i]);
}
puttuple_common(state, &stup);
@@ -2122,11 +2107,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* case).
*/
for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- mtup->datum1 = PointerGetDatum(mtup->tuple);
- }
+ GETDATUM1(state, &state->memtuples[i]);
}
}
@@ -3983,6 +3964,23 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
* Routines specialized for HeapTuple (actually MinimalTuple) case
*/
+static void
+getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
+{
+ HeapTupleData htup;
+
+ htup.t_len = ((MinimalTuple) stup->tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) stup->tuple -
+ MINIMAL_TUPLE_OFFSET);
+
+ stup->datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup->isnull1);
+
+}
+
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
@@ -4101,6 +4099,18 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
* comparisons per a btree index definition)
*/
+static void
+getdatum1_cluster(Tuplesortstate *state, SortTuple *stup)
+{
+ HeapTuple tup;
+
+ tup = (HeapTuple) stup->tuple;
+ stup->datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup->isnull1);
+}
+
static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
@@ -4270,6 +4280,18 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
* functions can be shared.
*/
+static void
+getdatum1_index(Tuplesortstate *state, SortTuple *stup)
+{
+ IndexTuple tuple;
+
+ tuple = stup->tuple;
+ stup->datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(state->indexRel),
+ &stup->isnull1);
+}
+
static int
comparetup_index_btree(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
@@ -4502,6 +4524,12 @@ readtup_index(Tuplesortstate *state, SortTuple *stup,
* Routines specialized for DatumTuple case
*/
+static void
+getdatum1_datum(Tuplesortstate *state, SortTuple *stup)
+{
+ stup->datum1 = PointerGetDatum(stup->tuple);
+}
+
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
--
2.30.2
v2-0005-Reorganize-data-structures.patchtext/x-patch; charset=US-ASCII; name=v2-0005-Reorganize-data-structures.patchDownload
From 3a0e1fa7c7e4da46a86f7d5b9dd0392549f3b460 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 18:11:26 +0300
Subject: [PATCH v2 5/6] Reorganize data structures
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/tuplesort.c | 762 ++++++++++++++++-------------
1 file changed, 432 insertions(+), 330 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 3bf990a1b34..e106e1ff9e2 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -126,8 +126,8 @@
#define CLUSTER_SORT 3
/* Sort parallel code from state for sort__start probes */
-#define PARALLEL_SORT(state) ((state)->shared == NULL ? 0 : \
- (state)->worker >= 0 ? 1 : 2)
+#define PARALLEL_SORT(coordinate) ((coordinate)->sharedsort == NULL ? 0 : \
+ (coordinate)->isWorker >= 0 ? 1 : 2)
/*
* Initial size of memtuples array. We're trying to select this size so that
@@ -236,37 +236,17 @@ typedef enum
#define TAPE_BUFFER_OVERHEAD BLCKSZ
#define MERGE_BUFFER_SIZE (BLCKSZ * 32)
+typedef struct TuplesortOps TuplesortOps;
+
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-/*
- * Private state of a Tuplesort operation.
- */
-struct Tuplesortstate
+struct TuplesortOps
{
- TupSortStatus status; /* enumerated value as shown above */
- int nKeys; /* number of columns in sort key */
- int sortopt; /* Bitmask of flags used to setup sort */
- bool bounded; /* did caller specify a maximum number of
- * tuples to return? */
- bool boundUsed; /* true if we made use of a bounded heap */
- int bound; /* if bounded, the maximum number of tuples */
- bool tuples; /* Can SortTuple.tuple ever be set? */
- int64 availMem; /* remaining memory available, in bytes */
- int64 allowedMem; /* total memory allowed, in bytes */
- int maxTapes; /* max number of input tapes to merge in each
- * pass */
- int64 maxSpace; /* maximum amount of space occupied among sort
- * of groups, either in-memory or on-disk */
- bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
- * space, false when it's value for in-memory
- * space */
- TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
MemoryContext maincontext; /* memory context for tuple sort metadata that
* persists across multiple batches */
MemoryContext sortcontext; /* memory context holding most sort data */
MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
- LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
/*
* These function pointers decouple the routines that must know what kind
@@ -300,12 +280,116 @@ struct Tuplesortstate
void (*readtup) (Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
+ void (*freestate) (Tuplesortstate *state);
+
/*
* Whether SortTuple's datum1 and isnull1 members are maintained by the
* above routines. If not, some sort specializations are disabled.
*/
bool haveDatum1;
+ /*
+ * The sortKeys variable is used by every case other than the hash index
+ * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
+ * MinimalTuple and CLUSTER routines, though.
+ */
+ int nKeys; /* number of columns in sort key */
+ SortSupport sortKeys; /* array of length nKeys */
+
+ /*
+ * This variable is shared by the single-key MinimalTuple case and the
+ * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
+ * presence of a value in this field is also checked by various sort
+ * specialization functions as an optimization when comparing the leading
+ * key in a tiebreak situation to determine if there are any subsequent
+ * keys to sort on.
+ */
+ SortSupport onlyKey;
+
+ int sortopt; /* Bitmask of flags used to setup sort */
+
+ bool tuples; /* Can SortTuple.tuple ever be set? */
+
+ void *arg;
+};
+
+typedef struct
+{
+ TupleDesc tupDesc;
+
+ /*
+ * These variables are specific to the CLUSTER case; they are set by
+ * tuplesort_begin_cluster.
+ */
+ IndexInfo *indexInfo; /* info about index being used for reference */
+ EState *estate; /* for evaluating index expressions */
+} TupleSortClusterArg;
+
+typedef struct
+{
+ /*
+ * These variables are specific to the IndexTuple case; they are set by
+ * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
+ */
+ Relation heapRel; /* table the index is being built on */
+ Relation indexRel; /* index being built */
+} TupleSortIndexArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /* These are specific to the index_btree subcase: */
+ bool enforceUnique; /* complain if we find duplicate tuples */
+ bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+} TupleSortIndexBTreeArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /* These are specific to the index_hash subcase: */
+ uint32 high_mask; /* masks for sortable part of hash code */
+ uint32 low_mask;
+ uint32 max_buckets;
+} TupleSortIndexHashArg;
+
+typedef struct
+{
+ TupleSortIndexArg index;
+
+ /*
+ * These variables are specific to the Datum case; they are set by
+ * tuplesort_begin_datum and used only by the DatumTuple routines.
+ */
+ Oid datumType;
+ /* we need typelen in order to know how to copy the Datums. */
+ int datumTypeLen;
+} TupleSortDatumArg;
+
+/*
+ * Private state of a Tuplesort operation.
+ */
+struct Tuplesortstate
+{
+ TuplesortOps ops;
+ TupSortStatus status; /* enumerated value as shown above */
+ bool bounded; /* did caller specify a maximum number of
+ * tuples to return? */
+ bool boundUsed; /* true if we made use of a bounded heap */
+ int bound; /* if bounded, the maximum number of tuples */
+ int64 availMem; /* remaining memory available, in bytes */
+ int64 allowedMem; /* total memory allowed, in bytes */
+ int maxTapes; /* max number of input tapes to merge in each
+ * pass */
+ int64 maxSpace; /* maximum amount of space occupied among sort
+ * of groups, either in-memory or on-disk */
+ bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
+ * space, false when it's value for in-memory
+ * space */
+ TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
+ LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
+
/*
* This array holds the tuples now in sort memory. If we are in state
* INITIAL, the tuples are in no particular order; if we are in state
@@ -420,24 +504,6 @@ struct Tuplesortstate
Sharedsort *shared;
int nParticipants;
- /*
- * The sortKeys variable is used by every case other than the hash index
- * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
- * MinimalTuple and CLUSTER routines, though.
- */
- TupleDesc tupDesc;
- SortSupport sortKeys; /* array of length nKeys */
-
- /*
- * This variable is shared by the single-key MinimalTuple case and the
- * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
- * presence of a value in this field is also checked by various sort
- * specialization functions as an optimization when comparing the leading
- * key in a tiebreak situation to determine if there are any subsequent
- * keys to sort on.
- */
- SortSupport onlyKey;
-
/*
* Additional state for managing "abbreviated key" sortsupport routines
* (which currently may be used by all cases except the hash index case).
@@ -447,37 +513,6 @@ struct Tuplesortstate
int64 abbrevNext; /* Tuple # at which to next check
* applicability */
- /*
- * These variables are specific to the CLUSTER case; they are set by
- * tuplesort_begin_cluster.
- */
- IndexInfo *indexInfo; /* info about index being used for reference */
- EState *estate; /* for evaluating index expressions */
-
- /*
- * These variables are specific to the IndexTuple case; they are set by
- * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
- */
- Relation heapRel; /* table the index is being built on */
- Relation indexRel; /* index being built */
-
- /* These are specific to the index_btree subcase: */
- bool enforceUnique; /* complain if we find duplicate tuples */
- bool uniqueNullsNotDistinct; /* unique constraint null treatment */
-
- /* These are specific to the index_hash subcase: */
- uint32 high_mask; /* masks for sortable part of hash code */
- uint32 low_mask;
- uint32 max_buckets;
-
- /*
- * These variables are specific to the Datum case; they are set by
- * tuplesort_begin_datum and used only by the DatumTuple routines.
- */
- Oid datumType;
- /* we need typelen in order to know how to copy the Datums. */
- int datumTypeLen;
-
/*
* Resource snapshot for time of sort start.
*/
@@ -542,10 +577,13 @@ struct Sharedsort
pfree(buf); \
} while(0)
-#define GETDATUM1(state,stup) ((*(state)->getdatum1) (state, stup))
-#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
-#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
-#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
+#define TuplesortstateGetOps(state) ((TuplesortOps *) state);
+
+#define GETDATUM1(state,stup) ((*(state)->ops.getdatum1) (state, stup))
+#define COMPARETUP(state,a,b) ((*(state)->ops.comparetup) (a, b, state))
+#define WRITETUP(state,tape,stup) ((*(state)->ops.writetup) (state, tape, stup))
+#define READTUP(state,stup,tape,len) ((*(state)->ops.readtup) (state, stup, tape, len))
+#define FREESTATE(state) ((state)->ops.freestate ? (*(state)->ops.freestate) (state) : (void) 0)
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
#define USEMEM(state,amt) ((state)->availMem -= (amt))
#define FREEMEM(state,amt) ((state)->availMem += (amt))
@@ -664,6 +702,7 @@ static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
+static void freestate_cluster(Tuplesortstate *state);
static int worker_get_identifier(Tuplesortstate *state);
static void worker_freeze_result_tape(Tuplesortstate *state);
static void worker_nomergeruns(Tuplesortstate *state);
@@ -694,7 +733,7 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplyUnsignedSortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->ops.sortKeys[0]);
if (compare != 0)
return compare;
@@ -702,10 +741,10 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->ops.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->ops.comparetup(a, b, state);
}
#if SIZEOF_DATUM >= 8
@@ -717,7 +756,7 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplySignedSortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->ops.sortKeys[0]);
if (compare != 0)
return compare;
@@ -726,10 +765,10 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->ops.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->ops.comparetup(a, b, state);
}
#endif
@@ -741,7 +780,7 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplyInt32SortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->ops.sortKeys[0]);
if (compare != 0)
return compare;
@@ -750,10 +789,10 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->ops.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->ops.comparetup(a, b, state);
}
/*
@@ -880,8 +919,9 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
pg_rusage_init(&state->ru_start);
#endif
- state->sortopt = sortopt;
- state->tuples = true;
+ state->ops.sortopt = sortopt;
+ state->ops.tuples = true;
+ state->abbrevNext = 10;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -890,8 +930,8 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
* with very little memory.
*/
state->allowedMem = Max(workMem, 64) * (int64) 1024;
- state->sortcontext = sortcontext;
- state->maincontext = maincontext;
+ state->ops.sortcontext = sortcontext;
+ state->ops.maincontext = maincontext;
/*
* Initial size of array must be more than ALLOCSET_SEPARATE_THRESHOLD;
@@ -950,7 +990,7 @@ tuplesort_begin_batch(Tuplesortstate *state)
{
MemoryContext oldcontext;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
/*
* Caller tuple (e.g. IndexTuple) memory context.
@@ -965,12 +1005,12 @@ tuplesort_begin_batch(Tuplesortstate *state)
* generation.c context as this keeps allocations more compact with less
* wastage. Allocations are also slightly more CPU efficient.
*/
- if (state->sortopt & TUPLESORT_ALLOWBOUNDED)
- state->tuplecontext = AllocSetContextCreate(state->sortcontext,
+ if (state->ops.sortopt & TUPLESORT_ALLOWBOUNDED)
+ state->ops.tuplecontext = AllocSetContextCreate(state->ops.sortcontext,
"Caller tuples",
ALLOCSET_DEFAULT_SIZES);
else
- state->tuplecontext = GenerationContextCreate(state->sortcontext,
+ state->ops.tuplecontext = GenerationContextCreate(state->ops.sortcontext,
"Caller tuples",
ALLOCSET_DEFAULT_SIZES);
@@ -1028,10 +1068,11 @@ tuplesort_begin_heap(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
MemoryContext oldcontext;
int i;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
AssertArg(nkeys > 0);
@@ -1042,30 +1083,28 @@ tuplesort_begin_heap(TupleDesc tupDesc,
nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = nkeys;
+ ops->nKeys = nkeys;
TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
false, /* no unique check */
nkeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->getdatum1 = getdatum1_heap;
- state->comparetup = comparetup_heap;
- state->writetup = writetup_heap;
- state->readtup = readtup_heap;
- state->haveDatum1 = true;
-
- state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
- state->abbrevNext = 10;
+ ops->getdatum1 = getdatum1_heap;
+ ops->comparetup = comparetup_heap;
+ ops->writetup = writetup_heap;
+ ops->readtup = readtup_heap;
+ ops->haveDatum1 = true;
+ ops->arg = tupDesc; /* assume we need not copy tupDesc */
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+ ops->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
for (i = 0; i < nkeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = ops->sortKeys + i;
AssertArg(attNums[i] != 0);
AssertArg(sortOperators[i] != 0);
@@ -1075,7 +1114,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
sortKey->ssup_nulls_first = nullsFirstFlags[i];
sortKey->ssup_attno = attNums[i];
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
}
@@ -1086,8 +1125,8 @@ tuplesort_begin_heap(TupleDesc tupDesc,
* is only of value to pass-by-value types anyway, whereas abbreviated
* keys are typically only of value to pass-by-reference types.
*/
- if (nkeys == 1 && !state->sortKeys->abbrev_converter)
- state->onlyKey = state->sortKeys;
+ if (nkeys == 1 && !ops->sortKeys->abbrev_converter)
+ ops->onlyKey = ops->sortKeys;
MemoryContextSwitchTo(oldcontext);
@@ -1102,13 +1141,16 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
BTScanInsert indexScanKey;
MemoryContext oldcontext;
+ TupleSortClusterArg *arg;
int i;
Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
+ arg = (TupleSortClusterArg *) palloc0(sizeof(TupleSortClusterArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1118,37 +1160,38 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
false, /* no unique check */
- state->nKeys,
+ ops->nKeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->getdatum1 = getdatum1_cluster;
- state->comparetup = comparetup_cluster;
- state->writetup = writetup_cluster;
- state->readtup = readtup_cluster;
- state->abbrevNext = 10;
+ ops->getdatum1 = getdatum1_cluster;
+ ops->comparetup = comparetup_cluster;
+ ops->writetup = writetup_cluster;
+ ops->readtup = readtup_cluster;
+ ops->freestate = freestate_cluster;
+ ops->arg = arg;
- state->indexInfo = BuildIndexInfo(indexRel);
+ arg->indexInfo = BuildIndexInfo(indexRel);
/*
* If we don't have a simple leading attribute, we don't currently
* initialize datum1, so disable optimizations that require it.
*/
- if (state->indexInfo->ii_IndexAttrNumbers[0] == 0)
- state->haveDatum1 = false;
+ if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
+ ops->haveDatum1 = false;
else
- state->haveDatum1 = true;
+ ops->haveDatum1 = true;
- state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
+ arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
indexScanKey = _bt_mkscankey(indexRel, NULL);
- if (state->indexInfo->ii_Expressions != NULL)
+ if (arg->indexInfo->ii_Expressions != NULL)
{
TupleTableSlot *slot;
ExprContext *econtext;
@@ -1159,19 +1202,19 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
* TupleTableSlot to put the table tuples into. The econtext's
* scantuple has to point to that slot, too.
*/
- state->estate = CreateExecutorState();
+ arg->estate = CreateExecutorState();
slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
- econtext = GetPerTupleExprContext(state->estate);
+ econtext = GetPerTupleExprContext(arg->estate);
econtext->ecxt_scantuple = slot;
}
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < ops->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = ops->sortKeys + i;
ScanKey scanKey = indexScanKey->scankeys + i;
int16 strategy;
@@ -1181,7 +1224,7 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
(scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
sortKey->ssup_attno = scanKey->sk_attno;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1209,11 +1252,14 @@ tuplesort_begin_index_btree(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
BTScanInsert indexScanKey;
+ TupleSortIndexBTreeArg *arg;
MemoryContext oldcontext;
int i;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
+ arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1223,36 +1269,36 @@ tuplesort_begin_index_btree(Relation heapRel,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
enforceUnique,
state->nKeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->getdatum1 = getdatum1_index;
- state->comparetup = comparetup_index_btree;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->abbrevNext = 10;
- state->haveDatum1 = true;
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_btree;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
- state->enforceUnique = enforceUnique;
- state->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = enforceUnique;
+ arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
indexScanKey = _bt_mkscankey(indexRel, NULL);
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < ops->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = ops->sortKeys + i;
ScanKey scanKey = indexScanKey->scankeys + i;
int16 strategy;
@@ -1262,7 +1308,7 @@ tuplesort_begin_index_btree(Relation heapRel,
(scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
sortKey->ssup_attno = scanKey->sk_attno;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1291,9 +1337,12 @@ tuplesort_begin_index_hash(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
MemoryContext oldcontext;
+ TupleSortIndexHashArg *arg;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
+ arg = (TupleSortIndexHashArg *) palloc(sizeof(TupleSortIndexHashArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1307,20 +1356,21 @@ tuplesort_begin_index_hash(Relation heapRel,
sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = 1; /* Only one sort column, the hash code */
+ ops->nKeys = 1; /* Only one sort column, the hash code */
- state->getdatum1 = getdatum1_index;
- state->comparetup = comparetup_index_hash;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->haveDatum1 = true;
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_hash;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
- state->high_mask = high_mask;
- state->low_mask = low_mask;
- state->max_buckets = max_buckets;
+ arg->high_mask = high_mask;
+ arg->low_mask = low_mask;
+ arg->max_buckets = max_buckets;
MemoryContextSwitchTo(oldcontext);
@@ -1336,10 +1386,13 @@ tuplesort_begin_index_gist(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
MemoryContext oldcontext;
+ TupleSortIndexBTreeArg *arg;
int i;
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
+ arg = (TupleSortIndexBTreeArg *) palloc(sizeof(TupleSortIndexBTreeArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1348,31 +1401,34 @@ tuplesort_begin_index_gist(Relation heapRel,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ ops->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
- state->getdatum1 = getdatum1_index;
- state->comparetup = comparetup_index_btree;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->haveDatum1 = true;
+ ops->getdatum1 = getdatum1_index;
+ ops->comparetup = comparetup_index_btree;
+ ops->writetup = writetup_index;
+ ops->readtup = readtup_index;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = false;
+ arg->uniqueNullsNotDistinct = false;
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ ops->sortKeys = (SortSupport) palloc0(ops->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < ops->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = ops->sortKeys + i;
sortKey->ssup_cxt = CurrentMemoryContext;
sortKey->ssup_collation = indexRel->rd_indcollation[i];
sortKey->ssup_nulls_first = false;
sortKey->ssup_attno = i + 1;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && ops->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1392,11 +1448,14 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg;
MemoryContext oldcontext;
int16 typlen;
bool typbyval;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.maincontext);
+ arg = (TupleSortDatumArg *) palloc(sizeof(TupleSortDatumArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1405,35 +1464,36 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = 1; /* always a one-column sort */
+ ops->nKeys = 1; /* always a one-column sort */
TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
false, /* no unique check */
1,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->getdatum1 = getdatum1_datum;
- state->comparetup = comparetup_datum;
- state->writetup = writetup_datum;
- state->readtup = readtup_datum;
+ ops->getdatum1 = getdatum1_datum;
+ ops->comparetup = comparetup_datum;
+ ops->writetup = writetup_datum;
+ ops->readtup = readtup_datum;
state->abbrevNext = 10;
- state->haveDatum1 = true;
+ ops->haveDatum1 = true;
+ ops->arg = arg;
- state->datumType = datumType;
+ arg->datumType = datumType;
/* lookup necessary attributes of the datum type */
get_typlenbyval(datumType, &typlen, &typbyval);
- state->datumTypeLen = typlen;
- state->tuples = !typbyval;
+ arg->datumTypeLen = typlen;
+ ops->tuples = !typbyval;
/* Prepare SortSupport data */
- state->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+ ops->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
- state->sortKeys->ssup_cxt = CurrentMemoryContext;
- state->sortKeys->ssup_collation = sortCollation;
- state->sortKeys->ssup_nulls_first = nullsFirstFlag;
+ ops->sortKeys->ssup_cxt = CurrentMemoryContext;
+ ops->sortKeys->ssup_collation = sortCollation;
+ ops->sortKeys->ssup_nulls_first = nullsFirstFlag;
/*
* Abbreviation is possible here only for by-reference types. In theory,
@@ -1443,9 +1503,9 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
* can't, because a datum sort only stores a single copy of the datum; the
* "tuple" field of each SortTuple is NULL.
*/
- state->sortKeys->abbreviate = !typbyval;
+ ops->sortKeys->abbreviate = !typbyval;
- PrepareSortSupportFromOrderingOp(sortOperator, state->sortKeys);
+ PrepareSortSupportFromOrderingOp(sortOperator, ops->sortKeys);
/*
* The "onlyKey" optimization cannot be used with abbreviated keys, since
@@ -1453,8 +1513,8 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
* is only of value to pass-by-value types anyway, whereas abbreviated
* keys are typically only of value to pass-by-reference types.
*/
- if (!state->sortKeys->abbrev_converter)
- state->onlyKey = state->sortKeys;
+ if (!ops->sortKeys->abbrev_converter)
+ ops->onlyKey = ops->sortKeys;
MemoryContextSwitchTo(oldcontext);
@@ -1479,7 +1539,7 @@ tuplesort_set_bound(Tuplesortstate *state, int64 bound)
/* Assert we're called before loading any tuples */
Assert(state->status == TSS_INITIAL && state->memtupcount == 0);
/* Assert we allow bounded sorts */
- Assert(state->sortopt & TUPLESORT_ALLOWBOUNDED);
+ Assert(state->ops.sortopt & TUPLESORT_ALLOWBOUNDED);
/* Can't set the bound twice, either */
Assert(!state->bounded);
/* Also, this shouldn't be called in a parallel worker */
@@ -1507,13 +1567,13 @@ tuplesort_set_bound(Tuplesortstate *state, int64 bound)
* optimization. Disable by setting state to be consistent with no
* abbreviation support.
*/
- state->sortKeys->abbrev_converter = NULL;
- if (state->sortKeys->abbrev_full_comparator)
- state->sortKeys->comparator = state->sortKeys->abbrev_full_comparator;
+ state->ops.sortKeys->abbrev_converter = NULL;
+ if (state->ops.sortKeys->abbrev_full_comparator)
+ state->ops.sortKeys->comparator = state->ops.sortKeys->abbrev_full_comparator;
/* Not strictly necessary, but be tidy */
- state->sortKeys->abbrev_abort = NULL;
- state->sortKeys->abbrev_full_comparator = NULL;
+ state->ops.sortKeys->abbrev_abort = NULL;
+ state->ops.sortKeys->abbrev_full_comparator = NULL;
}
/*
@@ -1536,7 +1596,7 @@ static void
tuplesort_free(Tuplesortstate *state)
{
/* context swap probably not needed, but let's be safe */
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
#ifdef TRACE_SORT
long spaceUsed;
@@ -1583,21 +1643,13 @@ tuplesort_free(Tuplesortstate *state)
TRACE_POSTGRESQL_SORT_DONE(state->tapeset != NULL, 0L);
#endif
- /* Free any execution state created for CLUSTER case */
- if (state->estate != NULL)
- {
- ExprContext *econtext = GetPerTupleExprContext(state->estate);
-
- ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
- FreeExecutorState(state->estate);
- }
-
+ FREESTATE(state);
MemoryContextSwitchTo(oldcontext);
/*
* Free the per-sort memory context, thereby releasing all working memory.
*/
- MemoryContextReset(state->sortcontext);
+ MemoryContextReset(state->ops.sortcontext);
}
/*
@@ -1618,7 +1670,7 @@ tuplesort_end(Tuplesortstate *state)
* Free the main memory context, including the Tuplesortstate struct
* itself.
*/
- MemoryContextDelete(state->maincontext);
+ MemoryContextDelete(state->ops.maincontext);
}
/*
@@ -1832,7 +1884,9 @@ noalloc:
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleDesc tupDesc = (TupleDesc) ops->arg;
SortTuple stup;
MinimalTuple tuple;
HeapTupleData htup;
@@ -1844,8 +1898,8 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
stup.datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ ops->sortKeys[0].ssup_attno,
+ tupDesc,
&stup.isnull1);
puttuple_common(state, &stup);
@@ -1862,7 +1916,9 @@ void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
SortTuple stup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
@@ -1872,11 +1928,11 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
* set up first-column key value, and potentially abbreviate, if it's a
* simple column
*/
- if (state->haveDatum1)
+ if (ops->haveDatum1)
{
stup.datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup.isnull1);
}
@@ -1894,9 +1950,11 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
ItemPointer self, Datum *values,
bool *isnull)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
SortTuple stup;
IndexTuple tuple;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
stup.tuple = index_form_tuple(RelationGetDescr(rel), values, isnull);
tuple = ((IndexTuple) stup.tuple);
@@ -1904,7 +1962,7 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
/* set up first-column key value */
stup.datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup.isnull1);
puttuple_common(state, &stup);
@@ -1920,7 +1978,9 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
void
tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.tuplecontext);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
SortTuple stup;
/*
@@ -1935,7 +1995,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* identical to stup.tuple.
*/
- if (isNull || !state->tuples)
+ if (isNull || !state->ops.tuples)
{
/*
* Set datum1 to zeroed representation for NULLs (to be consistent,
@@ -1948,7 +2008,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
else
{
stup.isnull1 = false;
- stup.datum1 = datumCopy(val, false, state->datumTypeLen);
+ stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
stup.tuple = DatumGetPointer(stup.datum1);
}
@@ -1963,15 +2023,15 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
static void
puttuple_common(Tuplesortstate *state, SortTuple *tuple)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
Assert(!LEADER(state));
if (tuple->tuple != NULL)
USEMEM(state, GetMemoryChunkSpace(tuple->tuple));
- if (!state->sortKeys || !state->haveDatum1 || !state->tuples ||
- !state->sortKeys->abbrev_converter || tuple->isnull1)
+ if (!state->ops.sortKeys || !state->ops.haveDatum1 || !state->ops.tuples ||
+ !state->ops.sortKeys->abbrev_converter || tuple->isnull1)
{
/*
* Store ordinary Datum representation, or NULL value. If there is a
@@ -1985,8 +2045,8 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
else if (!consider_abort_common(state))
{
/* Store abbreviated key representation */
- tuple->datum1 = state->sortKeys->abbrev_converter(tuple->datum1,
- state->sortKeys);
+ tuple->datum1 = state->ops.sortKeys->abbrev_converter(tuple->datum1,
+ state->ops.sortKeys);
}
else
{
@@ -2130,9 +2190,9 @@ writetuple_common(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
static bool
consider_abort_common(Tuplesortstate *state)
{
- Assert(state->sortKeys[0].abbrev_converter != NULL);
- Assert(state->sortKeys[0].abbrev_abort != NULL);
- Assert(state->sortKeys[0].abbrev_full_comparator != NULL);
+ Assert(state->ops.sortKeys[0].abbrev_converter != NULL);
+ Assert(state->ops.sortKeys[0].abbrev_abort != NULL);
+ Assert(state->ops.sortKeys[0].abbrev_full_comparator != NULL);
/*
* Check effectiveness of abbreviation optimization. Consider aborting
@@ -2147,19 +2207,19 @@ consider_abort_common(Tuplesortstate *state)
* Check opclass-supplied abbreviation abort routine. It may indicate
* that abbreviation should not proceed.
*/
- if (!state->sortKeys->abbrev_abort(state->memtupcount,
- state->sortKeys))
+ if (!state->ops.sortKeys->abbrev_abort(state->memtupcount,
+ state->ops.sortKeys))
return false;
/*
* Finally, restore authoritative comparator, and indicate that
* abbreviation is not in play by setting abbrev_converter to NULL
*/
- state->sortKeys[0].comparator = state->sortKeys[0].abbrev_full_comparator;
- state->sortKeys[0].abbrev_converter = NULL;
+ state->ops.sortKeys[0].comparator = state->ops.sortKeys[0].abbrev_full_comparator;
+ state->ops.sortKeys[0].abbrev_converter = NULL;
/* Not strictly necessary, but be tidy */
- state->sortKeys[0].abbrev_abort = NULL;
- state->sortKeys[0].abbrev_full_comparator = NULL;
+ state->ops.sortKeys[0].abbrev_abort = NULL;
+ state->ops.sortKeys[0].abbrev_full_comparator = NULL;
/* Give up - expect original pass-by-value representation */
return true;
@@ -2174,7 +2234,7 @@ consider_abort_common(Tuplesortstate *state)
void
tuplesort_performsort(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
#ifdef TRACE_SORT
if (trace_sort)
@@ -2294,7 +2354,7 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
switch (state->status)
{
case TSS_SORTEDINMEM:
- Assert(forward || state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(forward || state->ops.sortopt & TUPLESORT_RANDOMACCESS);
Assert(!state->slabAllocatorUsed);
if (forward)
{
@@ -2338,7 +2398,7 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
break;
case TSS_SORTEDONTAPE:
- Assert(forward || state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(forward || state->ops.sortopt & TUPLESORT_RANDOMACCESS);
Assert(state->slabAllocatorUsed);
/*
@@ -2540,7 +2600,7 @@ bool
tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
TupleTableSlot *slot, Datum *abbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2551,7 +2611,7 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
if (stup.tuple)
{
/* Record abbreviated key for caller */
- if (state->sortKeys->abbrev_converter && abbrev)
+ if (state->ops.sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
if (copy)
@@ -2576,7 +2636,7 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
HeapTuple
tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2596,7 +2656,7 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
IndexTuple
tuplesort_getindextuple(Tuplesortstate *state, bool forward)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2626,7 +2686,9 @@ bool
tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2639,10 +2701,10 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
MemoryContextSwitchTo(oldcontext);
/* Record abbreviated key for caller */
- if (state->sortKeys->abbrev_converter && abbrev)
+ if (ops->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
- if (stup.isnull1 || !state->tuples)
+ if (stup.isnull1 || !state->ops.tuples)
{
*val = stup.datum1;
*isNull = stup.isnull1;
@@ -2650,7 +2712,7 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
else
{
/* use stup.tuple because stup.datum1 may be an abbreviation */
- *val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
*isNull = false;
}
@@ -2703,7 +2765,7 @@ tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples, bool forward)
* We could probably optimize these cases better, but for now it's
* not worth the trouble.
*/
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
while (ntuples-- > 0)
{
SortTuple stup;
@@ -2979,7 +3041,7 @@ mergeruns(Tuplesortstate *state)
Assert(state->status == TSS_BUILDRUNS);
Assert(state->memtupcount == 0);
- if (state->sortKeys != NULL && state->sortKeys->abbrev_converter != NULL)
+ if (state->ops.sortKeys != NULL && state->ops.sortKeys->abbrev_converter != NULL)
{
/*
* If there are multiple runs to be merged, when we go to read back
@@ -2987,19 +3049,19 @@ mergeruns(Tuplesortstate *state)
* we don't care to regenerate them. Disable abbreviation from this
* point on.
*/
- state->sortKeys->abbrev_converter = NULL;
- state->sortKeys->comparator = state->sortKeys->abbrev_full_comparator;
+ state->ops.sortKeys->abbrev_converter = NULL;
+ state->ops.sortKeys->comparator = state->ops.sortKeys->abbrev_full_comparator;
/* Not strictly necessary, but be tidy */
- state->sortKeys->abbrev_abort = NULL;
- state->sortKeys->abbrev_full_comparator = NULL;
+ state->ops.sortKeys->abbrev_abort = NULL;
+ state->ops.sortKeys->abbrev_full_comparator = NULL;
}
/*
* Reset tuple memory. We've freed all the tuples that we previously
* allocated. We will use the slab allocator from now on.
*/
- MemoryContextResetOnly(state->tuplecontext);
+ MemoryContextResetOnly(state->ops.tuplecontext);
/*
* We no longer need a large memtuples array. (We will allocate a smaller
@@ -3022,7 +3084,7 @@ mergeruns(Tuplesortstate *state)
* From this point on, we no longer use the USEMEM()/LACKMEM() mechanism
* to track memory usage of individual tuples.
*/
- if (state->tuples)
+ if (state->ops.tuples)
init_slab_allocator(state, state->nOutputTapes + 1);
else
init_slab_allocator(state, 0);
@@ -3036,7 +3098,7 @@ mergeruns(Tuplesortstate *state)
* number of input tapes will not increase between passes.)
*/
state->memtupsize = state->nOutputTapes;
- state->memtuples = (SortTuple *) MemoryContextAlloc(state->maincontext,
+ state->memtuples = (SortTuple *) MemoryContextAlloc(state->ops.maincontext,
state->nOutputTapes * sizeof(SortTuple));
USEMEM(state, GetMemoryChunkSpace(state->memtuples));
@@ -3113,7 +3175,7 @@ mergeruns(Tuplesortstate *state)
* sorted tape, we can stop at this point and do the final merge
* on-the-fly.
*/
- if ((state->sortopt & TUPLESORT_RANDOMACCESS) == 0
+ if ((state->ops.sortopt & TUPLESORT_RANDOMACCESS) == 0
&& state->nInputRuns <= state->nInputTapes
&& !WORKER(state))
{
@@ -3339,7 +3401,7 @@ dumptuples(Tuplesortstate *state, bool alltuples)
* AllocSetFree's bucketing by size class might be particularly bad if
* this step wasn't taken.
*/
- MemoryContextReset(state->tuplecontext);
+ MemoryContextReset(state->ops.tuplecontext);
markrunend(state->destTape);
@@ -3357,9 +3419,9 @@ dumptuples(Tuplesortstate *state, bool alltuples)
void
tuplesort_rescan(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->ops.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3390,9 +3452,9 @@ tuplesort_rescan(Tuplesortstate *state)
void
tuplesort_markpos(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->ops.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3421,9 +3483,9 @@ tuplesort_markpos(Tuplesortstate *state)
void
tuplesort_restorepos(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->ops.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->ops.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3639,9 +3701,9 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
*/
- if (state->haveDatum1 && state->sortKeys)
+ if (state->ops.haveDatum1 && state->ops.sortKeys)
{
- if (state->sortKeys[0].comparator == ssup_datum_unsigned_cmp)
+ if (state->ops.sortKeys[0].comparator == ssup_datum_unsigned_cmp)
{
qsort_tuple_unsigned(state->memtuples,
state->memtupcount,
@@ -3649,7 +3711,7 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
return;
}
#if SIZEOF_DATUM >= 8
- else if (state->sortKeys[0].comparator == ssup_datum_signed_cmp)
+ else if (state->ops.sortKeys[0].comparator == ssup_datum_signed_cmp)
{
qsort_tuple_signed(state->memtuples,
state->memtupcount,
@@ -3657,7 +3719,7 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
return;
}
#endif
- else if (state->sortKeys[0].comparator == ssup_datum_int32_cmp)
+ else if (state->ops.sortKeys[0].comparator == ssup_datum_int32_cmp)
{
qsort_tuple_int32(state->memtuples,
state->memtupcount,
@@ -3667,16 +3729,16 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
}
/* Can we use the single-key sort function? */
- if (state->onlyKey != NULL)
+ if (state->ops.onlyKey != NULL)
{
qsort_ssup(state->memtuples, state->memtupcount,
- state->onlyKey);
+ state->ops.onlyKey);
}
else
{
qsort_tuple(state->memtuples,
state->memtupcount,
- state->comparetup,
+ state->ops.comparetup,
state);
}
}
@@ -3793,10 +3855,10 @@ tuplesort_heap_replace_top(Tuplesortstate *state, SortTuple *tuple)
static void
reversedirection(Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ SortSupport sortKey = state->ops.sortKeys;
int nkey;
- for (nkey = 0; nkey < state->nKeys; nkey++, sortKey++)
+ for (nkey = 0; nkey < state->ops.nKeys; nkey++, sortKey++)
{
sortKey->ssup_reverse = !sortKey->ssup_reverse;
sortKey->ssup_nulls_first = !sortKey->ssup_nulls_first;
@@ -3847,7 +3909,7 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
Assert(state->slabFreeHead);
if (tuplen > SLAB_SLOT_SIZE || !state->slabFreeHead)
- return MemoryContextAlloc(state->sortcontext, tuplen);
+ return MemoryContextAlloc(state->ops.sortcontext, tuplen);
else
{
buf = state->slabFreeHead;
@@ -3866,6 +3928,7 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
static void
getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
HeapTupleData htup;
htup.t_len = ((MinimalTuple) stup->tuple)->t_len +
@@ -3874,8 +3937,8 @@ getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
MINIMAL_TUPLE_OFFSET);
stup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ ops->sortKeys[0].ssup_attno,
+ (TupleDesc) ops->arg,
&stup->isnull1);
}
@@ -3883,7 +3946,8 @@ getdatum1_heap(Tuplesortstate *state, SortTuple *stup)
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ SortSupport sortKey = ops->sortKeys;
HeapTupleData ltup;
HeapTupleData rtup;
TupleDesc tupDesc;
@@ -3908,7 +3972,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
- tupDesc = state->tupDesc;
+ tupDesc = (TupleDesc) ops->arg;
if (sortKey->abbrev_converter)
{
@@ -3925,7 +3989,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
}
sortKey++;
- for (nkey = 1; nkey < state->nKeys; nkey++, sortKey++)
+ for (nkey = 1; nkey < ops->nKeys; nkey++, sortKey++)
{
attno = sortKey->ssup_attno;
@@ -3945,6 +4009,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
MinimalTuple tuple = (MinimalTuple) stup->tuple;
/* the part of the MinimalTuple we'll write: */
@@ -3956,7 +4021,7 @@ writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
}
@@ -3969,12 +4034,13 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
MinimalTuple tuple = (MinimalTuple) readtup_alloc(state, tuplen);
char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
HeapTupleData htup;
/* read in the tuple proper */
tuple->t_len = tuplen;
LogicalTapeReadExact(tape, tupbody, tupbodylen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
@@ -3982,8 +4048,8 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
stup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ ops->sortKeys[0].ssup_attno,
+ (TupleDesc) ops->arg,
&stup->isnull1);
}
@@ -3995,12 +4061,14 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
static void
getdatum1_cluster(Tuplesortstate *state, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
HeapTuple tup;
tup = (HeapTuple) stup->tuple;
stup->datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup->isnull1);
}
@@ -4008,7 +4076,9 @@ static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+ SortSupport sortKey = ops->sortKeys;
HeapTuple ltup;
HeapTuple rtup;
TupleDesc tupDesc;
@@ -4022,10 +4092,10 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
/* Be prepared to compare additional sort keys */
ltup = (HeapTuple) a->tuple;
rtup = (HeapTuple) b->tuple;
- tupDesc = state->tupDesc;
+ tupDesc = arg->tupDesc;
/* Compare the leading sort key, if it's simple */
- if (state->haveDatum1)
+ if (ops->haveDatum1)
{
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
@@ -4035,7 +4105,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
if (sortKey->abbrev_converter)
{
- AttrNumber leading = state->indexInfo->ii_IndexAttrNumbers[0];
+ AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
@@ -4044,7 +4114,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
datum2, isnull2,
sortKey);
}
- if (compare != 0 || state->nKeys == 1)
+ if (compare != 0 || ops->nKeys == 1)
return compare;
/* Compare additional columns the hard way */
sortKey++;
@@ -4056,13 +4126,13 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
nkey = 0;
}
- if (state->indexInfo->ii_Expressions == NULL)
+ if (arg->indexInfo->ii_Expressions == NULL)
{
/* If not expression index, just compare the proper heap attrs */
- for (; nkey < state->nKeys; nkey++, sortKey++)
+ for (; nkey < ops->nKeys; nkey++, sortKey++)
{
- AttrNumber attno = state->indexInfo->ii_IndexAttrNumbers[nkey];
+ AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
@@ -4089,19 +4159,19 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
TupleTableSlot *ecxt_scantuple;
/* Reset context each time to prevent memory leakage */
- ResetPerTupleExprContext(state->estate);
+ ResetPerTupleExprContext(arg->estate);
- ecxt_scantuple = GetPerTupleExprContext(state->estate)->ecxt_scantuple;
+ ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
- FormIndexDatum(state->indexInfo, ecxt_scantuple, state->estate,
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
l_index_values, l_index_isnull);
ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
- FormIndexDatum(state->indexInfo, ecxt_scantuple, state->estate,
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
r_index_values, r_index_isnull);
- for (; nkey < state->nKeys; nkey++, sortKey++)
+ for (; nkey < ops->nKeys; nkey++, sortKey++)
{
compare = ApplySortComparator(l_index_values[nkey],
l_index_isnull[nkey],
@@ -4119,6 +4189,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
HeapTuple tuple = (HeapTuple) stup->tuple;
unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
@@ -4126,7 +4197,7 @@ writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
}
@@ -4135,6 +4206,8 @@ static void
readtup_cluster(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int tuplen)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
HeapTuple tuple = (HeapTuple) readtup_alloc(state,
t_len + HEAPTUPLESIZE);
@@ -4147,18 +4220,34 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
tuple->t_tableOid = InvalidOid;
/* Read in the tuple body */
LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value, if it's a simple column */
- if (state->haveDatum1)
+ if (ops->haveDatum1)
stup->datum1 = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup->isnull1);
}
+static void
+freestate_cluster(Tuplesortstate *state)
+{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortClusterArg *arg = (TupleSortClusterArg *) ops->arg;
+
+ /* Free any execution state created for CLUSTER case */
+ if (arg->estate != NULL)
+ {
+ ExprContext *econtext = GetPerTupleExprContext(arg->estate);
+
+ ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
+ FreeExecutorState(arg->estate);
+ }
+}
+
/*
* Routines specialized for IndexTuple case
*
@@ -4170,12 +4259,14 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
static void
getdatum1_index(Tuplesortstate *state, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
IndexTuple tuple;
tuple = stup->tuple;
stup->datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup->isnull1);
}
@@ -4188,7 +4279,9 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* is also special handling for enforcing uniqueness, and special
* treatment for equal keys at the end.
*/
- SortSupport sortKey = state->sortKeys;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexBTreeArg *arg = (TupleSortIndexBTreeArg *) ops->arg;
+ SortSupport sortKey = ops->sortKeys;
IndexTuple tuple1;
IndexTuple tuple2;
int keysz;
@@ -4212,8 +4305,8 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
/* Compare additional sort keys */
tuple1 = (IndexTuple) a->tuple;
tuple2 = (IndexTuple) b->tuple;
- keysz = state->nKeys;
- tupDes = RelationGetDescr(state->indexRel);
+ keysz = ops->nKeys;
+ tupDes = RelationGetDescr(arg->index.indexRel);
if (sortKey->abbrev_converter)
{
@@ -4258,7 +4351,7 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* sort algorithm wouldn't have checked whether one must appear before the
* other.
*/
- if (state->enforceUnique && !(!state->uniqueNullsNotDistinct && equal_hasnull))
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
{
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
@@ -4274,16 +4367,16 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
index_deform_tuple(tuple1, tupDes, values, isnull);
- key_desc = BuildIndexValueDescription(state->indexRel, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
ereport(ERROR,
(errcode(ERRCODE_UNIQUE_VIOLATION),
errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(state->indexRel)),
+ RelationGetRelationName(arg->index.indexRel)),
key_desc ? errdetail("Key %s is duplicated.", key_desc) :
errdetail("Duplicate keys exist."),
- errtableconstraint(state->heapRel,
- RelationGetRelationName(state->indexRel))));
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
}
/*
@@ -4321,6 +4414,8 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Bucket bucket2;
IndexTuple tuple1;
IndexTuple tuple2;
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexHashArg *arg = (TupleSortIndexHashArg *) ops->arg;
/*
* Fetch hash keys and mask off bits we don't want to sort by. We know
@@ -4328,12 +4423,12 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
*/
Assert(!a->isnull1);
bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
- state->max_buckets, state->high_mask,
- state->low_mask);
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
Assert(!b->isnull1);
bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
- state->max_buckets, state->high_mask,
- state->low_mask);
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
if (bucket1 > bucket2)
return 1;
else if (bucket1 < bucket2)
@@ -4371,13 +4466,14 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
IndexTuple tuple = (IndexTuple) stup->tuple;
unsigned int tuplen;
tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
}
@@ -4386,18 +4482,20 @@ static void
readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortIndexArg *arg = (TupleSortIndexArg *) ops->arg;
unsigned int tuplen = len - sizeof(unsigned int);
IndexTuple tuple = (IndexTuple) readtup_alloc(state, tuplen);
LogicalTapeReadExact(tape, tuple, tuplen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value */
stup->datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup->isnull1);
}
@@ -4414,20 +4512,21 @@ getdatum1_datum(Tuplesortstate *state, SortTuple *stup)
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
int compare;
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- state->sortKeys);
+ ops->sortKeys);
if (compare != 0)
return compare;
/* if we have abbreviations, then "tuple" has the original value */
- if (state->sortKeys->abbrev_converter)
+ if (ops->sortKeys->abbrev_converter)
compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
PointerGetDatum(b->tuple), b->isnull1,
- state->sortKeys);
+ ops->sortKeys);
return compare;
}
@@ -4435,6 +4534,8 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
+ TupleSortDatumArg *arg = (TupleSortDatumArg *) ops->arg;
void *waddr;
unsigned int tuplen;
unsigned int writtenlen;
@@ -4444,7 +4545,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
waddr = NULL;
tuplen = 0;
}
- else if (!state->tuples)
+ else if (!state->ops.tuples)
{
waddr = &stup->datum1;
tuplen = sizeof(Datum);
@@ -4452,7 +4553,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
else
{
waddr = stup->tuple;
- tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, state->datumTypeLen);
+ tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
Assert(tuplen != 0);
}
@@ -4460,7 +4561,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
LogicalTapeWrite(tape, waddr, tuplen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
}
@@ -4469,6 +4570,7 @@ static void
readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
+ TuplesortOps *ops = TuplesortstateGetOps(state);
unsigned int tuplen = len - sizeof(unsigned int);
if (tuplen == 0)
@@ -4478,7 +4580,7 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
stup->isnull1 = true;
stup->tuple = NULL;
}
- else if (!state->tuples)
+ else if (!state->ops.tuples)
{
Assert(tuplen == sizeof(Datum));
LogicalTapeReadExact(tape, &stup->datum1, tuplen);
@@ -4495,7 +4597,7 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
stup->tuple = raddr;
}
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
+ if (ops->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
--
2.30.2
v2-0001-Remove-Tuplesortstate.copytup.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Remove-Tuplesortstate.copytup.patchDownload
From 03b78cdade3b86a0e97723721fa1d0bd64d0c7df Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 13:28:27 +0300
Subject: [PATCH v2 1/6] Remove Tuplesortstate.copytup
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/tuplesort.c | 330 ++++++++++++-----------------
1 file changed, 132 insertions(+), 198 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 31554fd867d..0114855c83c 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -279,14 +279,6 @@ struct Tuplesortstate
*/
SortTupleComparator comparetup;
- /*
- * Function to copy a supplied input tuple into palloc'd space and set up
- * its SortTuple representation (ie, set tuple/datum1/isnull1). Also,
- * state->availMem must be decreased by the amount of space used for the
- * tuple copy (note the SortTuple struct itself is not counted).
- */
- void (*copytup) (Tuplesortstate *state, SortTuple *stup, void *tup);
-
/*
* Function to write a stored tuple onto tape. The representation of the
* tuple on tape need not be the same as it is in memory; requirements on
@@ -549,7 +541,6 @@ struct Sharedsort
} while(0)
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
-#define COPYTUP(state,stup,tup) ((*(state)->copytup) (state, stup, tup))
#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
@@ -600,10 +591,7 @@ struct Sharedsort
* a lot better than what we were doing before 7.3. As of 9.6, a
* separate memory context is used for caller passed tuples. Resetting
* it at certain key increments significantly ameliorates fragmentation.
- * Note that this places a responsibility on copytup routines to use the
- * correct memory context for these tuples (and to not use the reset
- * context for anything whose lifetime needs to span multiple external
- * sort runs). readtup routines use the slab allocator (they cannot use
+ * readtup routines use the slab allocator (they cannot use
* the reset context because it gets deleted at the point that merging
* begins).
*/
@@ -643,14 +631,12 @@ static void markrunend(LogicalTape *tape);
static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
static int comparetup_heap(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_cluster(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
@@ -659,14 +645,12 @@ static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_index(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static int comparetup_datum(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_datum(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
@@ -1059,7 +1043,6 @@ tuplesort_begin_heap(TupleDesc tupDesc,
PARALLEL_SORT(state));
state->comparetup = comparetup_heap;
- state->copytup = copytup_heap;
state->writetup = writetup_heap;
state->readtup = readtup_heap;
state->haveDatum1 = true;
@@ -1135,7 +1118,6 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
PARALLEL_SORT(state));
state->comparetup = comparetup_cluster;
- state->copytup = copytup_cluster;
state->writetup = writetup_cluster;
state->readtup = readtup_cluster;
state->abbrevNext = 10;
@@ -1240,7 +1222,6 @@ tuplesort_begin_index_btree(Relation heapRel,
PARALLEL_SORT(state));
state->comparetup = comparetup_index_btree;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->abbrevNext = 10;
@@ -1317,7 +1298,6 @@ tuplesort_begin_index_hash(Relation heapRel,
state->nKeys = 1; /* Only one sort column, the hash code */
state->comparetup = comparetup_index_hash;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->haveDatum1 = true;
@@ -1358,7 +1338,6 @@ tuplesort_begin_index_gist(Relation heapRel,
state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
state->comparetup = comparetup_index_btree;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->haveDatum1 = true;
@@ -1422,7 +1401,6 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
PARALLEL_SORT(state));
state->comparetup = comparetup_datum;
- state->copytup = copytup_datum;
state->writetup = writetup_datum;
state->readtup = readtup_datum;
state->abbrevNext = 10;
@@ -1839,14 +1817,75 @@ noalloc:
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
+ Datum original;
+ MinimalTuple tuple;
+ HeapTupleData htup;
- /*
- * Copy the given tuple into memory we control, and decrease availMem.
- * Then call the common code.
- */
- COPYTUP(state, &stup, (void *) slot);
+ /* copy the tuple into sort storage */
+ tuple = ExecCopySlotMinimalTuple(slot);
+ stup.tuple = (void *) tuple;
+ USEMEM(state, GetMemoryChunkSpace(tuple));
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ original = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup.isnull1);
+
+ MemoryContextSwitchTo(state->sortcontext);
+
+ if (!state->sortKeys->abbrev_converter || stup.isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ stup.datum1 = original;
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ stup.datum1 = state->sortKeys->abbrev_converter(original,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ stup.datum1 = original;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ {
+ SortTuple *mtup = &state->memtuples[i];
+
+ htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
+ MINIMAL_TUPLE_OFFSET);
+
+ mtup->datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &mtup->isnull1);
+ }
+ }
puttuple_common(state, &stup);
@@ -1861,14 +1900,74 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
+ Datum original;
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+
+ /* copy the tuple into sort storage */
+ tup = heap_copytuple(tup);
+ stup.tuple = (void *) tup;
+ USEMEM(state, GetMemoryChunkSpace(tup));
+
+ MemoryContextSwitchTo(state->sortcontext);
/*
- * Copy the given tuple into memory we control, and decrease availMem.
- * Then call the common code.
+ * set up first-column key value, and potentially abbreviate, if it's a
+ * simple column
*/
- COPYTUP(state, &stup, (void *) tup);
+ if (state->haveDatum1)
+ {
+ original = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup.isnull1);
+
+ if (!state->sortKeys->abbrev_converter || stup.isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ stup.datum1 = original;
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ stup.datum1 = state->sortKeys->abbrev_converter(original,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ stup.datum1 = original;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ {
+ SortTuple *mtup = &state->memtuples[i];
+
+ tup = (HeapTuple) mtup->tuple;
+ mtup->datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &mtup->isnull1);
+ }
+ }
+ }
puttuple_common(state, &stup);
@@ -3946,84 +4045,6 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
return 0;
}
-static void
-copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /*
- * We expect the passed "tup" to be a TupleTableSlot, and form a
- * MinimalTuple using the exported interface for that.
- */
- TupleTableSlot *slot = (TupleTableSlot *) tup;
- Datum original;
- MinimalTuple tuple;
- HeapTupleData htup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
-
- /* copy the tuple into sort storage */
- tuple = ExecCopySlotMinimalTuple(slot);
- stup->tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- original = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &stup->isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-
- if (!state->sortKeys->abbrev_converter || stup->isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup->datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup->datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup->datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- mtup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &mtup->isnull1);
- }
- }
-}
-
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4192,79 +4213,6 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
return 0;
}
-static void
-copytup_cluster(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- HeapTuple tuple = (HeapTuple) tup;
- Datum original;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
-
- /* copy the tuple into sort storage */
- tuple = heap_copytuple(tuple);
- stup->tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
-
- MemoryContextSwitchTo(oldcontext);
-
- /*
- * set up first-column key value, and potentially abbreviate, if it's a
- * simple column
- */
- if (!state->haveDatum1)
- return;
-
- original = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &stup->isnull1);
-
- if (!state->sortKeys->abbrev_converter || stup->isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup->datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup->datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup->datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tuple = (HeapTuple) mtup->tuple;
- mtup->datum1 = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &mtup->isnull1);
- }
- }
-}
-
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4511,13 +4459,6 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
return 0;
}
-static void
-copytup_index(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /* Not currently needed */
- elog(ERROR, "copytup_index() should not be called");
-}
-
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4582,13 +4523,6 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
return compare;
}
-static void
-copytup_datum(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /* Not currently needed */
- elog(ERROR, "copytup_datum() should not be called");
-}
-
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
--
2.30.2
v2-0003-Put-abbreviation-logic-into-puttuple_common.patchtext/x-patch; charset=US-ASCII; name=v2-0003-Put-abbreviation-logic-into-puttuple_common.patchDownload
From 494d46dcf938e5f824a498e38ce77782751208e1 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:13:56 +0300
Subject: [PATCH v2 3/6] Put abbreviation logic into puttuple_common()
Reported-by:
Bug:
Discussion:
Author:
Reviewed-by:
Tested-by:
Backpatch-through:
---
src/backend/utils/sort/tuplesort.c | 213 +++++++----------------------
1 file changed, 50 insertions(+), 163 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index c649043fbb0..c4d8c183f62 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1832,7 +1832,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
- Datum original;
MinimalTuple tuple;
HeapTupleData htup;
@@ -1843,51 +1842,13 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- original = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &stup.isnull1);
+ stup.datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup.isnull1);
MemoryContextSwitchTo(state->sortcontext);
- if (!state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- GETDATUM1(state, &state->memtuples[i]);
- }
-
puttuple_common(state, &stup);
MemoryContextSwitchTo(oldcontext);
@@ -1902,7 +1863,6 @@ void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
SortTuple stup;
- Datum original;
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
/* copy the tuple into sort storage */
@@ -1918,48 +1878,10 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
*/
if (state->haveDatum1)
{
- original = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &stup.isnull1);
-
- if (!state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- GETDATUM1(state, &state->memtuples[i]);
- }
+ stup.datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup.isnull1);
}
puttuple_common(state, &stup);
@@ -1978,7 +1900,6 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
- Datum original;
IndexTuple tuple;
stup.tuple = index_form_tuple(RelationGetDescr(rel), values, isnull);
@@ -1986,51 +1907,13 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
tuple->t_tid = *self;
USEMEM(state, GetMemoryChunkSpace(stup.tuple));
/* set up first-column key value */
- original = index_getattr(tuple,
- 1,
- RelationGetDescr(state->indexRel),
- &stup.isnull1);
+ stup.datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(state->indexRel),
+ &stup.isnull1);
MemoryContextSwitchTo(state->sortcontext);
- if (!state->sortKeys || !state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- GETDATUM1(state, &state->memtuples[i]);
- }
-
puttuple_common(state, &stup);
MemoryContextSwitchTo(oldcontext);
@@ -2072,43 +1955,11 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
}
else
{
- Datum original = datumCopy(val, false, state->datumTypeLen);
-
stup.isnull1 = false;
- stup.tuple = DatumGetPointer(original);
+ stup.datum1 = datumCopy(val, false, state->datumTypeLen);
+ stup.tuple = DatumGetPointer(stup.datum1);
USEMEM(state, GetMemoryChunkSpace(stup.tuple));
MemoryContextSwitchTo(state->sortcontext);
-
- if (!state->sortKeys->abbrev_converter)
- {
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any
- * case).
- */
- for (i = 0; i < state->memtupcount; i++)
- GETDATUM1(state, &state->memtuples[i]);
- }
}
puttuple_common(state, &stup);
@@ -2124,6 +1975,42 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
{
Assert(!LEADER(state));
+ if (!state->sortKeys || !state->haveDatum1 || !state->tuples ||
+ !state->sortKeys->abbrev_converter || tuple->isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ tuple->datum1 = state->sortKeys->abbrev_converter(tuple->datum1,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ GETDATUM1(state, &state->memtuples[i]);
+ }
+
switch (state->status)
{
case TSS_INITIAL:
--
2.30.2
Hi, Pavel!
Thank you for your feedback.
On Thu, Jun 23, 2022 at 2:26 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
Some PostgreSQL extensions need to sort their pieces of data. Then it
worth to re-use our tuplesort. But despite our tuplesort having
extensibility, it's hidden inside tuplesort.c. There are at least a
couple of examples of how extensions deal with that.1. RUM table access method: https://github.com/postgrespro/rum
RUM repository contains a copy of tuplesort.c for each major
PostgreSQL release. A reliable solution, but this is not how things
are intended to work, right?
2. OrioleDB table access method: https://github.com/orioledb/orioledb
OrioleDB runs on patches PostgreSQL. It contains a patch, which just
exposes all the guts of tuplesort.c to the tuplesort.h
https://github.com/orioledb/postgres/commit/d42755f52cI think we need a proper way to let extension re-use our core
tuplesort facility. The attached patchset is intended to do this the
right way. Patches don't revise all the comments and lack code
beautification. The intention behind publishing this revision is to
verify the direction and get some feedback for further work.I still have one doubt about the thing: the compatibility with previous PG versions requires me to support code paths that I already added into RUM extension. I won't be able to drop it from extension for quite long time in the future. It could be avoided if we backpatch this, which seems doubtful to me provided the volume of code changes.
If we just change this thing since say v16 this will only help to extensions that doesn't support earlier PG versions. I still consider the change beneficial but wonder do you have some view on how should it be managed in existing extensions to benefit them?
I don't think there is a way to help extensions with earlier PG
versions. This is a significant patchset, which shouldn't be a subject
for backpatch. The existing extensions will benefit by simplification
of maintenance for PG 16+ releases. I think that's all we can do.
------
Regards,
Alexander Korotkov
Hi, Maxim!
On Thu, Jun 23, 2022 at 3:12 PM Maxim Orlov <orlovmg@gmail.com> wrote:
I've reviewed the patchset and noticed some minor issues:
- extra semicolon in macro (lead to warnings)
- comparison of var isWorker should be done in different wayHere is an upgraded version of the patchset.
Thank you for fixing this.
Overall, I consider this patchset useful. Any opinions?
Thank you.
------
Regards,
Alexander Korotkov
Hi!
Overall patch looks good let's mark it as ready for committer, shall we?
--
Best regards,
Maxim Orlov.
On Thu, 23 Jun 2022 at 14:12, Maxim Orlov <orlovmg@gmail.com> wrote:
Hi!
I've reviewed the patchset and noticed some minor issues:
- extra semicolon in macro (lead to warnings)
- comparison of var isWorker should be done in different wayHere is an upgraded version of the patchset.
Overall, I consider this patchset useful. Any opinions?
All of the patches are currently missing descriptive commit messages,
which I think is critical for getting this committed. As for per-patch
comments:
0001: This patch removes copytup, but it is not quite clear why -
please describe the reasoning in the commit message.
0002: getdatum1 needs comments on what it does. From the name, it
would return the datum1 from a sorttuple, but that's not what it does;
a better name would be putdatum1 or populatedatum1.
0003: in the various tuplesort_put*tuple[values] functions, the datum1
field is manually extracted. Shouldn't we use the getdatum1 functions
from 0002 instead? We could use either them directly to allow
inlining, or use the indirection through tuplesortstate.
0004: Needs a commit message, but otherwise seems fine.
0005:
+struct TuplesortOps
This struct has no comment on what it is. Something like "Public
interface of tuplesort operators, containing data directly accessable
to users of tuplesort" should suffice, but feel free to update the
wording.
+ void *arg;
+};
This field could use a comment on what it is used for, and how to use it.
+struct Tuplesortstate +{ + TuplesortOps ops;
This field needs a comment too.
0006: Needs a commit message, but otherwise seems fine.
Kind regards,
Matthias van de Meent
Hi,
I think this needs to be evaluated for performance...
I agree with the nearby comment that the commits need a bit of justification
at least to review them.
On 2022-06-23 15:12:27 +0300, Maxim Orlov wrote:
From 03b78cdade3b86a0e97723721fa1d0bd64d0c7df Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 13:28:27 +0300
Subject: [PATCH v2 1/6] Remove Tuplesortstate.copytup
Yea. I was recently complaining about the pointlessness of copytup.
From 1d78e271b22d7c6a1557defbe15ea5039ff28510 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:03:13 +0300
Subject: [PATCH v2 2/6] Tuplesortstate.getdatum1 method
Hm. This adds a bunch of indirect function calls were there previously
weren't.
From 494d46dcf938e5f824a498e38ce77782751208e1 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:13:56 +0300
Subject: [PATCH v2 3/6] Put abbreviation logic into puttuple_common()
There's definitely a lot of redundancy removed... But the list of branches in
puttuple_common() grew. Perhaps we instead can add a few flags to
putttuple_common() that determine whether abbreviation should happen, so that
we only do the work necessary for the "type" of sort?
From ee2dd46b07d62e13ed66b5a38272fb5667c943f3 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 00:14:51 +0300
Subject: [PATCH v2 4/6] Move freeing memory away from writetup()
Seems to do more than just moving freeing around?
@@ -1973,8 +1963,13 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull) static void puttuple_common(Tuplesortstate *state, SortTuple *tuple) { + MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext); + Assert(!LEADER(state));+ if (tuple->tuple != NULL) + USEMEM(state, GetMemoryChunkSpace(tuple->tuple)); +
Adding even more branches into common code...
From 3a0e1fa7c7e4da46a86f7d5b9dd0392549f3b460 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 18:11:26 +0300
Subject: [PATCH v2 5/6] Reorganize data structures
Hard to know what this is trying to achieve.
-struct Tuplesortstate +struct TuplesortOps { - TupSortStatus status; /* enumerated value as shown above */ - int nKeys; /* number of columns in sort key */ - int sortopt; /* Bitmask of flags used to setup sort */ - bool bounded; /* did caller specify a maximum number of - * tuples to return? */ - bool boundUsed; /* true if we made use of a bounded heap */ - int bound; /* if bounded, the maximum number of tuples */ - bool tuples; /* Can SortTuple.tuple ever be set? */ - int64 availMem; /* remaining memory available, in bytes */ - int64 allowedMem; /* total memory allowed, in bytes */ - int maxTapes; /* max number of input tapes to merge in each - * pass */ - int64 maxSpace; /* maximum amount of space occupied among sort - * of groups, either in-memory or on-disk */ - bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk - * space, false when it's value for in-memory - * space */ - TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */ MemoryContext maincontext; /* memory context for tuple sort metadata that * persists across multiple batches */ MemoryContext sortcontext; /* memory context holding most sort data */ MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */ - LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file *//*
* These function pointers decouple the routines that must know what kind
To me it seems odd to have memory contexts and similar things in a
datastructure calls *Ops.
From b06bcb5f3666f0541dfcc27c9c8462af2b5ec9e0 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 21:48:05 +0300
Subject: [PATCH v2 6/6] Split tuplesortops.c
I strongly suspect this will cause a slowdown. There was potential for
cross-function optimization that's now removed.
Greetings,
Andres Freund
.Hi!
On Wed, Jul 6, 2022 at 6:01 PM Andres Freund <andres@anarazel.de> wrote:
I think this needs to be evaluated for performance...
Surely, this needs.
I agree with the nearby comment that the commits need a bit of justification
at least to review them.
Will do this.
From 1d78e271b22d7c6a1557defbe15ea5039ff28510 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:03:13 +0300
Subject: [PATCH v2 2/6] Tuplesortstate.getdatum1 methodHm. This adds a bunch of indirect function calls were there previously
weren't.
Yep. I think it worth changing this function to deal with many
SortTuple's at once.
From 494d46dcf938e5f824a498e38ce77782751208e1 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:13:56 +0300
Subject: [PATCH v2 3/6] Put abbreviation logic into puttuple_common()There's definitely a lot of redundancy removed... But the list of branches in
puttuple_common() grew. Perhaps we instead can add a few flags to
putttuple_common() that determine whether abbreviation should happen, so that
we only do the work necessary for the "type" of sort?
Good point, will refactor that.
From ee2dd46b07d62e13ed66b5a38272fb5667c943f3 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 00:14:51 +0300
Subject: [PATCH v2 4/6] Move freeing memory away from writetup()Seems to do more than just moving freeing around?
Yes, it also move memory accounting from tuplesort_put*() to
puttuple_common(). Will revise this.
@@ -1973,8 +1963,13 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull) static void puttuple_common(Tuplesortstate *state, SortTuple *tuple) { + MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext); + Assert(!LEADER(state));+ if (tuple->tuple != NULL) + USEMEM(state, GetMemoryChunkSpace(tuple->tuple)); +Adding even more branches into common code...
I will see how to reduce branching here.
From 3a0e1fa7c7e4da46a86f7d5b9dd0392549f3b460 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 18:11:26 +0300
Subject: [PATCH v2 5/6] Reorganize data structuresHard to know what this is trying to achieve.
Split the public interface part out of Tuplesortstate.
-struct Tuplesortstate +struct TuplesortOps { - TupSortStatus status; /* enumerated value as shown above */ - int nKeys; /* number of columns in sort key */ - int sortopt; /* Bitmask of flags used to setup sort */ - bool bounded; /* did caller specify a maximum number of - * tuples to return? */ - bool boundUsed; /* true if we made use of a bounded heap */ - int bound; /* if bounded, the maximum number of tuples */ - bool tuples; /* Can SortTuple.tuple ever be set? */ - int64 availMem; /* remaining memory available, in bytes */ - int64 allowedMem; /* total memory allowed, in bytes */ - int maxTapes; /* max number of input tapes to merge in each - * pass */ - int64 maxSpace; /* maximum amount of space occupied among sort - * of groups, either in-memory or on-disk */ - bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk - * space, false when it's value for in-memory - * space */ - TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */ MemoryContext maincontext; /* memory context for tuple sort metadata that * persists across multiple batches */ MemoryContext sortcontext; /* memory context holding most sort data */ MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */ - LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file *//*
* These function pointers decouple the routines that must know what kindTo me it seems odd to have memory contexts and similar things in a
datastructure calls *Ops.
Yep, it worth renaming TuplesortOps into TuplesortPublic or something.
From b06bcb5f3666f0541dfcc27c9c8462af2b5ec9e0 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 21:48:05 +0300
Subject: [PATCH v2 6/6] Split tuplesortops.cI strongly suspect this will cause a slowdown. There was potential for
cross-function optimization that's now removed.
I wonder how can cross-function optimizations bypass function
pointers. Is it possible?
------
Regards,
Alexander Korotkov
On Wed, Jul 6, 2022 at 8:45 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
From b06bcb5f3666f0541dfcc27c9c8462af2b5ec9e0 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 21:48:05 +0300
Subject: [PATCH v2 6/6] Split tuplesortops.cI strongly suspect this will cause a slowdown. There was potential for
cross-function optimization that's now removed.I wonder how can cross-function optimizations bypass function
pointers. Is it possible?
Oh, this is not just functions called by pointer. This is also
puttuple_common() etc. OK, this needs to be checked.
------
Regards,
Alexander Korotkov
Hi, Matthias!
The revised patchset is attached.
On Wed, Jul 6, 2022 at 5:41 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
All of the patches are currently missing descriptive commit messages,
which I think is critical for getting this committed. As for per-patch
comments:0001: This patch removes copytup, but it is not quite clear why -
please describe the reasoning in the commit message.
Because spit of logic between Tuplesortstate.copytup() function and
tuplesort_put*() functions is unclear. It doesn't look like we need
an abstraction here, while all the work could be done in
tuplesort_put*().
0002: getdatum1 needs comments on what it does. From the name, it
would return the datum1 from a sorttuple, but that's not what it does;
a better name would be putdatum1 or populatedatum1.0003: in the various tuplesort_put*tuple[values] functions, the datum1
field is manually extracted. Shouldn't we use the getdatum1 functions
from 0002 instead? We could use either them directly to allow
inlining, or use the indirection through tuplesortstate.
getdatum1() was a bad name. Actually it restores original datum1
during rollback of abbreviations. I've replaced it with
removeabbrev(), which seems name to me and process many SortTuple's
during one call.
0004: Needs a commit message, but otherwise seems fine.
Commit message is added.
0005:
+struct TuplesortOps
This struct has no comment on what it is. Something like "Public
interface of tuplesort operators, containing data directly accessable
to users of tuplesort" should suffice, but feel free to update the
wording.+ void *arg;
+};This field could use a comment on what it is used for, and how to use it.
+struct Tuplesortstate +{ + TuplesortOps ops;This field needs a comment too.
0006: Needs a commit message, but otherwise seems fine.
TuplesortOps was renamed to TuplesortPublic. Comments and commit
messages are added.
There are some places, which potentially could cause a slowdown. I'm
going to make some experiments with that.
------
Regards,
Alexander Korotkov
Attachments:
0001-Remove-Tuplesortstate.copytup-function-v2.patchapplication/x-patch; name=0001-Remove-Tuplesortstate.copytup-function-v2.patchDownload
From 1df8f95586a7dc1afaba4c484d2b2502460458a5 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 13:28:27 +0300
Subject: [PATCH 1/6] Remove Tuplesortstate.copytup function
It's currently unclear how do we split functionality between
Tuplesortstate.copytup() function and tuplesort_put*() functions.
For instance, copytup_index() and copytup_datum() return error while
tuplesort_putindextuplevalues() and tuplesort_putdatum() do their work.
This commit removes Tuplesortstate.copytup() altogether, putting the
corresponding code into tuplesort_put*().
---
src/backend/utils/sort/tuplesort.c | 330 ++++++++++++-----------------
1 file changed, 132 insertions(+), 198 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 421afcf47d3..4812b1d9ae3 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -279,14 +279,6 @@ struct Tuplesortstate
*/
SortTupleComparator comparetup;
- /*
- * Function to copy a supplied input tuple into palloc'd space and set up
- * its SortTuple representation (ie, set tuple/datum1/isnull1). Also,
- * state->availMem must be decreased by the amount of space used for the
- * tuple copy (note the SortTuple struct itself is not counted).
- */
- void (*copytup) (Tuplesortstate *state, SortTuple *stup, void *tup);
-
/*
* Function to write a stored tuple onto tape. The representation of the
* tuple on tape need not be the same as it is in memory; requirements on
@@ -549,7 +541,6 @@ struct Sharedsort
} while(0)
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
-#define COPYTUP(state,stup,tup) ((*(state)->copytup) (state, stup, tup))
#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
@@ -600,10 +591,7 @@ struct Sharedsort
* a lot better than what we were doing before 7.3. As of 9.6, a
* separate memory context is used for caller passed tuples. Resetting
* it at certain key increments significantly ameliorates fragmentation.
- * Note that this places a responsibility on copytup routines to use the
- * correct memory context for these tuples (and to not use the reset
- * context for anything whose lifetime needs to span multiple external
- * sort runs). readtup routines use the slab allocator (they cannot use
+ * readtup routines use the slab allocator (they cannot use
* the reset context because it gets deleted at the point that merging
* begins).
*/
@@ -643,14 +631,12 @@ static void markrunend(LogicalTape *tape);
static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
static int comparetup_heap(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_cluster(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
@@ -659,14 +645,12 @@ static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_index(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static int comparetup_datum(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_datum(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
@@ -1059,7 +1043,6 @@ tuplesort_begin_heap(TupleDesc tupDesc,
PARALLEL_SORT(state));
state->comparetup = comparetup_heap;
- state->copytup = copytup_heap;
state->writetup = writetup_heap;
state->readtup = readtup_heap;
state->haveDatum1 = true;
@@ -1135,7 +1118,6 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
PARALLEL_SORT(state));
state->comparetup = comparetup_cluster;
- state->copytup = copytup_cluster;
state->writetup = writetup_cluster;
state->readtup = readtup_cluster;
state->abbrevNext = 10;
@@ -1240,7 +1222,6 @@ tuplesort_begin_index_btree(Relation heapRel,
PARALLEL_SORT(state));
state->comparetup = comparetup_index_btree;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->abbrevNext = 10;
@@ -1317,7 +1298,6 @@ tuplesort_begin_index_hash(Relation heapRel,
state->nKeys = 1; /* Only one sort column, the hash code */
state->comparetup = comparetup_index_hash;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->haveDatum1 = true;
@@ -1358,7 +1338,6 @@ tuplesort_begin_index_gist(Relation heapRel,
state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
state->comparetup = comparetup_index_btree;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->haveDatum1 = true;
@@ -1422,7 +1401,6 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
PARALLEL_SORT(state));
state->comparetup = comparetup_datum;
- state->copytup = copytup_datum;
state->writetup = writetup_datum;
state->readtup = readtup_datum;
state->abbrevNext = 10;
@@ -1839,14 +1817,75 @@ noalloc:
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
+ Datum original;
+ MinimalTuple tuple;
+ HeapTupleData htup;
- /*
- * Copy the given tuple into memory we control, and decrease availMem.
- * Then call the common code.
- */
- COPYTUP(state, &stup, (void *) slot);
+ /* copy the tuple into sort storage */
+ tuple = ExecCopySlotMinimalTuple(slot);
+ stup.tuple = (void *) tuple;
+ USEMEM(state, GetMemoryChunkSpace(tuple));
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ original = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup.isnull1);
+
+ MemoryContextSwitchTo(state->sortcontext);
+
+ if (!state->sortKeys->abbrev_converter || stup.isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ stup.datum1 = original;
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ stup.datum1 = state->sortKeys->abbrev_converter(original,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ stup.datum1 = original;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ {
+ SortTuple *mtup = &state->memtuples[i];
+
+ htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
+ MINIMAL_TUPLE_OFFSET);
+
+ mtup->datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &mtup->isnull1);
+ }
+ }
puttuple_common(state, &stup);
@@ -1861,14 +1900,74 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
+ Datum original;
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+
+ /* copy the tuple into sort storage */
+ tup = heap_copytuple(tup);
+ stup.tuple = (void *) tup;
+ USEMEM(state, GetMemoryChunkSpace(tup));
+
+ MemoryContextSwitchTo(state->sortcontext);
/*
- * Copy the given tuple into memory we control, and decrease availMem.
- * Then call the common code.
+ * set up first-column key value, and potentially abbreviate, if it's a
+ * simple column
*/
- COPYTUP(state, &stup, (void *) tup);
+ if (state->haveDatum1)
+ {
+ original = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup.isnull1);
+
+ if (!state->sortKeys->abbrev_converter || stup.isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ stup.datum1 = original;
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ stup.datum1 = state->sortKeys->abbrev_converter(original,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ stup.datum1 = original;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ {
+ SortTuple *mtup = &state->memtuples[i];
+
+ tup = (HeapTuple) mtup->tuple;
+ mtup->datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &mtup->isnull1);
+ }
+ }
+ }
puttuple_common(state, &stup);
@@ -3947,84 +4046,6 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
return 0;
}
-static void
-copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /*
- * We expect the passed "tup" to be a TupleTableSlot, and form a
- * MinimalTuple using the exported interface for that.
- */
- TupleTableSlot *slot = (TupleTableSlot *) tup;
- Datum original;
- MinimalTuple tuple;
- HeapTupleData htup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
-
- /* copy the tuple into sort storage */
- tuple = ExecCopySlotMinimalTuple(slot);
- stup->tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- original = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &stup->isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-
- if (!state->sortKeys->abbrev_converter || stup->isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup->datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup->datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup->datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- mtup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &mtup->isnull1);
- }
- }
-}
-
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4193,79 +4214,6 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
return 0;
}
-static void
-copytup_cluster(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- HeapTuple tuple = (HeapTuple) tup;
- Datum original;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
-
- /* copy the tuple into sort storage */
- tuple = heap_copytuple(tuple);
- stup->tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
-
- MemoryContextSwitchTo(oldcontext);
-
- /*
- * set up first-column key value, and potentially abbreviate, if it's a
- * simple column
- */
- if (!state->haveDatum1)
- return;
-
- original = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &stup->isnull1);
-
- if (!state->sortKeys->abbrev_converter || stup->isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup->datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup->datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup->datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tuple = (HeapTuple) mtup->tuple;
- mtup->datum1 = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &mtup->isnull1);
- }
- }
-}
-
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4512,13 +4460,6 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
return 0;
}
-static void
-copytup_index(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /* Not currently needed */
- elog(ERROR, "copytup_index() should not be called");
-}
-
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4583,13 +4524,6 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
return compare;
}
-static void
-copytup_datum(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /* Not currently needed */
- elog(ERROR, "copytup_datum() should not be called");
-}
-
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
--
2.24.3 (Apple Git-128)
0002-Add-new-Tuplesortstate.removeabbrev-function-v2.patchapplication/x-patch; name=0002-Add-new-Tuplesortstate.removeabbrev-function-v2.patchDownload
From 58c9e73eef4cbe2912f59600a582afe91156fc33 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:03:13 +0300
Subject: [PATCH 2/6] Add new Tuplesortstate.removeabbrev function
This commit is the preparation to move abbreviation logic into
puttuple_common(). The new removeabbrev function turns datum1 representation
of SortTuple's from the abbreviated key to the first column value. Therefore,
it encapsulates the differential part of abbreviation handling code in
tuplesort_put*() functions, making these functions similar.
---
src/backend/utils/sort/tuplesort.c | 156 +++++++++++++++++++----------
1 file changed, 103 insertions(+), 53 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 4812b1d9ae3..8b6b2bc1d38 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -279,6 +279,13 @@ struct Tuplesortstate
*/
SortTupleComparator comparetup;
+ /*
+ * Alter datum1 representation in the SortTuple's array back from the
+ * abbreviated key to the first column value.
+ */
+ void (*removeabbrev) (Tuplesortstate *state, SortTuple *stups,
+ int count);
+
/*
* Function to write a stored tuple onto tape. The representation of the
* tuple on tape need not be the same as it is in memory; requirements on
@@ -540,6 +547,7 @@ struct Sharedsort
pfree(buf); \
} while(0)
+#define REMOVEABBREV(state,stup,count) ((*(state)->removeabbrev) (state, stup, count))
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
@@ -629,6 +637,14 @@ static void reversedirection(Tuplesortstate *state);
static unsigned int getlen(LogicalTape *tape, bool eofOK);
static void markrunend(LogicalTape *tape);
static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
+static void removeabbrev_heap(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_index(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_datum(Tuplesortstate *state, SortTuple *stups,
+ int count);
static int comparetup_heap(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
@@ -1042,6 +1058,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_heap;
state->comparetup = comparetup_heap;
state->writetup = writetup_heap;
state->readtup = readtup_heap;
@@ -1117,6 +1134,7 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_cluster;
state->comparetup = comparetup_cluster;
state->writetup = writetup_cluster;
state->readtup = readtup_cluster;
@@ -1221,6 +1239,7 @@ tuplesort_begin_index_btree(Relation heapRel,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_index;
state->comparetup = comparetup_index_btree;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1297,6 +1316,7 @@ tuplesort_begin_index_hash(Relation heapRel,
state->nKeys = 1; /* Only one sort column, the hash code */
+ state->removeabbrev = removeabbrev_index;
state->comparetup = comparetup_index_hash;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1337,6 +1357,7 @@ tuplesort_begin_index_gist(Relation heapRel,
state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ state->removeabbrev = removeabbrev_index;
state->comparetup = comparetup_index_btree;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1400,6 +1421,7 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_datum;
state->comparetup = comparetup_datum;
state->writetup = writetup_datum;
state->readtup = readtup_datum;
@@ -1871,20 +1893,7 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
* sorted on tape, since serialized tuples lack abbreviated keys
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- mtup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &mtup->isnull1);
- }
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
puttuple_common(state, &stup);
@@ -1925,12 +1934,12 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
if (!state->sortKeys->abbrev_converter || stup.isnull1)
{
/*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
+ * Store ordinary Datum representation, or NULL value. If there
+ * is a converter it won't expect NULL values, and cost model is
+ * not required to account for NULL, so in that case we avoid
+ * calling converter and just set datum1 to zeroed representation
+ * (to be consistent, and to support cheap inequality tests for
+ * NULL abbreviated keys).
*/
stup.datum1 = original;
}
@@ -1949,23 +1958,15 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
/*
* Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tup = (HeapTuple) mtup->tuple;
- mtup->datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &mtup->isnull1);
- }
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any
+ * case).
+ */
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
}
@@ -2035,16 +2036,7 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
* sorted on tape, since serialized tuples lack abbreviated keys
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tuple = mtup->tuple;
- mtup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(state->indexRel),
- &mtup->isnull1);
- }
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
puttuple_common(state, &stup);
@@ -2122,12 +2114,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* (TSS_BUILDRUNS state prevents control reaching here in any
* case).
*/
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- mtup->datum1 = PointerGetDatum(mtup->tuple);
- }
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
}
@@ -3984,6 +3971,26 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
* Routines specialized for HeapTuple (actually MinimalTuple) case
*/
+static void
+removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTupleData htup;
+
+ htup.t_len = ((MinimalTuple) stups[i].tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
+ MINIMAL_TUPLE_OFFSET);
+ stups[i].datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stups[i].isnull1);
+ }
+}
+
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
@@ -4102,6 +4109,23 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
* comparisons per a btree index definition)
*/
+static void
+removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTuple tup;
+
+ tup = (HeapTuple) stups[i].tuple;
+ stups[i].datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stups[i].isnull1);
+ }
+}
+
static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
@@ -4271,6 +4295,23 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
* functions can be shared.
*/
+static void
+removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ IndexTuple tuple;
+
+ tuple = stups[i].tuple;
+ stups[i].datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(state->indexRel),
+ &stups[i].isnull1);
+ }
+}
+
static int
comparetup_index_btree(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
@@ -4503,6 +4544,15 @@ readtup_index(Tuplesortstate *state, SortTuple *stup,
* Routines specialized for DatumTuple case
*/
+static void
+removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ stups[i].datum1 = PointerGetDatum(stups[i].tuple);
+}
+
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
--
2.24.3 (Apple Git-128)
0003-Put-abbreviation-logic-into-puttuple_common-v2.patchapplication/x-patch; name=0003-Put-abbreviation-logic-into-puttuple_common-v2.patchDownload
From 3662dc6a8df1780696e4b861dd0f2517c146941e Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:13:56 +0300
Subject: [PATCH 3/6] Put abbreviation logic into puttuple_common()
Abbreviation code is very similar along tuplesort_put*() functions. This
commit unifies that code and puts it into puttuple_common(). tuplesort_put*()
functions differs in the abbreviation condition, so it has been added as an
argument to the puttuple_common() function.
---
src/backend/utils/sort/tuplesort.c | 222 ++++++++---------------------
1 file changed, 56 insertions(+), 166 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 8b6b2bc1d38..828efe701e5 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -616,7 +616,8 @@ static Tuplesortstate *tuplesort_begin_common(int workMem,
SortCoordinate coordinate,
int sortopt);
static void tuplesort_begin_batch(Tuplesortstate *state);
-static void puttuple_common(Tuplesortstate *state, SortTuple *tuple);
+static void puttuple_common(Tuplesortstate *state, SortTuple *tuple,
+ bool useAbbrev);
static bool consider_abort_common(Tuplesortstate *state);
static void inittapes(Tuplesortstate *state, bool mergeruns);
static void inittapestate(Tuplesortstate *state, int maxTapes);
@@ -1841,7 +1842,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
- Datum original;
MinimalTuple tuple;
HeapTupleData htup;
@@ -1852,51 +1852,15 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- original = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &stup.isnull1);
+ stup.datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup.isnull1);
MemoryContextSwitchTo(state->sortcontext);
- if (!state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
-
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1910,7 +1874,6 @@ void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
SortTuple stup;
- Datum original;
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
/* copy the tuple into sort storage */
@@ -1926,51 +1889,14 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
*/
if (state->haveDatum1)
{
- original = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &stup.isnull1);
-
- if (!state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there
- * is a converter it won't expect NULL values, and cost model is
- * not required to account for NULL, so in that case we avoid
- * calling converter and just set datum1 to zeroed representation
- * (to be consistent, and to support cheap inequality tests for
- * NULL abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any
- * case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
+ stup.datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup.isnull1);
}
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->haveDatum1 && state->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1986,7 +1912,6 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
{
MemoryContext oldcontext;
SortTuple stup;
- Datum original;
IndexTuple tuple;
stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
@@ -1995,51 +1920,15 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
tuple->t_tid = *self;
USEMEM(state, GetMemoryChunkSpace(stup.tuple));
/* set up first-column key value */
- original = index_getattr(tuple,
- 1,
- RelationGetDescr(state->indexRel),
- &stup.isnull1);
+ stup.datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(state->indexRel),
+ &stup.isnull1);
oldcontext = MemoryContextSwitchTo(state->sortcontext);
- if (!state->sortKeys || !state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
-
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->sortKeys && state->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -2080,45 +1969,15 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
}
else
{
- Datum original = datumCopy(val, false, state->datumTypeLen);
-
stup.isnull1 = false;
- stup.tuple = DatumGetPointer(original);
+ stup.datum1 = datumCopy(val, false, state->datumTypeLen);
+ stup.tuple = DatumGetPointer(stup.datum1);
USEMEM(state, GetMemoryChunkSpace(stup.tuple));
MemoryContextSwitchTo(state->sortcontext);
-
- if (!state->sortKeys->abbrev_converter)
- {
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any
- * case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
}
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->tuples && !isNull && state->sortKeys->abbrev_converter);
MemoryContextSwitchTo(oldcontext);
}
@@ -2127,10 +1986,41 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* Shared code for tuple and datum cases.
*/
static void
-puttuple_common(Tuplesortstate *state, SortTuple *tuple)
+puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
Assert(!LEADER(state));
+ if (!useAbbrev)
+ {
+ /*
+ * Leave ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ tuple->datum1 = state->sortKeys->abbrev_converter(tuple->datum1,
+ state->sortKeys);
+ }
+ else
+ {
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
+ }
+
switch (state->status)
{
case TSS_INITIAL:
--
2.24.3 (Apple Git-128)
0005-Split-TuplesortPublic-from-Tuplesortstate-v2.patchapplication/x-patch; name=0005-Split-TuplesortPublic-from-Tuplesortstate-v2.patchDownload
From 4cb346e0e93d3746550a0239ef2a4c7686f95632 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 18:11:26 +0300
Subject: [PATCH 5/6] Split TuplesortPublic from Tuplesortstate
The new TuplesortPublic data structure contains the definition of
sort-variant-specific interface methods and the part of Tuple sort operation
state required by their implementations. This will let define Tuple sort
variants without knowledge of Tuplesortstate, that is without knowledge
of generic sort implementation guts.
---
src/backend/utils/sort/tuplesort.c | 811 ++++++++++++++++-------------
src/tools/pgindent/typedefs.list | 6 +
2 files changed, 469 insertions(+), 348 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index c8c511fb8c5..4a9aeb8799a 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -126,8 +126,8 @@
#define CLUSTER_SORT 3
/* Sort parallel code from state for sort__start probes */
-#define PARALLEL_SORT(state) ((state)->shared == NULL ? 0 : \
- (state)->worker >= 0 ? 1 : 2)
+#define PARALLEL_SORT(coordinate) ((coordinate)->sharedsort == NULL ? 0 : \
+ (coordinate)->isWorker >= 0 ? 1 : 2)
/*
* Initial size of memtuples array. We're trying to select this size so that
@@ -236,38 +236,18 @@ typedef enum
#define TAPE_BUFFER_OVERHEAD BLCKSZ
#define MERGE_BUFFER_SIZE (BLCKSZ * 32)
+typedef struct TuplesortPublic TuplesortPublic;
+
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
/*
- * Private state of a Tuplesort operation.
+ * The public part of a Tuple sort operation state. This data structure
+ * containsthe definition of sort-variant-specific interface methods and
+ * the part of Tuple sort operation state required by their implementations.
*/
-struct Tuplesortstate
+struct TuplesortPublic
{
- TupSortStatus status; /* enumerated value as shown above */
- int nKeys; /* number of columns in sort key */
- int sortopt; /* Bitmask of flags used to setup sort */
- bool bounded; /* did caller specify a maximum number of
- * tuples to return? */
- bool boundUsed; /* true if we made use of a bounded heap */
- int bound; /* if bounded, the maximum number of tuples */
- bool tuples; /* Can SortTuple.tuple ever be set? */
- int64 availMem; /* remaining memory available, in bytes */
- int64 allowedMem; /* total memory allowed, in bytes */
- int maxTapes; /* max number of input tapes to merge in each
- * pass */
- int64 maxSpace; /* maximum amount of space occupied among sort
- * of groups, either in-memory or on-disk */
- bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
- * space, false when it's value for in-memory
- * space */
- TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
- MemoryContext maincontext; /* memory context for tuple sort metadata that
- * persists across multiple batches */
- MemoryContext sortcontext; /* memory context holding most sort data */
- MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
- LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
-
/*
* These function pointers decouple the routines that must know what kind
* of tuple we are sorting from the routines that don't need to know it.
@@ -301,12 +281,134 @@ struct Tuplesortstate
void (*readtup) (Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
+ /*
+ * Function to do some specific release of resources for the sort variant.
+ * In particular, this function should free everything stored in the "arg"
+ * field, which wouldn't be cleared on reset of the Tuple sort memory
+ * contextes. This can be NULL if nothing specific needs to be done.
+ */
+ void (*freestate) (Tuplesortstate *state);
+
+ /*
+ * The subsequent fields are used in the implementations of the functions
+ * above.
+ */
+ MemoryContext maincontext; /* memory context for tuple sort metadata that
+ * persists across multiple batches */
+ MemoryContext sortcontext; /* memory context holding most sort data */
+ MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
+
/*
* Whether SortTuple's datum1 and isnull1 members are maintained by the
* above routines. If not, some sort specializations are disabled.
*/
bool haveDatum1;
+ /*
+ * The sortKeys variable is used by every case other than the hash index
+ * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
+ * MinimalTuple and CLUSTER routines, though.
+ */
+ int nKeys; /* number of columns in sort key */
+ SortSupport sortKeys; /* array of length nKeys */
+
+ /*
+ * This variable is shared by the single-key MinimalTuple case and the
+ * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
+ * presence of a value in this field is also checked by various sort
+ * specialization functions as an optimization when comparing the leading
+ * key in a tiebreak situation to determine if there are any subsequent
+ * keys to sort on.
+ */
+ SortSupport onlyKey;
+
+ int sortopt; /* Bitmask of flags used to setup sort */
+
+ bool tuples; /* Can SortTuple.tuple ever be set? */
+
+ void *arg; /* Specific information for the sort variant */
+};
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
+ * the tuplesort_begin_cluster.
+ */
+typedef struct
+{
+ TupleDesc tupDesc;
+
+ IndexInfo *indexInfo; /* info about index being used for reference */
+ EState *estate; /* for evaluating index expressions */
+} TuplesortClusterArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the IndexTuple case.
+ * Set by tuplesort_begin_index_xxx and used only by the IndexTuple routines.
+ */
+typedef struct
+{
+ Relation heapRel; /* table the index is being built on */
+ Relation indexRel; /* index being built */
+} TuplesortIndexArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_btree subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ bool enforceUnique; /* complain if we find duplicate tuples */
+ bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+} TuplesortIndexBTreeArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_hash subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ uint32 high_mask; /* masks for sortable part of hash code */
+ uint32 low_mask;
+ uint32 max_buckets;
+} TuplesortIndexHashArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the Datum case.
+ * Set by tuplesort_begin_datum and used only by the DatumTuple routines.
+ */
+typedef struct
+{
+ /* the datatype oid of Datum's to be sorted */
+ Oid datumType;
+ /* we need typelen in order to know how to copy the Datums. */
+ int datumTypeLen;
+} TuplesortDatumArg;
+
+/*
+ * Private state of a Tuplesort operation.
+ */
+struct Tuplesortstate
+{
+ TuplesortPublic base;
+ TupSortStatus status; /* enumerated value as shown above */
+ bool bounded; /* did caller specify a maximum number of
+ * tuples to return? */
+ bool boundUsed; /* true if we made use of a bounded heap */
+ int bound; /* if bounded, the maximum number of tuples */
+ int64 availMem; /* remaining memory available, in bytes */
+ int64 allowedMem; /* total memory allowed, in bytes */
+ int maxTapes; /* max number of input tapes to merge in each
+ * pass */
+ int64 maxSpace; /* maximum amount of space occupied among sort
+ * of groups, either in-memory or on-disk */
+ bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
+ * space, false when it's value for in-memory
+ * space */
+ TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
+ LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
+
/*
* This array holds the tuples now in sort memory. If we are in state
* INITIAL, the tuples are in no particular order; if we are in state
@@ -421,24 +523,6 @@ struct Tuplesortstate
Sharedsort *shared;
int nParticipants;
- /*
- * The sortKeys variable is used by every case other than the hash index
- * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
- * MinimalTuple and CLUSTER routines, though.
- */
- TupleDesc tupDesc;
- SortSupport sortKeys; /* array of length nKeys */
-
- /*
- * This variable is shared by the single-key MinimalTuple case and the
- * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
- * presence of a value in this field is also checked by various sort
- * specialization functions as an optimization when comparing the leading
- * key in a tiebreak situation to determine if there are any subsequent
- * keys to sort on.
- */
- SortSupport onlyKey;
-
/*
* Additional state for managing "abbreviated key" sortsupport routines
* (which currently may be used by all cases except the hash index case).
@@ -448,37 +532,6 @@ struct Tuplesortstate
int64 abbrevNext; /* Tuple # at which to next check
* applicability */
- /*
- * These variables are specific to the CLUSTER case; they are set by
- * tuplesort_begin_cluster.
- */
- IndexInfo *indexInfo; /* info about index being used for reference */
- EState *estate; /* for evaluating index expressions */
-
- /*
- * These variables are specific to the IndexTuple case; they are set by
- * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
- */
- Relation heapRel; /* table the index is being built on */
- Relation indexRel; /* index being built */
-
- /* These are specific to the index_btree subcase: */
- bool enforceUnique; /* complain if we find duplicate tuples */
- bool uniqueNullsNotDistinct; /* unique constraint null treatment */
-
- /* These are specific to the index_hash subcase: */
- uint32 high_mask; /* masks for sortable part of hash code */
- uint32 low_mask;
- uint32 max_buckets;
-
- /*
- * These variables are specific to the Datum case; they are set by
- * tuplesort_begin_datum and used only by the DatumTuple routines.
- */
- Oid datumType;
- /* we need typelen in order to know how to copy the Datums. */
- int datumTypeLen;
-
/*
* Resource snapshot for time of sort start.
*/
@@ -543,10 +596,13 @@ struct Sharedsort
pfree(buf); \
} while(0)
-#define REMOVEABBREV(state,stup,count) ((*(state)->removeabbrev) (state, stup, count))
-#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
+#define TuplesortstateGetPublic(state) ((TuplesortPublic *) state);
+
+#define REMOVEABBREV(state,stup,count) ((*(state)->base.removeabbrev) (state, stup, count))
+#define COMPARETUP(state,a,b) ((*(state)->base.comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) (writetuple(state, tape, stup))
-#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
+#define READTUP(state,stup,tape,len) ((*(state)->base.readtup) (state, stup, tape, len))
+#define FREESTATE(state) ((state)->base.freestate ? (*(state)->base.freestate) (state) : (void) 0)
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
#define USEMEM(state,amt) ((state)->availMem -= (amt))
#define FREEMEM(state,amt) ((state)->availMem += (amt))
@@ -670,6 +726,7 @@ static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
+static void freestate_cluster(Tuplesortstate *state);
static int worker_get_identifier(Tuplesortstate *state);
static void worker_freeze_result_tape(Tuplesortstate *state);
static void worker_nomergeruns(Tuplesortstate *state);
@@ -700,7 +757,7 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplyUnsignedSortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->base.sortKeys[0]);
if (compare != 0)
return compare;
@@ -708,10 +765,10 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->base.comparetup(a, b, state);
}
#if SIZEOF_DATUM >= 8
@@ -723,7 +780,7 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplySignedSortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->base.sortKeys[0]);
if (compare != 0)
return compare;
@@ -732,10 +789,10 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->base.comparetup(a, b, state);
}
#endif
@@ -747,7 +804,7 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplyInt32SortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->base.sortKeys[0]);
if (compare != 0)
return compare;
@@ -756,10 +813,10 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->base.comparetup(a, b, state);
}
/*
@@ -886,8 +943,9 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
pg_rusage_init(&state->ru_start);
#endif
- state->sortopt = sortopt;
- state->tuples = true;
+ state->base.sortopt = sortopt;
+ state->base.tuples = true;
+ state->abbrevNext = 10;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -896,8 +954,8 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
* with very little memory.
*/
state->allowedMem = Max(workMem, 64) * (int64) 1024;
- state->sortcontext = sortcontext;
- state->maincontext = maincontext;
+ state->base.sortcontext = sortcontext;
+ state->base.maincontext = maincontext;
/*
* Initial size of array must be more than ALLOCSET_SEPARATE_THRESHOLD;
@@ -956,7 +1014,7 @@ tuplesort_begin_batch(Tuplesortstate *state)
{
MemoryContext oldcontext;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->base.maincontext);
/*
* Caller tuple (e.g. IndexTuple) memory context.
@@ -971,14 +1029,14 @@ tuplesort_begin_batch(Tuplesortstate *state)
* generation.c context as this keeps allocations more compact with less
* wastage. Allocations are also slightly more CPU efficient.
*/
- if (state->sortopt & TUPLESORT_ALLOWBOUNDED)
- state->tuplecontext = AllocSetContextCreate(state->sortcontext,
- "Caller tuples",
- ALLOCSET_DEFAULT_SIZES);
+ if (state->base.sortopt & TUPLESORT_ALLOWBOUNDED)
+ state->base.tuplecontext = AllocSetContextCreate(state->base.sortcontext,
+ "Caller tuples",
+ ALLOCSET_DEFAULT_SIZES);
else
- state->tuplecontext = GenerationContextCreate(state->sortcontext,
- "Caller tuples",
- ALLOCSET_DEFAULT_SIZES);
+ state->base.tuplecontext = GenerationContextCreate(state->base.sortcontext,
+ "Caller tuples",
+ ALLOCSET_DEFAULT_SIZES);
state->status = TSS_INITIAL;
@@ -1034,10 +1092,11 @@ tuplesort_begin_heap(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MemoryContext oldcontext;
int i;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
AssertArg(nkeys > 0);
@@ -1048,30 +1107,28 @@ tuplesort_begin_heap(TupleDesc tupDesc,
nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = nkeys;
+ base->nKeys = nkeys;
TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
false, /* no unique check */
nkeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
-
- state->removeabbrev = removeabbrev_heap;
- state->comparetup = comparetup_heap;
- state->writetup = writetup_heap;
- state->readtup = readtup_heap;
- state->haveDatum1 = true;
+ PARALLEL_SORT(coordinate));
- state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
- state->abbrevNext = 10;
+ base->removeabbrev = removeabbrev_heap;
+ base->comparetup = comparetup_heap;
+ base->writetup = writetup_heap;
+ base->readtup = readtup_heap;
+ base->haveDatum1 = true;
+ base->arg = tupDesc; /* assume we need not copy tupDesc */
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
for (i = 0; i < nkeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
AssertArg(attNums[i] != 0);
AssertArg(sortOperators[i] != 0);
@@ -1081,7 +1138,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
sortKey->ssup_nulls_first = nullsFirstFlags[i];
sortKey->ssup_attno = attNums[i];
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
}
@@ -1092,8 +1149,8 @@ tuplesort_begin_heap(TupleDesc tupDesc,
* is only of value to pass-by-value types anyway, whereas abbreviated
* keys are typically only of value to pass-by-reference types.
*/
- if (nkeys == 1 && !state->sortKeys->abbrev_converter)
- state->onlyKey = state->sortKeys;
+ if (nkeys == 1 && !base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
MemoryContextSwitchTo(oldcontext);
@@ -1108,13 +1165,16 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
BTScanInsert indexScanKey;
MemoryContext oldcontext;
+ TuplesortClusterArg *arg;
int i;
Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortClusterArg *) palloc0(sizeof(TuplesortClusterArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1124,37 +1184,38 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
false, /* no unique check */
- state->nKeys,
+ base->nKeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_cluster;
- state->comparetup = comparetup_cluster;
- state->writetup = writetup_cluster;
- state->readtup = readtup_cluster;
- state->abbrevNext = 10;
+ base->removeabbrev = removeabbrev_cluster;
+ base->comparetup = comparetup_cluster;
+ base->writetup = writetup_cluster;
+ base->readtup = readtup_cluster;
+ base->freestate = freestate_cluster;
+ base->arg = arg;
- state->indexInfo = BuildIndexInfo(indexRel);
+ arg->indexInfo = BuildIndexInfo(indexRel);
/*
* If we don't have a simple leading attribute, we don't currently
* initialize datum1, so disable optimizations that require it.
*/
- if (state->indexInfo->ii_IndexAttrNumbers[0] == 0)
- state->haveDatum1 = false;
+ if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
+ base->haveDatum1 = false;
else
- state->haveDatum1 = true;
+ base->haveDatum1 = true;
- state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
+ arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
indexScanKey = _bt_mkscankey(indexRel, NULL);
- if (state->indexInfo->ii_Expressions != NULL)
+ if (arg->indexInfo->ii_Expressions != NULL)
{
TupleTableSlot *slot;
ExprContext *econtext;
@@ -1165,19 +1226,19 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
* TupleTableSlot to put the table tuples into. The econtext's
* scantuple has to point to that slot, too.
*/
- state->estate = CreateExecutorState();
+ arg->estate = CreateExecutorState();
slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
- econtext = GetPerTupleExprContext(state->estate);
+ econtext = GetPerTupleExprContext(arg->estate);
econtext->ecxt_scantuple = slot;
}
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < base->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
ScanKey scanKey = indexScanKey->scankeys + i;
int16 strategy;
@@ -1187,7 +1248,7 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
(scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
sortKey->ssup_attno = scanKey->sk_attno;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1215,11 +1276,14 @@ tuplesort_begin_index_btree(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
BTScanInsert indexScanKey;
+ TuplesortIndexBTreeArg *arg;
MemoryContext oldcontext;
int i;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1229,36 +1293,36 @@ tuplesort_begin_index_btree(Relation heapRel,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
enforceUnique,
state->nKeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_index;
- state->comparetup = comparetup_index_btree;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->abbrevNext = 10;
- state->haveDatum1 = true;
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
- state->enforceUnique = enforceUnique;
- state->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = enforceUnique;
+ arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
indexScanKey = _bt_mkscankey(indexRel, NULL);
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < base->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
ScanKey scanKey = indexScanKey->scankeys + i;
int16 strategy;
@@ -1268,7 +1332,7 @@ tuplesort_begin_index_btree(Relation heapRel,
(scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
sortKey->ssup_attno = scanKey->sk_attno;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1297,9 +1361,12 @@ tuplesort_begin_index_hash(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MemoryContext oldcontext;
+ TuplesortIndexHashArg *arg;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexHashArg *) palloc(sizeof(TuplesortIndexHashArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1313,20 +1380,21 @@ tuplesort_begin_index_hash(Relation heapRel,
sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = 1; /* Only one sort column, the hash code */
+ base->nKeys = 1; /* Only one sort column, the hash code */
- state->removeabbrev = removeabbrev_index;
- state->comparetup = comparetup_index_hash;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->haveDatum1 = true;
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_hash;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
- state->high_mask = high_mask;
- state->low_mask = low_mask;
- state->max_buckets = max_buckets;
+ arg->high_mask = high_mask;
+ arg->low_mask = low_mask;
+ arg->max_buckets = max_buckets;
MemoryContextSwitchTo(oldcontext);
@@ -1342,10 +1410,13 @@ tuplesort_begin_index_gist(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MemoryContext oldcontext;
+ TuplesortIndexBTreeArg *arg;
int i;
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1354,31 +1425,34 @@ tuplesort_begin_index_gist(Relation heapRel,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
- state->removeabbrev = removeabbrev_index;
- state->comparetup = comparetup_index_btree;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->haveDatum1 = true;
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = false;
+ arg->uniqueNullsNotDistinct = false;
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < base->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
sortKey->ssup_cxt = CurrentMemoryContext;
sortKey->ssup_collation = indexRel->rd_indcollation[i];
sortKey->ssup_nulls_first = false;
sortKey->ssup_attno = i + 1;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1398,11 +1472,14 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg;
MemoryContext oldcontext;
int16 typlen;
bool typbyval;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortDatumArg *) palloc(sizeof(TuplesortDatumArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1411,35 +1488,36 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = 1; /* always a one-column sort */
+ base->nKeys = 1; /* always a one-column sort */
TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
false, /* no unique check */
1,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_datum;
- state->comparetup = comparetup_datum;
- state->writetup = writetup_datum;
- state->readtup = readtup_datum;
+ base->removeabbrev = removeabbrev_datum;
+ base->comparetup = comparetup_datum;
+ base->writetup = writetup_datum;
+ base->readtup = readtup_datum;
state->abbrevNext = 10;
- state->haveDatum1 = true;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->datumType = datumType;
+ arg->datumType = datumType;
/* lookup necessary attributes of the datum type */
get_typlenbyval(datumType, &typlen, &typbyval);
- state->datumTypeLen = typlen;
- state->tuples = !typbyval;
+ arg->datumTypeLen = typlen;
+ base->tuples = !typbyval;
/* Prepare SortSupport data */
- state->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
- state->sortKeys->ssup_cxt = CurrentMemoryContext;
- state->sortKeys->ssup_collation = sortCollation;
- state->sortKeys->ssup_nulls_first = nullsFirstFlag;
+ base->sortKeys->ssup_cxt = CurrentMemoryContext;
+ base->sortKeys->ssup_collation = sortCollation;
+ base->sortKeys->ssup_nulls_first = nullsFirstFlag;
/*
* Abbreviation is possible here only for by-reference types. In theory,
@@ -1449,9 +1527,9 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
* can't, because a datum sort only stores a single copy of the datum; the
* "tuple" field of each SortTuple is NULL.
*/
- state->sortKeys->abbreviate = !typbyval;
+ base->sortKeys->abbreviate = !typbyval;
- PrepareSortSupportFromOrderingOp(sortOperator, state->sortKeys);
+ PrepareSortSupportFromOrderingOp(sortOperator, base->sortKeys);
/*
* The "onlyKey" optimization cannot be used with abbreviated keys, since
@@ -1459,8 +1537,8 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
* is only of value to pass-by-value types anyway, whereas abbreviated
* keys are typically only of value to pass-by-reference types.
*/
- if (!state->sortKeys->abbrev_converter)
- state->onlyKey = state->sortKeys;
+ if (!base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
MemoryContextSwitchTo(oldcontext);
@@ -1485,7 +1563,7 @@ tuplesort_set_bound(Tuplesortstate *state, int64 bound)
/* Assert we're called before loading any tuples */
Assert(state->status == TSS_INITIAL && state->memtupcount == 0);
/* Assert we allow bounded sorts */
- Assert(state->sortopt & TUPLESORT_ALLOWBOUNDED);
+ Assert(state->base.sortopt & TUPLESORT_ALLOWBOUNDED);
/* Can't set the bound twice, either */
Assert(!state->bounded);
/* Also, this shouldn't be called in a parallel worker */
@@ -1513,13 +1591,13 @@ tuplesort_set_bound(Tuplesortstate *state, int64 bound)
* optimization. Disable by setting state to be consistent with no
* abbreviation support.
*/
- state->sortKeys->abbrev_converter = NULL;
- if (state->sortKeys->abbrev_full_comparator)
- state->sortKeys->comparator = state->sortKeys->abbrev_full_comparator;
+ state->base.sortKeys->abbrev_converter = NULL;
+ if (state->base.sortKeys->abbrev_full_comparator)
+ state->base.sortKeys->comparator = state->base.sortKeys->abbrev_full_comparator;
/* Not strictly necessary, but be tidy */
- state->sortKeys->abbrev_abort = NULL;
- state->sortKeys->abbrev_full_comparator = NULL;
+ state->base.sortKeys->abbrev_abort = NULL;
+ state->base.sortKeys->abbrev_full_comparator = NULL;
}
/*
@@ -1542,7 +1620,7 @@ static void
tuplesort_free(Tuplesortstate *state)
{
/* context swap probably not needed, but let's be safe */
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
#ifdef TRACE_SORT
long spaceUsed;
@@ -1589,21 +1667,13 @@ tuplesort_free(Tuplesortstate *state)
TRACE_POSTGRESQL_SORT_DONE(state->tapeset != NULL, 0L);
#endif
- /* Free any execution state created for CLUSTER case */
- if (state->estate != NULL)
- {
- ExprContext *econtext = GetPerTupleExprContext(state->estate);
-
- ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
- FreeExecutorState(state->estate);
- }
-
+ FREESTATE(state);
MemoryContextSwitchTo(oldcontext);
/*
* Free the per-sort memory context, thereby releasing all working memory.
*/
- MemoryContextReset(state->sortcontext);
+ MemoryContextReset(state->base.sortcontext);
}
/*
@@ -1624,7 +1694,7 @@ tuplesort_end(Tuplesortstate *state)
* Free the main memory context, including the Tuplesortstate struct
* itself.
*/
- MemoryContextDelete(state->maincontext);
+ MemoryContextDelete(state->base.maincontext);
}
/*
@@ -1838,7 +1908,9 @@ noalloc:
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TupleDesc tupDesc = (TupleDesc) base->arg;
SortTuple stup;
MinimalTuple tuple;
HeapTupleData htup;
@@ -1850,12 +1922,12 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
stup.datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ base->sortKeys[0].ssup_attno,
+ tupDesc,
&stup.isnull1);
puttuple_common(state, &stup,
- state->sortKeys->abbrev_converter && !stup.isnull1);
+ base->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1869,7 +1941,9 @@ void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
SortTuple stup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
@@ -1879,16 +1953,16 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
* set up first-column key value, and potentially abbreviate, if it's a
* simple column
*/
- if (state->haveDatum1)
+ if (base->haveDatum1)
{
stup.datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup.isnull1);
}
puttuple_common(state, &stup,
- state->haveDatum1 && state->sortKeys->abbrev_converter && !stup.isnull1);
+ base->haveDatum1 && base->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1904,19 +1978,21 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
{
SortTuple stup;
IndexTuple tuple;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
- isnull, state->tuplecontext);
+ isnull, base->tuplecontext);
tuple = ((IndexTuple) stup.tuple);
tuple->t_tid = *self;
/* set up first-column key value */
stup.datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup.isnull1);
puttuple_common(state, &stup,
- state->sortKeys && state->sortKeys->abbrev_converter && !stup.isnull1);
+ base->sortKeys && base->sortKeys->abbrev_converter && !stup.isnull1);
}
/*
@@ -1927,7 +2003,9 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
void
tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
SortTuple stup;
/*
@@ -1942,7 +2020,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* identical to stup.tuple.
*/
- if (isNull || !state->tuples)
+ if (isNull || !base->tuples)
{
/*
* Set datum1 to zeroed representation for NULLs (to be consistent,
@@ -1955,12 +2033,12 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
else
{
stup.isnull1 = false;
- stup.datum1 = datumCopy(val, false, state->datumTypeLen);
+ stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
stup.tuple = DatumGetPointer(stup.datum1);
}
puttuple_common(state, &stup,
- state->tuples && !isNull && state->sortKeys->abbrev_converter);
+ base->tuples && !isNull && base->sortKeys->abbrev_converter);
MemoryContextSwitchTo(oldcontext);
}
@@ -1971,7 +2049,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
static void
puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
Assert(!LEADER(state));
@@ -1993,8 +2071,8 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
else if (!consider_abort_common(state))
{
/* Store abbreviated key representation */
- tuple->datum1 = state->sortKeys->abbrev_converter(tuple->datum1,
- state->sortKeys);
+ tuple->datum1 = state->base.sortKeys->abbrev_converter(tuple->datum1,
+ state->base.sortKeys);
}
else
{
@@ -2128,7 +2206,7 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
static void
writetuple(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
- state->writetup(state, tape, stup);
+ state->base.writetup(state, tape, stup);
if (!state->slabAllocatorUsed && stup->tuple)
{
@@ -2140,9 +2218,9 @@ writetuple(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
static bool
consider_abort_common(Tuplesortstate *state)
{
- Assert(state->sortKeys[0].abbrev_converter != NULL);
- Assert(state->sortKeys[0].abbrev_abort != NULL);
- Assert(state->sortKeys[0].abbrev_full_comparator != NULL);
+ Assert(state->base.sortKeys[0].abbrev_converter != NULL);
+ Assert(state->base.sortKeys[0].abbrev_abort != NULL);
+ Assert(state->base.sortKeys[0].abbrev_full_comparator != NULL);
/*
* Check effectiveness of abbreviation optimization. Consider aborting
@@ -2157,19 +2235,19 @@ consider_abort_common(Tuplesortstate *state)
* Check opclass-supplied abbreviation abort routine. It may indicate
* that abbreviation should not proceed.
*/
- if (!state->sortKeys->abbrev_abort(state->memtupcount,
- state->sortKeys))
+ if (!state->base.sortKeys->abbrev_abort(state->memtupcount,
+ state->base.sortKeys))
return false;
/*
* Finally, restore authoritative comparator, and indicate that
* abbreviation is not in play by setting abbrev_converter to NULL
*/
- state->sortKeys[0].comparator = state->sortKeys[0].abbrev_full_comparator;
- state->sortKeys[0].abbrev_converter = NULL;
+ state->base.sortKeys[0].comparator = state->base.sortKeys[0].abbrev_full_comparator;
+ state->base.sortKeys[0].abbrev_converter = NULL;
/* Not strictly necessary, but be tidy */
- state->sortKeys[0].abbrev_abort = NULL;
- state->sortKeys[0].abbrev_full_comparator = NULL;
+ state->base.sortKeys[0].abbrev_abort = NULL;
+ state->base.sortKeys[0].abbrev_full_comparator = NULL;
/* Give up - expect original pass-by-value representation */
return true;
@@ -2184,7 +2262,7 @@ consider_abort_common(Tuplesortstate *state)
void
tuplesort_performsort(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
#ifdef TRACE_SORT
if (trace_sort)
@@ -2304,7 +2382,7 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
switch (state->status)
{
case TSS_SORTEDINMEM:
- Assert(forward || state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(forward || state->base.sortopt & TUPLESORT_RANDOMACCESS);
Assert(!state->slabAllocatorUsed);
if (forward)
{
@@ -2348,7 +2426,7 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
break;
case TSS_SORTEDONTAPE:
- Assert(forward || state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(forward || state->base.sortopt & TUPLESORT_RANDOMACCESS);
Assert(state->slabAllocatorUsed);
/*
@@ -2550,7 +2628,8 @@ bool
tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
TupleTableSlot *slot, Datum *abbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2561,7 +2640,7 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
if (stup.tuple)
{
/* Record abbreviated key for caller */
- if (state->sortKeys->abbrev_converter && abbrev)
+ if (base->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
if (copy)
@@ -2586,7 +2665,8 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
HeapTuple
tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2606,7 +2686,8 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
IndexTuple
tuplesort_getindextuple(Tuplesortstate *state, bool forward)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2636,7 +2717,9 @@ bool
tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2649,10 +2732,10 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
MemoryContextSwitchTo(oldcontext);
/* Record abbreviated key for caller */
- if (state->sortKeys->abbrev_converter && abbrev)
+ if (base->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
- if (stup.isnull1 || !state->tuples)
+ if (stup.isnull1 || !base->tuples)
{
*val = stup.datum1;
*isNull = stup.isnull1;
@@ -2660,7 +2743,7 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
else
{
/* use stup.tuple because stup.datum1 may be an abbreviation */
- *val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
*isNull = false;
}
@@ -2713,7 +2796,7 @@ tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples, bool forward)
* We could probably optimize these cases better, but for now it's
* not worth the trouble.
*/
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
while (ntuples-- > 0)
{
SortTuple stup;
@@ -2989,7 +3072,7 @@ mergeruns(Tuplesortstate *state)
Assert(state->status == TSS_BUILDRUNS);
Assert(state->memtupcount == 0);
- if (state->sortKeys != NULL && state->sortKeys->abbrev_converter != NULL)
+ if (state->base.sortKeys != NULL && state->base.sortKeys->abbrev_converter != NULL)
{
/*
* If there are multiple runs to be merged, when we go to read back
@@ -2997,19 +3080,19 @@ mergeruns(Tuplesortstate *state)
* we don't care to regenerate them. Disable abbreviation from this
* point on.
*/
- state->sortKeys->abbrev_converter = NULL;
- state->sortKeys->comparator = state->sortKeys->abbrev_full_comparator;
+ state->base.sortKeys->abbrev_converter = NULL;
+ state->base.sortKeys->comparator = state->base.sortKeys->abbrev_full_comparator;
/* Not strictly necessary, but be tidy */
- state->sortKeys->abbrev_abort = NULL;
- state->sortKeys->abbrev_full_comparator = NULL;
+ state->base.sortKeys->abbrev_abort = NULL;
+ state->base.sortKeys->abbrev_full_comparator = NULL;
}
/*
* Reset tuple memory. We've freed all the tuples that we previously
* allocated. We will use the slab allocator from now on.
*/
- MemoryContextResetOnly(state->tuplecontext);
+ MemoryContextResetOnly(state->base.tuplecontext);
/*
* We no longer need a large memtuples array. (We will allocate a smaller
@@ -3032,7 +3115,7 @@ mergeruns(Tuplesortstate *state)
* From this point on, we no longer use the USEMEM()/LACKMEM() mechanism
* to track memory usage of individual tuples.
*/
- if (state->tuples)
+ if (state->base.tuples)
init_slab_allocator(state, state->nOutputTapes + 1);
else
init_slab_allocator(state, 0);
@@ -3046,7 +3129,7 @@ mergeruns(Tuplesortstate *state)
* number of input tapes will not increase between passes.)
*/
state->memtupsize = state->nOutputTapes;
- state->memtuples = (SortTuple *) MemoryContextAlloc(state->maincontext,
+ state->memtuples = (SortTuple *) MemoryContextAlloc(state->base.maincontext,
state->nOutputTapes * sizeof(SortTuple));
USEMEM(state, GetMemoryChunkSpace(state->memtuples));
@@ -3123,7 +3206,7 @@ mergeruns(Tuplesortstate *state)
* sorted tape, we can stop at this point and do the final merge
* on-the-fly.
*/
- if ((state->sortopt & TUPLESORT_RANDOMACCESS) == 0
+ if ((state->base.sortopt & TUPLESORT_RANDOMACCESS) == 0
&& state->nInputRuns <= state->nInputTapes
&& !WORKER(state))
{
@@ -3349,7 +3432,7 @@ dumptuples(Tuplesortstate *state, bool alltuples)
* AllocSetFree's bucketing by size class might be particularly bad if
* this step wasn't taken.
*/
- MemoryContextReset(state->tuplecontext);
+ MemoryContextReset(state->base.tuplecontext);
markrunend(state->destTape);
@@ -3367,9 +3450,9 @@ dumptuples(Tuplesortstate *state, bool alltuples)
void
tuplesort_rescan(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->base.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3400,9 +3483,9 @@ tuplesort_rescan(Tuplesortstate *state)
void
tuplesort_markpos(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->base.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3431,9 +3514,9 @@ tuplesort_markpos(Tuplesortstate *state)
void
tuplesort_restorepos(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->base.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3649,9 +3732,9 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
*/
- if (state->haveDatum1 && state->sortKeys)
+ if (state->base.haveDatum1 && state->base.sortKeys)
{
- if (state->sortKeys[0].comparator == ssup_datum_unsigned_cmp)
+ if (state->base.sortKeys[0].comparator == ssup_datum_unsigned_cmp)
{
qsort_tuple_unsigned(state->memtuples,
state->memtupcount,
@@ -3659,7 +3742,7 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
return;
}
#if SIZEOF_DATUM >= 8
- else if (state->sortKeys[0].comparator == ssup_datum_signed_cmp)
+ else if (state->base.sortKeys[0].comparator == ssup_datum_signed_cmp)
{
qsort_tuple_signed(state->memtuples,
state->memtupcount,
@@ -3667,7 +3750,7 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
return;
}
#endif
- else if (state->sortKeys[0].comparator == ssup_datum_int32_cmp)
+ else if (state->base.sortKeys[0].comparator == ssup_datum_int32_cmp)
{
qsort_tuple_int32(state->memtuples,
state->memtupcount,
@@ -3677,16 +3760,16 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
}
/* Can we use the single-key sort function? */
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
{
qsort_ssup(state->memtuples, state->memtupcount,
- state->onlyKey);
+ state->base.onlyKey);
}
else
{
qsort_tuple(state->memtuples,
state->memtupcount,
- state->comparetup,
+ state->base.comparetup,
state);
}
}
@@ -3803,10 +3886,10 @@ tuplesort_heap_replace_top(Tuplesortstate *state, SortTuple *tuple)
static void
reversedirection(Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ SortSupport sortKey = state->base.sortKeys;
int nkey;
- for (nkey = 0; nkey < state->nKeys; nkey++, sortKey++)
+ for (nkey = 0; nkey < state->base.nKeys; nkey++, sortKey++)
{
sortKey->ssup_reverse = !sortKey->ssup_reverse;
sortKey->ssup_nulls_first = !sortKey->ssup_nulls_first;
@@ -3857,7 +3940,7 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
Assert(state->slabFreeHead);
if (tuplen > SLAB_SLOT_SIZE || !state->slabFreeHead)
- return MemoryContextAlloc(state->sortcontext, tuplen);
+ return MemoryContextAlloc(state->base.sortcontext, tuplen);
else
{
buf = state->slabFreeHead;
@@ -3877,6 +3960,7 @@ static void
removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
{
int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
for (i = 0; i < count; i++)
{
@@ -3887,8 +3971,8 @@ removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
MINIMAL_TUPLE_OFFSET);
stups[i].datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
&stups[i].isnull1);
}
}
@@ -3896,7 +3980,8 @@ removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys;
HeapTupleData ltup;
HeapTupleData rtup;
TupleDesc tupDesc;
@@ -3921,7 +4006,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
- tupDesc = state->tupDesc;
+ tupDesc = (TupleDesc) base->arg;
if (sortKey->abbrev_converter)
{
@@ -3938,7 +4023,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
}
sortKey++;
- for (nkey = 1; nkey < state->nKeys; nkey++, sortKey++)
+ for (nkey = 1; nkey < base->nKeys; nkey++, sortKey++)
{
attno = sortKey->ssup_attno;
@@ -3958,6 +4043,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MinimalTuple tuple = (MinimalTuple) stup->tuple;
/* the part of the MinimalTuple we'll write: */
@@ -3969,8 +4055,7 @@ writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
}
@@ -3982,21 +4067,21 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
MinimalTuple tuple = (MinimalTuple) readtup_alloc(state, tuplen);
char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
HeapTupleData htup;
/* read in the tuple proper */
tuple->t_len = tuplen;
LogicalTapeReadExact(tape, tupbody, tupbodylen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
stup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
&stup->isnull1);
}
@@ -4009,6 +4094,8 @@ static void
removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
{
int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
for (i = 0; i < count; i++)
{
@@ -4016,8 +4103,8 @@ removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
tup = (HeapTuple) stups[i].tuple;
stups[i].datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stups[i].isnull1);
}
}
@@ -4026,7 +4113,9 @@ static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
HeapTuple ltup;
HeapTuple rtup;
TupleDesc tupDesc;
@@ -4040,10 +4129,10 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
/* Be prepared to compare additional sort keys */
ltup = (HeapTuple) a->tuple;
rtup = (HeapTuple) b->tuple;
- tupDesc = state->tupDesc;
+ tupDesc = arg->tupDesc;
/* Compare the leading sort key, if it's simple */
- if (state->haveDatum1)
+ if (base->haveDatum1)
{
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
@@ -4053,7 +4142,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
if (sortKey->abbrev_converter)
{
- AttrNumber leading = state->indexInfo->ii_IndexAttrNumbers[0];
+ AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
@@ -4062,7 +4151,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
datum2, isnull2,
sortKey);
}
- if (compare != 0 || state->nKeys == 1)
+ if (compare != 0 || base->nKeys == 1)
return compare;
/* Compare additional columns the hard way */
sortKey++;
@@ -4074,13 +4163,13 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
nkey = 0;
}
- if (state->indexInfo->ii_Expressions == NULL)
+ if (arg->indexInfo->ii_Expressions == NULL)
{
/* If not expression index, just compare the proper heap attrs */
- for (; nkey < state->nKeys; nkey++, sortKey++)
+ for (; nkey < base->nKeys; nkey++, sortKey++)
{
- AttrNumber attno = state->indexInfo->ii_IndexAttrNumbers[nkey];
+ AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
@@ -4107,19 +4196,19 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
TupleTableSlot *ecxt_scantuple;
/* Reset context each time to prevent memory leakage */
- ResetPerTupleExprContext(state->estate);
+ ResetPerTupleExprContext(arg->estate);
- ecxt_scantuple = GetPerTupleExprContext(state->estate)->ecxt_scantuple;
+ ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
- FormIndexDatum(state->indexInfo, ecxt_scantuple, state->estate,
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
l_index_values, l_index_isnull);
ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
- FormIndexDatum(state->indexInfo, ecxt_scantuple, state->estate,
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
r_index_values, r_index_isnull);
- for (; nkey < state->nKeys; nkey++, sortKey++)
+ for (; nkey < base->nKeys; nkey++, sortKey++)
{
compare = ApplySortComparator(l_index_values[nkey],
l_index_isnull[nkey],
@@ -4137,6 +4226,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
HeapTuple tuple = (HeapTuple) stup->tuple;
unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
@@ -4144,8 +4234,7 @@ writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
}
@@ -4153,6 +4242,8 @@ static void
readtup_cluster(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int tuplen)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
HeapTuple tuple = (HeapTuple) readtup_alloc(state,
t_len + HEAPTUPLESIZE);
@@ -4165,18 +4256,33 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
tuple->t_tableOid = InvalidOid;
/* Read in the tuple body */
LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value, if it's a simple column */
- if (state->haveDatum1)
+ if (base->haveDatum1)
stup->datum1 = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup->isnull1);
}
+static void
+freestate_cluster(Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ /* Free any execution state created for CLUSTER case */
+ if (arg->estate != NULL)
+ {
+ ExprContext *econtext = GetPerTupleExprContext(arg->estate);
+
+ ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
+ FreeExecutorState(arg->estate);
+ }
+}
+
/*
* Routines specialized for IndexTuple case
*
@@ -4188,6 +4294,8 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
static void
removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
int i;
for (i = 0; i < count; i++)
@@ -4197,7 +4305,7 @@ removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
tuple = stups[i].tuple;
stups[i].datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stups[i].isnull1);
}
}
@@ -4211,7 +4319,9 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* is also special handling for enforcing uniqueness, and special
* treatment for equal keys at the end.
*/
- SortSupport sortKey = state->sortKeys;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
IndexTuple tuple1;
IndexTuple tuple2;
int keysz;
@@ -4235,8 +4345,8 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
/* Compare additional sort keys */
tuple1 = (IndexTuple) a->tuple;
tuple2 = (IndexTuple) b->tuple;
- keysz = state->nKeys;
- tupDes = RelationGetDescr(state->indexRel);
+ keysz = base->nKeys;
+ tupDes = RelationGetDescr(arg->index.indexRel);
if (sortKey->abbrev_converter)
{
@@ -4281,7 +4391,7 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* sort algorithm wouldn't have checked whether one must appear before the
* other.
*/
- if (state->enforceUnique && !(!state->uniqueNullsNotDistinct && equal_hasnull))
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
{
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
@@ -4297,16 +4407,16 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
index_deform_tuple(tuple1, tupDes, values, isnull);
- key_desc = BuildIndexValueDescription(state->indexRel, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
ereport(ERROR,
(errcode(ERRCODE_UNIQUE_VIOLATION),
errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(state->indexRel)),
+ RelationGetRelationName(arg->index.indexRel)),
key_desc ? errdetail("Key %s is duplicated.", key_desc) :
errdetail("Duplicate keys exist."),
- errtableconstraint(state->heapRel,
- RelationGetRelationName(state->indexRel))));
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
}
/*
@@ -4344,6 +4454,8 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Bucket bucket2;
IndexTuple tuple1;
IndexTuple tuple2;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexHashArg *arg = (TuplesortIndexHashArg *) base->arg;
/*
* Fetch hash keys and mask off bits we don't want to sort by. We know
@@ -4351,12 +4463,12 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
*/
Assert(!a->isnull1);
bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
- state->max_buckets, state->high_mask,
- state->low_mask);
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
Assert(!b->isnull1);
bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
- state->max_buckets, state->high_mask,
- state->low_mask);
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
if (bucket1 > bucket2)
return 1;
else if (bucket1 < bucket2)
@@ -4394,14 +4506,14 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
IndexTuple tuple = (IndexTuple) stup->tuple;
unsigned int tuplen;
tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
}
@@ -4409,18 +4521,19 @@ static void
readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
unsigned int tuplen = len - sizeof(unsigned int);
IndexTuple tuple = (IndexTuple) readtup_alloc(state, tuplen);
LogicalTapeReadExact(tape, tuple, tuplen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value */
stup->datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup->isnull1);
}
@@ -4440,20 +4553,21 @@ removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
int compare;
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- state->sortKeys);
+ base->sortKeys);
if (compare != 0)
return compare;
/* if we have abbreviations, then "tuple" has the original value */
- if (state->sortKeys->abbrev_converter)
+ if (base->sortKeys->abbrev_converter)
compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
PointerGetDatum(b->tuple), b->isnull1,
- state->sortKeys);
+ base->sortKeys);
return compare;
}
@@ -4461,6 +4575,8 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
void *waddr;
unsigned int tuplen;
unsigned int writtenlen;
@@ -4470,7 +4586,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
waddr = NULL;
tuplen = 0;
}
- else if (!state->tuples)
+ else if (!base->tuples)
{
waddr = &stup->datum1;
tuplen = sizeof(Datum);
@@ -4478,7 +4594,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
else
{
waddr = stup->tuple;
- tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, state->datumTypeLen);
+ tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
Assert(tuplen != 0);
}
@@ -4486,8 +4602,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
LogicalTapeWrite(tape, waddr, tuplen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
}
@@ -4495,6 +4610,7 @@ static void
readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
unsigned int tuplen = len - sizeof(unsigned int);
if (tuplen == 0)
@@ -4504,7 +4620,7 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
stup->isnull1 = true;
stup->tuple = NULL;
}
- else if (!state->tuples)
+ else if (!base->tuples)
{
Assert(tuplen == sizeof(Datum));
LogicalTapeReadExact(tape, &stup->datum1, tuplen);
@@ -4521,8 +4637,7 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
stup->tuple = raddr;
}
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 34a76ceb60f..1f88be06aa1 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2833,8 +2833,14 @@ TupleHashTable
TupleQueueReader
TupleTableSlot
TupleTableSlotOps
+TuplesortClusterArg
+TuplesortDatumArg
+TuplesortIndexArg
+TuplesortIndexBTreeArg
+TuplesortIndexHashArg
TuplesortInstrumentation
TuplesortMethod
+TuplesortPublic
TuplesortSpaceType
Tuplesortstate
Tuplestorestate
--
2.24.3 (Apple Git-128)
0004-Move-memory-management-away-from-writetup-and-tup-v2.patchapplication/x-patch; name=0004-Move-memory-management-away-from-writetup-and-tup-v2.patchDownload
From e12d74ce7fb1fd79f433945113aa948f64e8cc2c Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 00:14:51 +0300
Subject: [PATCH 4/6] Move memory management away from writetup() and
tuplesort_put*()
This commit puts some generic work away from sort-variant-specific function.
In particular, tuplesort_put*() now doesn't need to decrease available memory
and switch to sort context before calling puttuple_common(). writetup()
doesn't need to free SortTuple.tuple and increase available memory.
---
src/backend/utils/sort/tuplesort.c | 78 +++++++++++++-----------------
1 file changed, 33 insertions(+), 45 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 828efe701e5..c8c511fb8c5 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -288,11 +288,7 @@ struct Tuplesortstate
/*
* Function to write a stored tuple onto tape. The representation of the
- * tuple on tape need not be the same as it is in memory; requirements on
- * the tape representation are given below. Unless the slab allocator is
- * used, after writing the tuple, pfree() the out-of-line data (not the
- * SortTuple struct!), and increase state->availMem by the amount of
- * memory space thereby released.
+ * tuple on tape need not be the same as it is in memory.
*/
void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
@@ -549,7 +545,7 @@ struct Sharedsort
#define REMOVEABBREV(state,stup,count) ((*(state)->removeabbrev) (state, stup, count))
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
-#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
+#define WRITETUP(state,tape,stup) (writetuple(state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
#define USEMEM(state,amt) ((state)->availMem -= (amt))
@@ -618,6 +614,8 @@ static Tuplesortstate *tuplesort_begin_common(int workMem,
static void tuplesort_begin_batch(Tuplesortstate *state);
static void puttuple_common(Tuplesortstate *state, SortTuple *tuple,
bool useAbbrev);
+static void writetuple(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
static bool consider_abort_common(Tuplesortstate *state);
static void inittapes(Tuplesortstate *state, bool mergeruns);
static void inittapestate(Tuplesortstate *state, int maxTapes);
@@ -1848,7 +1846,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
/* copy the tuple into sort storage */
tuple = ExecCopySlotMinimalTuple(slot);
stup.tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
@@ -1857,8 +1854,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
state->tupDesc,
&stup.isnull1);
- MemoryContextSwitchTo(state->sortcontext);
-
puttuple_common(state, &stup,
state->sortKeys->abbrev_converter && !stup.isnull1);
@@ -1879,9 +1874,6 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
stup.tuple = (void *) tup;
- USEMEM(state, GetMemoryChunkSpace(tup));
-
- MemoryContextSwitchTo(state->sortcontext);
/*
* set up first-column key value, and potentially abbreviate, if it's a
@@ -1910,7 +1902,6 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
ItemPointer self, Datum *values,
bool *isnull)
{
- MemoryContext oldcontext;
SortTuple stup;
IndexTuple tuple;
@@ -1918,19 +1909,14 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
isnull, state->tuplecontext);
tuple = ((IndexTuple) stup.tuple);
tuple->t_tid = *self;
- USEMEM(state, GetMemoryChunkSpace(stup.tuple));
/* set up first-column key value */
stup.datum1 = index_getattr(tuple,
1,
RelationGetDescr(state->indexRel),
&stup.isnull1);
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
-
puttuple_common(state, &stup,
state->sortKeys && state->sortKeys->abbrev_converter && !stup.isnull1);
-
- MemoryContextSwitchTo(oldcontext);
}
/*
@@ -1965,15 +1951,12 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
stup.datum1 = !isNull ? val : (Datum) 0;
stup.isnull1 = isNull;
stup.tuple = NULL; /* no separate storage */
- MemoryContextSwitchTo(state->sortcontext);
}
else
{
stup.isnull1 = false;
stup.datum1 = datumCopy(val, false, state->datumTypeLen);
stup.tuple = DatumGetPointer(stup.datum1);
- USEMEM(state, GetMemoryChunkSpace(stup.tuple));
- MemoryContextSwitchTo(state->sortcontext);
}
puttuple_common(state, &stup,
@@ -1988,8 +1971,14 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
static void
puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+
Assert(!LEADER(state));
+ /* Count the size of the out-of-line data */
+ if (tuple->tuple != NULL)
+ USEMEM(state, GetMemoryChunkSpace(tuple->tuple));
+
if (!useAbbrev)
{
/*
@@ -2062,6 +2051,7 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
pg_rusage_show(&state->ru_start));
#endif
make_bounded_heap(state);
+ MemoryContextSwitchTo(oldcontext);
return;
}
@@ -2069,7 +2059,10 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
* Done if we still fit in available memory and have array slots.
*/
if (state->memtupcount < state->memtupsize && !LACKMEM(state))
+ {
+ MemoryContextSwitchTo(oldcontext);
return;
+ }
/*
* Nope; time to switch to tape-based operation.
@@ -2123,6 +2116,25 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
elog(ERROR, "invalid tuplesort state");
break;
}
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Write a stored tuple onto tape.tuple. Unless the slab allocator is
+ * used, after writing the tuple, pfree() the out-of-line data (not the
+ * SortTuple struct!), and increase state->availMem by the amount of
+ * memory space thereby released.
+ */
+static void
+writetuple(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ state->writetup(state, tape, stup);
+
+ if (!state->slabAllocatorUsed && stup->tuple)
+ {
+ FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
+ pfree(stup->tuple);
+ }
}
static bool
@@ -3960,12 +3972,6 @@ writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- heap_free_minimal_tuple(tuple);
- }
}
static void
@@ -4141,12 +4147,6 @@ writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- heap_freetuple(tuple);
- }
}
static void
@@ -4403,12 +4403,6 @@ writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- pfree(tuple);
- }
}
static void
@@ -4495,12 +4489,6 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
-
- if (!state->slabAllocatorUsed && stup->tuple)
- {
- FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
- pfree(stup->tuple);
- }
}
static void
--
2.24.3 (Apple Git-128)
0006-Split-tuplesortvariants.c-from-tuplesort.c-v2.patchapplication/x-patch; name=0006-Split-tuplesortvariants.c-from-tuplesort.c-v2.patchDownload
From d43afcde5982e01ce2acbcbd5826ceec3475b030 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 21:48:05 +0300
Subject: [PATCH 6/6] Split tuplesortvariants.c from tuplesort.c
This commit puts the implementation of Tuple sort variants into the separate
file tuplesortvariants.c. That gives better separation of the code and
serves well as the demonstration that Tuple sort variant can be defined outside
of tuplesort.c.
---
src/backend/utils/sort/Makefile | 1 +
src/backend/utils/sort/tuplesort.c | 1721 +-------------------
src/backend/utils/sort/tuplesortvariants.c | 1572 ++++++++++++++++++
src/include/utils/tuplesort.h | 221 ++-
4 files changed, 1774 insertions(+), 1741 deletions(-)
create mode 100644 src/backend/utils/sort/tuplesortvariants.c
diff --git a/src/backend/utils/sort/Makefile b/src/backend/utils/sort/Makefile
index 26f65fcaf7a..8e87699fdd2 100644
--- a/src/backend/utils/sort/Makefile
+++ b/src/backend/utils/sort/Makefile
@@ -19,6 +19,7 @@ OBJS = \
sharedtuplestore.o \
sortsupport.o \
tuplesort.o \
+ tuplesortvariants.o \
tuplestore.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 4a9aeb8799a..00abbe56dff 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -100,35 +100,17 @@
#include <limits.h>
-#include "access/hash.h"
-#include "access/htup_details.h"
-#include "access/nbtree.h"
-#include "catalog/index.h"
#include "catalog/pg_am.h"
#include "commands/tablespace.h"
#include "executor/executor.h"
#include "miscadmin.h"
#include "pg_trace.h"
-#include "utils/datum.h"
-#include "utils/logtape.h"
-#include "utils/lsyscache.h"
+#include "storage/shmem.h"
#include "utils/memutils.h"
#include "utils/pg_rusage.h"
#include "utils/rel.h"
-#include "utils/sortsupport.h"
#include "utils/tuplesort.h"
-
-/* sort-type codes for sort__start probes */
-#define HEAP_SORT 0
-#define INDEX_SORT 1
-#define DATUM_SORT 2
-#define CLUSTER_SORT 3
-
-/* Sort parallel code from state for sort__start probes */
-#define PARALLEL_SORT(coordinate) ((coordinate)->sharedsort == NULL ? 0 : \
- (coordinate)->isWorker >= 0 ? 1 : 2)
-
/*
* Initial size of memtuples array. We're trying to select this size so that
* array doesn't exceed ALLOCSET_SEPARATE_THRESHOLD and so that the overhead of
@@ -149,43 +131,6 @@ bool optimize_bounded_sort = true;
#endif
-/*
- * The objects we actually sort are SortTuple structs. These contain
- * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
- * which is a separate palloc chunk --- we assume it is just one chunk and
- * can be freed by a simple pfree() (except during merge, when we use a
- * simple slab allocator). SortTuples also contain the tuple's first key
- * column in Datum/nullflag format, and a source/input tape number that
- * tracks which tape each heap element/slot belongs to during merging.
- *
- * Storing the first key column lets us save heap_getattr or index_getattr
- * calls during tuple comparisons. We could extract and save all the key
- * columns not just the first, but this would increase code complexity and
- * overhead, and wouldn't actually save any comparison cycles in the common
- * case where the first key determines the comparison result. Note that
- * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
- *
- * There is one special case: when the sort support infrastructure provides an
- * "abbreviated key" representation, where the key is (typically) a pass by
- * value proxy for a pass by reference type. In this case, the abbreviated key
- * is stored in datum1 in place of the actual first key column.
- *
- * When sorting single Datums, the data value is represented directly by
- * datum1/isnull1 for pass by value types (or null values). If the datatype is
- * pass-by-reference and isnull1 is false, then "tuple" points to a separately
- * palloc'd data value, otherwise "tuple" is NULL. The value of datum1 is then
- * either the same pointer as "tuple", or is an abbreviated key value as
- * described above. Accordingly, "tuple" is always used in preference to
- * datum1 as the authoritative value for pass-by-reference cases.
- */
-typedef struct
-{
- void *tuple; /* the tuple itself */
- Datum datum1; /* value of first key column */
- bool isnull1; /* is first key column NULL? */
- int srctape; /* source tape number */
-} SortTuple;
-
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
* tuples. To avoid palloc/pfree overhead.
@@ -236,155 +181,6 @@ typedef enum
#define TAPE_BUFFER_OVERHEAD BLCKSZ
#define MERGE_BUFFER_SIZE (BLCKSZ * 32)
-typedef struct TuplesortPublic TuplesortPublic;
-
-typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-
-/*
- * The public part of a Tuple sort operation state. This data structure
- * containsthe definition of sort-variant-specific interface methods and
- * the part of Tuple sort operation state required by their implementations.
- */
-struct TuplesortPublic
-{
- /*
- * These function pointers decouple the routines that must know what kind
- * of tuple we are sorting from the routines that don't need to know it.
- * They are set up by the tuplesort_begin_xxx routines.
- *
- * Function to compare two tuples; result is per qsort() convention, ie:
- * <0, 0, >0 according as a<b, a=b, a>b. The API must match
- * qsort_arg_comparator.
- */
- SortTupleComparator comparetup;
-
- /*
- * Alter datum1 representation in the SortTuple's array back from the
- * abbreviated key to the first column value.
- */
- void (*removeabbrev) (Tuplesortstate *state, SortTuple *stups,
- int count);
-
- /*
- * Function to write a stored tuple onto tape. The representation of the
- * tuple on tape need not be the same as it is in memory.
- */
- void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-
- /*
- * Function to read a stored tuple from tape back into memory. 'len' is
- * the already-read length of the stored tuple. The tuple is allocated
- * from the slab memory arena, or is palloc'd, see readtup_alloc().
- */
- void (*readtup) (Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-
- /*
- * Function to do some specific release of resources for the sort variant.
- * In particular, this function should free everything stored in the "arg"
- * field, which wouldn't be cleared on reset of the Tuple sort memory
- * contextes. This can be NULL if nothing specific needs to be done.
- */
- void (*freestate) (Tuplesortstate *state);
-
- /*
- * The subsequent fields are used in the implementations of the functions
- * above.
- */
- MemoryContext maincontext; /* memory context for tuple sort metadata that
- * persists across multiple batches */
- MemoryContext sortcontext; /* memory context holding most sort data */
- MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
-
- /*
- * Whether SortTuple's datum1 and isnull1 members are maintained by the
- * above routines. If not, some sort specializations are disabled.
- */
- bool haveDatum1;
-
- /*
- * The sortKeys variable is used by every case other than the hash index
- * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
- * MinimalTuple and CLUSTER routines, though.
- */
- int nKeys; /* number of columns in sort key */
- SortSupport sortKeys; /* array of length nKeys */
-
- /*
- * This variable is shared by the single-key MinimalTuple case and the
- * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
- * presence of a value in this field is also checked by various sort
- * specialization functions as an optimization when comparing the leading
- * key in a tiebreak situation to determine if there are any subsequent
- * keys to sort on.
- */
- SortSupport onlyKey;
-
- int sortopt; /* Bitmask of flags used to setup sort */
-
- bool tuples; /* Can SortTuple.tuple ever be set? */
-
- void *arg; /* Specific information for the sort variant */
-};
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
- * the tuplesort_begin_cluster.
- */
-typedef struct
-{
- TupleDesc tupDesc;
-
- IndexInfo *indexInfo; /* info about index being used for reference */
- EState *estate; /* for evaluating index expressions */
-} TuplesortClusterArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the IndexTuple case.
- * Set by tuplesort_begin_index_xxx and used only by the IndexTuple routines.
- */
-typedef struct
-{
- Relation heapRel; /* table the index is being built on */
- Relation indexRel; /* index being built */
-} TuplesortIndexArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the index_btree subcase.
- */
-typedef struct
-{
- TuplesortIndexArg index;
-
- bool enforceUnique; /* complain if we find duplicate tuples */
- bool uniqueNullsNotDistinct; /* unique constraint null treatment */
-} TuplesortIndexBTreeArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the index_hash subcase.
- */
-typedef struct
-{
- TuplesortIndexArg index;
-
- uint32 high_mask; /* masks for sortable part of hash code */
- uint32 low_mask;
- uint32 max_buckets;
-} TuplesortIndexHashArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the Datum case.
- * Set by tuplesort_begin_datum and used only by the DatumTuple routines.
- */
-typedef struct
-{
- /* the datatype oid of Datum's to be sorted */
- Oid datumType;
- /* we need typelen in order to know how to copy the Datums. */
- int datumTypeLen;
-} TuplesortDatumArg;
/*
* Private state of a Tuplesort operation.
@@ -596,8 +392,6 @@ struct Sharedsort
pfree(buf); \
} while(0)
-#define TuplesortstateGetPublic(state) ((TuplesortPublic *) state);
-
#define REMOVEABBREV(state,stup,count) ((*(state)->base.removeabbrev) (state, stup, count))
#define COMPARETUP(state,a,b) ((*(state)->base.comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) (writetuple(state, tape, stup))
@@ -656,20 +450,8 @@ struct Sharedsort
* begins).
*/
-/* When using this macro, beware of double evaluation of len */
-#define LogicalTapeReadExact(tape, ptr, len) \
- do { \
- if (LogicalTapeRead(tape, ptr, len) != (size_t) (len)) \
- elog(ERROR, "unexpected end of data"); \
- } while(0)
-
-static Tuplesortstate *tuplesort_begin_common(int workMem,
- SortCoordinate coordinate,
- int sortopt);
static void tuplesort_begin_batch(Tuplesortstate *state);
-static void puttuple_common(Tuplesortstate *state, SortTuple *tuple,
- bool useAbbrev);
static void writetuple(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static bool consider_abort_common(Tuplesortstate *state);
@@ -691,42 +473,6 @@ static void tuplesort_heap_delete_top(Tuplesortstate *state);
static void reversedirection(Tuplesortstate *state);
static unsigned int getlen(LogicalTape *tape, bool eofOK);
static void markrunend(LogicalTape *tape);
-static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
-static void removeabbrev_heap(Tuplesortstate *state, SortTuple *stups,
- int count);
-static void removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups,
- int count);
-static void removeabbrev_index(Tuplesortstate *state, SortTuple *stups,
- int count);
-static void removeabbrev_datum(Tuplesortstate *state, SortTuple *stups,
- int count);
-static int comparetup_heap(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_index(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_datum(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static void freestate_cluster(Tuplesortstate *state);
static int worker_get_identifier(Tuplesortstate *state);
static void worker_freeze_result_tape(Tuplesortstate *state);
static void worker_nomergeruns(Tuplesortstate *state);
@@ -897,7 +643,7 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* sort options. See TUPLESORT_* definitions in tuplesort.h
*/
-static Tuplesortstate *
+Tuplesortstate *
tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
{
Tuplesortstate *state;
@@ -1083,468 +829,6 @@ tuplesort_begin_batch(Tuplesortstate *state)
MemoryContextSwitchTo(oldcontext);
}
-Tuplesortstate *
-tuplesort_begin_heap(TupleDesc tupDesc,
- int nkeys, AttrNumber *attNums,
- Oid *sortOperators, Oid *sortCollations,
- bool *nullsFirstFlags,
- int workMem, SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- int i;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
-
- AssertArg(nkeys > 0);
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
- nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = nkeys;
-
- TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
- false, /* no unique check */
- nkeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_heap;
- base->comparetup = comparetup_heap;
- base->writetup = writetup_heap;
- base->readtup = readtup_heap;
- base->haveDatum1 = true;
- base->arg = tupDesc; /* assume we need not copy tupDesc */
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
-
- for (i = 0; i < nkeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
-
- AssertArg(attNums[i] != 0);
- AssertArg(sortOperators[i] != 0);
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = sortCollations[i];
- sortKey->ssup_nulls_first = nullsFirstFlags[i];
- sortKey->ssup_attno = attNums[i];
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
- }
-
- /*
- * The "onlyKey" optimization cannot be used with abbreviated keys, since
- * tie-breaker comparisons may be required. Typically, the optimization
- * is only of value to pass-by-value types anyway, whereas abbreviated
- * keys are typically only of value to pass-by-reference types.
- */
- if (nkeys == 1 && !base->sortKeys->abbrev_converter)
- base->onlyKey = base->sortKeys;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_cluster(TupleDesc tupDesc,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- BTScanInsert indexScanKey;
- MemoryContext oldcontext;
- TuplesortClusterArg *arg;
- int i;
-
- Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortClusterArg *) palloc0(sizeof(TuplesortClusterArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
- RelationGetNumberOfAttributes(indexRel),
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
- false, /* no unique check */
- base->nKeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_cluster;
- base->comparetup = comparetup_cluster;
- base->writetup = writetup_cluster;
- base->readtup = readtup_cluster;
- base->freestate = freestate_cluster;
- base->arg = arg;
-
- arg->indexInfo = BuildIndexInfo(indexRel);
-
- /*
- * If we don't have a simple leading attribute, we don't currently
- * initialize datum1, so disable optimizations that require it.
- */
- if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
- base->haveDatum1 = false;
- else
- base->haveDatum1 = true;
-
- arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
-
- indexScanKey = _bt_mkscankey(indexRel, NULL);
-
- if (arg->indexInfo->ii_Expressions != NULL)
- {
- TupleTableSlot *slot;
- ExprContext *econtext;
-
- /*
- * We will need to use FormIndexDatum to evaluate the index
- * expressions. To do that, we need an EState, as well as a
- * TupleTableSlot to put the table tuples into. The econtext's
- * scantuple has to point to that slot, too.
- */
- arg->estate = CreateExecutorState();
- slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
- econtext = GetPerTupleExprContext(arg->estate);
- econtext->ecxt_scantuple = slot;
- }
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(base->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < base->nKeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
- ScanKey scanKey = indexScanKey->scankeys + i;
- int16 strategy;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = scanKey->sk_collation;
- sortKey->ssup_nulls_first =
- (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
- sortKey->ssup_attno = scanKey->sk_attno;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
- BTGreaterStrategyNumber : BTLessStrategyNumber;
-
- PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
- }
-
- pfree(indexScanKey);
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_btree(Relation heapRel,
- Relation indexRel,
- bool enforceUnique,
- bool uniqueNullsNotDistinct,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- BTScanInsert indexScanKey;
- TuplesortIndexBTreeArg *arg;
- MemoryContext oldcontext;
- int i;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: unique = %c, workMem = %d, randomAccess = %c",
- enforceUnique ? 't' : 'f',
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
- enforceUnique,
- state->nKeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_index;
- base->comparetup = comparetup_index_btree;
- base->writetup = writetup_index;
- base->readtup = readtup_index;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
- arg->enforceUnique = enforceUnique;
- arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
-
- indexScanKey = _bt_mkscankey(indexRel, NULL);
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(base->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < base->nKeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
- ScanKey scanKey = indexScanKey->scankeys + i;
- int16 strategy;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = scanKey->sk_collation;
- sortKey->ssup_nulls_first =
- (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
- sortKey->ssup_attno = scanKey->sk_attno;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
- BTGreaterStrategyNumber : BTLessStrategyNumber;
-
- PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
- }
-
- pfree(indexScanKey);
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_hash(Relation heapRel,
- Relation indexRel,
- uint32 high_mask,
- uint32 low_mask,
- uint32 max_buckets,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- TuplesortIndexHashArg *arg;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexHashArg *) palloc(sizeof(TuplesortIndexHashArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: high_mask = 0x%x, low_mask = 0x%x, "
- "max_buckets = 0x%x, workMem = %d, randomAccess = %c",
- high_mask,
- low_mask,
- max_buckets,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = 1; /* Only one sort column, the hash code */
-
- base->removeabbrev = removeabbrev_index;
- base->comparetup = comparetup_index_hash;
- base->writetup = writetup_index;
- base->readtup = readtup_index;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
-
- arg->high_mask = high_mask;
- arg->low_mask = low_mask;
- arg->max_buckets = max_buckets;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_gist(Relation heapRel,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- TuplesortIndexBTreeArg *arg;
- int i;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: workMem = %d, randomAccess = %c",
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- base->removeabbrev = removeabbrev_index;
- base->comparetup = comparetup_index_btree;
- base->writetup = writetup_index;
- base->readtup = readtup_index;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
- arg->enforceUnique = false;
- arg->uniqueNullsNotDistinct = false;
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(base->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < base->nKeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = indexRel->rd_indcollation[i];
- sortKey->ssup_nulls_first = false;
- sortKey->ssup_attno = i + 1;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- /* Look for a sort support function */
- PrepareSortSupportFromGistIndexRel(indexRel, sortKey);
- }
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
- bool nullsFirstFlag, int workMem,
- SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortDatumArg *arg;
- MemoryContext oldcontext;
- int16 typlen;
- bool typbyval;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortDatumArg *) palloc(sizeof(TuplesortDatumArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin datum sort: workMem = %d, randomAccess = %c",
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = 1; /* always a one-column sort */
-
- TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
- false, /* no unique check */
- 1,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_datum;
- base->comparetup = comparetup_datum;
- base->writetup = writetup_datum;
- base->readtup = readtup_datum;
- state->abbrevNext = 10;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->datumType = datumType;
-
- /* lookup necessary attributes of the datum type */
- get_typlenbyval(datumType, &typlen, &typbyval);
- arg->datumTypeLen = typlen;
- base->tuples = !typbyval;
-
- /* Prepare SortSupport data */
- base->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
-
- base->sortKeys->ssup_cxt = CurrentMemoryContext;
- base->sortKeys->ssup_collation = sortCollation;
- base->sortKeys->ssup_nulls_first = nullsFirstFlag;
-
- /*
- * Abbreviation is possible here only for by-reference types. In theory,
- * a pass-by-value datatype could have an abbreviated form that is cheaper
- * to compare. In a tuple sort, we could support that, because we can
- * always extract the original datum from the tuple as needed. Here, we
- * can't, because a datum sort only stores a single copy of the datum; the
- * "tuple" field of each SortTuple is NULL.
- */
- base->sortKeys->abbreviate = !typbyval;
-
- PrepareSortSupportFromOrderingOp(sortOperator, base->sortKeys);
-
- /*
- * The "onlyKey" optimization cannot be used with abbreviated keys, since
- * tie-breaker comparisons may be required. Typically, the optimization
- * is only of value to pass-by-value types anyway, whereas abbreviated
- * keys are typically only of value to pass-by-reference types.
- */
- if (!base->sortKeys->abbrev_converter)
- base->onlyKey = base->sortKeys;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
/*
* tuplesort_set_bound
*
@@ -1900,154 +1184,11 @@ noalloc:
return false;
}
-/*
- * Accept one tuple while collecting input data for sort.
- *
- * Note that the input data is always copied; the caller need not save it.
- */
-void
-tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TupleDesc tupDesc = (TupleDesc) base->arg;
- SortTuple stup;
- MinimalTuple tuple;
- HeapTupleData htup;
-
- /* copy the tuple into sort storage */
- tuple = ExecCopySlotMinimalTuple(slot);
- stup.tuple = (void *) tuple;
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- stup.datum1 = heap_getattr(&htup,
- base->sortKeys[0].ssup_attno,
- tupDesc,
- &stup.isnull1);
-
- puttuple_common(state, &stup,
- base->sortKeys->abbrev_converter && !stup.isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Accept one tuple while collecting input data for sort.
- *
- * Note that the input data is always copied; the caller need not save it.
- */
-void
-tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
-{
- SortTuple stup;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
-
- /* copy the tuple into sort storage */
- tup = heap_copytuple(tup);
- stup.tuple = (void *) tup;
-
- /*
- * set up first-column key value, and potentially abbreviate, if it's a
- * simple column
- */
- if (base->haveDatum1)
- {
- stup.datum1 = heap_getattr(tup,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup.isnull1);
- }
-
- puttuple_common(state, &stup,
- base->haveDatum1 && base->sortKeys->abbrev_converter && !stup.isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Collect one index tuple while collecting input data for sort, building
- * it from caller-supplied values.
- */
-void
-tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
- ItemPointer self, Datum *values,
- bool *isnull)
-{
- SortTuple stup;
- IndexTuple tuple;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
-
- stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
- isnull, base->tuplecontext);
- tuple = ((IndexTuple) stup.tuple);
- tuple->t_tid = *self;
- /* set up first-column key value */
- stup.datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup.isnull1);
-
- puttuple_common(state, &stup,
- base->sortKeys && base->sortKeys->abbrev_converter && !stup.isnull1);
-}
-
-/*
- * Accept one Datum while collecting input data for sort.
- *
- * If the Datum is pass-by-ref type, the value will be copied.
- */
-void
-tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- SortTuple stup;
-
- /*
- * Pass-by-value types or null values are just stored directly in
- * stup.datum1 (and stup.tuple is not used and set to NULL).
- *
- * Non-null pass-by-reference values need to be copied into memory we
- * control, and possibly abbreviated. The copied value is pointed to by
- * stup.tuple and is treated as the canonical copy (e.g. to return via
- * tuplesort_getdatum or when writing to tape); stup.datum1 gets the
- * abbreviated value if abbreviation is happening, otherwise it's
- * identical to stup.tuple.
- */
-
- if (isNull || !base->tuples)
- {
- /*
- * Set datum1 to zeroed representation for NULLs (to be consistent,
- * and to support cheap inequality tests for NULL abbreviated keys).
- */
- stup.datum1 = !isNull ? val : (Datum) 0;
- stup.isnull1 = isNull;
- stup.tuple = NULL; /* no separate storage */
- }
- else
- {
- stup.isnull1 = false;
- stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
- stup.tuple = DatumGetPointer(stup.datum1);
- }
-
- puttuple_common(state, &stup,
- base->tuples && !isNull && base->sortKeys->abbrev_converter);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
/*
* Shared code for tuple and datum cases.
*/
-static void
-puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
+void
+tuplesort_puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
@@ -2370,7 +1511,7 @@ tuplesort_performsort(Tuplesortstate *state)
* by caller. Note that fetched tuple is stored in memory that may be
* recycled by any future fetch.
*/
-static bool
+bool
tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
SortTuple *stup)
{
@@ -2594,162 +1735,17 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
}
newtup.srctape = srcTapeIndex;
tuplesort_heap_replace_top(state, &newtup);
- return true;
- }
- return false;
-
- default:
- elog(ERROR, "invalid tuplesort state");
- return false; /* keep compiler quiet */
- }
-}
-
-/*
- * Fetch the next tuple in either forward or back direction.
- * If successful, put tuple in slot and return true; else, clear the slot
- * and return false.
- *
- * Caller may optionally be passed back abbreviated value (on true return
- * value) when abbreviation was used, which can be used to cheaply avoid
- * equality checks that might otherwise be required. Caller can safely make a
- * determination of "non-equal tuple" based on simple binary inequality. A
- * NULL value in leading attribute will set abbreviated value to zeroed
- * representation, which caller may rely on in abbreviated inequality check.
- *
- * If copy is true, the slot receives a tuple that's been copied into the
- * caller's memory context, so that it will stay valid regardless of future
- * manipulations of the tuplesort's state (up to and including deleting the
- * tuplesort). If copy is false, the slot will just receive a pointer to a
- * tuple held within the tuplesort, which is more efficient, but only safe for
- * callers that are prepared to have any subsequent manipulation of the
- * tuplesort's state invalidate slot contents.
- */
-bool
-tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
- TupleTableSlot *slot, Datum *abbrev)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- if (stup.tuple)
- {
- /* Record abbreviated key for caller */
- if (base->sortKeys->abbrev_converter && abbrev)
- *abbrev = stup.datum1;
-
- if (copy)
- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
-
- ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
- return true;
- }
- else
- {
- ExecClearTuple(slot);
- return false;
- }
-}
-
-/*
- * Fetch the next tuple in either forward or back direction.
- * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller may not rely on tuple
- * remaining valid after any further manipulation of tuplesort.
- */
-HeapTuple
-tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- return stup.tuple;
-}
-
-/*
- * Fetch the next index tuple in either forward or back direction.
- * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller may not rely on tuple
- * remaining valid after any further manipulation of tuplesort.
- */
-IndexTuple
-tuplesort_getindextuple(Tuplesortstate *state, bool forward)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- return (IndexTuple) stup.tuple;
-}
-
-/*
- * Fetch the next Datum in either forward or back direction.
- * Returns false if no more datums.
- *
- * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * in caller's context, and is now owned by the caller (this differs from
- * similar routines for other types of tuplesorts).
- *
- * Caller may optionally be passed back abbreviated value (on true return
- * value) when abbreviation was used, which can be used to cheaply avoid
- * equality checks that might otherwise be required. Caller can safely make a
- * determination of "non-equal tuple" based on simple binary inequality. A
- * NULL value will have a zeroed abbreviated value representation, which caller
- * may rely on in abbreviated inequality check.
- */
-bool
-tuplesort_getdatum(Tuplesortstate *state, bool forward,
- Datum *val, bool *isNull, Datum *abbrev)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- {
- MemoryContextSwitchTo(oldcontext);
- return false;
- }
-
- /* Ensure we copy into caller's memory context */
- MemoryContextSwitchTo(oldcontext);
-
- /* Record abbreviated key for caller */
- if (base->sortKeys->abbrev_converter && abbrev)
- *abbrev = stup.datum1;
+ return true;
+ }
+ return false;
- if (stup.isnull1 || !base->tuples)
- {
- *val = stup.datum1;
- *isNull = stup.isnull1;
- }
- else
- {
- /* use stup.tuple because stup.datum1 may be an abbreviation */
- *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
- *isNull = false;
+ default:
+ elog(ERROR, "invalid tuplesort state");
+ return false; /* keep compiler quiet */
}
-
- return true;
}
+
/*
* Advance over N tuples in either forward or back direction,
* without returning any data. N==0 is a no-op.
@@ -3928,8 +2924,8 @@ markrunend(LogicalTape *tape)
* We use next free slot from the slab allocator, or palloc() if the tuple
* is too large for that.
*/
-static void *
-readtup_alloc(Tuplesortstate *state, Size tuplen)
+void *
+tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen)
{
SlabSlot *buf;
@@ -3952,695 +2948,6 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
}
-/*
- * Routines specialized for HeapTuple (actually MinimalTuple) case
- */
-
-static void
-removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
-{
- int i;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
-
- for (i = 0; i < count; i++)
- {
- HeapTupleData htup;
-
- htup.t_len = ((MinimalTuple) stups[i].tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
- MINIMAL_TUPLE_OFFSET);
- stups[i].datum1 = heap_getattr(&htup,
- base->sortKeys[0].ssup_attno,
- (TupleDesc) base->arg,
- &stups[i].isnull1);
- }
-}
-
-static int
-comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- SortSupport sortKey = base->sortKeys;
- HeapTupleData ltup;
- HeapTupleData rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- AttrNumber attno;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
- /* Compare the leading sort key */
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- /* Compare additional sort keys */
- ltup.t_len = ((MinimalTuple) a->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
- ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
- rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
- rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
- tupDesc = (TupleDesc) base->arg;
-
- if (sortKey->abbrev_converter)
- {
- attno = sortKey->ssup_attno;
-
- datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- sortKey++;
- for (nkey = 1; nkey < base->nKeys; nkey++, sortKey++)
- {
- attno = sortKey->ssup_attno;
-
- datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- return 0;
-}
-
-static void
-writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MinimalTuple tuple = (MinimalTuple) stup->tuple;
-
- /* the part of the MinimalTuple we'll write: */
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
-
- /* total on-disk footprint: */
- unsigned int tuplen = tupbodylen + sizeof(int);
-
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_heap(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- unsigned int tupbodylen = len - sizeof(int);
- unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
- MinimalTuple tuple = (MinimalTuple) readtup_alloc(state, tuplen);
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- HeapTupleData htup;
-
- /* read in the tuple proper */
- tuple->t_len = tuplen;
- LogicalTapeReadExact(tape, tupbody, tupbodylen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- stup->datum1 = heap_getattr(&htup,
- base->sortKeys[0].ssup_attno,
- (TupleDesc) base->arg,
- &stup->isnull1);
-}
-
-/*
- * Routines specialized for the CLUSTER case (HeapTuple data, with
- * comparisons per a btree index definition)
- */
-
-static void
-removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
-{
- int i;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
-
- for (i = 0; i < count; i++)
- {
- HeapTuple tup;
-
- tup = (HeapTuple) stups[i].tuple;
- stups[i].datum1 = heap_getattr(tup,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stups[i].isnull1);
- }
-}
-
-static int
-comparetup_cluster(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
- SortSupport sortKey = base->sortKeys;
- HeapTuple ltup;
- HeapTuple rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
- /* Be prepared to compare additional sort keys */
- ltup = (HeapTuple) a->tuple;
- rtup = (HeapTuple) b->tuple;
- tupDesc = arg->tupDesc;
-
- /* Compare the leading sort key, if it's simple */
- if (base->haveDatum1)
- {
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- if (sortKey->abbrev_converter)
- {
- AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
-
- datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
- datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- }
- if (compare != 0 || base->nKeys == 1)
- return compare;
- /* Compare additional columns the hard way */
- sortKey++;
- nkey = 1;
- }
- else
- {
- /* Must compare all keys the hard way */
- nkey = 0;
- }
-
- if (arg->indexInfo->ii_Expressions == NULL)
- {
- /* If not expression index, just compare the proper heap attrs */
-
- for (; nkey < base->nKeys; nkey++, sortKey++)
- {
- AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
-
- datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
- }
- else
- {
- /*
- * In the expression index case, compute the whole index tuple and
- * then compare values. It would perhaps be faster to compute only as
- * many columns as we need to compare, but that would require
- * duplicating all the logic in FormIndexDatum.
- */
- Datum l_index_values[INDEX_MAX_KEYS];
- bool l_index_isnull[INDEX_MAX_KEYS];
- Datum r_index_values[INDEX_MAX_KEYS];
- bool r_index_isnull[INDEX_MAX_KEYS];
- TupleTableSlot *ecxt_scantuple;
-
- /* Reset context each time to prevent memory leakage */
- ResetPerTupleExprContext(arg->estate);
-
- ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
-
- ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
- FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
- l_index_values, l_index_isnull);
-
- ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
- FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
- r_index_values, r_index_isnull);
-
- for (; nkey < base->nKeys; nkey++, sortKey++)
- {
- compare = ApplySortComparator(l_index_values[nkey],
- l_index_isnull[nkey],
- r_index_values[nkey],
- r_index_isnull[nkey],
- sortKey);
- if (compare != 0)
- return compare;
- }
- }
-
- return 0;
-}
-
-static void
-writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- HeapTuple tuple = (HeapTuple) stup->tuple;
- unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
-
- /* We need to store t_self, but not other fields of HeapTupleData */
- LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
- LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_cluster(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int tuplen)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
- unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
- HeapTuple tuple = (HeapTuple) readtup_alloc(state,
- t_len + HEAPTUPLESIZE);
-
- /* Reconstruct the HeapTupleData header */
- tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
- tuple->t_len = t_len;
- LogicalTapeReadExact(tape, &tuple->t_self, sizeof(ItemPointerData));
- /* We don't currently bother to reconstruct t_tableOid */
- tuple->t_tableOid = InvalidOid;
- /* Read in the tuple body */
- LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value, if it's a simple column */
- if (base->haveDatum1)
- stup->datum1 = heap_getattr(tuple,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup->isnull1);
-}
-
-static void
-freestate_cluster(Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
-
- /* Free any execution state created for CLUSTER case */
- if (arg->estate != NULL)
- {
- ExprContext *econtext = GetPerTupleExprContext(arg->estate);
-
- ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
- FreeExecutorState(arg->estate);
- }
-}
-
-/*
- * Routines specialized for IndexTuple case
- *
- * The btree and hash cases require separate comparison functions, but the
- * IndexTuple representation is the same so the copy/write/read support
- * functions can be shared.
- */
-
-static void
-removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
- int i;
-
- for (i = 0; i < count; i++)
- {
- IndexTuple tuple;
-
- tuple = stups[i].tuple;
- stups[i].datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stups[i].isnull1);
- }
-}
-
-static int
-comparetup_index_btree(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- /*
- * This is similar to comparetup_heap(), but expects index tuples. There
- * is also special handling for enforcing uniqueness, and special
- * treatment for equal keys at the end.
- */
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
- SortSupport sortKey = base->sortKeys;
- IndexTuple tuple1;
- IndexTuple tuple2;
- int keysz;
- TupleDesc tupDes;
- bool equal_hasnull = false;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
- /* Compare the leading sort key */
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- /* Compare additional sort keys */
- tuple1 = (IndexTuple) a->tuple;
- tuple2 = (IndexTuple) b->tuple;
- keysz = base->nKeys;
- tupDes = RelationGetDescr(arg->index.indexRel);
-
- if (sortKey->abbrev_converter)
- {
- datum1 = index_getattr(tuple1, 1, tupDes, &isnull1);
- datum2 = index_getattr(tuple2, 1, tupDes, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- /* they are equal, so we only need to examine one null flag */
- if (a->isnull1)
- equal_hasnull = true;
-
- sortKey++;
- for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
- {
- datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
- datum2 = index_getattr(tuple2, nkey, tupDes, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare; /* done when we find unequal attributes */
-
- /* they are equal, so we only need to examine one null flag */
- if (isnull1)
- equal_hasnull = true;
- }
-
- /*
- * If btree has asked us to enforce uniqueness, complain if two equal
- * tuples are detected (unless there was at least one NULL field and NULLS
- * NOT DISTINCT was not set).
- *
- * It is sufficient to make the test here, because if two tuples are equal
- * they *must* get compared at some stage of the sort --- otherwise the
- * sort algorithm wouldn't have checked whether one must appear before the
- * other.
- */
- if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
- {
- Datum values[INDEX_MAX_KEYS];
- bool isnull[INDEX_MAX_KEYS];
- char *key_desc;
-
- /*
- * Some rather brain-dead implementations of qsort (such as the one in
- * QNX 4) will sometimes call the comparison routine to compare a
- * value to itself, but we always use our own implementation, which
- * does not.
- */
- Assert(tuple1 != tuple2);
-
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
- }
-
- /*
- * If key values are equal, we sort on ItemPointer. This is required for
- * btree indexes, since heap TID is treated as an implicit last key
- * attribute in order to ensure that all keys in the index are physically
- * unique.
- */
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
-}
-
-static int
-comparetup_index_hash(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- Bucket bucket1;
- Bucket bucket2;
- IndexTuple tuple1;
- IndexTuple tuple2;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexHashArg *arg = (TuplesortIndexHashArg *) base->arg;
-
- /*
- * Fetch hash keys and mask off bits we don't want to sort by. We know
- * that the first column of the index tuple is the hash key.
- */
- Assert(!a->isnull1);
- bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
- arg->max_buckets, arg->high_mask,
- arg->low_mask);
- Assert(!b->isnull1);
- bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
- arg->max_buckets, arg->high_mask,
- arg->low_mask);
- if (bucket1 > bucket2)
- return 1;
- else if (bucket1 < bucket2)
- return -1;
-
- /*
- * If hash values are equal, we sort on ItemPointer. This does not affect
- * validity of the finished index, but it may be useful to have index
- * scans in physical order.
- */
- tuple1 = (IndexTuple) a->tuple;
- tuple2 = (IndexTuple) b->tuple;
-
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
-}
-
-static void
-writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- IndexTuple tuple = (IndexTuple) stup->tuple;
- unsigned int tuplen;
-
- tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_index(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
- unsigned int tuplen = len - sizeof(unsigned int);
- IndexTuple tuple = (IndexTuple) readtup_alloc(state, tuplen);
-
- LogicalTapeReadExact(tape, tuple, tuplen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value */
- stup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup->isnull1);
-}
-
-/*
- * Routines specialized for DatumTuple case
- */
-
-static void
-removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
-{
- int i;
-
- for (i = 0; i < count; i++)
- stups[i].datum1 = PointerGetDatum(stups[i].tuple);
-}
-
-static int
-comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- int compare;
-
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- base->sortKeys);
- if (compare != 0)
- return compare;
-
- /* if we have abbreviations, then "tuple" has the original value */
-
- if (base->sortKeys->abbrev_converter)
- compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
- PointerGetDatum(b->tuple), b->isnull1,
- base->sortKeys);
-
- return compare;
-}
-
-static void
-writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- void *waddr;
- unsigned int tuplen;
- unsigned int writtenlen;
-
- if (stup->isnull1)
- {
- waddr = NULL;
- tuplen = 0;
- }
- else if (!base->tuples)
- {
- waddr = &stup->datum1;
- tuplen = sizeof(Datum);
- }
- else
- {
- waddr = stup->tuple;
- tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
- Assert(tuplen != 0);
- }
-
- writtenlen = tuplen + sizeof(unsigned int);
-
- LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
- LogicalTapeWrite(tape, waddr, tuplen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
-}
-
-static void
-readtup_datum(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- unsigned int tuplen = len - sizeof(unsigned int);
-
- if (tuplen == 0)
- {
- /* it's NULL */
- stup->datum1 = (Datum) 0;
- stup->isnull1 = true;
- stup->tuple = NULL;
- }
- else if (!base->tuples)
- {
- Assert(tuplen == sizeof(Datum));
- LogicalTapeReadExact(tape, &stup->datum1, tuplen);
- stup->isnull1 = false;
- stup->tuple = NULL;
- }
- else
- {
- void *raddr = readtup_alloc(state, tuplen);
-
- LogicalTapeReadExact(tape, raddr, tuplen);
- stup->datum1 = PointerGetDatum(raddr);
- stup->isnull1 = false;
- stup->tuple = raddr;
- }
-
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
-}
-
/*
* Parallel sort routines
*/
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
new file mode 100644
index 00000000000..ae4df11b5d8
--- /dev/null
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -0,0 +1,1572 @@
+/*-------------------------------------------------------------------------
+ *
+ * tuplesortvariants.c
+ * Implementation of tuple sorting variants.
+ *
+ *
+ * Copyright (c) 2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/utils/sort/tuplesortvariants.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "access/htup_details.h"
+#include "access/nbtree.h"
+#include "catalog/index.h"
+#include "executor/executor.h"
+#include "pg_trace.h"
+#include "utils/datum.h"
+#include "utils/lsyscache.h"
+#include "utils/guc.h"
+#include "utils/tuplesort.h"
+
+
+/* sort-type codes for sort__start probes */
+#define HEAP_SORT 0
+#define INDEX_SORT 1
+#define DATUM_SORT 2
+#define CLUSTER_SORT 3
+
+static void removeabbrev_heap(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_index(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_datum(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static int comparetup_heap(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_index(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_datum(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static void freestate_cluster(Tuplesortstate *state);
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
+ * the tuplesort_begin_cluster.
+ */
+typedef struct
+{
+ TupleDesc tupDesc;
+
+ IndexInfo *indexInfo; /* info about index being used for reference */
+ EState *estate; /* for evaluating index expressions */
+} TuplesortClusterArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the IndexTuple case.
+ * Set by tuplesort_begin_index_xxx and used only by the IndexTuple routines.
+ */
+typedef struct
+{
+ Relation heapRel; /* table the index is being built on */
+ Relation indexRel; /* index being built */
+} TuplesortIndexArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_btree subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ bool enforceUnique; /* complain if we find duplicate tuples */
+ bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+} TuplesortIndexBTreeArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_hash subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ uint32 high_mask; /* masks for sortable part of hash code */
+ uint32 low_mask;
+ uint32 max_buckets;
+} TuplesortIndexHashArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the Datum case.
+ * Set by tuplesort_begin_datum and used only by the DatumTuple routines.
+ */
+typedef struct
+{
+ /* the datatype oid of Datum's to be sorted */
+ Oid datumType;
+ /* we need typelen in order to know how to copy the Datums. */
+ int datumTypeLen;
+} TuplesortDatumArg;
+
+Tuplesortstate *
+tuplesort_begin_heap(TupleDesc tupDesc,
+ int nkeys, AttrNumber *attNums,
+ Oid *sortOperators, Oid *sortCollations,
+ bool *nullsFirstFlags,
+ int workMem, SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+
+ AssertArg(nkeys > 0);
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
+ nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = nkeys;
+
+ TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
+ false, /* no unique check */
+ nkeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_heap;
+ base->comparetup = comparetup_heap;
+ base->writetup = writetup_heap;
+ base->readtup = readtup_heap;
+ base->haveDatum1 = true;
+ base->arg = tupDesc; /* assume we need not copy tupDesc */
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+
+ for (i = 0; i < nkeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+
+ AssertArg(attNums[i] != 0);
+ AssertArg(sortOperators[i] != 0);
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = sortCollations[i];
+ sortKey->ssup_nulls_first = nullsFirstFlags[i];
+ sortKey->ssup_attno = attNums[i];
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
+ }
+
+ /*
+ * The "onlyKey" optimization cannot be used with abbreviated keys, since
+ * tie-breaker comparisons may be required. Typically, the optimization
+ * is only of value to pass-by-value types anyway, whereas abbreviated
+ * keys are typically only of value to pass-by-reference types.
+ */
+ if (nkeys == 1 && !base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_cluster(TupleDesc tupDesc,
+ Relation indexRel,
+ int workMem,
+ SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ BTScanInsert indexScanKey;
+ MemoryContext oldcontext;
+ TuplesortClusterArg *arg;
+ int i;
+
+ Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortClusterArg *) palloc0(sizeof(TuplesortClusterArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
+ RelationGetNumberOfAttributes(indexRel),
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
+ false, /* no unique check */
+ base->nKeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_cluster;
+ base->comparetup = comparetup_cluster;
+ base->writetup = writetup_cluster;
+ base->readtup = readtup_cluster;
+ base->freestate = freestate_cluster;
+ base->arg = arg;
+
+ arg->indexInfo = BuildIndexInfo(indexRel);
+
+ /*
+ * If we don't have a simple leading attribute, we don't currently
+ * initialize datum1, so disable optimizations that require it.
+ */
+ if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
+ base->haveDatum1 = false;
+ else
+ base->haveDatum1 = true;
+
+ arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
+
+ indexScanKey = _bt_mkscankey(indexRel, NULL);
+
+ if (arg->indexInfo->ii_Expressions != NULL)
+ {
+ TupleTableSlot *slot;
+ ExprContext *econtext;
+
+ /*
+ * We will need to use FormIndexDatum to evaluate the index
+ * expressions. To do that, we need an EState, as well as a
+ * TupleTableSlot to put the table tuples into. The econtext's
+ * scantuple has to point to that slot, too.
+ */
+ arg->estate = CreateExecutorState();
+ slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
+ econtext = GetPerTupleExprContext(arg->estate);
+ econtext->ecxt_scantuple = slot;
+ }
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < base->nKeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+ ScanKey scanKey = indexScanKey->scankeys + i;
+ int16 strategy;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = scanKey->sk_collation;
+ sortKey->ssup_nulls_first =
+ (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
+ sortKey->ssup_attno = scanKey->sk_attno;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
+ BTGreaterStrategyNumber : BTLessStrategyNumber;
+
+ PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
+ }
+
+ pfree(indexScanKey);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_btree(Relation heapRel,
+ Relation indexRel,
+ bool enforceUnique,
+ bool uniqueNullsNotDistinct,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ BTScanInsert indexScanKey;
+ TuplesortIndexBTreeArg *arg;
+ MemoryContext oldcontext;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: unique = %c, workMem = %d, randomAccess = %c",
+ enforceUnique ? 't' : 'f',
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
+ enforceUnique,
+ state->nKeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = enforceUnique;
+ arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+
+ indexScanKey = _bt_mkscankey(indexRel, NULL);
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < base->nKeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+ ScanKey scanKey = indexScanKey->scankeys + i;
+ int16 strategy;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = scanKey->sk_collation;
+ sortKey->ssup_nulls_first =
+ (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
+ sortKey->ssup_attno = scanKey->sk_attno;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
+ BTGreaterStrategyNumber : BTLessStrategyNumber;
+
+ PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
+ }
+
+ pfree(indexScanKey);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_hash(Relation heapRel,
+ Relation indexRel,
+ uint32 high_mask,
+ uint32 low_mask,
+ uint32 max_buckets,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext;
+ TuplesortIndexHashArg *arg;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexHashArg *) palloc(sizeof(TuplesortIndexHashArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: high_mask = 0x%x, low_mask = 0x%x, "
+ "max_buckets = 0x%x, workMem = %d, randomAccess = %c",
+ high_mask,
+ low_mask,
+ max_buckets,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = 1; /* Only one sort column, the hash code */
+
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_hash;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+
+ arg->high_mask = high_mask;
+ arg->low_mask = low_mask;
+ arg->max_buckets = max_buckets;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_gist(Relation heapRel,
+ Relation indexRel,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext;
+ TuplesortIndexBTreeArg *arg;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: workMem = %d, randomAccess = %c",
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = false;
+ arg->uniqueNullsNotDistinct = false;
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < base->nKeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = indexRel->rd_indcollation[i];
+ sortKey->ssup_nulls_first = false;
+ sortKey->ssup_attno = i + 1;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ /* Look for a sort support function */
+ PrepareSortSupportFromGistIndexRel(indexRel, sortKey);
+ }
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
+ bool nullsFirstFlag, int workMem,
+ SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg;
+ MemoryContext oldcontext;
+ int16 typlen;
+ bool typbyval;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortDatumArg *) palloc(sizeof(TuplesortDatumArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin datum sort: workMem = %d, randomAccess = %c",
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = 1; /* always a one-column sort */
+
+ TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
+ false, /* no unique check */
+ 1,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_datum;
+ base->comparetup = comparetup_datum;
+ base->writetup = writetup_datum;
+ base->readtup = readtup_datum;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->datumType = datumType;
+
+ /* lookup necessary attributes of the datum type */
+ get_typlenbyval(datumType, &typlen, &typbyval);
+ arg->datumTypeLen = typlen;
+ base->tuples = !typbyval;
+
+ /* Prepare SortSupport data */
+ base->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+
+ base->sortKeys->ssup_cxt = CurrentMemoryContext;
+ base->sortKeys->ssup_collation = sortCollation;
+ base->sortKeys->ssup_nulls_first = nullsFirstFlag;
+
+ /*
+ * Abbreviation is possible here only for by-reference types. In theory,
+ * a pass-by-value datatype could have an abbreviated form that is cheaper
+ * to compare. In a tuple sort, we could support that, because we can
+ * always extract the original datum from the tuple as needed. Here, we
+ * can't, because a datum sort only stores a single copy of the datum; the
+ * "tuple" field of each SortTuple is NULL.
+ */
+ base->sortKeys->abbreviate = !typbyval;
+
+ PrepareSortSupportFromOrderingOp(sortOperator, base->sortKeys);
+
+ /*
+ * The "onlyKey" optimization cannot be used with abbreviated keys, since
+ * tie-breaker comparisons may be required. Typically, the optimization
+ * is only of value to pass-by-value types anyway, whereas abbreviated
+ * keys are typically only of value to pass-by-reference types.
+ */
+ if (!base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+/*
+ * Accept one tuple while collecting input data for sort.
+ *
+ * Note that the input data is always copied; the caller need not save it.
+ */
+void
+tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TupleDesc tupDesc = (TupleDesc) base->arg;
+ SortTuple stup;
+ MinimalTuple tuple;
+ HeapTupleData htup;
+
+ /* copy the tuple into sort storage */
+ tuple = ExecCopySlotMinimalTuple(slot);
+ stup.tuple = (void *) tuple;
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ stup.datum1 = heap_getattr(&htup,
+ base->sortKeys[0].ssup_attno,
+ tupDesc,
+ &stup.isnull1);
+
+ tuplesort_puttuple_common(state, &stup,
+ base->sortKeys->abbrev_converter &&
+ !stup.isnull1);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Accept one tuple while collecting input data for sort.
+ *
+ * Note that the input data is always copied; the caller need not save it.
+ */
+void
+tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
+{
+ SortTuple stup;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ /* copy the tuple into sort storage */
+ tup = heap_copytuple(tup);
+ stup.tuple = (void *) tup;
+
+ /*
+ * set up first-column key value, and potentially abbreviate, if it's a
+ * simple column
+ */
+ if (base->haveDatum1)
+ {
+ stup.datum1 = heap_getattr(tup,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup.isnull1);
+ }
+
+ tuplesort_puttuple_common(state, &stup,
+ base->haveDatum1 &&
+ base->sortKeys->abbrev_converter &&
+ !stup.isnull1);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Collect one index tuple while collecting input data for sort, building
+ * it from caller-supplied values.
+ */
+void
+tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
+ ItemPointer self, Datum *values,
+ bool *isnull)
+{
+ SortTuple stup;
+ IndexTuple tuple;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
+
+ stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
+ isnull, base->tuplecontext);
+ tuple = ((IndexTuple) stup.tuple);
+ tuple->t_tid = *self;
+ /* set up first-column key value */
+ stup.datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup.isnull1);
+
+ tuplesort_puttuple_common(state, &stup,
+ base->sortKeys &&
+ base->sortKeys->abbrev_converter &&
+ !stup.isnull1);
+}
+
+/*
+ * Accept one Datum while collecting input data for sort.
+ *
+ * If the Datum is pass-by-ref type, the value will be copied.
+ */
+void
+tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
+ SortTuple stup;
+
+ /*
+ * Pass-by-value types or null values are just stored directly in
+ * stup.datum1 (and stup.tuple is not used and set to NULL).
+ *
+ * Non-null pass-by-reference values need to be copied into memory we
+ * control, and possibly abbreviated. The copied value is pointed to by
+ * stup.tuple and is treated as the canonical copy (e.g. to return via
+ * tuplesort_getdatum or when writing to tape); stup.datum1 gets the
+ * abbreviated value if abbreviation is happening, otherwise it's
+ * identical to stup.tuple.
+ */
+
+ if (isNull || !base->tuples)
+ {
+ /*
+ * Set datum1 to zeroed representation for NULLs (to be consistent,
+ * and to support cheap inequality tests for NULL abbreviated keys).
+ */
+ stup.datum1 = !isNull ? val : (Datum) 0;
+ stup.isnull1 = isNull;
+ stup.tuple = NULL; /* no separate storage */
+ }
+ else
+ {
+ stup.isnull1 = false;
+ stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
+ stup.tuple = DatumGetPointer(stup.datum1);
+ }
+
+ tuplesort_puttuple_common(state, &stup,
+ base->tuples &&
+ base->sortKeys->abbrev_converter && !isNull);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Fetch the next tuple in either forward or back direction.
+ * If successful, put tuple in slot and return true; else, clear the slot
+ * and return false.
+ *
+ * Caller may optionally be passed back abbreviated value (on true return
+ * value) when abbreviation was used, which can be used to cheaply avoid
+ * equality checks that might otherwise be required. Caller can safely make a
+ * determination of "non-equal tuple" based on simple binary inequality. A
+ * NULL value in leading attribute will set abbreviated value to zeroed
+ * representation, which caller may rely on in abbreviated inequality check.
+ *
+ * If copy is true, the slot receives a tuple that's been copied into the
+ * caller's memory context, so that it will stay valid regardless of future
+ * manipulations of the tuplesort's state (up to and including deleting the
+ * tuplesort). If copy is false, the slot will just receive a pointer to a
+ * tuple held within the tuplesort, which is more efficient, but only safe for
+ * callers that are prepared to have any subsequent manipulation of the
+ * tuplesort's state invalidate slot contents.
+ */
+bool
+tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
+ TupleTableSlot *slot, Datum *abbrev)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ if (stup.tuple)
+ {
+ /* Record abbreviated key for caller */
+ if (base->sortKeys->abbrev_converter && abbrev)
+ *abbrev = stup.datum1;
+
+ if (copy)
+ stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
+
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
+ return true;
+ }
+ else
+ {
+ ExecClearTuple(slot);
+ return false;
+ }
+}
+
+/*
+ * Fetch the next tuple in either forward or back direction.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller may not rely on tuple
+ * remaining valid after any further manipulation of tuplesort.
+ */
+HeapTuple
+tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return stup.tuple;
+}
+
+/*
+ * Fetch the next index tuple in either forward or back direction.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller may not rely on tuple
+ * remaining valid after any further manipulation of tuplesort.
+ */
+IndexTuple
+tuplesort_getindextuple(Tuplesortstate *state, bool forward)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return (IndexTuple) stup.tuple;
+}
+
+/*
+ * Fetch the next Datum in either forward or back direction.
+ * Returns false if no more datums.
+ *
+ * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
+ * in caller's context, and is now owned by the caller (this differs from
+ * similar routines for other types of tuplesorts).
+ *
+ * Caller may optionally be passed back abbreviated value (on true return
+ * value) when abbreviation was used, which can be used to cheaply avoid
+ * equality checks that might otherwise be required. Caller can safely make a
+ * determination of "non-equal tuple" based on simple binary inequality. A
+ * NULL value will have a zeroed abbreviated value representation, which caller
+ * may rely on in abbreviated inequality check.
+ */
+bool
+tuplesort_getdatum(Tuplesortstate *state, bool forward,
+ Datum *val, bool *isNull, Datum *abbrev)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ {
+ MemoryContextSwitchTo(oldcontext);
+ return false;
+ }
+
+ /* Ensure we copy into caller's memory context */
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Record abbreviated key for caller */
+ if (base->sortKeys->abbrev_converter && abbrev)
+ *abbrev = stup.datum1;
+
+ if (stup.isnull1 || !base->tuples)
+ {
+ *val = stup.datum1;
+ *isNull = stup.isnull1;
+ }
+ else
+ {
+ /* use stup.tuple because stup.datum1 may be an abbreviation */
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
+ *isNull = false;
+ }
+
+ return true;
+}
+
+
+/*
+ * Routines specialized for HeapTuple (actually MinimalTuple) case
+ */
+
+static void
+removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTupleData htup;
+
+ htup.t_len = ((MinimalTuple) stups[i].tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
+ MINIMAL_TUPLE_OFFSET);
+ stups[i].datum1 = heap_getattr(&htup,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
+ &stups[i].isnull1);
+ }
+}
+
+static int
+comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys;
+ HeapTupleData ltup;
+ HeapTupleData rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ AttrNumber attno;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+
+ /* Compare the leading sort key */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ /* Compare additional sort keys */
+ ltup.t_len = ((MinimalTuple) a->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
+ rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
+ tupDesc = (TupleDesc) base->arg;
+
+ if (sortKey->abbrev_converter)
+ {
+ attno = sortKey->ssup_attno;
+
+ datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ sortKey++;
+ for (nkey = 1; nkey < base->nKeys; nkey++, sortKey++)
+ {
+ attno = sortKey->ssup_attno;
+
+ datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ return 0;
+}
+
+static void
+writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MinimalTuple tuple = (MinimalTuple) stup->tuple;
+
+ /* the part of the MinimalTuple we'll write: */
+ char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
+
+ /* total on-disk footprint: */
+ unsigned int tuplen = tupbodylen + sizeof(int);
+
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_heap(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ unsigned int tupbodylen = len - sizeof(int);
+ unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
+ MinimalTuple tuple = (MinimalTuple) tuplesort_readtup_alloc(state, tuplen);
+ char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ HeapTupleData htup;
+
+ /* read in the tuple proper */
+ tuple->t_len = tuplen;
+ LogicalTapeReadExact(tape, tupbody, tupbodylen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ stup->datum1 = heap_getattr(&htup,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
+ &stup->isnull1);
+}
+
+/*
+ * Routines specialized for the CLUSTER case (HeapTuple data, with
+ * comparisons per a btree index definition)
+ */
+
+static void
+removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTuple tup;
+
+ tup = (HeapTuple) stups[i].tuple;
+ stups[i].datum1 = heap_getattr(tup,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stups[i].isnull1);
+ }
+}
+
+static int
+comparetup_cluster(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
+ HeapTuple ltup;
+ HeapTuple rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+ /* Be prepared to compare additional sort keys */
+ ltup = (HeapTuple) a->tuple;
+ rtup = (HeapTuple) b->tuple;
+ tupDesc = arg->tupDesc;
+
+ /* Compare the leading sort key, if it's simple */
+ if (base->haveDatum1)
+ {
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ if (sortKey->abbrev_converter)
+ {
+ AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
+
+ datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
+ datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ }
+ if (compare != 0 || base->nKeys == 1)
+ return compare;
+ /* Compare additional columns the hard way */
+ sortKey++;
+ nkey = 1;
+ }
+ else
+ {
+ /* Must compare all keys the hard way */
+ nkey = 0;
+ }
+
+ if (arg->indexInfo->ii_Expressions == NULL)
+ {
+ /* If not expression index, just compare the proper heap attrs */
+
+ for (; nkey < base->nKeys; nkey++, sortKey++)
+ {
+ AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
+
+ datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+ }
+ else
+ {
+ /*
+ * In the expression index case, compute the whole index tuple and
+ * then compare values. It would perhaps be faster to compute only as
+ * many columns as we need to compare, but that would require
+ * duplicating all the logic in FormIndexDatum.
+ */
+ Datum l_index_values[INDEX_MAX_KEYS];
+ bool l_index_isnull[INDEX_MAX_KEYS];
+ Datum r_index_values[INDEX_MAX_KEYS];
+ bool r_index_isnull[INDEX_MAX_KEYS];
+ TupleTableSlot *ecxt_scantuple;
+
+ /* Reset context each time to prevent memory leakage */
+ ResetPerTupleExprContext(arg->estate);
+
+ ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
+
+ ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
+ l_index_values, l_index_isnull);
+
+ ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
+ r_index_values, r_index_isnull);
+
+ for (; nkey < base->nKeys; nkey++, sortKey++)
+ {
+ compare = ApplySortComparator(l_index_values[nkey],
+ l_index_isnull[nkey],
+ r_index_values[nkey],
+ r_index_isnull[nkey],
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+ }
+
+ return 0;
+}
+
+static void
+writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ HeapTuple tuple = (HeapTuple) stup->tuple;
+ unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
+
+ /* We need to store t_self, but not other fields of HeapTupleData */
+ LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
+ LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_cluster(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int tuplen)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
+ HeapTuple tuple = (HeapTuple) tuplesort_readtup_alloc(state,
+ t_len + HEAPTUPLESIZE);
+
+ /* Reconstruct the HeapTupleData header */
+ tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
+ tuple->t_len = t_len;
+ LogicalTapeReadExact(tape, &tuple->t_self, sizeof(ItemPointerData));
+ /* We don't currently bother to reconstruct t_tableOid */
+ tuple->t_tableOid = InvalidOid;
+ /* Read in the tuple body */
+ LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value, if it's a simple column */
+ if (base->haveDatum1)
+ stup->datum1 = heap_getattr(tuple,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup->isnull1);
+}
+
+static void
+freestate_cluster(Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ /* Free any execution state created for CLUSTER case */
+ if (arg->estate != NULL)
+ {
+ ExprContext *econtext = GetPerTupleExprContext(arg->estate);
+
+ ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
+ FreeExecutorState(arg->estate);
+ }
+}
+
+/*
+ * Routines specialized for IndexTuple case
+ *
+ * The btree and hash cases require separate comparison functions, but the
+ * IndexTuple representation is the same so the copy/write/read support
+ * functions can be shared.
+ */
+
+static void
+removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ IndexTuple tuple;
+
+ tuple = stups[i].tuple;
+ stups[i].datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stups[i].isnull1);
+ }
+}
+
+static int
+comparetup_index_btree(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ /*
+ * This is similar to comparetup_heap(), but expects index tuples. There
+ * is also special handling for enforcing uniqueness, and special
+ * treatment for equal keys at the end.
+ */
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+ int keysz;
+ TupleDesc tupDes;
+ bool equal_hasnull = false;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+
+ /* Compare the leading sort key */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ /* Compare additional sort keys */
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+ keysz = base->nKeys;
+ tupDes = RelationGetDescr(arg->index.indexRel);
+
+ if (sortKey->abbrev_converter)
+ {
+ datum1 = index_getattr(tuple1, 1, tupDes, &isnull1);
+ datum2 = index_getattr(tuple2, 1, tupDes, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ /* they are equal, so we only need to examine one null flag */
+ if (a->isnull1)
+ equal_hasnull = true;
+
+ sortKey++;
+ for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
+ {
+ datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
+ datum2 = index_getattr(tuple2, nkey, tupDes, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare; /* done when we find unequal attributes */
+
+ /* they are equal, so we only need to examine one null flag */
+ if (isnull1)
+ equal_hasnull = true;
+ }
+
+ /*
+ * If btree has asked us to enforce uniqueness, complain if two equal
+ * tuples are detected (unless there was at least one NULL field and NULLS
+ * NOT DISTINCT was not set).
+ *
+ * It is sufficient to make the test here, because if two tuples are equal
+ * they *must* get compared at some stage of the sort --- otherwise the
+ * sort algorithm wouldn't have checked whether one must appear before the
+ * other.
+ */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
+ {
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ char *key_desc;
+
+ /*
+ * Some rather brain-dead implementations of qsort (such as the one in
+ * QNX 4) will sometimes call the comparison routine to compare a
+ * value to itself, but we always use our own implementation, which
+ * does not.
+ */
+ Assert(tuple1 != tuple2);
+
+ index_deform_tuple(tuple1, tupDes, values, isnull);
+
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+static int
+comparetup_index_hash(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ Bucket bucket1;
+ Bucket bucket2;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexHashArg *arg = (TuplesortIndexHashArg *) base->arg;
+
+ /*
+ * Fetch hash keys and mask off bits we don't want to sort by. We know
+ * that the first column of the index tuple is the hash key.
+ */
+ Assert(!a->isnull1);
+ bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
+ Assert(!b->isnull1);
+ bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
+ if (bucket1 > bucket2)
+ return 1;
+ else if (bucket1 < bucket2)
+ return -1;
+
+ /*
+ * If hash values are equal, we sort on ItemPointer. This does not affect
+ * validity of the finished index, but it may be useful to have index
+ * scans in physical order.
+ */
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+static void
+writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ IndexTuple tuple = (IndexTuple) stup->tuple;
+ unsigned int tuplen;
+
+ tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_index(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
+ unsigned int tuplen = len - sizeof(unsigned int);
+ IndexTuple tuple = (IndexTuple) tuplesort_readtup_alloc(state, tuplen);
+
+ LogicalTapeReadExact(tape, tuple, tuplen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value */
+ stup->datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup->isnull1);
+}
+
+/*
+ * Routines specialized for DatumTuple case
+ */
+
+static void
+removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ stups[i].datum1 = PointerGetDatum(stups[i].tuple);
+}
+
+static int
+comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ int compare;
+
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ base->sortKeys);
+ if (compare != 0)
+ return compare;
+
+ /* if we have abbreviations, then "tuple" has the original value */
+
+ if (base->sortKeys->abbrev_converter)
+ compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
+ PointerGetDatum(b->tuple), b->isnull1,
+ base->sortKeys);
+
+ return compare;
+}
+
+static void
+writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
+ void *waddr;
+ unsigned int tuplen;
+ unsigned int writtenlen;
+
+ if (stup->isnull1)
+ {
+ waddr = NULL;
+ tuplen = 0;
+ }
+ else if (!base->tuples)
+ {
+ waddr = &stup->datum1;
+ tuplen = sizeof(Datum);
+ }
+ else
+ {
+ waddr = stup->tuple;
+ tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
+ Assert(tuplen != 0);
+ }
+
+ writtenlen = tuplen + sizeof(unsigned int);
+
+ LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
+ LogicalTapeWrite(tape, waddr, tuplen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
+}
+
+static void
+readtup_datum(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ unsigned int tuplen = len - sizeof(unsigned int);
+
+ if (tuplen == 0)
+ {
+ /* it's NULL */
+ stup->datum1 = (Datum) 0;
+ stup->isnull1 = true;
+ stup->tuple = NULL;
+ }
+ else if (!base->tuples)
+ {
+ Assert(tuplen == sizeof(Datum));
+ LogicalTapeReadExact(tape, &stup->datum1, tuplen);
+ stup->isnull1 = false;
+ stup->tuple = NULL;
+ }
+ else
+ {
+ void *raddr = tuplesort_readtup_alloc(state, tuplen);
+
+ LogicalTapeReadExact(tape, raddr, tuplen);
+ stup->datum1 = PointerGetDatum(raddr);
+ stup->isnull1 = false;
+ stup->tuple = raddr;
+ }
+
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+}
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 364cf132fcb..bb44b916380 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -24,7 +24,9 @@
#include "access/itup.h"
#include "executor/tuptable.h"
#include "storage/dsm.h"
+#include "utils/logtape.h"
#include "utils/relcache.h"
+#include "utils/sortsupport.h"
/*
@@ -102,6 +104,147 @@ typedef struct TuplesortInstrumentation
int64 spaceUsed; /* space consumption, in kB */
} TuplesortInstrumentation;
+/*
+ * The objects we actually sort are SortTuple structs. These contain
+ * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
+ * which is a separate palloc chunk --- we assume it is just one chunk and
+ * can be freed by a simple pfree() (except during merge, when we use a
+ * simple slab allocator). SortTuples also contain the tuple's first key
+ * column in Datum/nullflag format, and a source/input tape number that
+ * tracks which tape each heap element/slot belongs to during merging.
+ *
+ * Storing the first key column lets us save heap_getattr or index_getattr
+ * calls during tuple comparisons. We could extract and save all the key
+ * columns not just the first, but this would increase code complexity and
+ * overhead, and wouldn't actually save any comparison cycles in the common
+ * case where the first key determines the comparison result. Note that
+ * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
+ *
+ * There is one special case: when the sort support infrastructure provides an
+ * "abbreviated key" representation, where the key is (typically) a pass by
+ * value proxy for a pass by reference type. In this case, the abbreviated key
+ * is stored in datum1 in place of the actual first key column.
+ *
+ * When sorting single Datums, the data value is represented directly by
+ * datum1/isnull1 for pass by value types (or null values). If the datatype is
+ * pass-by-reference and isnull1 is false, then "tuple" points to a separately
+ * palloc'd data value, otherwise "tuple" is NULL. The value of datum1 is then
+ * either the same pointer as "tuple", or is an abbreviated key value as
+ * described above. Accordingly, "tuple" is always used in preference to
+ * datum1 as the authoritative value for pass-by-reference cases.
+ */
+typedef struct
+{
+ void *tuple; /* the tuple itself */
+ Datum datum1; /* value of first key column */
+ bool isnull1; /* is first key column NULL? */
+ int srctape; /* source tape number */
+} SortTuple;
+
+typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+
+/*
+ * The public part of a Tuple sort operation state. This data structure
+ * containsthe definition of sort-variant-specific interface methods and
+ * the part of Tuple sort operation state required by their implementations.
+ */
+typedef struct
+{
+ /*
+ * These function pointers decouple the routines that must know what kind
+ * of tuple we are sorting from the routines that don't need to know it.
+ * They are set up by the tuplesort_begin_xxx routines.
+ *
+ * Function to compare two tuples; result is per qsort() convention, ie:
+ * <0, 0, >0 according as a<b, a=b, a>b. The API must match
+ * qsort_arg_comparator.
+ */
+ SortTupleComparator comparetup;
+
+ /*
+ * Alter datum1 representation in the SortTuple's array back from the
+ * abbreviated key to the first column value.
+ */
+ void (*removeabbrev) (Tuplesortstate *state, SortTuple *stups,
+ int count);
+
+ /*
+ * Function to write a stored tuple onto tape. The representation of the
+ * tuple on tape need not be the same as it is in memory.
+ */
+ void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+
+ /*
+ * Function to read a stored tuple from tape back into memory. 'len' is
+ * the already-read length of the stored tuple. The tuple is allocated
+ * from the slab memory arena, or is palloc'd, see
+ * tuplesort_readtup_alloc().
+ */
+ void (*readtup) (Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+
+ /*
+ * Function to do some specific release of resources for the sort variant.
+ * In particular, this function should free everything stored in the "arg"
+ * field, which wouldn't be cleared on reset of the Tuple sort memory
+ * contextes. This can be NULL if nothing specific needs to be done.
+ */
+ void (*freestate) (Tuplesortstate *state);
+
+ /*
+ * The subsequent fields are used in the implementations of the functions
+ * above.
+ */
+ MemoryContext maincontext; /* memory context for tuple sort metadata that
+ * persists across multiple batches */
+ MemoryContext sortcontext; /* memory context holding most sort data */
+ MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
+
+ /*
+ * Whether SortTuple's datum1 and isnull1 members are maintained by the
+ * above routines. If not, some sort specializations are disabled.
+ */
+ bool haveDatum1;
+
+ /*
+ * The sortKeys variable is used by every case other than the hash index
+ * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
+ * MinimalTuple and CLUSTER routines, though.
+ */
+ int nKeys; /* number of columns in sort key */
+ SortSupport sortKeys; /* array of length nKeys */
+
+ /*
+ * This variable is shared by the single-key MinimalTuple case and the
+ * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
+ * presence of a value in this field is also checked by various sort
+ * specialization functions as an optimization when comparing the leading
+ * key in a tiebreak situation to determine if there are any subsequent
+ * keys to sort on.
+ */
+ SortSupport onlyKey;
+
+ int sortopt; /* Bitmask of flags used to setup sort */
+
+ bool tuples; /* Can SortTuple.tuple ever be set? */
+
+ void *arg; /* Specific information for the sort variant */
+} TuplesortPublic;
+
+/* Sort parallel code from state for sort__start probes */
+#define PARALLEL_SORT(coordinate) ((coordinate)->sharedsort == NULL ? 0 : \
+ (coordinate)->isWorker >= 0 ? 1 : 2)
+
+#define TuplesortstateGetPublic(state) ((TuplesortPublic *) state);
+
+/* When using this macro, beware of double evaluation of len */
+#define LogicalTapeReadExact(tape, ptr, len) \
+ do { \
+ if (LogicalTapeRead(tape, ptr, len) != (size_t) (len)) \
+ elog(ERROR, "unexpected end of data"); \
+ } while(0)
/*
* We provide multiple interfaces to what is essentially the same code,
@@ -205,6 +348,50 @@ typedef struct TuplesortInstrumentation
* generated (typically, caller uses a parallel heap scan).
*/
+
+extern Tuplesortstate *tuplesort_begin_common(int workMem,
+ SortCoordinate coordinate,
+ int sortopt);
+extern void tuplesort_set_bound(Tuplesortstate *state, int64 bound);
+extern bool tuplesort_used_bound(Tuplesortstate *state);
+extern void tuplesort_puttuple_common(Tuplesortstate *state,
+ SortTuple *tuple, bool useAbbrev);
+extern void tuplesort_performsort(Tuplesortstate *state);
+extern bool tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
+ SortTuple *stup);
+extern bool tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples,
+ bool forward);
+extern void tuplesort_end(Tuplesortstate *state);
+extern void tuplesort_reset(Tuplesortstate *state);
+
+extern void tuplesort_get_stats(Tuplesortstate *state,
+ TuplesortInstrumentation *stats);
+extern const char *tuplesort_method_name(TuplesortMethod m);
+extern const char *tuplesort_space_type_name(TuplesortSpaceType t);
+
+extern int tuplesort_merge_order(int64 allowedMem);
+
+extern Size tuplesort_estimate_shared(int nworkers);
+extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
+ dsm_segment *seg);
+extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
+
+/*
+ * These routines may only be called if randomAccess was specified 'true'.
+ * Likewise, backwards scan in gettuple/getdatum is only allowed if
+ * randomAccess was specified. Note that parallel sorts do not support
+ * randomAccess.
+ */
+
+extern void tuplesort_rescan(Tuplesortstate *state);
+extern void tuplesort_markpos(Tuplesortstate *state);
+extern void tuplesort_restorepos(Tuplesortstate *state);
+
+extern void *tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen);
+
+
+/* tuplesortops.c */
+
extern Tuplesortstate *tuplesort_begin_heap(TupleDesc tupDesc,
int nkeys, AttrNumber *attNums,
Oid *sortOperators, Oid *sortCollations,
@@ -238,9 +425,6 @@ extern Tuplesortstate *tuplesort_begin_datum(Oid datumType,
int workMem, SortCoordinate coordinate,
int sortopt);
-extern void tuplesort_set_bound(Tuplesortstate *state, int64 bound);
-extern bool tuplesort_used_bound(Tuplesortstate *state);
-
extern void tuplesort_puttupleslot(Tuplesortstate *state,
TupleTableSlot *slot);
extern void tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup);
@@ -250,8 +434,6 @@ extern void tuplesort_putindextuplevalues(Tuplesortstate *state,
extern void tuplesort_putdatum(Tuplesortstate *state, Datum val,
bool isNull);
-extern void tuplesort_performsort(Tuplesortstate *state);
-
extern bool tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
bool copy, TupleTableSlot *slot, Datum *abbrev);
extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward);
@@ -259,34 +441,5 @@ extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward);
extern bool tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev);
-extern bool tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples,
- bool forward);
-
-extern void tuplesort_end(Tuplesortstate *state);
-
-extern void tuplesort_reset(Tuplesortstate *state);
-
-extern void tuplesort_get_stats(Tuplesortstate *state,
- TuplesortInstrumentation *stats);
-extern const char *tuplesort_method_name(TuplesortMethod m);
-extern const char *tuplesort_space_type_name(TuplesortSpaceType t);
-
-extern int tuplesort_merge_order(int64 allowedMem);
-
-extern Size tuplesort_estimate_shared(int nworkers);
-extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
- dsm_segment *seg);
-extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
-
-/*
- * These routines may only be called if randomAccess was specified 'true'.
- * Likewise, backwards scan in gettuple/getdatum is only allowed if
- * randomAccess was specified. Note that parallel sorts do not support
- * randomAccess.
- */
-
-extern void tuplesort_rescan(Tuplesortstate *state);
-extern void tuplesort_markpos(Tuplesortstate *state);
-extern void tuplesort_restorepos(Tuplesortstate *state);
#endif /* TUPLESORT_H */
--
2.24.3 (Apple Git-128)
On Tue, Jul 12, 2022 at 3:23 PM Alexander Korotkov <aekorotkov@gmail.com>
wrote:
There are some places, which potentially could cause a slowdown. I'm
going to make some experiments with that.
I haven't looked at the patches, so I don't know of a specific place to
look for a slowdown, but I thought it might help to perform the same query
tests as my most recent test for evaluating qsort variants (some
description in [1]/messages/by-id/CAFBsxsHeTACMP1JVQ+m35-v2NkmEqsJMHLhEfWk4sTB5aw_jkQ@mail.gmail.com -- John Naylor EDB: http://www.enterprisedb.com), and here is the spreadsheet. Overall, the differences
look like noise. A few cases with unabbreviatable text look a bit faster
with the patch. I'm not sure if that's a real difference, but in any case I
don't see a slowdown anywhere.
[1]: /messages/by-id/CAFBsxsHeTACMP1JVQ+m35-v2NkmEqsJMHLhEfWk4sTB5aw_jkQ@mail.gmail.com -- John Naylor EDB: http://www.enterprisedb.com
/messages/by-id/CAFBsxsHeTACMP1JVQ+m35-v2NkmEqsJMHLhEfWk4sTB5aw_jkQ@mail.gmail.com
--
John Naylor
EDB: http://www.enterprisedb.com
Attachments:
Hi, John!
On Thu, Jul 21, 2022 at 6:44 AM John Naylor
<john.naylor@enterprisedb.com> wrote:
On Tue, Jul 12, 2022 at 3:23 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
There are some places, which potentially could cause a slowdown. I'm
going to make some experiments with that.I haven't looked at the patches, so I don't know of a specific place to look for a slowdown, but I thought it might help to perform the same query tests as my most recent test for evaluating qsort variants (some description in [1]), and here is the spreadsheet. Overall, the differences look like noise. A few cases with unabbreviatable text look a bit faster with the patch. I'm not sure if that's a real difference, but in any case I don't see a slowdown anywhere.
[1] /messages/by-id/CAFBsxsHeTACMP1JVQ+m35-v2NkmEqsJMHLhEfWk4sTB5aw_jkQ@mail.gmail.com
Great, thank you very much for the feedback!
------
Regards,
Alexander Korotkov
I've looked through the updated patch. Overall it looks good enough.
Some minor things:
- PARALLEL_SORT macro is based on coordinate struct instead of state
struct. In some calls(i.e. from _bt_spools_heapscan) coordinate could
appear to be NULL, which can be a segfault on items dereference inside the
macro.
- state->worker and coordinate->isWorker a little bit differ in semantics
i.e.:
..............................................worker............... leader
state -> worker........................ >=0.....................-1
coordinate ->isWorker............. 1..........................0
- in tuplesort_begin_index_btree I suppose it should be base->nKeys instead
of state->nKeys
- Cfbot reports gcc warnings due to mixed code and declarations. So I used
this to beautify code in tuplesortvariants.c a little. (This is added as a
separate patch 0007)
All these things are corrected/done in a new version 3 of a patchset (PFA).
For me, the patchset seems like a long-needed thing to support PostgreSQL
extensibility. Overall corrections in v3 are minor, so I'd like to mark the
patch as RfC if there are no objections.
--
Best regards,
Pavel Borisov
Supabase.
Attachments:
v3-0001-Remove-Tuplesortstate.copytup-function.patchapplication/octet-stream; name=v3-0001-Remove-Tuplesortstate.copytup-function.patchDownload
From 197fe8c04af7587900b9dcad9704e09f0782e711 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 13:28:27 +0300
Subject: [PATCH v3 1/7] Remove Tuplesortstate.copytup function
It's currently unclear how do we split functionality between
Tuplesortstate.copytup() function and tuplesort_put*() functions.
For instance, copytup_index() and copytup_datum() return error while
tuplesort_putindextuplevalues() and tuplesort_putdatum() do their work.
This commit removes Tuplesortstate.copytup() altogether, putting the
corresponding code into tuplesort_put*().
---
src/backend/utils/sort/tuplesort.c | 330 ++++++++++++-----------------
1 file changed, 132 insertions(+), 198 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 421afcf47d..4812b1d9ae 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -279,14 +279,6 @@ struct Tuplesortstate
*/
SortTupleComparator comparetup;
- /*
- * Function to copy a supplied input tuple into palloc'd space and set up
- * its SortTuple representation (ie, set tuple/datum1/isnull1). Also,
- * state->availMem must be decreased by the amount of space used for the
- * tuple copy (note the SortTuple struct itself is not counted).
- */
- void (*copytup) (Tuplesortstate *state, SortTuple *stup, void *tup);
-
/*
* Function to write a stored tuple onto tape. The representation of the
* tuple on tape need not be the same as it is in memory; requirements on
@@ -549,7 +541,6 @@ struct Sharedsort
} while(0)
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
-#define COPYTUP(state,stup,tup) ((*(state)->copytup) (state, stup, tup))
#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
@@ -600,10 +591,7 @@ struct Sharedsort
* a lot better than what we were doing before 7.3. As of 9.6, a
* separate memory context is used for caller passed tuples. Resetting
* it at certain key increments significantly ameliorates fragmentation.
- * Note that this places a responsibility on copytup routines to use the
- * correct memory context for these tuples (and to not use the reset
- * context for anything whose lifetime needs to span multiple external
- * sort runs). readtup routines use the slab allocator (they cannot use
+ * readtup routines use the slab allocator (they cannot use
* the reset context because it gets deleted at the point that merging
* begins).
*/
@@ -643,14 +631,12 @@ static void markrunend(LogicalTape *tape);
static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
static int comparetup_heap(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_cluster(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
@@ -659,14 +645,12 @@ static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_index(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static int comparetup_datum(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_datum(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
@@ -1059,7 +1043,6 @@ tuplesort_begin_heap(TupleDesc tupDesc,
PARALLEL_SORT(state));
state->comparetup = comparetup_heap;
- state->copytup = copytup_heap;
state->writetup = writetup_heap;
state->readtup = readtup_heap;
state->haveDatum1 = true;
@@ -1135,7 +1118,6 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
PARALLEL_SORT(state));
state->comparetup = comparetup_cluster;
- state->copytup = copytup_cluster;
state->writetup = writetup_cluster;
state->readtup = readtup_cluster;
state->abbrevNext = 10;
@@ -1240,7 +1222,6 @@ tuplesort_begin_index_btree(Relation heapRel,
PARALLEL_SORT(state));
state->comparetup = comparetup_index_btree;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->abbrevNext = 10;
@@ -1317,7 +1298,6 @@ tuplesort_begin_index_hash(Relation heapRel,
state->nKeys = 1; /* Only one sort column, the hash code */
state->comparetup = comparetup_index_hash;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->haveDatum1 = true;
@@ -1358,7 +1338,6 @@ tuplesort_begin_index_gist(Relation heapRel,
state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
state->comparetup = comparetup_index_btree;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->haveDatum1 = true;
@@ -1422,7 +1401,6 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
PARALLEL_SORT(state));
state->comparetup = comparetup_datum;
- state->copytup = copytup_datum;
state->writetup = writetup_datum;
state->readtup = readtup_datum;
state->abbrevNext = 10;
@@ -1839,14 +1817,75 @@ noalloc:
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
+ Datum original;
+ MinimalTuple tuple;
+ HeapTupleData htup;
- /*
- * Copy the given tuple into memory we control, and decrease availMem.
- * Then call the common code.
- */
- COPYTUP(state, &stup, (void *) slot);
+ /* copy the tuple into sort storage */
+ tuple = ExecCopySlotMinimalTuple(slot);
+ stup.tuple = (void *) tuple;
+ USEMEM(state, GetMemoryChunkSpace(tuple));
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ original = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup.isnull1);
+
+ MemoryContextSwitchTo(state->sortcontext);
+
+ if (!state->sortKeys->abbrev_converter || stup.isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ stup.datum1 = original;
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ stup.datum1 = state->sortKeys->abbrev_converter(original,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ stup.datum1 = original;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ {
+ SortTuple *mtup = &state->memtuples[i];
+
+ htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
+ MINIMAL_TUPLE_OFFSET);
+
+ mtup->datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &mtup->isnull1);
+ }
+ }
puttuple_common(state, &stup);
@@ -1861,14 +1900,74 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
+ Datum original;
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+
+ /* copy the tuple into sort storage */
+ tup = heap_copytuple(tup);
+ stup.tuple = (void *) tup;
+ USEMEM(state, GetMemoryChunkSpace(tup));
+
+ MemoryContextSwitchTo(state->sortcontext);
/*
- * Copy the given tuple into memory we control, and decrease availMem.
- * Then call the common code.
+ * set up first-column key value, and potentially abbreviate, if it's a
+ * simple column
*/
- COPYTUP(state, &stup, (void *) tup);
+ if (state->haveDatum1)
+ {
+ original = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup.isnull1);
+
+ if (!state->sortKeys->abbrev_converter || stup.isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ stup.datum1 = original;
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ stup.datum1 = state->sortKeys->abbrev_converter(original,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ stup.datum1 = original;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ {
+ SortTuple *mtup = &state->memtuples[i];
+
+ tup = (HeapTuple) mtup->tuple;
+ mtup->datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &mtup->isnull1);
+ }
+ }
+ }
puttuple_common(state, &stup);
@@ -3947,84 +4046,6 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
return 0;
}
-static void
-copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /*
- * We expect the passed "tup" to be a TupleTableSlot, and form a
- * MinimalTuple using the exported interface for that.
- */
- TupleTableSlot *slot = (TupleTableSlot *) tup;
- Datum original;
- MinimalTuple tuple;
- HeapTupleData htup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
-
- /* copy the tuple into sort storage */
- tuple = ExecCopySlotMinimalTuple(slot);
- stup->tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- original = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &stup->isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-
- if (!state->sortKeys->abbrev_converter || stup->isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup->datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup->datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup->datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- mtup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &mtup->isnull1);
- }
- }
-}
-
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4193,79 +4214,6 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
return 0;
}
-static void
-copytup_cluster(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- HeapTuple tuple = (HeapTuple) tup;
- Datum original;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
-
- /* copy the tuple into sort storage */
- tuple = heap_copytuple(tuple);
- stup->tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
-
- MemoryContextSwitchTo(oldcontext);
-
- /*
- * set up first-column key value, and potentially abbreviate, if it's a
- * simple column
- */
- if (!state->haveDatum1)
- return;
-
- original = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &stup->isnull1);
-
- if (!state->sortKeys->abbrev_converter || stup->isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup->datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup->datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup->datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tuple = (HeapTuple) mtup->tuple;
- mtup->datum1 = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &mtup->isnull1);
- }
- }
-}
-
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4512,13 +4460,6 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
return 0;
}
-static void
-copytup_index(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /* Not currently needed */
- elog(ERROR, "copytup_index() should not be called");
-}
-
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4583,13 +4524,6 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
return compare;
}
-static void
-copytup_datum(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /* Not currently needed */
- elog(ERROR, "copytup_datum() should not be called");
-}
-
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
--
2.24.3 (Apple Git-128)
v3-0007-Tuplesort-code-beautification.patchapplication/octet-stream; name=v3-0007-Tuplesort-code-beautification.patchDownload
From e7ae8c8ebaa63446838816a80d63aa2051dc2c6e Mon Sep 17 00:00:00 2001
From: Pavel Borisov <pashkin.elfe@gmail.com>
Date: Fri, 22 Jul 2022 19:47:50 +0400
Subject: [PATCH v3 7/7] Tuplesort code beautification
---
src/backend/utils/sort/tuplesortvariants.c | 440 ++++++++++++---------
1 file changed, 252 insertions(+), 188 deletions(-)
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index e7bc87c71c..92f3206bf4 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -126,18 +126,18 @@ typedef struct
} TuplesortDatumArg;
Tuplesortstate *
-tuplesort_begin_heap(TupleDesc tupDesc,
- int nkeys, AttrNumber *attNums,
+tuplesort_begin_heap(TupleDesc tupDesc, int nkeys, AttrNumber *attNums,
Oid *sortOperators, Oid *sortCollations,
bool *nullsFirstFlags,
int workMem, SortCoordinate coordinate, int sortopt)
{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- int i;
+ Tuplesortstate *state;
+ TuplesortPublic *base;
+ MemoryContext oldcontext;
+ int i;
+ state = tuplesort_begin_common(workMem, coordinate, sortopt);
+ base = TuplesortstateGetPublic(state);
oldcontext = MemoryContextSwitchTo(base->maincontext);
AssertArg(nkeys > 0);
@@ -200,19 +200,18 @@ tuplesort_begin_heap(TupleDesc tupDesc,
}
Tuplesortstate *
-tuplesort_begin_cluster(TupleDesc tupDesc,
- Relation indexRel,
- int workMem,
+tuplesort_begin_cluster(TupleDesc tupDesc, Relation indexRel, int workMem,
SortCoordinate coordinate, int sortopt)
{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- BTScanInsert indexScanKey;
- MemoryContext oldcontext;
+ Tuplesortstate *state;
+ TuplesortPublic *base;
+ BTScanInsert indexScanKey;
+ MemoryContext oldcontext;
TuplesortClusterArg *arg;
- int i;
+ int i;
+ state = tuplesort_begin_common(workMem, coordinate, sortopt);
+ base = TuplesortstateGetPublic(state);
Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
oldcontext = MemoryContextSwitchTo(base->maincontext);
@@ -308,22 +307,20 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
}
Tuplesortstate *
-tuplesort_begin_index_btree(Relation heapRel,
- Relation indexRel,
- bool enforceUnique,
- bool uniqueNullsNotDistinct,
- int workMem,
- SortCoordinate coordinate,
+tuplesort_begin_index_btree(Relation heapRel, Relation indexRel,
+ bool enforceUnique, bool uniqueNullsNotDistinct,
+ int workMem, SortCoordinate coordinate,
int sortopt)
{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- BTScanInsert indexScanKey;
+ Tuplesortstate *state;
+ TuplesortPublic *base;
+ BTScanInsert indexScanKey;
TuplesortIndexBTreeArg *arg;
- MemoryContext oldcontext;
- int i;
+ MemoryContext oldcontext;
+ int i;
+ state = tuplesort_begin_common(workMem, coordinate, sortopt);
+ base = TuplesortstateGetPublic(state);
oldcontext = MemoryContextSwitchTo(base->maincontext);
arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
@@ -392,21 +389,18 @@ tuplesort_begin_index_btree(Relation heapRel,
}
Tuplesortstate *
-tuplesort_begin_index_hash(Relation heapRel,
- Relation indexRel,
- uint32 high_mask,
- uint32 low_mask,
- uint32 max_buckets,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
+tuplesort_begin_index_hash(Relation heapRel, Relation indexRel,
+ uint32 high_mask, uint32 low_mask,
+ uint32 max_buckets, int workMem,
+ SortCoordinate coordinate, int sortopt)
{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- TuplesortIndexHashArg *arg;
+ Tuplesortstate *state;
+ TuplesortPublic *base;
+ MemoryContext oldcontext;
+ TuplesortIndexHashArg *arg;
+ state = tuplesort_begin_common(workMem, coordinate, sortopt);
+ base = TuplesortstateGetPublic(state);
oldcontext = MemoryContextSwitchTo(base->maincontext);
arg = (TuplesortIndexHashArg *) palloc(sizeof(TuplesortIndexHashArg));
@@ -444,19 +438,18 @@ tuplesort_begin_index_hash(Relation heapRel,
}
Tuplesortstate *
-tuplesort_begin_index_gist(Relation heapRel,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate,
+tuplesort_begin_index_gist(Relation heapRel, Relation indexRel,
+ int workMem, SortCoordinate coordinate,
int sortopt)
{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
+ Tuplesortstate *state;
+ TuplesortPublic *base;
+ MemoryContext oldcontext;
TuplesortIndexBTreeArg *arg;
- int i;
+ int i;
+ state = tuplesort_begin_common(workMem, coordinate, sortopt);
+ base = TuplesortstateGetPublic(state);
oldcontext = MemoryContextSwitchTo(base->maincontext);
arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
@@ -512,14 +505,15 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
bool nullsFirstFlag, int workMem,
SortCoordinate coordinate, int sortopt)
{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortDatumArg *arg;
- MemoryContext oldcontext;
- int16 typlen;
- bool typbyval;
-
+ Tuplesortstate *state;
+ TuplesortPublic *base;
+ TuplesortDatumArg *arg;
+ MemoryContext oldcontext;
+ int16 typlen;
+ bool typbyval;
+
+ state = tuplesort_begin_common(workMem, coordinate, sortopt);
+ base = TuplesortstateGetPublic(state);
oldcontext = MemoryContextSwitchTo(base->maincontext);
arg = (TuplesortDatumArg *) palloc(sizeof(TuplesortDatumArg));
@@ -594,13 +588,16 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TupleDesc tupDesc = (TupleDesc) base->arg;
- SortTuple stup;
- MinimalTuple tuple;
- HeapTupleData htup;
-
+ TuplesortPublic *base;
+ MemoryContext oldcontext;
+ TupleDesc tupDesc;
+ SortTuple stup;
+ MinimalTuple tuple;
+ HeapTupleData htup;
+
+ base = TuplesortstateGetPublic(state);
+ tupDesc = (TupleDesc) base->arg;
+ oldcontext = MemoryContextSwitchTo(base->tuplecontext);
/* copy the tuple into sort storage */
tuple = ExecCopySlotMinimalTuple(slot);
stup.tuple = (void *) tuple;
@@ -627,11 +624,14 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
- SortTuple stup;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ SortTuple stup;
+ TuplesortPublic *base;
+ MemoryContext oldcontext;
+ TuplesortClusterArg *arg;
+ base = TuplesortstateGetPublic(state);
+ oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ arg = (TuplesortClusterArg *) base->arg;
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
stup.tuple = (void *) tup;
@@ -665,11 +665,13 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
ItemPointer self, Datum *values,
bool *isnull)
{
- SortTuple stup;
- IndexTuple tuple;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
+ SortTuple stup;
+ IndexTuple tuple;
+ TuplesortPublic *base;
+ TuplesortIndexArg *arg;
+ base = TuplesortstateGetPublic(state);
+ arg = (TuplesortIndexArg *) base->arg;
stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
isnull, base->tuplecontext);
tuple = ((IndexTuple) stup.tuple);
@@ -694,11 +696,14 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
void
tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- SortTuple stup;
+ TuplesortPublic *base;
+ MemoryContext oldcontext;
+ TuplesortDatumArg *arg;
+ SortTuple stup;
+ base = TuplesortstateGetPublic(state);
+ oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ arg = (TuplesortDatumArg *) base->arg;
/*
* Pass-by-value types or null values are just stored directly in
* stup.datum1 (and stup.tuple is not used and set to NULL).
@@ -759,9 +764,12 @@ bool
tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
TupleTableSlot *slot, Datum *abbrev)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
+ TuplesortPublic *base;
+ MemoryContext oldcontext;
+ SortTuple stup;
+
+ base = TuplesortstateGetPublic(state);
+ oldcontext = MemoryContextSwitchTo(base->sortcontext);
if (!tuplesort_gettuple_common(state, forward, &stup))
stup.tuple = NULL;
@@ -796,9 +804,12 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
HeapTuple
tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
+ TuplesortPublic *base;
+ MemoryContext oldcontext;
+ SortTuple stup;
+
+ base = TuplesortstateGetPublic(state);
+ oldcontext = MemoryContextSwitchTo(base->sortcontext);
if (!tuplesort_gettuple_common(state, forward, &stup))
stup.tuple = NULL;
@@ -817,9 +828,12 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
IndexTuple
tuplesort_getindextuple(Tuplesortstate *state, bool forward)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
+ TuplesortPublic *base;
+ MemoryContext oldcontext;
+ SortTuple stup;
+
+ base = TuplesortstateGetPublic(state);
+ oldcontext = MemoryContextSwitchTo(base->sortcontext);
if (!tuplesort_gettuple_common(state, forward, &stup))
stup.tuple = NULL;
@@ -848,10 +862,14 @@ bool
tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- SortTuple stup;
+ TuplesortPublic *base;
+ MemoryContext oldcontext;
+ TuplesortDatumArg *arg;
+ SortTuple stup;
+
+ base = TuplesortstateGetPublic(state);
+ oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ arg = (TuplesortDatumArg *) base->arg;
if (!tuplesort_gettuple_common(state, forward, &stup))
{
@@ -910,20 +928,21 @@ removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- SortSupport sortKey = base->sortKeys;
- HeapTupleData ltup;
- HeapTupleData rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- AttrNumber attno;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
+ TuplesortPublic *base;
+ SortSupport sortKey;
+ HeapTupleData ltup,
+ rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ AttrNumber attno;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+ base = TuplesortstateGetPublic(state);
+ sortKey = base->sortKeys;
/* Compare the leading sort key */
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
@@ -973,15 +992,17 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MinimalTuple tuple = (MinimalTuple) stup->tuple;
-
- /* the part of the MinimalTuple we'll write: */
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
-
- /* total on-disk footprint: */
- unsigned int tuplen = tupbodylen + sizeof(int);
+ TuplesortPublic *base;
+ MinimalTuple tuple;
+ char *tupbody; /* the part of the MinimalTuple we'll write: */
+ unsigned int tupbodylen,
+ tuplen; /* total on-disk footprint: */
+
+ base = TuplesortstateGetPublic(state);
+ tuple = (MinimalTuple) stup->tuple;
+ tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
+ tuplen = tupbodylen + sizeof(int);
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
@@ -993,13 +1014,18 @@ static void
readtup_heap(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
- unsigned int tupbodylen = len - sizeof(int);
- unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
- MinimalTuple tuple = (MinimalTuple) tuplesort_readtup_alloc(state, tuplen);
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- HeapTupleData htup;
-
+ unsigned int tupbodylen,
+ tuplen;
+ MinimalTuple tuple;
+ char *tupbody;
+ TuplesortPublic *base;
+ HeapTupleData htup;
+
+ tupbodylen = len - sizeof(int);
+ tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
+ tuple = (MinimalTuple) tuplesort_readtup_alloc(state, tuplen);
+ tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ base = TuplesortstateGetPublic(state);
/* read in the tuple proper */
tuple->t_len = tuplen;
LogicalTapeReadExact(tape, tupbody, tupbodylen);
@@ -1023,9 +1049,12 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
static void
removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
{
- int i;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ int i;
+ TuplesortPublic *base;
+ TuplesortClusterArg *arg;
+
+ base = TuplesortstateGetPublic(state);
+ arg = (TuplesortClusterArg *) base->arg;
for (i = 0; i < count; i++)
{
@@ -1043,19 +1072,22 @@ static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
- SortSupport sortKey = base->sortKeys;
- HeapTuple ltup;
- HeapTuple rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
+ TuplesortPublic *base;
+ TuplesortClusterArg *arg;
+ SortSupport sortKey;
+ HeapTuple ltup,
+ rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+ base = TuplesortstateGetPublic(state);
+ arg = (TuplesortClusterArg *) base->arg;
+ sortKey = base->sortKeys;
/* Be prepared to compare additional sort keys */
ltup = (HeapTuple) a->tuple;
rtup = (HeapTuple) b->tuple;
@@ -1156,9 +1188,13 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- HeapTuple tuple = (HeapTuple) stup->tuple;
- unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
+ TuplesortPublic *base;
+ HeapTuple tuple;
+ unsigned int tuplen;
+
+ base = TuplesortstateGetPublic(state);
+ tuple = (HeapTuple) stup->tuple;
+ tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
/* We need to store t_self, but not other fields of HeapTupleData */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
@@ -1172,11 +1208,15 @@ static void
readtup_cluster(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int tuplen)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
- unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
- HeapTuple tuple = (HeapTuple) tuplesort_readtup_alloc(state,
- t_len + HEAPTUPLESIZE);
+ TuplesortPublic *base;
+ TuplesortClusterArg *arg;
+ unsigned int t_len;
+ HeapTuple tuple;
+
+ base = TuplesortstateGetPublic(state);
+ arg = (TuplesortClusterArg *) base->arg;
+ t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
+ tuple = (HeapTuple) tuplesort_readtup_alloc(state, t_len + HEAPTUPLESIZE);
/* Reconstruct the HeapTupleData header */
tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
@@ -1200,8 +1240,11 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
static void
freestate_cluster(Tuplesortstate *state)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ TuplesortPublic *base;
+ TuplesortClusterArg *arg;
+
+ base = TuplesortstateGetPublic(state);
+ arg = (TuplesortClusterArg *) base->arg;
/* Free any execution state created for CLUSTER case */
if (arg->estate != NULL)
@@ -1224,9 +1267,12 @@ freestate_cluster(Tuplesortstate *state)
static void
removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
- int i;
+ TuplesortPublic *base;
+ TuplesortIndexArg *arg;
+ int i;
+
+ base = TuplesortstateGetPublic(state);
+ arg = (TuplesortIndexArg *) base->arg;
for (i = 0; i < count; i++)
{
@@ -1249,22 +1295,24 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* is also special handling for enforcing uniqueness, and special
* treatment for equal keys at the end.
*/
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
- SortSupport sortKey = base->sortKeys;
- IndexTuple tuple1;
- IndexTuple tuple2;
- int keysz;
- TupleDesc tupDes;
- bool equal_hasnull = false;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
+ TuplesortPublic *base;
+ TuplesortIndexBTreeArg *arg;
+ SortSupport sortKey;
+ IndexTuple tuple1,
+ tuple2;
+ int keysz;
+ TupleDesc tupDes;
+ bool equal_hasnull = false;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+ base = TuplesortstateGetPublic(state);
+ arg = (TuplesortIndexBTreeArg *) base->arg;
+ sortKey = base->sortKeys;
/* Compare the leading sort key */
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
@@ -1380,13 +1428,15 @@ static int
comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
{
- Bucket bucket1;
- Bucket bucket2;
- IndexTuple tuple1;
- IndexTuple tuple2;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexHashArg *arg = (TuplesortIndexHashArg *) base->arg;
-
+ Bucket bucket1,
+ bucket2;
+ IndexTuple tuple1,
+ tuple2;
+ TuplesortPublic *base;
+ TuplesortIndexHashArg *arg;
+
+ base = TuplesortstateGetPublic(state);
+ arg = (TuplesortIndexHashArg *) base->arg;
/*
* Fetch hash keys and mask off bits we don't want to sort by. We know
* that the first column of the index tuple is the hash key.
@@ -1436,10 +1486,12 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- IndexTuple tuple = (IndexTuple) stup->tuple;
- unsigned int tuplen;
+ TuplesortPublic *base;
+ IndexTuple tuple;
+ unsigned int tuplen;
+ base = TuplesortstateGetPublic(state);
+ tuple = (IndexTuple) stup->tuple;
tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
@@ -1451,10 +1503,15 @@ static void
readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
- unsigned int tuplen = len - sizeof(unsigned int);
- IndexTuple tuple = (IndexTuple) tuplesort_readtup_alloc(state, tuplen);
+ TuplesortPublic *base;
+ TuplesortIndexArg *arg;
+ unsigned int tuplen;
+ IndexTuple tuple;
+
+ base = TuplesortstateGetPublic(state);
+ arg = (TuplesortIndexArg *) base->arg;
+ tuplen = len - sizeof(unsigned int);
+ tuple = (IndexTuple) tuplesort_readtup_alloc(state, tuplen);
LogicalTapeReadExact(tape, tuple, tuplen);
if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
@@ -1483,9 +1540,10 @@ removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- int compare;
+ TuplesortPublic *base;
+ int compare;
+ base = TuplesortstateGetPublic(state);
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
base->sortKeys);
@@ -1505,11 +1563,14 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- void *waddr;
- unsigned int tuplen;
- unsigned int writtenlen;
+ TuplesortPublic *base;
+ TuplesortDatumArg *arg;
+ void *waddr;
+ unsigned int tuplen,
+ writtenlen;
+
+ base = TuplesortstateGetPublic(state);
+ arg = (TuplesortDatumArg *) base->arg;
if (stup->isnull1)
{
@@ -1540,8 +1601,11 @@ static void
readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- unsigned int tuplen = len - sizeof(unsigned int);
+ TuplesortPublic *base;
+ unsigned int tuplen;
+
+ base = TuplesortstateGetPublic(state);
+ tuplen = len - sizeof(unsigned int);
if (tuplen == 0)
{
--
2.24.3 (Apple Git-128)
v3-0006-Split-tuplesortvariants.c-from-tuplesort.c.patchapplication/octet-stream; name=v3-0006-Split-tuplesortvariants.c-from-tuplesort.c.patchDownload
From 7370c2ea7bee2f30dfb991661475630be2f59777 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 21:48:05 +0300
Subject: [PATCH v3 6/7] Split tuplesortvariants.c from tuplesort.c
This commit puts the implementation of Tuple sort variants into the separate
file tuplesortvariants.c. That gives better separation of the code and
serves well as the demonstration that Tuple sort variant can be defined outside
of tuplesort.c.
---
src/backend/utils/sort/Makefile | 1 +
src/backend/utils/sort/tuplesort.c | 1722 +-------------------
src/backend/utils/sort/tuplesortvariants.c | 1572 ++++++++++++++++++
src/include/utils/tuplesort.h | 222 ++-
4 files changed, 1775 insertions(+), 1742 deletions(-)
create mode 100644 src/backend/utils/sort/tuplesortvariants.c
diff --git a/src/backend/utils/sort/Makefile b/src/backend/utils/sort/Makefile
index 2c31fd453d..5bfca3040a 100644
--- a/src/backend/utils/sort/Makefile
+++ b/src/backend/utils/sort/Makefile
@@ -20,6 +20,7 @@ OBJS = \
sharedtuplestore.o \
sortsupport.o \
tuplesort.o \
+ tuplesortvariants.o \
tuplestore.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index fb711f51f8..00abbe56df 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -100,36 +100,17 @@
#include <limits.h>
-#include "access/hash.h"
-#include "access/htup_details.h"
-#include "access/nbtree.h"
-#include "catalog/index.h"
#include "catalog/pg_am.h"
#include "commands/tablespace.h"
#include "executor/executor.h"
#include "miscadmin.h"
#include "pg_trace.h"
-#include "utils/datum.h"
-#include "utils/logtape.h"
-#include "utils/lsyscache.h"
+#include "storage/shmem.h"
#include "utils/memutils.h"
#include "utils/pg_rusage.h"
#include "utils/rel.h"
-#include "utils/sortsupport.h"
#include "utils/tuplesort.h"
-
-/* sort-type codes for sort__start probes */
-#define HEAP_SORT 0
-#define INDEX_SORT 1
-#define DATUM_SORT 2
-#define CLUSTER_SORT 3
-
-/* Sort parallel code from state for sort__start probes */
-#define PARALLEL_SORT(coordinate) (coordinate == NULL || \
- (coordinate)->sharedsort == NULL ? 0 : \
- (coordinate)->isWorker ? 1 : 2)
-
/*
* Initial size of memtuples array. We're trying to select this size so that
* array doesn't exceed ALLOCSET_SEPARATE_THRESHOLD and so that the overhead of
@@ -150,43 +131,6 @@ bool optimize_bounded_sort = true;
#endif
-/*
- * The objects we actually sort are SortTuple structs. These contain
- * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
- * which is a separate palloc chunk --- we assume it is just one chunk and
- * can be freed by a simple pfree() (except during merge, when we use a
- * simple slab allocator). SortTuples also contain the tuple's first key
- * column in Datum/nullflag format, and a source/input tape number that
- * tracks which tape each heap element/slot belongs to during merging.
- *
- * Storing the first key column lets us save heap_getattr or index_getattr
- * calls during tuple comparisons. We could extract and save all the key
- * columns not just the first, but this would increase code complexity and
- * overhead, and wouldn't actually save any comparison cycles in the common
- * case where the first key determines the comparison result. Note that
- * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
- *
- * There is one special case: when the sort support infrastructure provides an
- * "abbreviated key" representation, where the key is (typically) a pass by
- * value proxy for a pass by reference type. In this case, the abbreviated key
- * is stored in datum1 in place of the actual first key column.
- *
- * When sorting single Datums, the data value is represented directly by
- * datum1/isnull1 for pass by value types (or null values). If the datatype is
- * pass-by-reference and isnull1 is false, then "tuple" points to a separately
- * palloc'd data value, otherwise "tuple" is NULL. The value of datum1 is then
- * either the same pointer as "tuple", or is an abbreviated key value as
- * described above. Accordingly, "tuple" is always used in preference to
- * datum1 as the authoritative value for pass-by-reference cases.
- */
-typedef struct
-{
- void *tuple; /* the tuple itself */
- Datum datum1; /* value of first key column */
- bool isnull1; /* is first key column NULL? */
- int srctape; /* source tape number */
-} SortTuple;
-
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
* tuples. To avoid palloc/pfree overhead.
@@ -237,155 +181,6 @@ typedef enum
#define TAPE_BUFFER_OVERHEAD BLCKSZ
#define MERGE_BUFFER_SIZE (BLCKSZ * 32)
-typedef struct TuplesortPublic TuplesortPublic;
-
-typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-
-/*
- * The public part of a Tuple sort operation state. This data structure
- * containsthe definition of sort-variant-specific interface methods and
- * the part of Tuple sort operation state required by their implementations.
- */
-struct TuplesortPublic
-{
- /*
- * These function pointers decouple the routines that must know what kind
- * of tuple we are sorting from the routines that don't need to know it.
- * They are set up by the tuplesort_begin_xxx routines.
- *
- * Function to compare two tuples; result is per qsort() convention, ie:
- * <0, 0, >0 according as a<b, a=b, a>b. The API must match
- * qsort_arg_comparator.
- */
- SortTupleComparator comparetup;
-
- /*
- * Alter datum1 representation in the SortTuple's array back from the
- * abbreviated key to the first column value.
- */
- void (*removeabbrev) (Tuplesortstate *state, SortTuple *stups,
- int count);
-
- /*
- * Function to write a stored tuple onto tape. The representation of the
- * tuple on tape need not be the same as it is in memory.
- */
- void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-
- /*
- * Function to read a stored tuple from tape back into memory. 'len' is
- * the already-read length of the stored tuple. The tuple is allocated
- * from the slab memory arena, or is palloc'd, see readtup_alloc().
- */
- void (*readtup) (Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-
- /*
- * Function to do some specific release of resources for the sort variant.
- * In particular, this function should free everything stored in the "arg"
- * field, which wouldn't be cleared on reset of the Tuple sort memory
- * contextes. This can be NULL if nothing specific needs to be done.
- */
- void (*freestate) (Tuplesortstate *state);
-
- /*
- * The subsequent fields are used in the implementations of the functions
- * above.
- */
- MemoryContext maincontext; /* memory context for tuple sort metadata that
- * persists across multiple batches */
- MemoryContext sortcontext; /* memory context holding most sort data */
- MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
-
- /*
- * Whether SortTuple's datum1 and isnull1 members are maintained by the
- * above routines. If not, some sort specializations are disabled.
- */
- bool haveDatum1;
-
- /*
- * The sortKeys variable is used by every case other than the hash index
- * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
- * MinimalTuple and CLUSTER routines, though.
- */
- int nKeys; /* number of columns in sort key */
- SortSupport sortKeys; /* array of length nKeys */
-
- /*
- * This variable is shared by the single-key MinimalTuple case and the
- * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
- * presence of a value in this field is also checked by various sort
- * specialization functions as an optimization when comparing the leading
- * key in a tiebreak situation to determine if there are any subsequent
- * keys to sort on.
- */
- SortSupport onlyKey;
-
- int sortopt; /* Bitmask of flags used to setup sort */
-
- bool tuples; /* Can SortTuple.tuple ever be set? */
-
- void *arg; /* Specific information for the sort variant */
-};
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
- * the tuplesort_begin_cluster.
- */
-typedef struct
-{
- TupleDesc tupDesc;
-
- IndexInfo *indexInfo; /* info about index being used for reference */
- EState *estate; /* for evaluating index expressions */
-} TuplesortClusterArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the IndexTuple case.
- * Set by tuplesort_begin_index_xxx and used only by the IndexTuple routines.
- */
-typedef struct
-{
- Relation heapRel; /* table the index is being built on */
- Relation indexRel; /* index being built */
-} TuplesortIndexArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the index_btree subcase.
- */
-typedef struct
-{
- TuplesortIndexArg index;
-
- bool enforceUnique; /* complain if we find duplicate tuples */
- bool uniqueNullsNotDistinct; /* unique constraint null treatment */
-} TuplesortIndexBTreeArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the index_hash subcase.
- */
-typedef struct
-{
- TuplesortIndexArg index;
-
- uint32 high_mask; /* masks for sortable part of hash code */
- uint32 low_mask;
- uint32 max_buckets;
-} TuplesortIndexHashArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the Datum case.
- * Set by tuplesort_begin_datum and used only by the DatumTuple routines.
- */
-typedef struct
-{
- /* the datatype oid of Datum's to be sorted */
- Oid datumType;
- /* we need typelen in order to know how to copy the Datums. */
- int datumTypeLen;
-} TuplesortDatumArg;
/*
* Private state of a Tuplesort operation.
@@ -597,8 +392,6 @@ struct Sharedsort
pfree(buf); \
} while(0)
-#define TuplesortstateGetPublic(state) ((TuplesortPublic *) state);
-
#define REMOVEABBREV(state,stup,count) ((*(state)->base.removeabbrev) (state, stup, count))
#define COMPARETUP(state,a,b) ((*(state)->base.comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) (writetuple(state, tape, stup))
@@ -657,20 +450,8 @@ struct Sharedsort
* begins).
*/
-/* When using this macro, beware of double evaluation of len */
-#define LogicalTapeReadExact(tape, ptr, len) \
- do { \
- if (LogicalTapeRead(tape, ptr, len) != (size_t) (len)) \
- elog(ERROR, "unexpected end of data"); \
- } while(0)
-
-static Tuplesortstate *tuplesort_begin_common(int workMem,
- SortCoordinate coordinate,
- int sortopt);
static void tuplesort_begin_batch(Tuplesortstate *state);
-static void puttuple_common(Tuplesortstate *state, SortTuple *tuple,
- bool useAbbrev);
static void writetuple(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static bool consider_abort_common(Tuplesortstate *state);
@@ -692,42 +473,6 @@ static void tuplesort_heap_delete_top(Tuplesortstate *state);
static void reversedirection(Tuplesortstate *state);
static unsigned int getlen(LogicalTape *tape, bool eofOK);
static void markrunend(LogicalTape *tape);
-static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
-static void removeabbrev_heap(Tuplesortstate *state, SortTuple *stups,
- int count);
-static void removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups,
- int count);
-static void removeabbrev_index(Tuplesortstate *state, SortTuple *stups,
- int count);
-static void removeabbrev_datum(Tuplesortstate *state, SortTuple *stups,
- int count);
-static int comparetup_heap(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_index(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_datum(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static void freestate_cluster(Tuplesortstate *state);
static int worker_get_identifier(Tuplesortstate *state);
static void worker_freeze_result_tape(Tuplesortstate *state);
static void worker_nomergeruns(Tuplesortstate *state);
@@ -898,7 +643,7 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* sort options. See TUPLESORT_* definitions in tuplesort.h
*/
-static Tuplesortstate *
+Tuplesortstate *
tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
{
Tuplesortstate *state;
@@ -1084,468 +829,6 @@ tuplesort_begin_batch(Tuplesortstate *state)
MemoryContextSwitchTo(oldcontext);
}
-Tuplesortstate *
-tuplesort_begin_heap(TupleDesc tupDesc,
- int nkeys, AttrNumber *attNums,
- Oid *sortOperators, Oid *sortCollations,
- bool *nullsFirstFlags,
- int workMem, SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- int i;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
-
- AssertArg(nkeys > 0);
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
- nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = nkeys;
-
- TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
- false, /* no unique check */
- nkeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_heap;
- base->comparetup = comparetup_heap;
- base->writetup = writetup_heap;
- base->readtup = readtup_heap;
- base->haveDatum1 = true;
- base->arg = tupDesc; /* assume we need not copy tupDesc */
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
-
- for (i = 0; i < nkeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
-
- AssertArg(attNums[i] != 0);
- AssertArg(sortOperators[i] != 0);
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = sortCollations[i];
- sortKey->ssup_nulls_first = nullsFirstFlags[i];
- sortKey->ssup_attno = attNums[i];
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
- }
-
- /*
- * The "onlyKey" optimization cannot be used with abbreviated keys, since
- * tie-breaker comparisons may be required. Typically, the optimization
- * is only of value to pass-by-value types anyway, whereas abbreviated
- * keys are typically only of value to pass-by-reference types.
- */
- if (nkeys == 1 && !base->sortKeys->abbrev_converter)
- base->onlyKey = base->sortKeys;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_cluster(TupleDesc tupDesc,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- BTScanInsert indexScanKey;
- MemoryContext oldcontext;
- TuplesortClusterArg *arg;
- int i;
-
- Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortClusterArg *) palloc0(sizeof(TuplesortClusterArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
- RelationGetNumberOfAttributes(indexRel),
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
- false, /* no unique check */
- base->nKeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_cluster;
- base->comparetup = comparetup_cluster;
- base->writetup = writetup_cluster;
- base->readtup = readtup_cluster;
- base->freestate = freestate_cluster;
- base->arg = arg;
-
- arg->indexInfo = BuildIndexInfo(indexRel);
-
- /*
- * If we don't have a simple leading attribute, we don't currently
- * initialize datum1, so disable optimizations that require it.
- */
- if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
- base->haveDatum1 = false;
- else
- base->haveDatum1 = true;
-
- arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
-
- indexScanKey = _bt_mkscankey(indexRel, NULL);
-
- if (arg->indexInfo->ii_Expressions != NULL)
- {
- TupleTableSlot *slot;
- ExprContext *econtext;
-
- /*
- * We will need to use FormIndexDatum to evaluate the index
- * expressions. To do that, we need an EState, as well as a
- * TupleTableSlot to put the table tuples into. The econtext's
- * scantuple has to point to that slot, too.
- */
- arg->estate = CreateExecutorState();
- slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
- econtext = GetPerTupleExprContext(arg->estate);
- econtext->ecxt_scantuple = slot;
- }
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(base->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < base->nKeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
- ScanKey scanKey = indexScanKey->scankeys + i;
- int16 strategy;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = scanKey->sk_collation;
- sortKey->ssup_nulls_first =
- (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
- sortKey->ssup_attno = scanKey->sk_attno;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
- BTGreaterStrategyNumber : BTLessStrategyNumber;
-
- PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
- }
-
- pfree(indexScanKey);
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_btree(Relation heapRel,
- Relation indexRel,
- bool enforceUnique,
- bool uniqueNullsNotDistinct,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- BTScanInsert indexScanKey;
- TuplesortIndexBTreeArg *arg;
- MemoryContext oldcontext;
- int i;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: unique = %c, workMem = %d, randomAccess = %c",
- enforceUnique ? 't' : 'f',
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
- enforceUnique,
- base->nKeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_index;
- base->comparetup = comparetup_index_btree;
- base->writetup = writetup_index;
- base->readtup = readtup_index;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
- arg->enforceUnique = enforceUnique;
- arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
-
- indexScanKey = _bt_mkscankey(indexRel, NULL);
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(base->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < base->nKeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
- ScanKey scanKey = indexScanKey->scankeys + i;
- int16 strategy;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = scanKey->sk_collation;
- sortKey->ssup_nulls_first =
- (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
- sortKey->ssup_attno = scanKey->sk_attno;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
- BTGreaterStrategyNumber : BTLessStrategyNumber;
-
- PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
- }
-
- pfree(indexScanKey);
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_hash(Relation heapRel,
- Relation indexRel,
- uint32 high_mask,
- uint32 low_mask,
- uint32 max_buckets,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- TuplesortIndexHashArg *arg;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexHashArg *) palloc(sizeof(TuplesortIndexHashArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: high_mask = 0x%x, low_mask = 0x%x, "
- "max_buckets = 0x%x, workMem = %d, randomAccess = %c",
- high_mask,
- low_mask,
- max_buckets,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = 1; /* Only one sort column, the hash code */
-
- base->removeabbrev = removeabbrev_index;
- base->comparetup = comparetup_index_hash;
- base->writetup = writetup_index;
- base->readtup = readtup_index;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
-
- arg->high_mask = high_mask;
- arg->low_mask = low_mask;
- arg->max_buckets = max_buckets;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_gist(Relation heapRel,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- TuplesortIndexBTreeArg *arg;
- int i;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: workMem = %d, randomAccess = %c",
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- base->removeabbrev = removeabbrev_index;
- base->comparetup = comparetup_index_btree;
- base->writetup = writetup_index;
- base->readtup = readtup_index;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
- arg->enforceUnique = false;
- arg->uniqueNullsNotDistinct = false;
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(base->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < base->nKeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = indexRel->rd_indcollation[i];
- sortKey->ssup_nulls_first = false;
- sortKey->ssup_attno = i + 1;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- /* Look for a sort support function */
- PrepareSortSupportFromGistIndexRel(indexRel, sortKey);
- }
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
- bool nullsFirstFlag, int workMem,
- SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortDatumArg *arg;
- MemoryContext oldcontext;
- int16 typlen;
- bool typbyval;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortDatumArg *) palloc(sizeof(TuplesortDatumArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin datum sort: workMem = %d, randomAccess = %c",
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = 1; /* always a one-column sort */
-
- TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
- false, /* no unique check */
- 1,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_datum;
- base->comparetup = comparetup_datum;
- base->writetup = writetup_datum;
- base->readtup = readtup_datum;
- state->abbrevNext = 10;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->datumType = datumType;
-
- /* lookup necessary attributes of the datum type */
- get_typlenbyval(datumType, &typlen, &typbyval);
- arg->datumTypeLen = typlen;
- base->tuples = !typbyval;
-
- /* Prepare SortSupport data */
- base->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
-
- base->sortKeys->ssup_cxt = CurrentMemoryContext;
- base->sortKeys->ssup_collation = sortCollation;
- base->sortKeys->ssup_nulls_first = nullsFirstFlag;
-
- /*
- * Abbreviation is possible here only for by-reference types. In theory,
- * a pass-by-value datatype could have an abbreviated form that is cheaper
- * to compare. In a tuple sort, we could support that, because we can
- * always extract the original datum from the tuple as needed. Here, we
- * can't, because a datum sort only stores a single copy of the datum; the
- * "tuple" field of each SortTuple is NULL.
- */
- base->sortKeys->abbreviate = !typbyval;
-
- PrepareSortSupportFromOrderingOp(sortOperator, base->sortKeys);
-
- /*
- * The "onlyKey" optimization cannot be used with abbreviated keys, since
- * tie-breaker comparisons may be required. Typically, the optimization
- * is only of value to pass-by-value types anyway, whereas abbreviated
- * keys are typically only of value to pass-by-reference types.
- */
- if (!base->sortKeys->abbrev_converter)
- base->onlyKey = base->sortKeys;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
/*
* tuplesort_set_bound
*
@@ -1901,154 +1184,11 @@ noalloc:
return false;
}
-/*
- * Accept one tuple while collecting input data for sort.
- *
- * Note that the input data is always copied; the caller need not save it.
- */
-void
-tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TupleDesc tupDesc = (TupleDesc) base->arg;
- SortTuple stup;
- MinimalTuple tuple;
- HeapTupleData htup;
-
- /* copy the tuple into sort storage */
- tuple = ExecCopySlotMinimalTuple(slot);
- stup.tuple = (void *) tuple;
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- stup.datum1 = heap_getattr(&htup,
- base->sortKeys[0].ssup_attno,
- tupDesc,
- &stup.isnull1);
-
- puttuple_common(state, &stup,
- base->sortKeys->abbrev_converter && !stup.isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Accept one tuple while collecting input data for sort.
- *
- * Note that the input data is always copied; the caller need not save it.
- */
-void
-tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
-{
- SortTuple stup;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
-
- /* copy the tuple into sort storage */
- tup = heap_copytuple(tup);
- stup.tuple = (void *) tup;
-
- /*
- * set up first-column key value, and potentially abbreviate, if it's a
- * simple column
- */
- if (base->haveDatum1)
- {
- stup.datum1 = heap_getattr(tup,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup.isnull1);
- }
-
- puttuple_common(state, &stup,
- base->haveDatum1 && base->sortKeys->abbrev_converter && !stup.isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Collect one index tuple while collecting input data for sort, building
- * it from caller-supplied values.
- */
-void
-tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
- ItemPointer self, Datum *values,
- bool *isnull)
-{
- SortTuple stup;
- IndexTuple tuple;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
-
- stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
- isnull, base->tuplecontext);
- tuple = ((IndexTuple) stup.tuple);
- tuple->t_tid = *self;
- /* set up first-column key value */
- stup.datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup.isnull1);
-
- puttuple_common(state, &stup,
- base->sortKeys && base->sortKeys->abbrev_converter && !stup.isnull1);
-}
-
-/*
- * Accept one Datum while collecting input data for sort.
- *
- * If the Datum is pass-by-ref type, the value will be copied.
- */
-void
-tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- SortTuple stup;
-
- /*
- * Pass-by-value types or null values are just stored directly in
- * stup.datum1 (and stup.tuple is not used and set to NULL).
- *
- * Non-null pass-by-reference values need to be copied into memory we
- * control, and possibly abbreviated. The copied value is pointed to by
- * stup.tuple and is treated as the canonical copy (e.g. to return via
- * tuplesort_getdatum or when writing to tape); stup.datum1 gets the
- * abbreviated value if abbreviation is happening, otherwise it's
- * identical to stup.tuple.
- */
-
- if (isNull || !base->tuples)
- {
- /*
- * Set datum1 to zeroed representation for NULLs (to be consistent,
- * and to support cheap inequality tests for NULL abbreviated keys).
- */
- stup.datum1 = !isNull ? val : (Datum) 0;
- stup.isnull1 = isNull;
- stup.tuple = NULL; /* no separate storage */
- }
- else
- {
- stup.isnull1 = false;
- stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
- stup.tuple = DatumGetPointer(stup.datum1);
- }
-
- puttuple_common(state, &stup,
- base->tuples && !isNull && base->sortKeys->abbrev_converter);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
/*
* Shared code for tuple and datum cases.
*/
-static void
-puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
+void
+tuplesort_puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
@@ -2371,7 +1511,7 @@ tuplesort_performsort(Tuplesortstate *state)
* by caller. Note that fetched tuple is stored in memory that may be
* recycled by any future fetch.
*/
-static bool
+bool
tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
SortTuple *stup)
{
@@ -2595,162 +1735,17 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
}
newtup.srctape = srcTapeIndex;
tuplesort_heap_replace_top(state, &newtup);
- return true;
- }
- return false;
-
- default:
- elog(ERROR, "invalid tuplesort state");
- return false; /* keep compiler quiet */
- }
-}
-
-/*
- * Fetch the next tuple in either forward or back direction.
- * If successful, put tuple in slot and return true; else, clear the slot
- * and return false.
- *
- * Caller may optionally be passed back abbreviated value (on true return
- * value) when abbreviation was used, which can be used to cheaply avoid
- * equality checks that might otherwise be required. Caller can safely make a
- * determination of "non-equal tuple" based on simple binary inequality. A
- * NULL value in leading attribute will set abbreviated value to zeroed
- * representation, which caller may rely on in abbreviated inequality check.
- *
- * If copy is true, the slot receives a tuple that's been copied into the
- * caller's memory context, so that it will stay valid regardless of future
- * manipulations of the tuplesort's state (up to and including deleting the
- * tuplesort). If copy is false, the slot will just receive a pointer to a
- * tuple held within the tuplesort, which is more efficient, but only safe for
- * callers that are prepared to have any subsequent manipulation of the
- * tuplesort's state invalidate slot contents.
- */
-bool
-tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
- TupleTableSlot *slot, Datum *abbrev)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- if (stup.tuple)
- {
- /* Record abbreviated key for caller */
- if (base->sortKeys->abbrev_converter && abbrev)
- *abbrev = stup.datum1;
-
- if (copy)
- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
-
- ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
- return true;
- }
- else
- {
- ExecClearTuple(slot);
- return false;
- }
-}
-
-/*
- * Fetch the next tuple in either forward or back direction.
- * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller may not rely on tuple
- * remaining valid after any further manipulation of tuplesort.
- */
-HeapTuple
-tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- return stup.tuple;
-}
-
-/*
- * Fetch the next index tuple in either forward or back direction.
- * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller may not rely on tuple
- * remaining valid after any further manipulation of tuplesort.
- */
-IndexTuple
-tuplesort_getindextuple(Tuplesortstate *state, bool forward)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- return (IndexTuple) stup.tuple;
-}
-
-/*
- * Fetch the next Datum in either forward or back direction.
- * Returns false if no more datums.
- *
- * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * in caller's context, and is now owned by the caller (this differs from
- * similar routines for other types of tuplesorts).
- *
- * Caller may optionally be passed back abbreviated value (on true return
- * value) when abbreviation was used, which can be used to cheaply avoid
- * equality checks that might otherwise be required. Caller can safely make a
- * determination of "non-equal tuple" based on simple binary inequality. A
- * NULL value will have a zeroed abbreviated value representation, which caller
- * may rely on in abbreviated inequality check.
- */
-bool
-tuplesort_getdatum(Tuplesortstate *state, bool forward,
- Datum *val, bool *isNull, Datum *abbrev)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- {
- MemoryContextSwitchTo(oldcontext);
- return false;
- }
-
- /* Ensure we copy into caller's memory context */
- MemoryContextSwitchTo(oldcontext);
-
- /* Record abbreviated key for caller */
- if (base->sortKeys->abbrev_converter && abbrev)
- *abbrev = stup.datum1;
+ return true;
+ }
+ return false;
- if (stup.isnull1 || !base->tuples)
- {
- *val = stup.datum1;
- *isNull = stup.isnull1;
- }
- else
- {
- /* use stup.tuple because stup.datum1 may be an abbreviation */
- *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
- *isNull = false;
+ default:
+ elog(ERROR, "invalid tuplesort state");
+ return false; /* keep compiler quiet */
}
-
- return true;
}
+
/*
* Advance over N tuples in either forward or back direction,
* without returning any data. N==0 is a no-op.
@@ -3929,8 +2924,8 @@ markrunend(LogicalTape *tape)
* We use next free slot from the slab allocator, or palloc() if the tuple
* is too large for that.
*/
-static void *
-readtup_alloc(Tuplesortstate *state, Size tuplen)
+void *
+tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen)
{
SlabSlot *buf;
@@ -3953,695 +2948,6 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
}
-/*
- * Routines specialized for HeapTuple (actually MinimalTuple) case
- */
-
-static void
-removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
-{
- int i;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
-
- for (i = 0; i < count; i++)
- {
- HeapTupleData htup;
-
- htup.t_len = ((MinimalTuple) stups[i].tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
- MINIMAL_TUPLE_OFFSET);
- stups[i].datum1 = heap_getattr(&htup,
- base->sortKeys[0].ssup_attno,
- (TupleDesc) base->arg,
- &stups[i].isnull1);
- }
-}
-
-static int
-comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- SortSupport sortKey = base->sortKeys;
- HeapTupleData ltup;
- HeapTupleData rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- AttrNumber attno;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
- /* Compare the leading sort key */
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- /* Compare additional sort keys */
- ltup.t_len = ((MinimalTuple) a->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
- ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
- rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
- rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
- tupDesc = (TupleDesc) base->arg;
-
- if (sortKey->abbrev_converter)
- {
- attno = sortKey->ssup_attno;
-
- datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- sortKey++;
- for (nkey = 1; nkey < base->nKeys; nkey++, sortKey++)
- {
- attno = sortKey->ssup_attno;
-
- datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- return 0;
-}
-
-static void
-writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MinimalTuple tuple = (MinimalTuple) stup->tuple;
-
- /* the part of the MinimalTuple we'll write: */
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
-
- /* total on-disk footprint: */
- unsigned int tuplen = tupbodylen + sizeof(int);
-
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_heap(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- unsigned int tupbodylen = len - sizeof(int);
- unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
- MinimalTuple tuple = (MinimalTuple) readtup_alloc(state, tuplen);
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- HeapTupleData htup;
-
- /* read in the tuple proper */
- tuple->t_len = tuplen;
- LogicalTapeReadExact(tape, tupbody, tupbodylen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- stup->datum1 = heap_getattr(&htup,
- base->sortKeys[0].ssup_attno,
- (TupleDesc) base->arg,
- &stup->isnull1);
-}
-
-/*
- * Routines specialized for the CLUSTER case (HeapTuple data, with
- * comparisons per a btree index definition)
- */
-
-static void
-removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
-{
- int i;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
-
- for (i = 0; i < count; i++)
- {
- HeapTuple tup;
-
- tup = (HeapTuple) stups[i].tuple;
- stups[i].datum1 = heap_getattr(tup,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stups[i].isnull1);
- }
-}
-
-static int
-comparetup_cluster(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
- SortSupport sortKey = base->sortKeys;
- HeapTuple ltup;
- HeapTuple rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
- /* Be prepared to compare additional sort keys */
- ltup = (HeapTuple) a->tuple;
- rtup = (HeapTuple) b->tuple;
- tupDesc = arg->tupDesc;
-
- /* Compare the leading sort key, if it's simple */
- if (base->haveDatum1)
- {
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- if (sortKey->abbrev_converter)
- {
- AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
-
- datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
- datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- }
- if (compare != 0 || base->nKeys == 1)
- return compare;
- /* Compare additional columns the hard way */
- sortKey++;
- nkey = 1;
- }
- else
- {
- /* Must compare all keys the hard way */
- nkey = 0;
- }
-
- if (arg->indexInfo->ii_Expressions == NULL)
- {
- /* If not expression index, just compare the proper heap attrs */
-
- for (; nkey < base->nKeys; nkey++, sortKey++)
- {
- AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
-
- datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
- }
- else
- {
- /*
- * In the expression index case, compute the whole index tuple and
- * then compare values. It would perhaps be faster to compute only as
- * many columns as we need to compare, but that would require
- * duplicating all the logic in FormIndexDatum.
- */
- Datum l_index_values[INDEX_MAX_KEYS];
- bool l_index_isnull[INDEX_MAX_KEYS];
- Datum r_index_values[INDEX_MAX_KEYS];
- bool r_index_isnull[INDEX_MAX_KEYS];
- TupleTableSlot *ecxt_scantuple;
-
- /* Reset context each time to prevent memory leakage */
- ResetPerTupleExprContext(arg->estate);
-
- ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
-
- ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
- FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
- l_index_values, l_index_isnull);
-
- ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
- FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
- r_index_values, r_index_isnull);
-
- for (; nkey < base->nKeys; nkey++, sortKey++)
- {
- compare = ApplySortComparator(l_index_values[nkey],
- l_index_isnull[nkey],
- r_index_values[nkey],
- r_index_isnull[nkey],
- sortKey);
- if (compare != 0)
- return compare;
- }
- }
-
- return 0;
-}
-
-static void
-writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- HeapTuple tuple = (HeapTuple) stup->tuple;
- unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
-
- /* We need to store t_self, but not other fields of HeapTupleData */
- LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
- LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_cluster(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int tuplen)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
- unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
- HeapTuple tuple = (HeapTuple) readtup_alloc(state,
- t_len + HEAPTUPLESIZE);
-
- /* Reconstruct the HeapTupleData header */
- tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
- tuple->t_len = t_len;
- LogicalTapeReadExact(tape, &tuple->t_self, sizeof(ItemPointerData));
- /* We don't currently bother to reconstruct t_tableOid */
- tuple->t_tableOid = InvalidOid;
- /* Read in the tuple body */
- LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value, if it's a simple column */
- if (base->haveDatum1)
- stup->datum1 = heap_getattr(tuple,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup->isnull1);
-}
-
-static void
-freestate_cluster(Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
-
- /* Free any execution state created for CLUSTER case */
- if (arg->estate != NULL)
- {
- ExprContext *econtext = GetPerTupleExprContext(arg->estate);
-
- ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
- FreeExecutorState(arg->estate);
- }
-}
-
-/*
- * Routines specialized for IndexTuple case
- *
- * The btree and hash cases require separate comparison functions, but the
- * IndexTuple representation is the same so the copy/write/read support
- * functions can be shared.
- */
-
-static void
-removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
- int i;
-
- for (i = 0; i < count; i++)
- {
- IndexTuple tuple;
-
- tuple = stups[i].tuple;
- stups[i].datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stups[i].isnull1);
- }
-}
-
-static int
-comparetup_index_btree(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- /*
- * This is similar to comparetup_heap(), but expects index tuples. There
- * is also special handling for enforcing uniqueness, and special
- * treatment for equal keys at the end.
- */
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
- SortSupport sortKey = base->sortKeys;
- IndexTuple tuple1;
- IndexTuple tuple2;
- int keysz;
- TupleDesc tupDes;
- bool equal_hasnull = false;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
- /* Compare the leading sort key */
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- /* Compare additional sort keys */
- tuple1 = (IndexTuple) a->tuple;
- tuple2 = (IndexTuple) b->tuple;
- keysz = base->nKeys;
- tupDes = RelationGetDescr(arg->index.indexRel);
-
- if (sortKey->abbrev_converter)
- {
- datum1 = index_getattr(tuple1, 1, tupDes, &isnull1);
- datum2 = index_getattr(tuple2, 1, tupDes, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- /* they are equal, so we only need to examine one null flag */
- if (a->isnull1)
- equal_hasnull = true;
-
- sortKey++;
- for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
- {
- datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
- datum2 = index_getattr(tuple2, nkey, tupDes, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare; /* done when we find unequal attributes */
-
- /* they are equal, so we only need to examine one null flag */
- if (isnull1)
- equal_hasnull = true;
- }
-
- /*
- * If btree has asked us to enforce uniqueness, complain if two equal
- * tuples are detected (unless there was at least one NULL field and NULLS
- * NOT DISTINCT was not set).
- *
- * It is sufficient to make the test here, because if two tuples are equal
- * they *must* get compared at some stage of the sort --- otherwise the
- * sort algorithm wouldn't have checked whether one must appear before the
- * other.
- */
- if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
- {
- Datum values[INDEX_MAX_KEYS];
- bool isnull[INDEX_MAX_KEYS];
- char *key_desc;
-
- /*
- * Some rather brain-dead implementations of qsort (such as the one in
- * QNX 4) will sometimes call the comparison routine to compare a
- * value to itself, but we always use our own implementation, which
- * does not.
- */
- Assert(tuple1 != tuple2);
-
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
- }
-
- /*
- * If key values are equal, we sort on ItemPointer. This is required for
- * btree indexes, since heap TID is treated as an implicit last key
- * attribute in order to ensure that all keys in the index are physically
- * unique.
- */
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
-}
-
-static int
-comparetup_index_hash(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- Bucket bucket1;
- Bucket bucket2;
- IndexTuple tuple1;
- IndexTuple tuple2;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexHashArg *arg = (TuplesortIndexHashArg *) base->arg;
-
- /*
- * Fetch hash keys and mask off bits we don't want to sort by. We know
- * that the first column of the index tuple is the hash key.
- */
- Assert(!a->isnull1);
- bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
- arg->max_buckets, arg->high_mask,
- arg->low_mask);
- Assert(!b->isnull1);
- bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
- arg->max_buckets, arg->high_mask,
- arg->low_mask);
- if (bucket1 > bucket2)
- return 1;
- else if (bucket1 < bucket2)
- return -1;
-
- /*
- * If hash values are equal, we sort on ItemPointer. This does not affect
- * validity of the finished index, but it may be useful to have index
- * scans in physical order.
- */
- tuple1 = (IndexTuple) a->tuple;
- tuple2 = (IndexTuple) b->tuple;
-
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
-}
-
-static void
-writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- IndexTuple tuple = (IndexTuple) stup->tuple;
- unsigned int tuplen;
-
- tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_index(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
- unsigned int tuplen = len - sizeof(unsigned int);
- IndexTuple tuple = (IndexTuple) readtup_alloc(state, tuplen);
-
- LogicalTapeReadExact(tape, tuple, tuplen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value */
- stup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup->isnull1);
-}
-
-/*
- * Routines specialized for DatumTuple case
- */
-
-static void
-removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
-{
- int i;
-
- for (i = 0; i < count; i++)
- stups[i].datum1 = PointerGetDatum(stups[i].tuple);
-}
-
-static int
-comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- int compare;
-
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- base->sortKeys);
- if (compare != 0)
- return compare;
-
- /* if we have abbreviations, then "tuple" has the original value */
-
- if (base->sortKeys->abbrev_converter)
- compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
- PointerGetDatum(b->tuple), b->isnull1,
- base->sortKeys);
-
- return compare;
-}
-
-static void
-writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- void *waddr;
- unsigned int tuplen;
- unsigned int writtenlen;
-
- if (stup->isnull1)
- {
- waddr = NULL;
- tuplen = 0;
- }
- else if (!base->tuples)
- {
- waddr = &stup->datum1;
- tuplen = sizeof(Datum);
- }
- else
- {
- waddr = stup->tuple;
- tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
- Assert(tuplen != 0);
- }
-
- writtenlen = tuplen + sizeof(unsigned int);
-
- LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
- LogicalTapeWrite(tape, waddr, tuplen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
-}
-
-static void
-readtup_datum(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- unsigned int tuplen = len - sizeof(unsigned int);
-
- if (tuplen == 0)
- {
- /* it's NULL */
- stup->datum1 = (Datum) 0;
- stup->isnull1 = true;
- stup->tuple = NULL;
- }
- else if (!base->tuples)
- {
- Assert(tuplen == sizeof(Datum));
- LogicalTapeReadExact(tape, &stup->datum1, tuplen);
- stup->isnull1 = false;
- stup->tuple = NULL;
- }
- else
- {
- void *raddr = readtup_alloc(state, tuplen);
-
- LogicalTapeReadExact(tape, raddr, tuplen);
- stup->datum1 = PointerGetDatum(raddr);
- stup->isnull1 = false;
- stup->tuple = raddr;
- }
-
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
-}
-
/*
* Parallel sort routines
*/
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
new file mode 100644
index 0000000000..e7bc87c71c
--- /dev/null
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -0,0 +1,1572 @@
+/*-------------------------------------------------------------------------
+ *
+ * tuplesortvariants.c
+ * Implementation of tuple sorting variants.
+ *
+ *
+ * Copyright (c) 2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/utils/sort/tuplesortvariants.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "access/htup_details.h"
+#include "access/nbtree.h"
+#include "catalog/index.h"
+#include "executor/executor.h"
+#include "pg_trace.h"
+#include "utils/datum.h"
+#include "utils/lsyscache.h"
+#include "utils/guc.h"
+#include "utils/tuplesort.h"
+
+
+/* sort-type codes for sort__start probes */
+#define HEAP_SORT 0
+#define INDEX_SORT 1
+#define DATUM_SORT 2
+#define CLUSTER_SORT 3
+
+static void removeabbrev_heap(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_index(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_datum(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static int comparetup_heap(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_index(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_datum(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static void freestate_cluster(Tuplesortstate *state);
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
+ * the tuplesort_begin_cluster.
+ */
+typedef struct
+{
+ TupleDesc tupDesc;
+
+ IndexInfo *indexInfo; /* info about index being used for reference */
+ EState *estate; /* for evaluating index expressions */
+} TuplesortClusterArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the IndexTuple case.
+ * Set by tuplesort_begin_index_xxx and used only by the IndexTuple routines.
+ */
+typedef struct
+{
+ Relation heapRel; /* table the index is being built on */
+ Relation indexRel; /* index being built */
+} TuplesortIndexArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_btree subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ bool enforceUnique; /* complain if we find duplicate tuples */
+ bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+} TuplesortIndexBTreeArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_hash subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ uint32 high_mask; /* masks for sortable part of hash code */
+ uint32 low_mask;
+ uint32 max_buckets;
+} TuplesortIndexHashArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the Datum case.
+ * Set by tuplesort_begin_datum and used only by the DatumTuple routines.
+ */
+typedef struct
+{
+ /* the datatype oid of Datum's to be sorted */
+ Oid datumType;
+ /* we need typelen in order to know how to copy the Datums. */
+ int datumTypeLen;
+} TuplesortDatumArg;
+
+Tuplesortstate *
+tuplesort_begin_heap(TupleDesc tupDesc,
+ int nkeys, AttrNumber *attNums,
+ Oid *sortOperators, Oid *sortCollations,
+ bool *nullsFirstFlags,
+ int workMem, SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+
+ AssertArg(nkeys > 0);
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
+ nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = nkeys;
+
+ TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
+ false, /* no unique check */
+ nkeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_heap;
+ base->comparetup = comparetup_heap;
+ base->writetup = writetup_heap;
+ base->readtup = readtup_heap;
+ base->haveDatum1 = true;
+ base->arg = tupDesc; /* assume we need not copy tupDesc */
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+
+ for (i = 0; i < nkeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+
+ AssertArg(attNums[i] != 0);
+ AssertArg(sortOperators[i] != 0);
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = sortCollations[i];
+ sortKey->ssup_nulls_first = nullsFirstFlags[i];
+ sortKey->ssup_attno = attNums[i];
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
+ }
+
+ /*
+ * The "onlyKey" optimization cannot be used with abbreviated keys, since
+ * tie-breaker comparisons may be required. Typically, the optimization
+ * is only of value to pass-by-value types anyway, whereas abbreviated
+ * keys are typically only of value to pass-by-reference types.
+ */
+ if (nkeys == 1 && !base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_cluster(TupleDesc tupDesc,
+ Relation indexRel,
+ int workMem,
+ SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ BTScanInsert indexScanKey;
+ MemoryContext oldcontext;
+ TuplesortClusterArg *arg;
+ int i;
+
+ Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortClusterArg *) palloc0(sizeof(TuplesortClusterArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
+ RelationGetNumberOfAttributes(indexRel),
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
+ false, /* no unique check */
+ base->nKeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_cluster;
+ base->comparetup = comparetup_cluster;
+ base->writetup = writetup_cluster;
+ base->readtup = readtup_cluster;
+ base->freestate = freestate_cluster;
+ base->arg = arg;
+
+ arg->indexInfo = BuildIndexInfo(indexRel);
+
+ /*
+ * If we don't have a simple leading attribute, we don't currently
+ * initialize datum1, so disable optimizations that require it.
+ */
+ if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
+ base->haveDatum1 = false;
+ else
+ base->haveDatum1 = true;
+
+ arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
+
+ indexScanKey = _bt_mkscankey(indexRel, NULL);
+
+ if (arg->indexInfo->ii_Expressions != NULL)
+ {
+ TupleTableSlot *slot;
+ ExprContext *econtext;
+
+ /*
+ * We will need to use FormIndexDatum to evaluate the index
+ * expressions. To do that, we need an EState, as well as a
+ * TupleTableSlot to put the table tuples into. The econtext's
+ * scantuple has to point to that slot, too.
+ */
+ arg->estate = CreateExecutorState();
+ slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
+ econtext = GetPerTupleExprContext(arg->estate);
+ econtext->ecxt_scantuple = slot;
+ }
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < base->nKeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+ ScanKey scanKey = indexScanKey->scankeys + i;
+ int16 strategy;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = scanKey->sk_collation;
+ sortKey->ssup_nulls_first =
+ (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
+ sortKey->ssup_attno = scanKey->sk_attno;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
+ BTGreaterStrategyNumber : BTLessStrategyNumber;
+
+ PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
+ }
+
+ pfree(indexScanKey);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_btree(Relation heapRel,
+ Relation indexRel,
+ bool enforceUnique,
+ bool uniqueNullsNotDistinct,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ BTScanInsert indexScanKey;
+ TuplesortIndexBTreeArg *arg;
+ MemoryContext oldcontext;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: unique = %c, workMem = %d, randomAccess = %c",
+ enforceUnique ? 't' : 'f',
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
+ enforceUnique,
+ base->nKeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = enforceUnique;
+ arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+
+ indexScanKey = _bt_mkscankey(indexRel, NULL);
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < base->nKeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+ ScanKey scanKey = indexScanKey->scankeys + i;
+ int16 strategy;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = scanKey->sk_collation;
+ sortKey->ssup_nulls_first =
+ (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
+ sortKey->ssup_attno = scanKey->sk_attno;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
+ BTGreaterStrategyNumber : BTLessStrategyNumber;
+
+ PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
+ }
+
+ pfree(indexScanKey);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_hash(Relation heapRel,
+ Relation indexRel,
+ uint32 high_mask,
+ uint32 low_mask,
+ uint32 max_buckets,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext;
+ TuplesortIndexHashArg *arg;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexHashArg *) palloc(sizeof(TuplesortIndexHashArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: high_mask = 0x%x, low_mask = 0x%x, "
+ "max_buckets = 0x%x, workMem = %d, randomAccess = %c",
+ high_mask,
+ low_mask,
+ max_buckets,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = 1; /* Only one sort column, the hash code */
+
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_hash;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+
+ arg->high_mask = high_mask;
+ arg->low_mask = low_mask;
+ arg->max_buckets = max_buckets;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_gist(Relation heapRel,
+ Relation indexRel,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext;
+ TuplesortIndexBTreeArg *arg;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: workMem = %d, randomAccess = %c",
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = false;
+ arg->uniqueNullsNotDistinct = false;
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < base->nKeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = indexRel->rd_indcollation[i];
+ sortKey->ssup_nulls_first = false;
+ sortKey->ssup_attno = i + 1;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ /* Look for a sort support function */
+ PrepareSortSupportFromGistIndexRel(indexRel, sortKey);
+ }
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
+ bool nullsFirstFlag, int workMem,
+ SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg;
+ MemoryContext oldcontext;
+ int16 typlen;
+ bool typbyval;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortDatumArg *) palloc(sizeof(TuplesortDatumArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin datum sort: workMem = %d, randomAccess = %c",
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = 1; /* always a one-column sort */
+
+ TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
+ false, /* no unique check */
+ 1,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_datum;
+ base->comparetup = comparetup_datum;
+ base->writetup = writetup_datum;
+ base->readtup = readtup_datum;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->datumType = datumType;
+
+ /* lookup necessary attributes of the datum type */
+ get_typlenbyval(datumType, &typlen, &typbyval);
+ arg->datumTypeLen = typlen;
+ base->tuples = !typbyval;
+
+ /* Prepare SortSupport data */
+ base->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+
+ base->sortKeys->ssup_cxt = CurrentMemoryContext;
+ base->sortKeys->ssup_collation = sortCollation;
+ base->sortKeys->ssup_nulls_first = nullsFirstFlag;
+
+ /*
+ * Abbreviation is possible here only for by-reference types. In theory,
+ * a pass-by-value datatype could have an abbreviated form that is cheaper
+ * to compare. In a tuple sort, we could support that, because we can
+ * always extract the original datum from the tuple as needed. Here, we
+ * can't, because a datum sort only stores a single copy of the datum; the
+ * "tuple" field of each SortTuple is NULL.
+ */
+ base->sortKeys->abbreviate = !typbyval;
+
+ PrepareSortSupportFromOrderingOp(sortOperator, base->sortKeys);
+
+ /*
+ * The "onlyKey" optimization cannot be used with abbreviated keys, since
+ * tie-breaker comparisons may be required. Typically, the optimization
+ * is only of value to pass-by-value types anyway, whereas abbreviated
+ * keys are typically only of value to pass-by-reference types.
+ */
+ if (!base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+/*
+ * Accept one tuple while collecting input data for sort.
+ *
+ * Note that the input data is always copied; the caller need not save it.
+ */
+void
+tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TupleDesc tupDesc = (TupleDesc) base->arg;
+ SortTuple stup;
+ MinimalTuple tuple;
+ HeapTupleData htup;
+
+ /* copy the tuple into sort storage */
+ tuple = ExecCopySlotMinimalTuple(slot);
+ stup.tuple = (void *) tuple;
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ stup.datum1 = heap_getattr(&htup,
+ base->sortKeys[0].ssup_attno,
+ tupDesc,
+ &stup.isnull1);
+
+ tuplesort_puttuple_common(state, &stup,
+ base->sortKeys->abbrev_converter &&
+ !stup.isnull1);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Accept one tuple while collecting input data for sort.
+ *
+ * Note that the input data is always copied; the caller need not save it.
+ */
+void
+tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
+{
+ SortTuple stup;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ /* copy the tuple into sort storage */
+ tup = heap_copytuple(tup);
+ stup.tuple = (void *) tup;
+
+ /*
+ * set up first-column key value, and potentially abbreviate, if it's a
+ * simple column
+ */
+ if (base->haveDatum1)
+ {
+ stup.datum1 = heap_getattr(tup,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup.isnull1);
+ }
+
+ tuplesort_puttuple_common(state, &stup,
+ base->haveDatum1 &&
+ base->sortKeys->abbrev_converter &&
+ !stup.isnull1);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Collect one index tuple while collecting input data for sort, building
+ * it from caller-supplied values.
+ */
+void
+tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
+ ItemPointer self, Datum *values,
+ bool *isnull)
+{
+ SortTuple stup;
+ IndexTuple tuple;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
+
+ stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
+ isnull, base->tuplecontext);
+ tuple = ((IndexTuple) stup.tuple);
+ tuple->t_tid = *self;
+ /* set up first-column key value */
+ stup.datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup.isnull1);
+
+ tuplesort_puttuple_common(state, &stup,
+ base->sortKeys &&
+ base->sortKeys->abbrev_converter &&
+ !stup.isnull1);
+}
+
+/*
+ * Accept one Datum while collecting input data for sort.
+ *
+ * If the Datum is pass-by-ref type, the value will be copied.
+ */
+void
+tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
+ SortTuple stup;
+
+ /*
+ * Pass-by-value types or null values are just stored directly in
+ * stup.datum1 (and stup.tuple is not used and set to NULL).
+ *
+ * Non-null pass-by-reference values need to be copied into memory we
+ * control, and possibly abbreviated. The copied value is pointed to by
+ * stup.tuple and is treated as the canonical copy (e.g. to return via
+ * tuplesort_getdatum or when writing to tape); stup.datum1 gets the
+ * abbreviated value if abbreviation is happening, otherwise it's
+ * identical to stup.tuple.
+ */
+
+ if (isNull || !base->tuples)
+ {
+ /*
+ * Set datum1 to zeroed representation for NULLs (to be consistent,
+ * and to support cheap inequality tests for NULL abbreviated keys).
+ */
+ stup.datum1 = !isNull ? val : (Datum) 0;
+ stup.isnull1 = isNull;
+ stup.tuple = NULL; /* no separate storage */
+ }
+ else
+ {
+ stup.isnull1 = false;
+ stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
+ stup.tuple = DatumGetPointer(stup.datum1);
+ }
+
+ tuplesort_puttuple_common(state, &stup,
+ base->tuples &&
+ base->sortKeys->abbrev_converter && !isNull);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Fetch the next tuple in either forward or back direction.
+ * If successful, put tuple in slot and return true; else, clear the slot
+ * and return false.
+ *
+ * Caller may optionally be passed back abbreviated value (on true return
+ * value) when abbreviation was used, which can be used to cheaply avoid
+ * equality checks that might otherwise be required. Caller can safely make a
+ * determination of "non-equal tuple" based on simple binary inequality. A
+ * NULL value in leading attribute will set abbreviated value to zeroed
+ * representation, which caller may rely on in abbreviated inequality check.
+ *
+ * If copy is true, the slot receives a tuple that's been copied into the
+ * caller's memory context, so that it will stay valid regardless of future
+ * manipulations of the tuplesort's state (up to and including deleting the
+ * tuplesort). If copy is false, the slot will just receive a pointer to a
+ * tuple held within the tuplesort, which is more efficient, but only safe for
+ * callers that are prepared to have any subsequent manipulation of the
+ * tuplesort's state invalidate slot contents.
+ */
+bool
+tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
+ TupleTableSlot *slot, Datum *abbrev)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ if (stup.tuple)
+ {
+ /* Record abbreviated key for caller */
+ if (base->sortKeys->abbrev_converter && abbrev)
+ *abbrev = stup.datum1;
+
+ if (copy)
+ stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
+
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
+ return true;
+ }
+ else
+ {
+ ExecClearTuple(slot);
+ return false;
+ }
+}
+
+/*
+ * Fetch the next tuple in either forward or back direction.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller may not rely on tuple
+ * remaining valid after any further manipulation of tuplesort.
+ */
+HeapTuple
+tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return stup.tuple;
+}
+
+/*
+ * Fetch the next index tuple in either forward or back direction.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller may not rely on tuple
+ * remaining valid after any further manipulation of tuplesort.
+ */
+IndexTuple
+tuplesort_getindextuple(Tuplesortstate *state, bool forward)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return (IndexTuple) stup.tuple;
+}
+
+/*
+ * Fetch the next Datum in either forward or back direction.
+ * Returns false if no more datums.
+ *
+ * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
+ * in caller's context, and is now owned by the caller (this differs from
+ * similar routines for other types of tuplesorts).
+ *
+ * Caller may optionally be passed back abbreviated value (on true return
+ * value) when abbreviation was used, which can be used to cheaply avoid
+ * equality checks that might otherwise be required. Caller can safely make a
+ * determination of "non-equal tuple" based on simple binary inequality. A
+ * NULL value will have a zeroed abbreviated value representation, which caller
+ * may rely on in abbreviated inequality check.
+ */
+bool
+tuplesort_getdatum(Tuplesortstate *state, bool forward,
+ Datum *val, bool *isNull, Datum *abbrev)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ {
+ MemoryContextSwitchTo(oldcontext);
+ return false;
+ }
+
+ /* Ensure we copy into caller's memory context */
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Record abbreviated key for caller */
+ if (base->sortKeys->abbrev_converter && abbrev)
+ *abbrev = stup.datum1;
+
+ if (stup.isnull1 || !base->tuples)
+ {
+ *val = stup.datum1;
+ *isNull = stup.isnull1;
+ }
+ else
+ {
+ /* use stup.tuple because stup.datum1 may be an abbreviation */
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
+ *isNull = false;
+ }
+
+ return true;
+}
+
+
+/*
+ * Routines specialized for HeapTuple (actually MinimalTuple) case
+ */
+
+static void
+removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTupleData htup;
+
+ htup.t_len = ((MinimalTuple) stups[i].tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
+ MINIMAL_TUPLE_OFFSET);
+ stups[i].datum1 = heap_getattr(&htup,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
+ &stups[i].isnull1);
+ }
+}
+
+static int
+comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys;
+ HeapTupleData ltup;
+ HeapTupleData rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ AttrNumber attno;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+
+ /* Compare the leading sort key */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ /* Compare additional sort keys */
+ ltup.t_len = ((MinimalTuple) a->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
+ rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
+ tupDesc = (TupleDesc) base->arg;
+
+ if (sortKey->abbrev_converter)
+ {
+ attno = sortKey->ssup_attno;
+
+ datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ sortKey++;
+ for (nkey = 1; nkey < base->nKeys; nkey++, sortKey++)
+ {
+ attno = sortKey->ssup_attno;
+
+ datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ return 0;
+}
+
+static void
+writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MinimalTuple tuple = (MinimalTuple) stup->tuple;
+
+ /* the part of the MinimalTuple we'll write: */
+ char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
+
+ /* total on-disk footprint: */
+ unsigned int tuplen = tupbodylen + sizeof(int);
+
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_heap(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ unsigned int tupbodylen = len - sizeof(int);
+ unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
+ MinimalTuple tuple = (MinimalTuple) tuplesort_readtup_alloc(state, tuplen);
+ char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ HeapTupleData htup;
+
+ /* read in the tuple proper */
+ tuple->t_len = tuplen;
+ LogicalTapeReadExact(tape, tupbody, tupbodylen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ stup->datum1 = heap_getattr(&htup,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
+ &stup->isnull1);
+}
+
+/*
+ * Routines specialized for the CLUSTER case (HeapTuple data, with
+ * comparisons per a btree index definition)
+ */
+
+static void
+removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTuple tup;
+
+ tup = (HeapTuple) stups[i].tuple;
+ stups[i].datum1 = heap_getattr(tup,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stups[i].isnull1);
+ }
+}
+
+static int
+comparetup_cluster(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
+ HeapTuple ltup;
+ HeapTuple rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+ /* Be prepared to compare additional sort keys */
+ ltup = (HeapTuple) a->tuple;
+ rtup = (HeapTuple) b->tuple;
+ tupDesc = arg->tupDesc;
+
+ /* Compare the leading sort key, if it's simple */
+ if (base->haveDatum1)
+ {
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ if (sortKey->abbrev_converter)
+ {
+ AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
+
+ datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
+ datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ }
+ if (compare != 0 || base->nKeys == 1)
+ return compare;
+ /* Compare additional columns the hard way */
+ sortKey++;
+ nkey = 1;
+ }
+ else
+ {
+ /* Must compare all keys the hard way */
+ nkey = 0;
+ }
+
+ if (arg->indexInfo->ii_Expressions == NULL)
+ {
+ /* If not expression index, just compare the proper heap attrs */
+
+ for (; nkey < base->nKeys; nkey++, sortKey++)
+ {
+ AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
+
+ datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+ }
+ else
+ {
+ /*
+ * In the expression index case, compute the whole index tuple and
+ * then compare values. It would perhaps be faster to compute only as
+ * many columns as we need to compare, but that would require
+ * duplicating all the logic in FormIndexDatum.
+ */
+ Datum l_index_values[INDEX_MAX_KEYS];
+ bool l_index_isnull[INDEX_MAX_KEYS];
+ Datum r_index_values[INDEX_MAX_KEYS];
+ bool r_index_isnull[INDEX_MAX_KEYS];
+ TupleTableSlot *ecxt_scantuple;
+
+ /* Reset context each time to prevent memory leakage */
+ ResetPerTupleExprContext(arg->estate);
+
+ ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
+
+ ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
+ l_index_values, l_index_isnull);
+
+ ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
+ r_index_values, r_index_isnull);
+
+ for (; nkey < base->nKeys; nkey++, sortKey++)
+ {
+ compare = ApplySortComparator(l_index_values[nkey],
+ l_index_isnull[nkey],
+ r_index_values[nkey],
+ r_index_isnull[nkey],
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+ }
+
+ return 0;
+}
+
+static void
+writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ HeapTuple tuple = (HeapTuple) stup->tuple;
+ unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
+
+ /* We need to store t_self, but not other fields of HeapTupleData */
+ LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
+ LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_cluster(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int tuplen)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
+ HeapTuple tuple = (HeapTuple) tuplesort_readtup_alloc(state,
+ t_len + HEAPTUPLESIZE);
+
+ /* Reconstruct the HeapTupleData header */
+ tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
+ tuple->t_len = t_len;
+ LogicalTapeReadExact(tape, &tuple->t_self, sizeof(ItemPointerData));
+ /* We don't currently bother to reconstruct t_tableOid */
+ tuple->t_tableOid = InvalidOid;
+ /* Read in the tuple body */
+ LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value, if it's a simple column */
+ if (base->haveDatum1)
+ stup->datum1 = heap_getattr(tuple,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup->isnull1);
+}
+
+static void
+freestate_cluster(Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ /* Free any execution state created for CLUSTER case */
+ if (arg->estate != NULL)
+ {
+ ExprContext *econtext = GetPerTupleExprContext(arg->estate);
+
+ ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
+ FreeExecutorState(arg->estate);
+ }
+}
+
+/*
+ * Routines specialized for IndexTuple case
+ *
+ * The btree and hash cases require separate comparison functions, but the
+ * IndexTuple representation is the same so the copy/write/read support
+ * functions can be shared.
+ */
+
+static void
+removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ IndexTuple tuple;
+
+ tuple = stups[i].tuple;
+ stups[i].datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stups[i].isnull1);
+ }
+}
+
+static int
+comparetup_index_btree(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ /*
+ * This is similar to comparetup_heap(), but expects index tuples. There
+ * is also special handling for enforcing uniqueness, and special
+ * treatment for equal keys at the end.
+ */
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+ int keysz;
+ TupleDesc tupDes;
+ bool equal_hasnull = false;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+
+ /* Compare the leading sort key */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ /* Compare additional sort keys */
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+ keysz = base->nKeys;
+ tupDes = RelationGetDescr(arg->index.indexRel);
+
+ if (sortKey->abbrev_converter)
+ {
+ datum1 = index_getattr(tuple1, 1, tupDes, &isnull1);
+ datum2 = index_getattr(tuple2, 1, tupDes, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ /* they are equal, so we only need to examine one null flag */
+ if (a->isnull1)
+ equal_hasnull = true;
+
+ sortKey++;
+ for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
+ {
+ datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
+ datum2 = index_getattr(tuple2, nkey, tupDes, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare; /* done when we find unequal attributes */
+
+ /* they are equal, so we only need to examine one null flag */
+ if (isnull1)
+ equal_hasnull = true;
+ }
+
+ /*
+ * If btree has asked us to enforce uniqueness, complain if two equal
+ * tuples are detected (unless there was at least one NULL field and NULLS
+ * NOT DISTINCT was not set).
+ *
+ * It is sufficient to make the test here, because if two tuples are equal
+ * they *must* get compared at some stage of the sort --- otherwise the
+ * sort algorithm wouldn't have checked whether one must appear before the
+ * other.
+ */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
+ {
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ char *key_desc;
+
+ /*
+ * Some rather brain-dead implementations of qsort (such as the one in
+ * QNX 4) will sometimes call the comparison routine to compare a
+ * value to itself, but we always use our own implementation, which
+ * does not.
+ */
+ Assert(tuple1 != tuple2);
+
+ index_deform_tuple(tuple1, tupDes, values, isnull);
+
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+static int
+comparetup_index_hash(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ Bucket bucket1;
+ Bucket bucket2;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexHashArg *arg = (TuplesortIndexHashArg *) base->arg;
+
+ /*
+ * Fetch hash keys and mask off bits we don't want to sort by. We know
+ * that the first column of the index tuple is the hash key.
+ */
+ Assert(!a->isnull1);
+ bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
+ Assert(!b->isnull1);
+ bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
+ if (bucket1 > bucket2)
+ return 1;
+ else if (bucket1 < bucket2)
+ return -1;
+
+ /*
+ * If hash values are equal, we sort on ItemPointer. This does not affect
+ * validity of the finished index, but it may be useful to have index
+ * scans in physical order.
+ */
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+static void
+writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ IndexTuple tuple = (IndexTuple) stup->tuple;
+ unsigned int tuplen;
+
+ tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_index(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
+ unsigned int tuplen = len - sizeof(unsigned int);
+ IndexTuple tuple = (IndexTuple) tuplesort_readtup_alloc(state, tuplen);
+
+ LogicalTapeReadExact(tape, tuple, tuplen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value */
+ stup->datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup->isnull1);
+}
+
+/*
+ * Routines specialized for DatumTuple case
+ */
+
+static void
+removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ stups[i].datum1 = PointerGetDatum(stups[i].tuple);
+}
+
+static int
+comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ int compare;
+
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ base->sortKeys);
+ if (compare != 0)
+ return compare;
+
+ /* if we have abbreviations, then "tuple" has the original value */
+
+ if (base->sortKeys->abbrev_converter)
+ compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
+ PointerGetDatum(b->tuple), b->isnull1,
+ base->sortKeys);
+
+ return compare;
+}
+
+static void
+writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
+ void *waddr;
+ unsigned int tuplen;
+ unsigned int writtenlen;
+
+ if (stup->isnull1)
+ {
+ waddr = NULL;
+ tuplen = 0;
+ }
+ else if (!base->tuples)
+ {
+ waddr = &stup->datum1;
+ tuplen = sizeof(Datum);
+ }
+ else
+ {
+ waddr = stup->tuple;
+ tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
+ Assert(tuplen != 0);
+ }
+
+ writtenlen = tuplen + sizeof(unsigned int);
+
+ LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
+ LogicalTapeWrite(tape, waddr, tuplen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
+}
+
+static void
+readtup_datum(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ unsigned int tuplen = len - sizeof(unsigned int);
+
+ if (tuplen == 0)
+ {
+ /* it's NULL */
+ stup->datum1 = (Datum) 0;
+ stup->isnull1 = true;
+ stup->tuple = NULL;
+ }
+ else if (!base->tuples)
+ {
+ Assert(tuplen == sizeof(Datum));
+ LogicalTapeReadExact(tape, &stup->datum1, tuplen);
+ stup->isnull1 = false;
+ stup->tuple = NULL;
+ }
+ else
+ {
+ void *raddr = tuplesort_readtup_alloc(state, tuplen);
+
+ LogicalTapeReadExact(tape, raddr, tuplen);
+ stup->datum1 = PointerGetDatum(raddr);
+ stup->isnull1 = false;
+ stup->tuple = raddr;
+ }
+
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+}
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 364cf132fc..22b7daf1e0 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -24,7 +24,9 @@
#include "access/itup.h"
#include "executor/tuptable.h"
#include "storage/dsm.h"
+#include "utils/logtape.h"
#include "utils/relcache.h"
+#include "utils/sortsupport.h"
/*
@@ -102,6 +104,148 @@ typedef struct TuplesortInstrumentation
int64 spaceUsed; /* space consumption, in kB */
} TuplesortInstrumentation;
+/*
+ * The objects we actually sort are SortTuple structs. These contain
+ * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
+ * which is a separate palloc chunk --- we assume it is just one chunk and
+ * can be freed by a simple pfree() (except during merge, when we use a
+ * simple slab allocator). SortTuples also contain the tuple's first key
+ * column in Datum/nullflag format, and a source/input tape number that
+ * tracks which tape each heap element/slot belongs to during merging.
+ *
+ * Storing the first key column lets us save heap_getattr or index_getattr
+ * calls during tuple comparisons. We could extract and save all the key
+ * columns not just the first, but this would increase code complexity and
+ * overhead, and wouldn't actually save any comparison cycles in the common
+ * case where the first key determines the comparison result. Note that
+ * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
+ *
+ * There is one special case: when the sort support infrastructure provides an
+ * "abbreviated key" representation, where the key is (typically) a pass by
+ * value proxy for a pass by reference type. In this case, the abbreviated key
+ * is stored in datum1 in place of the actual first key column.
+ *
+ * When sorting single Datums, the data value is represented directly by
+ * datum1/isnull1 for pass by value types (or null values). If the datatype is
+ * pass-by-reference and isnull1 is false, then "tuple" points to a separately
+ * palloc'd data value, otherwise "tuple" is NULL. The value of datum1 is then
+ * either the same pointer as "tuple", or is an abbreviated key value as
+ * described above. Accordingly, "tuple" is always used in preference to
+ * datum1 as the authoritative value for pass-by-reference cases.
+ */
+typedef struct
+{
+ void *tuple; /* the tuple itself */
+ Datum datum1; /* value of first key column */
+ bool isnull1; /* is first key column NULL? */
+ int srctape; /* source tape number */
+} SortTuple;
+
+typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+
+/*
+ * The public part of a Tuple sort operation state. This data structure
+ * containsthe definition of sort-variant-specific interface methods and
+ * the part of Tuple sort operation state required by their implementations.
+ */
+typedef struct
+{
+ /*
+ * These function pointers decouple the routines that must know what kind
+ * of tuple we are sorting from the routines that don't need to know it.
+ * They are set up by the tuplesort_begin_xxx routines.
+ *
+ * Function to compare two tuples; result is per qsort() convention, ie:
+ * <0, 0, >0 according as a<b, a=b, a>b. The API must match
+ * qsort_arg_comparator.
+ */
+ SortTupleComparator comparetup;
+
+ /*
+ * Alter datum1 representation in the SortTuple's array back from the
+ * abbreviated key to the first column value.
+ */
+ void (*removeabbrev) (Tuplesortstate *state, SortTuple *stups,
+ int count);
+
+ /*
+ * Function to write a stored tuple onto tape. The representation of the
+ * tuple on tape need not be the same as it is in memory.
+ */
+ void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+
+ /*
+ * Function to read a stored tuple from tape back into memory. 'len' is
+ * the already-read length of the stored tuple. The tuple is allocated
+ * from the slab memory arena, or is palloc'd, see
+ * tuplesort_readtup_alloc().
+ */
+ void (*readtup) (Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+
+ /*
+ * Function to do some specific release of resources for the sort variant.
+ * In particular, this function should free everything stored in the "arg"
+ * field, which wouldn't be cleared on reset of the Tuple sort memory
+ * contextes. This can be NULL if nothing specific needs to be done.
+ */
+ void (*freestate) (Tuplesortstate *state);
+
+ /*
+ * The subsequent fields are used in the implementations of the functions
+ * above.
+ */
+ MemoryContext maincontext; /* memory context for tuple sort metadata that
+ * persists across multiple batches */
+ MemoryContext sortcontext; /* memory context holding most sort data */
+ MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
+
+ /*
+ * Whether SortTuple's datum1 and isnull1 members are maintained by the
+ * above routines. If not, some sort specializations are disabled.
+ */
+ bool haveDatum1;
+
+ /*
+ * The sortKeys variable is used by every case other than the hash index
+ * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
+ * MinimalTuple and CLUSTER routines, though.
+ */
+ int nKeys; /* number of columns in sort key */
+ SortSupport sortKeys; /* array of length nKeys */
+
+ /*
+ * This variable is shared by the single-key MinimalTuple case and the
+ * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
+ * presence of a value in this field is also checked by various sort
+ * specialization functions as an optimization when comparing the leading
+ * key in a tiebreak situation to determine if there are any subsequent
+ * keys to sort on.
+ */
+ SortSupport onlyKey;
+
+ int sortopt; /* Bitmask of flags used to setup sort */
+
+ bool tuples; /* Can SortTuple.tuple ever be set? */
+
+ void *arg; /* Specific information for the sort variant */
+} TuplesortPublic;
+
+/* Sort parallel code from state for sort__start probes */
+#define PARALLEL_SORT(coordinate) (coordinate == NULL || \
+ (coordinate)->sharedsort == NULL ? 0 : \
+ (coordinate)->isWorker ? 1 : 2)
+
+#define TuplesortstateGetPublic(state) ((TuplesortPublic *) state);
+
+/* When using this macro, beware of double evaluation of len */
+#define LogicalTapeReadExact(tape, ptr, len) \
+ do { \
+ if (LogicalTapeRead(tape, ptr, len) != (size_t) (len)) \
+ elog(ERROR, "unexpected end of data"); \
+ } while(0)
/*
* We provide multiple interfaces to what is essentially the same code,
@@ -205,6 +349,50 @@ typedef struct TuplesortInstrumentation
* generated (typically, caller uses a parallel heap scan).
*/
+
+extern Tuplesortstate *tuplesort_begin_common(int workMem,
+ SortCoordinate coordinate,
+ int sortopt);
+extern void tuplesort_set_bound(Tuplesortstate *state, int64 bound);
+extern bool tuplesort_used_bound(Tuplesortstate *state);
+extern void tuplesort_puttuple_common(Tuplesortstate *state,
+ SortTuple *tuple, bool useAbbrev);
+extern void tuplesort_performsort(Tuplesortstate *state);
+extern bool tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
+ SortTuple *stup);
+extern bool tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples,
+ bool forward);
+extern void tuplesort_end(Tuplesortstate *state);
+extern void tuplesort_reset(Tuplesortstate *state);
+
+extern void tuplesort_get_stats(Tuplesortstate *state,
+ TuplesortInstrumentation *stats);
+extern const char *tuplesort_method_name(TuplesortMethod m);
+extern const char *tuplesort_space_type_name(TuplesortSpaceType t);
+
+extern int tuplesort_merge_order(int64 allowedMem);
+
+extern Size tuplesort_estimate_shared(int nworkers);
+extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
+ dsm_segment *seg);
+extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
+
+/*
+ * These routines may only be called if randomAccess was specified 'true'.
+ * Likewise, backwards scan in gettuple/getdatum is only allowed if
+ * randomAccess was specified. Note that parallel sorts do not support
+ * randomAccess.
+ */
+
+extern void tuplesort_rescan(Tuplesortstate *state);
+extern void tuplesort_markpos(Tuplesortstate *state);
+extern void tuplesort_restorepos(Tuplesortstate *state);
+
+extern void *tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen);
+
+
+/* tuplesortops.c */
+
extern Tuplesortstate *tuplesort_begin_heap(TupleDesc tupDesc,
int nkeys, AttrNumber *attNums,
Oid *sortOperators, Oid *sortCollations,
@@ -238,9 +426,6 @@ extern Tuplesortstate *tuplesort_begin_datum(Oid datumType,
int workMem, SortCoordinate coordinate,
int sortopt);
-extern void tuplesort_set_bound(Tuplesortstate *state, int64 bound);
-extern bool tuplesort_used_bound(Tuplesortstate *state);
-
extern void tuplesort_puttupleslot(Tuplesortstate *state,
TupleTableSlot *slot);
extern void tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup);
@@ -250,8 +435,6 @@ extern void tuplesort_putindextuplevalues(Tuplesortstate *state,
extern void tuplesort_putdatum(Tuplesortstate *state, Datum val,
bool isNull);
-extern void tuplesort_performsort(Tuplesortstate *state);
-
extern bool tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
bool copy, TupleTableSlot *slot, Datum *abbrev);
extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward);
@@ -259,34 +442,5 @@ extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward);
extern bool tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev);
-extern bool tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples,
- bool forward);
-
-extern void tuplesort_end(Tuplesortstate *state);
-
-extern void tuplesort_reset(Tuplesortstate *state);
-
-extern void tuplesort_get_stats(Tuplesortstate *state,
- TuplesortInstrumentation *stats);
-extern const char *tuplesort_method_name(TuplesortMethod m);
-extern const char *tuplesort_space_type_name(TuplesortSpaceType t);
-
-extern int tuplesort_merge_order(int64 allowedMem);
-
-extern Size tuplesort_estimate_shared(int nworkers);
-extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
- dsm_segment *seg);
-extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
-
-/*
- * These routines may only be called if randomAccess was specified 'true'.
- * Likewise, backwards scan in gettuple/getdatum is only allowed if
- * randomAccess was specified. Note that parallel sorts do not support
- * randomAccess.
- */
-
-extern void tuplesort_rescan(Tuplesortstate *state);
-extern void tuplesort_markpos(Tuplesortstate *state);
-extern void tuplesort_restorepos(Tuplesortstate *state);
#endif /* TUPLESORT_H */
--
2.24.3 (Apple Git-128)
v3-0004-Move-memory-management-away-from-writetup-and-tup.patchapplication/octet-stream; name=v3-0004-Move-memory-management-away-from-writetup-and-tup.patchDownload
From cf3409444bfbd1060d5bfd6d8a9162a076affcfc Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 00:14:51 +0300
Subject: [PATCH v3 4/7] Move memory management away from writetup() and
tuplesort_put*()
This commit puts some generic work away from sort-variant-specific function.
In particular, tuplesort_put*() now doesn't need to decrease available memory
and switch to sort context before calling puttuple_common(). writetup()
doesn't need to free SortTuple.tuple and increase available memory.
---
src/backend/utils/sort/tuplesort.c | 78 +++++++++++++-----------------
1 file changed, 33 insertions(+), 45 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 828efe701e..c8c511fb8c 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -288,11 +288,7 @@ struct Tuplesortstate
/*
* Function to write a stored tuple onto tape. The representation of the
- * tuple on tape need not be the same as it is in memory; requirements on
- * the tape representation are given below. Unless the slab allocator is
- * used, after writing the tuple, pfree() the out-of-line data (not the
- * SortTuple struct!), and increase state->availMem by the amount of
- * memory space thereby released.
+ * tuple on tape need not be the same as it is in memory.
*/
void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
@@ -549,7 +545,7 @@ struct Sharedsort
#define REMOVEABBREV(state,stup,count) ((*(state)->removeabbrev) (state, stup, count))
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
-#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
+#define WRITETUP(state,tape,stup) (writetuple(state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
#define USEMEM(state,amt) ((state)->availMem -= (amt))
@@ -618,6 +614,8 @@ static Tuplesortstate *tuplesort_begin_common(int workMem,
static void tuplesort_begin_batch(Tuplesortstate *state);
static void puttuple_common(Tuplesortstate *state, SortTuple *tuple,
bool useAbbrev);
+static void writetuple(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
static bool consider_abort_common(Tuplesortstate *state);
static void inittapes(Tuplesortstate *state, bool mergeruns);
static void inittapestate(Tuplesortstate *state, int maxTapes);
@@ -1848,7 +1846,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
/* copy the tuple into sort storage */
tuple = ExecCopySlotMinimalTuple(slot);
stup.tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
@@ -1857,8 +1854,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
state->tupDesc,
&stup.isnull1);
- MemoryContextSwitchTo(state->sortcontext);
-
puttuple_common(state, &stup,
state->sortKeys->abbrev_converter && !stup.isnull1);
@@ -1879,9 +1874,6 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
stup.tuple = (void *) tup;
- USEMEM(state, GetMemoryChunkSpace(tup));
-
- MemoryContextSwitchTo(state->sortcontext);
/*
* set up first-column key value, and potentially abbreviate, if it's a
@@ -1910,7 +1902,6 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
ItemPointer self, Datum *values,
bool *isnull)
{
- MemoryContext oldcontext;
SortTuple stup;
IndexTuple tuple;
@@ -1918,19 +1909,14 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
isnull, state->tuplecontext);
tuple = ((IndexTuple) stup.tuple);
tuple->t_tid = *self;
- USEMEM(state, GetMemoryChunkSpace(stup.tuple));
/* set up first-column key value */
stup.datum1 = index_getattr(tuple,
1,
RelationGetDescr(state->indexRel),
&stup.isnull1);
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
-
puttuple_common(state, &stup,
state->sortKeys && state->sortKeys->abbrev_converter && !stup.isnull1);
-
- MemoryContextSwitchTo(oldcontext);
}
/*
@@ -1965,15 +1951,12 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
stup.datum1 = !isNull ? val : (Datum) 0;
stup.isnull1 = isNull;
stup.tuple = NULL; /* no separate storage */
- MemoryContextSwitchTo(state->sortcontext);
}
else
{
stup.isnull1 = false;
stup.datum1 = datumCopy(val, false, state->datumTypeLen);
stup.tuple = DatumGetPointer(stup.datum1);
- USEMEM(state, GetMemoryChunkSpace(stup.tuple));
- MemoryContextSwitchTo(state->sortcontext);
}
puttuple_common(state, &stup,
@@ -1988,8 +1971,14 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
static void
puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+
Assert(!LEADER(state));
+ /* Count the size of the out-of-line data */
+ if (tuple->tuple != NULL)
+ USEMEM(state, GetMemoryChunkSpace(tuple->tuple));
+
if (!useAbbrev)
{
/*
@@ -2062,6 +2051,7 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
pg_rusage_show(&state->ru_start));
#endif
make_bounded_heap(state);
+ MemoryContextSwitchTo(oldcontext);
return;
}
@@ -2069,7 +2059,10 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
* Done if we still fit in available memory and have array slots.
*/
if (state->memtupcount < state->memtupsize && !LACKMEM(state))
+ {
+ MemoryContextSwitchTo(oldcontext);
return;
+ }
/*
* Nope; time to switch to tape-based operation.
@@ -2123,6 +2116,25 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
elog(ERROR, "invalid tuplesort state");
break;
}
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Write a stored tuple onto tape.tuple. Unless the slab allocator is
+ * used, after writing the tuple, pfree() the out-of-line data (not the
+ * SortTuple struct!), and increase state->availMem by the amount of
+ * memory space thereby released.
+ */
+static void
+writetuple(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ state->writetup(state, tape, stup);
+
+ if (!state->slabAllocatorUsed && stup->tuple)
+ {
+ FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
+ pfree(stup->tuple);
+ }
}
static bool
@@ -3960,12 +3972,6 @@ writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- heap_free_minimal_tuple(tuple);
- }
}
static void
@@ -4141,12 +4147,6 @@ writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- heap_freetuple(tuple);
- }
}
static void
@@ -4403,12 +4403,6 @@ writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- pfree(tuple);
- }
}
static void
@@ -4495,12 +4489,6 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
-
- if (!state->slabAllocatorUsed && stup->tuple)
- {
- FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
- pfree(stup->tuple);
- }
}
static void
--
2.24.3 (Apple Git-128)
v3-0005-Split-TuplesortPublic-from-Tuplesortstate.patchapplication/octet-stream; name=v3-0005-Split-TuplesortPublic-from-Tuplesortstate.patchDownload
From 4423f3aea8e6ba265716f764d47779ddf2523a20 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 18:11:26 +0300
Subject: [PATCH v3 5/7] Split TuplesortPublic from Tuplesortstate
The new TuplesortPublic data structure contains the definition of
sort-variant-specific interface methods and the part of Tuple sort operation
state required by their implementations. This will let define Tuple sort
variants without knowledge of Tuplesortstate, that is without knowledge
of generic sort implementation guts.
---
src/backend/utils/sort/tuplesort.c | 814 ++++++++++++++++-------------
src/tools/pgindent/typedefs.list | 6 +
2 files changed, 471 insertions(+), 349 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index c8c511fb8c..fb711f51f8 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -126,8 +126,9 @@
#define CLUSTER_SORT 3
/* Sort parallel code from state for sort__start probes */
-#define PARALLEL_SORT(state) ((state)->shared == NULL ? 0 : \
- (state)->worker >= 0 ? 1 : 2)
+#define PARALLEL_SORT(coordinate) (coordinate == NULL || \
+ (coordinate)->sharedsort == NULL ? 0 : \
+ (coordinate)->isWorker ? 1 : 2)
/*
* Initial size of memtuples array. We're trying to select this size so that
@@ -236,38 +237,18 @@ typedef enum
#define TAPE_BUFFER_OVERHEAD BLCKSZ
#define MERGE_BUFFER_SIZE (BLCKSZ * 32)
+typedef struct TuplesortPublic TuplesortPublic;
+
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
/*
- * Private state of a Tuplesort operation.
+ * The public part of a Tuple sort operation state. This data structure
+ * containsthe definition of sort-variant-specific interface methods and
+ * the part of Tuple sort operation state required by their implementations.
*/
-struct Tuplesortstate
+struct TuplesortPublic
{
- TupSortStatus status; /* enumerated value as shown above */
- int nKeys; /* number of columns in sort key */
- int sortopt; /* Bitmask of flags used to setup sort */
- bool bounded; /* did caller specify a maximum number of
- * tuples to return? */
- bool boundUsed; /* true if we made use of a bounded heap */
- int bound; /* if bounded, the maximum number of tuples */
- bool tuples; /* Can SortTuple.tuple ever be set? */
- int64 availMem; /* remaining memory available, in bytes */
- int64 allowedMem; /* total memory allowed, in bytes */
- int maxTapes; /* max number of input tapes to merge in each
- * pass */
- int64 maxSpace; /* maximum amount of space occupied among sort
- * of groups, either in-memory or on-disk */
- bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
- * space, false when it's value for in-memory
- * space */
- TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
- MemoryContext maincontext; /* memory context for tuple sort metadata that
- * persists across multiple batches */
- MemoryContext sortcontext; /* memory context holding most sort data */
- MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
- LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
-
/*
* These function pointers decouple the routines that must know what kind
* of tuple we are sorting from the routines that don't need to know it.
@@ -301,12 +282,134 @@ struct Tuplesortstate
void (*readtup) (Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
+ /*
+ * Function to do some specific release of resources for the sort variant.
+ * In particular, this function should free everything stored in the "arg"
+ * field, which wouldn't be cleared on reset of the Tuple sort memory
+ * contextes. This can be NULL if nothing specific needs to be done.
+ */
+ void (*freestate) (Tuplesortstate *state);
+
+ /*
+ * The subsequent fields are used in the implementations of the functions
+ * above.
+ */
+ MemoryContext maincontext; /* memory context for tuple sort metadata that
+ * persists across multiple batches */
+ MemoryContext sortcontext; /* memory context holding most sort data */
+ MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
+
/*
* Whether SortTuple's datum1 and isnull1 members are maintained by the
* above routines. If not, some sort specializations are disabled.
*/
bool haveDatum1;
+ /*
+ * The sortKeys variable is used by every case other than the hash index
+ * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
+ * MinimalTuple and CLUSTER routines, though.
+ */
+ int nKeys; /* number of columns in sort key */
+ SortSupport sortKeys; /* array of length nKeys */
+
+ /*
+ * This variable is shared by the single-key MinimalTuple case and the
+ * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
+ * presence of a value in this field is also checked by various sort
+ * specialization functions as an optimization when comparing the leading
+ * key in a tiebreak situation to determine if there are any subsequent
+ * keys to sort on.
+ */
+ SortSupport onlyKey;
+
+ int sortopt; /* Bitmask of flags used to setup sort */
+
+ bool tuples; /* Can SortTuple.tuple ever be set? */
+
+ void *arg; /* Specific information for the sort variant */
+};
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
+ * the tuplesort_begin_cluster.
+ */
+typedef struct
+{
+ TupleDesc tupDesc;
+
+ IndexInfo *indexInfo; /* info about index being used for reference */
+ EState *estate; /* for evaluating index expressions */
+} TuplesortClusterArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the IndexTuple case.
+ * Set by tuplesort_begin_index_xxx and used only by the IndexTuple routines.
+ */
+typedef struct
+{
+ Relation heapRel; /* table the index is being built on */
+ Relation indexRel; /* index being built */
+} TuplesortIndexArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_btree subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ bool enforceUnique; /* complain if we find duplicate tuples */
+ bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+} TuplesortIndexBTreeArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_hash subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ uint32 high_mask; /* masks for sortable part of hash code */
+ uint32 low_mask;
+ uint32 max_buckets;
+} TuplesortIndexHashArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the Datum case.
+ * Set by tuplesort_begin_datum and used only by the DatumTuple routines.
+ */
+typedef struct
+{
+ /* the datatype oid of Datum's to be sorted */
+ Oid datumType;
+ /* we need typelen in order to know how to copy the Datums. */
+ int datumTypeLen;
+} TuplesortDatumArg;
+
+/*
+ * Private state of a Tuplesort operation.
+ */
+struct Tuplesortstate
+{
+ TuplesortPublic base;
+ TupSortStatus status; /* enumerated value as shown above */
+ bool bounded; /* did caller specify a maximum number of
+ * tuples to return? */
+ bool boundUsed; /* true if we made use of a bounded heap */
+ int bound; /* if bounded, the maximum number of tuples */
+ int64 availMem; /* remaining memory available, in bytes */
+ int64 allowedMem; /* total memory allowed, in bytes */
+ int maxTapes; /* max number of input tapes to merge in each
+ * pass */
+ int64 maxSpace; /* maximum amount of space occupied among sort
+ * of groups, either in-memory or on-disk */
+ bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
+ * space, false when it's value for in-memory
+ * space */
+ TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
+ LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
+
/*
* This array holds the tuples now in sort memory. If we are in state
* INITIAL, the tuples are in no particular order; if we are in state
@@ -421,24 +524,6 @@ struct Tuplesortstate
Sharedsort *shared;
int nParticipants;
- /*
- * The sortKeys variable is used by every case other than the hash index
- * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
- * MinimalTuple and CLUSTER routines, though.
- */
- TupleDesc tupDesc;
- SortSupport sortKeys; /* array of length nKeys */
-
- /*
- * This variable is shared by the single-key MinimalTuple case and the
- * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
- * presence of a value in this field is also checked by various sort
- * specialization functions as an optimization when comparing the leading
- * key in a tiebreak situation to determine if there are any subsequent
- * keys to sort on.
- */
- SortSupport onlyKey;
-
/*
* Additional state for managing "abbreviated key" sortsupport routines
* (which currently may be used by all cases except the hash index case).
@@ -448,37 +533,6 @@ struct Tuplesortstate
int64 abbrevNext; /* Tuple # at which to next check
* applicability */
- /*
- * These variables are specific to the CLUSTER case; they are set by
- * tuplesort_begin_cluster.
- */
- IndexInfo *indexInfo; /* info about index being used for reference */
- EState *estate; /* for evaluating index expressions */
-
- /*
- * These variables are specific to the IndexTuple case; they are set by
- * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
- */
- Relation heapRel; /* table the index is being built on */
- Relation indexRel; /* index being built */
-
- /* These are specific to the index_btree subcase: */
- bool enforceUnique; /* complain if we find duplicate tuples */
- bool uniqueNullsNotDistinct; /* unique constraint null treatment */
-
- /* These are specific to the index_hash subcase: */
- uint32 high_mask; /* masks for sortable part of hash code */
- uint32 low_mask;
- uint32 max_buckets;
-
- /*
- * These variables are specific to the Datum case; they are set by
- * tuplesort_begin_datum and used only by the DatumTuple routines.
- */
- Oid datumType;
- /* we need typelen in order to know how to copy the Datums. */
- int datumTypeLen;
-
/*
* Resource snapshot for time of sort start.
*/
@@ -543,10 +597,13 @@ struct Sharedsort
pfree(buf); \
} while(0)
-#define REMOVEABBREV(state,stup,count) ((*(state)->removeabbrev) (state, stup, count))
-#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
+#define TuplesortstateGetPublic(state) ((TuplesortPublic *) state);
+
+#define REMOVEABBREV(state,stup,count) ((*(state)->base.removeabbrev) (state, stup, count))
+#define COMPARETUP(state,a,b) ((*(state)->base.comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) (writetuple(state, tape, stup))
-#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
+#define READTUP(state,stup,tape,len) ((*(state)->base.readtup) (state, stup, tape, len))
+#define FREESTATE(state) ((state)->base.freestate ? (*(state)->base.freestate) (state) : (void) 0)
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
#define USEMEM(state,amt) ((state)->availMem -= (amt))
#define FREEMEM(state,amt) ((state)->availMem += (amt))
@@ -670,6 +727,7 @@ static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
+static void freestate_cluster(Tuplesortstate *state);
static int worker_get_identifier(Tuplesortstate *state);
static void worker_freeze_result_tape(Tuplesortstate *state);
static void worker_nomergeruns(Tuplesortstate *state);
@@ -700,7 +758,7 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplyUnsignedSortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->base.sortKeys[0]);
if (compare != 0)
return compare;
@@ -708,10 +766,10 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->base.comparetup(a, b, state);
}
#if SIZEOF_DATUM >= 8
@@ -723,7 +781,7 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplySignedSortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->base.sortKeys[0]);
if (compare != 0)
return compare;
@@ -732,10 +790,10 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->base.comparetup(a, b, state);
}
#endif
@@ -747,7 +805,7 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplyInt32SortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->base.sortKeys[0]);
if (compare != 0)
return compare;
@@ -756,10 +814,10 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->base.comparetup(a, b, state);
}
/*
@@ -886,8 +944,9 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
pg_rusage_init(&state->ru_start);
#endif
- state->sortopt = sortopt;
- state->tuples = true;
+ state->base.sortopt = sortopt;
+ state->base.tuples = true;
+ state->abbrevNext = 10;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -896,8 +955,8 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
* with very little memory.
*/
state->allowedMem = Max(workMem, 64) * (int64) 1024;
- state->sortcontext = sortcontext;
- state->maincontext = maincontext;
+ state->base.sortcontext = sortcontext;
+ state->base.maincontext = maincontext;
/*
* Initial size of array must be more than ALLOCSET_SEPARATE_THRESHOLD;
@@ -956,7 +1015,7 @@ tuplesort_begin_batch(Tuplesortstate *state)
{
MemoryContext oldcontext;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->base.maincontext);
/*
* Caller tuple (e.g. IndexTuple) memory context.
@@ -971,14 +1030,14 @@ tuplesort_begin_batch(Tuplesortstate *state)
* generation.c context as this keeps allocations more compact with less
* wastage. Allocations are also slightly more CPU efficient.
*/
- if (state->sortopt & TUPLESORT_ALLOWBOUNDED)
- state->tuplecontext = AllocSetContextCreate(state->sortcontext,
- "Caller tuples",
- ALLOCSET_DEFAULT_SIZES);
+ if (state->base.sortopt & TUPLESORT_ALLOWBOUNDED)
+ state->base.tuplecontext = AllocSetContextCreate(state->base.sortcontext,
+ "Caller tuples",
+ ALLOCSET_DEFAULT_SIZES);
else
- state->tuplecontext = GenerationContextCreate(state->sortcontext,
- "Caller tuples",
- ALLOCSET_DEFAULT_SIZES);
+ state->base.tuplecontext = GenerationContextCreate(state->base.sortcontext,
+ "Caller tuples",
+ ALLOCSET_DEFAULT_SIZES);
state->status = TSS_INITIAL;
@@ -1034,10 +1093,11 @@ tuplesort_begin_heap(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MemoryContext oldcontext;
int i;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
AssertArg(nkeys > 0);
@@ -1048,30 +1108,28 @@ tuplesort_begin_heap(TupleDesc tupDesc,
nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = nkeys;
+ base->nKeys = nkeys;
TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
false, /* no unique check */
nkeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_heap;
- state->comparetup = comparetup_heap;
- state->writetup = writetup_heap;
- state->readtup = readtup_heap;
- state->haveDatum1 = true;
-
- state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
- state->abbrevNext = 10;
+ base->removeabbrev = removeabbrev_heap;
+ base->comparetup = comparetup_heap;
+ base->writetup = writetup_heap;
+ base->readtup = readtup_heap;
+ base->haveDatum1 = true;
+ base->arg = tupDesc; /* assume we need not copy tupDesc */
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
for (i = 0; i < nkeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
AssertArg(attNums[i] != 0);
AssertArg(sortOperators[i] != 0);
@@ -1081,7 +1139,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
sortKey->ssup_nulls_first = nullsFirstFlags[i];
sortKey->ssup_attno = attNums[i];
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
}
@@ -1092,8 +1150,8 @@ tuplesort_begin_heap(TupleDesc tupDesc,
* is only of value to pass-by-value types anyway, whereas abbreviated
* keys are typically only of value to pass-by-reference types.
*/
- if (nkeys == 1 && !state->sortKeys->abbrev_converter)
- state->onlyKey = state->sortKeys;
+ if (nkeys == 1 && !base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
MemoryContextSwitchTo(oldcontext);
@@ -1108,13 +1166,16 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
BTScanInsert indexScanKey;
MemoryContext oldcontext;
+ TuplesortClusterArg *arg;
int i;
Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortClusterArg *) palloc0(sizeof(TuplesortClusterArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1124,37 +1185,38 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
false, /* no unique check */
- state->nKeys,
+ base->nKeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_cluster;
- state->comparetup = comparetup_cluster;
- state->writetup = writetup_cluster;
- state->readtup = readtup_cluster;
- state->abbrevNext = 10;
+ base->removeabbrev = removeabbrev_cluster;
+ base->comparetup = comparetup_cluster;
+ base->writetup = writetup_cluster;
+ base->readtup = readtup_cluster;
+ base->freestate = freestate_cluster;
+ base->arg = arg;
- state->indexInfo = BuildIndexInfo(indexRel);
+ arg->indexInfo = BuildIndexInfo(indexRel);
/*
* If we don't have a simple leading attribute, we don't currently
* initialize datum1, so disable optimizations that require it.
*/
- if (state->indexInfo->ii_IndexAttrNumbers[0] == 0)
- state->haveDatum1 = false;
+ if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
+ base->haveDatum1 = false;
else
- state->haveDatum1 = true;
+ base->haveDatum1 = true;
- state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
+ arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
indexScanKey = _bt_mkscankey(indexRel, NULL);
- if (state->indexInfo->ii_Expressions != NULL)
+ if (arg->indexInfo->ii_Expressions != NULL)
{
TupleTableSlot *slot;
ExprContext *econtext;
@@ -1165,19 +1227,19 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
* TupleTableSlot to put the table tuples into. The econtext's
* scantuple has to point to that slot, too.
*/
- state->estate = CreateExecutorState();
+ arg->estate = CreateExecutorState();
slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
- econtext = GetPerTupleExprContext(state->estate);
+ econtext = GetPerTupleExprContext(arg->estate);
econtext->ecxt_scantuple = slot;
}
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < base->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
ScanKey scanKey = indexScanKey->scankeys + i;
int16 strategy;
@@ -1187,7 +1249,7 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
(scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
sortKey->ssup_attno = scanKey->sk_attno;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1215,11 +1277,14 @@ tuplesort_begin_index_btree(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
BTScanInsert indexScanKey;
+ TuplesortIndexBTreeArg *arg;
MemoryContext oldcontext;
int i;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1229,36 +1294,36 @@ tuplesort_begin_index_btree(Relation heapRel,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
enforceUnique,
- state->nKeys,
+ base->nKeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_index;
- state->comparetup = comparetup_index_btree;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->abbrevNext = 10;
- state->haveDatum1 = true;
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
- state->enforceUnique = enforceUnique;
- state->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = enforceUnique;
+ arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
indexScanKey = _bt_mkscankey(indexRel, NULL);
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < base->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
ScanKey scanKey = indexScanKey->scankeys + i;
int16 strategy;
@@ -1268,7 +1333,7 @@ tuplesort_begin_index_btree(Relation heapRel,
(scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
sortKey->ssup_attno = scanKey->sk_attno;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1297,9 +1362,12 @@ tuplesort_begin_index_hash(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MemoryContext oldcontext;
+ TuplesortIndexHashArg *arg;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexHashArg *) palloc(sizeof(TuplesortIndexHashArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1313,20 +1381,21 @@ tuplesort_begin_index_hash(Relation heapRel,
sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = 1; /* Only one sort column, the hash code */
+ base->nKeys = 1; /* Only one sort column, the hash code */
- state->removeabbrev = removeabbrev_index;
- state->comparetup = comparetup_index_hash;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->haveDatum1 = true;
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_hash;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
- state->high_mask = high_mask;
- state->low_mask = low_mask;
- state->max_buckets = max_buckets;
+ arg->high_mask = high_mask;
+ arg->low_mask = low_mask;
+ arg->max_buckets = max_buckets;
MemoryContextSwitchTo(oldcontext);
@@ -1342,10 +1411,13 @@ tuplesort_begin_index_gist(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MemoryContext oldcontext;
+ TuplesortIndexBTreeArg *arg;
int i;
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1354,31 +1426,34 @@ tuplesort_begin_index_gist(Relation heapRel,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
- state->removeabbrev = removeabbrev_index;
- state->comparetup = comparetup_index_btree;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->haveDatum1 = true;
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = false;
+ arg->uniqueNullsNotDistinct = false;
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < base->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
sortKey->ssup_cxt = CurrentMemoryContext;
sortKey->ssup_collation = indexRel->rd_indcollation[i];
sortKey->ssup_nulls_first = false;
sortKey->ssup_attno = i + 1;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1398,11 +1473,14 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg;
MemoryContext oldcontext;
int16 typlen;
bool typbyval;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortDatumArg *) palloc(sizeof(TuplesortDatumArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1411,35 +1489,36 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = 1; /* always a one-column sort */
+ base->nKeys = 1; /* always a one-column sort */
TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
false, /* no unique check */
1,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_datum;
- state->comparetup = comparetup_datum;
- state->writetup = writetup_datum;
- state->readtup = readtup_datum;
+ base->removeabbrev = removeabbrev_datum;
+ base->comparetup = comparetup_datum;
+ base->writetup = writetup_datum;
+ base->readtup = readtup_datum;
state->abbrevNext = 10;
- state->haveDatum1 = true;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->datumType = datumType;
+ arg->datumType = datumType;
/* lookup necessary attributes of the datum type */
get_typlenbyval(datumType, &typlen, &typbyval);
- state->datumTypeLen = typlen;
- state->tuples = !typbyval;
+ arg->datumTypeLen = typlen;
+ base->tuples = !typbyval;
/* Prepare SortSupport data */
- state->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
- state->sortKeys->ssup_cxt = CurrentMemoryContext;
- state->sortKeys->ssup_collation = sortCollation;
- state->sortKeys->ssup_nulls_first = nullsFirstFlag;
+ base->sortKeys->ssup_cxt = CurrentMemoryContext;
+ base->sortKeys->ssup_collation = sortCollation;
+ base->sortKeys->ssup_nulls_first = nullsFirstFlag;
/*
* Abbreviation is possible here only for by-reference types. In theory,
@@ -1449,9 +1528,9 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
* can't, because a datum sort only stores a single copy of the datum; the
* "tuple" field of each SortTuple is NULL.
*/
- state->sortKeys->abbreviate = !typbyval;
+ base->sortKeys->abbreviate = !typbyval;
- PrepareSortSupportFromOrderingOp(sortOperator, state->sortKeys);
+ PrepareSortSupportFromOrderingOp(sortOperator, base->sortKeys);
/*
* The "onlyKey" optimization cannot be used with abbreviated keys, since
@@ -1459,8 +1538,8 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
* is only of value to pass-by-value types anyway, whereas abbreviated
* keys are typically only of value to pass-by-reference types.
*/
- if (!state->sortKeys->abbrev_converter)
- state->onlyKey = state->sortKeys;
+ if (!base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
MemoryContextSwitchTo(oldcontext);
@@ -1485,7 +1564,7 @@ tuplesort_set_bound(Tuplesortstate *state, int64 bound)
/* Assert we're called before loading any tuples */
Assert(state->status == TSS_INITIAL && state->memtupcount == 0);
/* Assert we allow bounded sorts */
- Assert(state->sortopt & TUPLESORT_ALLOWBOUNDED);
+ Assert(state->base.sortopt & TUPLESORT_ALLOWBOUNDED);
/* Can't set the bound twice, either */
Assert(!state->bounded);
/* Also, this shouldn't be called in a parallel worker */
@@ -1513,13 +1592,13 @@ tuplesort_set_bound(Tuplesortstate *state, int64 bound)
* optimization. Disable by setting state to be consistent with no
* abbreviation support.
*/
- state->sortKeys->abbrev_converter = NULL;
- if (state->sortKeys->abbrev_full_comparator)
- state->sortKeys->comparator = state->sortKeys->abbrev_full_comparator;
+ state->base.sortKeys->abbrev_converter = NULL;
+ if (state->base.sortKeys->abbrev_full_comparator)
+ state->base.sortKeys->comparator = state->base.sortKeys->abbrev_full_comparator;
/* Not strictly necessary, but be tidy */
- state->sortKeys->abbrev_abort = NULL;
- state->sortKeys->abbrev_full_comparator = NULL;
+ state->base.sortKeys->abbrev_abort = NULL;
+ state->base.sortKeys->abbrev_full_comparator = NULL;
}
/*
@@ -1542,7 +1621,7 @@ static void
tuplesort_free(Tuplesortstate *state)
{
/* context swap probably not needed, but let's be safe */
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
#ifdef TRACE_SORT
long spaceUsed;
@@ -1589,21 +1668,13 @@ tuplesort_free(Tuplesortstate *state)
TRACE_POSTGRESQL_SORT_DONE(state->tapeset != NULL, 0L);
#endif
- /* Free any execution state created for CLUSTER case */
- if (state->estate != NULL)
- {
- ExprContext *econtext = GetPerTupleExprContext(state->estate);
-
- ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
- FreeExecutorState(state->estate);
- }
-
+ FREESTATE(state);
MemoryContextSwitchTo(oldcontext);
/*
* Free the per-sort memory context, thereby releasing all working memory.
*/
- MemoryContextReset(state->sortcontext);
+ MemoryContextReset(state->base.sortcontext);
}
/*
@@ -1624,7 +1695,7 @@ tuplesort_end(Tuplesortstate *state)
* Free the main memory context, including the Tuplesortstate struct
* itself.
*/
- MemoryContextDelete(state->maincontext);
+ MemoryContextDelete(state->base.maincontext);
}
/*
@@ -1838,7 +1909,9 @@ noalloc:
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TupleDesc tupDesc = (TupleDesc) base->arg;
SortTuple stup;
MinimalTuple tuple;
HeapTupleData htup;
@@ -1850,12 +1923,12 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
stup.datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ base->sortKeys[0].ssup_attno,
+ tupDesc,
&stup.isnull1);
puttuple_common(state, &stup,
- state->sortKeys->abbrev_converter && !stup.isnull1);
+ base->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1869,7 +1942,9 @@ void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
SortTuple stup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
@@ -1879,16 +1954,16 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
* set up first-column key value, and potentially abbreviate, if it's a
* simple column
*/
- if (state->haveDatum1)
+ if (base->haveDatum1)
{
stup.datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup.isnull1);
}
puttuple_common(state, &stup,
- state->haveDatum1 && state->sortKeys->abbrev_converter && !stup.isnull1);
+ base->haveDatum1 && base->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1904,19 +1979,21 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
{
SortTuple stup;
IndexTuple tuple;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
- isnull, state->tuplecontext);
+ isnull, base->tuplecontext);
tuple = ((IndexTuple) stup.tuple);
tuple->t_tid = *self;
/* set up first-column key value */
stup.datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup.isnull1);
puttuple_common(state, &stup,
- state->sortKeys && state->sortKeys->abbrev_converter && !stup.isnull1);
+ base->sortKeys && base->sortKeys->abbrev_converter && !stup.isnull1);
}
/*
@@ -1927,7 +2004,9 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
void
tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
SortTuple stup;
/*
@@ -1942,7 +2021,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* identical to stup.tuple.
*/
- if (isNull || !state->tuples)
+ if (isNull || !base->tuples)
{
/*
* Set datum1 to zeroed representation for NULLs (to be consistent,
@@ -1955,12 +2034,12 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
else
{
stup.isnull1 = false;
- stup.datum1 = datumCopy(val, false, state->datumTypeLen);
+ stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
stup.tuple = DatumGetPointer(stup.datum1);
}
puttuple_common(state, &stup,
- state->tuples && !isNull && state->sortKeys->abbrev_converter);
+ base->tuples && !isNull && base->sortKeys->abbrev_converter);
MemoryContextSwitchTo(oldcontext);
}
@@ -1971,7 +2050,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
static void
puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
Assert(!LEADER(state));
@@ -1993,8 +2072,8 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
else if (!consider_abort_common(state))
{
/* Store abbreviated key representation */
- tuple->datum1 = state->sortKeys->abbrev_converter(tuple->datum1,
- state->sortKeys);
+ tuple->datum1 = state->base.sortKeys->abbrev_converter(tuple->datum1,
+ state->base.sortKeys);
}
else
{
@@ -2128,7 +2207,7 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
static void
writetuple(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
- state->writetup(state, tape, stup);
+ state->base.writetup(state, tape, stup);
if (!state->slabAllocatorUsed && stup->tuple)
{
@@ -2140,9 +2219,9 @@ writetuple(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
static bool
consider_abort_common(Tuplesortstate *state)
{
- Assert(state->sortKeys[0].abbrev_converter != NULL);
- Assert(state->sortKeys[0].abbrev_abort != NULL);
- Assert(state->sortKeys[0].abbrev_full_comparator != NULL);
+ Assert(state->base.sortKeys[0].abbrev_converter != NULL);
+ Assert(state->base.sortKeys[0].abbrev_abort != NULL);
+ Assert(state->base.sortKeys[0].abbrev_full_comparator != NULL);
/*
* Check effectiveness of abbreviation optimization. Consider aborting
@@ -2157,19 +2236,19 @@ consider_abort_common(Tuplesortstate *state)
* Check opclass-supplied abbreviation abort routine. It may indicate
* that abbreviation should not proceed.
*/
- if (!state->sortKeys->abbrev_abort(state->memtupcount,
- state->sortKeys))
+ if (!state->base.sortKeys->abbrev_abort(state->memtupcount,
+ state->base.sortKeys))
return false;
/*
* Finally, restore authoritative comparator, and indicate that
* abbreviation is not in play by setting abbrev_converter to NULL
*/
- state->sortKeys[0].comparator = state->sortKeys[0].abbrev_full_comparator;
- state->sortKeys[0].abbrev_converter = NULL;
+ state->base.sortKeys[0].comparator = state->base.sortKeys[0].abbrev_full_comparator;
+ state->base.sortKeys[0].abbrev_converter = NULL;
/* Not strictly necessary, but be tidy */
- state->sortKeys[0].abbrev_abort = NULL;
- state->sortKeys[0].abbrev_full_comparator = NULL;
+ state->base.sortKeys[0].abbrev_abort = NULL;
+ state->base.sortKeys[0].abbrev_full_comparator = NULL;
/* Give up - expect original pass-by-value representation */
return true;
@@ -2184,7 +2263,7 @@ consider_abort_common(Tuplesortstate *state)
void
tuplesort_performsort(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
#ifdef TRACE_SORT
if (trace_sort)
@@ -2304,7 +2383,7 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
switch (state->status)
{
case TSS_SORTEDINMEM:
- Assert(forward || state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(forward || state->base.sortopt & TUPLESORT_RANDOMACCESS);
Assert(!state->slabAllocatorUsed);
if (forward)
{
@@ -2348,7 +2427,7 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
break;
case TSS_SORTEDONTAPE:
- Assert(forward || state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(forward || state->base.sortopt & TUPLESORT_RANDOMACCESS);
Assert(state->slabAllocatorUsed);
/*
@@ -2550,7 +2629,8 @@ bool
tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
TupleTableSlot *slot, Datum *abbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2561,7 +2641,7 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
if (stup.tuple)
{
/* Record abbreviated key for caller */
- if (state->sortKeys->abbrev_converter && abbrev)
+ if (base->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
if (copy)
@@ -2586,7 +2666,8 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
HeapTuple
tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2606,7 +2687,8 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
IndexTuple
tuplesort_getindextuple(Tuplesortstate *state, bool forward)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2636,7 +2718,9 @@ bool
tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2649,10 +2733,10 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
MemoryContextSwitchTo(oldcontext);
/* Record abbreviated key for caller */
- if (state->sortKeys->abbrev_converter && abbrev)
+ if (base->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
- if (stup.isnull1 || !state->tuples)
+ if (stup.isnull1 || !base->tuples)
{
*val = stup.datum1;
*isNull = stup.isnull1;
@@ -2660,7 +2744,7 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
else
{
/* use stup.tuple because stup.datum1 may be an abbreviation */
- *val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
*isNull = false;
}
@@ -2713,7 +2797,7 @@ tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples, bool forward)
* We could probably optimize these cases better, but for now it's
* not worth the trouble.
*/
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
while (ntuples-- > 0)
{
SortTuple stup;
@@ -2989,7 +3073,7 @@ mergeruns(Tuplesortstate *state)
Assert(state->status == TSS_BUILDRUNS);
Assert(state->memtupcount == 0);
- if (state->sortKeys != NULL && state->sortKeys->abbrev_converter != NULL)
+ if (state->base.sortKeys != NULL && state->base.sortKeys->abbrev_converter != NULL)
{
/*
* If there are multiple runs to be merged, when we go to read back
@@ -2997,19 +3081,19 @@ mergeruns(Tuplesortstate *state)
* we don't care to regenerate them. Disable abbreviation from this
* point on.
*/
- state->sortKeys->abbrev_converter = NULL;
- state->sortKeys->comparator = state->sortKeys->abbrev_full_comparator;
+ state->base.sortKeys->abbrev_converter = NULL;
+ state->base.sortKeys->comparator = state->base.sortKeys->abbrev_full_comparator;
/* Not strictly necessary, but be tidy */
- state->sortKeys->abbrev_abort = NULL;
- state->sortKeys->abbrev_full_comparator = NULL;
+ state->base.sortKeys->abbrev_abort = NULL;
+ state->base.sortKeys->abbrev_full_comparator = NULL;
}
/*
* Reset tuple memory. We've freed all the tuples that we previously
* allocated. We will use the slab allocator from now on.
*/
- MemoryContextResetOnly(state->tuplecontext);
+ MemoryContextResetOnly(state->base.tuplecontext);
/*
* We no longer need a large memtuples array. (We will allocate a smaller
@@ -3032,7 +3116,7 @@ mergeruns(Tuplesortstate *state)
* From this point on, we no longer use the USEMEM()/LACKMEM() mechanism
* to track memory usage of individual tuples.
*/
- if (state->tuples)
+ if (state->base.tuples)
init_slab_allocator(state, state->nOutputTapes + 1);
else
init_slab_allocator(state, 0);
@@ -3046,7 +3130,7 @@ mergeruns(Tuplesortstate *state)
* number of input tapes will not increase between passes.)
*/
state->memtupsize = state->nOutputTapes;
- state->memtuples = (SortTuple *) MemoryContextAlloc(state->maincontext,
+ state->memtuples = (SortTuple *) MemoryContextAlloc(state->base.maincontext,
state->nOutputTapes * sizeof(SortTuple));
USEMEM(state, GetMemoryChunkSpace(state->memtuples));
@@ -3123,7 +3207,7 @@ mergeruns(Tuplesortstate *state)
* sorted tape, we can stop at this point and do the final merge
* on-the-fly.
*/
- if ((state->sortopt & TUPLESORT_RANDOMACCESS) == 0
+ if ((state->base.sortopt & TUPLESORT_RANDOMACCESS) == 0
&& state->nInputRuns <= state->nInputTapes
&& !WORKER(state))
{
@@ -3349,7 +3433,7 @@ dumptuples(Tuplesortstate *state, bool alltuples)
* AllocSetFree's bucketing by size class might be particularly bad if
* this step wasn't taken.
*/
- MemoryContextReset(state->tuplecontext);
+ MemoryContextReset(state->base.tuplecontext);
markrunend(state->destTape);
@@ -3367,9 +3451,9 @@ dumptuples(Tuplesortstate *state, bool alltuples)
void
tuplesort_rescan(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->base.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3400,9 +3484,9 @@ tuplesort_rescan(Tuplesortstate *state)
void
tuplesort_markpos(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->base.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3431,9 +3515,9 @@ tuplesort_markpos(Tuplesortstate *state)
void
tuplesort_restorepos(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->base.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3649,9 +3733,9 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
*/
- if (state->haveDatum1 && state->sortKeys)
+ if (state->base.haveDatum1 && state->base.sortKeys)
{
- if (state->sortKeys[0].comparator == ssup_datum_unsigned_cmp)
+ if (state->base.sortKeys[0].comparator == ssup_datum_unsigned_cmp)
{
qsort_tuple_unsigned(state->memtuples,
state->memtupcount,
@@ -3659,7 +3743,7 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
return;
}
#if SIZEOF_DATUM >= 8
- else if (state->sortKeys[0].comparator == ssup_datum_signed_cmp)
+ else if (state->base.sortKeys[0].comparator == ssup_datum_signed_cmp)
{
qsort_tuple_signed(state->memtuples,
state->memtupcount,
@@ -3667,7 +3751,7 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
return;
}
#endif
- else if (state->sortKeys[0].comparator == ssup_datum_int32_cmp)
+ else if (state->base.sortKeys[0].comparator == ssup_datum_int32_cmp)
{
qsort_tuple_int32(state->memtuples,
state->memtupcount,
@@ -3677,16 +3761,16 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
}
/* Can we use the single-key sort function? */
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
{
qsort_ssup(state->memtuples, state->memtupcount,
- state->onlyKey);
+ state->base.onlyKey);
}
else
{
qsort_tuple(state->memtuples,
state->memtupcount,
- state->comparetup,
+ state->base.comparetup,
state);
}
}
@@ -3803,10 +3887,10 @@ tuplesort_heap_replace_top(Tuplesortstate *state, SortTuple *tuple)
static void
reversedirection(Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ SortSupport sortKey = state->base.sortKeys;
int nkey;
- for (nkey = 0; nkey < state->nKeys; nkey++, sortKey++)
+ for (nkey = 0; nkey < state->base.nKeys; nkey++, sortKey++)
{
sortKey->ssup_reverse = !sortKey->ssup_reverse;
sortKey->ssup_nulls_first = !sortKey->ssup_nulls_first;
@@ -3857,7 +3941,7 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
Assert(state->slabFreeHead);
if (tuplen > SLAB_SLOT_SIZE || !state->slabFreeHead)
- return MemoryContextAlloc(state->sortcontext, tuplen);
+ return MemoryContextAlloc(state->base.sortcontext, tuplen);
else
{
buf = state->slabFreeHead;
@@ -3877,6 +3961,7 @@ static void
removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
{
int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
for (i = 0; i < count; i++)
{
@@ -3887,8 +3972,8 @@ removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
MINIMAL_TUPLE_OFFSET);
stups[i].datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
&stups[i].isnull1);
}
}
@@ -3896,7 +3981,8 @@ removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys;
HeapTupleData ltup;
HeapTupleData rtup;
TupleDesc tupDesc;
@@ -3921,7 +4007,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
- tupDesc = state->tupDesc;
+ tupDesc = (TupleDesc) base->arg;
if (sortKey->abbrev_converter)
{
@@ -3938,7 +4024,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
}
sortKey++;
- for (nkey = 1; nkey < state->nKeys; nkey++, sortKey++)
+ for (nkey = 1; nkey < base->nKeys; nkey++, sortKey++)
{
attno = sortKey->ssup_attno;
@@ -3958,6 +4044,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MinimalTuple tuple = (MinimalTuple) stup->tuple;
/* the part of the MinimalTuple we'll write: */
@@ -3969,8 +4056,7 @@ writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
}
@@ -3982,21 +4068,21 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
MinimalTuple tuple = (MinimalTuple) readtup_alloc(state, tuplen);
char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
HeapTupleData htup;
/* read in the tuple proper */
tuple->t_len = tuplen;
LogicalTapeReadExact(tape, tupbody, tupbodylen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
stup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
&stup->isnull1);
}
@@ -4009,6 +4095,8 @@ static void
removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
{
int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
for (i = 0; i < count; i++)
{
@@ -4016,8 +4104,8 @@ removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
tup = (HeapTuple) stups[i].tuple;
stups[i].datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stups[i].isnull1);
}
}
@@ -4026,7 +4114,9 @@ static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
HeapTuple ltup;
HeapTuple rtup;
TupleDesc tupDesc;
@@ -4040,10 +4130,10 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
/* Be prepared to compare additional sort keys */
ltup = (HeapTuple) a->tuple;
rtup = (HeapTuple) b->tuple;
- tupDesc = state->tupDesc;
+ tupDesc = arg->tupDesc;
/* Compare the leading sort key, if it's simple */
- if (state->haveDatum1)
+ if (base->haveDatum1)
{
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
@@ -4053,7 +4143,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
if (sortKey->abbrev_converter)
{
- AttrNumber leading = state->indexInfo->ii_IndexAttrNumbers[0];
+ AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
@@ -4062,7 +4152,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
datum2, isnull2,
sortKey);
}
- if (compare != 0 || state->nKeys == 1)
+ if (compare != 0 || base->nKeys == 1)
return compare;
/* Compare additional columns the hard way */
sortKey++;
@@ -4074,13 +4164,13 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
nkey = 0;
}
- if (state->indexInfo->ii_Expressions == NULL)
+ if (arg->indexInfo->ii_Expressions == NULL)
{
/* If not expression index, just compare the proper heap attrs */
- for (; nkey < state->nKeys; nkey++, sortKey++)
+ for (; nkey < base->nKeys; nkey++, sortKey++)
{
- AttrNumber attno = state->indexInfo->ii_IndexAttrNumbers[nkey];
+ AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
@@ -4107,19 +4197,19 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
TupleTableSlot *ecxt_scantuple;
/* Reset context each time to prevent memory leakage */
- ResetPerTupleExprContext(state->estate);
+ ResetPerTupleExprContext(arg->estate);
- ecxt_scantuple = GetPerTupleExprContext(state->estate)->ecxt_scantuple;
+ ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
- FormIndexDatum(state->indexInfo, ecxt_scantuple, state->estate,
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
l_index_values, l_index_isnull);
ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
- FormIndexDatum(state->indexInfo, ecxt_scantuple, state->estate,
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
r_index_values, r_index_isnull);
- for (; nkey < state->nKeys; nkey++, sortKey++)
+ for (; nkey < base->nKeys; nkey++, sortKey++)
{
compare = ApplySortComparator(l_index_values[nkey],
l_index_isnull[nkey],
@@ -4137,6 +4227,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
HeapTuple tuple = (HeapTuple) stup->tuple;
unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
@@ -4144,8 +4235,7 @@ writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
}
@@ -4153,6 +4243,8 @@ static void
readtup_cluster(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int tuplen)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
HeapTuple tuple = (HeapTuple) readtup_alloc(state,
t_len + HEAPTUPLESIZE);
@@ -4165,18 +4257,33 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
tuple->t_tableOid = InvalidOid;
/* Read in the tuple body */
LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value, if it's a simple column */
- if (state->haveDatum1)
+ if (base->haveDatum1)
stup->datum1 = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup->isnull1);
}
+static void
+freestate_cluster(Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ /* Free any execution state created for CLUSTER case */
+ if (arg->estate != NULL)
+ {
+ ExprContext *econtext = GetPerTupleExprContext(arg->estate);
+
+ ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
+ FreeExecutorState(arg->estate);
+ }
+}
+
/*
* Routines specialized for IndexTuple case
*
@@ -4188,6 +4295,8 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
static void
removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
int i;
for (i = 0; i < count; i++)
@@ -4197,7 +4306,7 @@ removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
tuple = stups[i].tuple;
stups[i].datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stups[i].isnull1);
}
}
@@ -4211,7 +4320,9 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* is also special handling for enforcing uniqueness, and special
* treatment for equal keys at the end.
*/
- SortSupport sortKey = state->sortKeys;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
IndexTuple tuple1;
IndexTuple tuple2;
int keysz;
@@ -4235,8 +4346,8 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
/* Compare additional sort keys */
tuple1 = (IndexTuple) a->tuple;
tuple2 = (IndexTuple) b->tuple;
- keysz = state->nKeys;
- tupDes = RelationGetDescr(state->indexRel);
+ keysz = base->nKeys;
+ tupDes = RelationGetDescr(arg->index.indexRel);
if (sortKey->abbrev_converter)
{
@@ -4281,7 +4392,7 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* sort algorithm wouldn't have checked whether one must appear before the
* other.
*/
- if (state->enforceUnique && !(!state->uniqueNullsNotDistinct && equal_hasnull))
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
{
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
@@ -4297,16 +4408,16 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
index_deform_tuple(tuple1, tupDes, values, isnull);
- key_desc = BuildIndexValueDescription(state->indexRel, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
ereport(ERROR,
(errcode(ERRCODE_UNIQUE_VIOLATION),
errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(state->indexRel)),
+ RelationGetRelationName(arg->index.indexRel)),
key_desc ? errdetail("Key %s is duplicated.", key_desc) :
errdetail("Duplicate keys exist."),
- errtableconstraint(state->heapRel,
- RelationGetRelationName(state->indexRel))));
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
}
/*
@@ -4344,6 +4455,8 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Bucket bucket2;
IndexTuple tuple1;
IndexTuple tuple2;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexHashArg *arg = (TuplesortIndexHashArg *) base->arg;
/*
* Fetch hash keys and mask off bits we don't want to sort by. We know
@@ -4351,12 +4464,12 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
*/
Assert(!a->isnull1);
bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
- state->max_buckets, state->high_mask,
- state->low_mask);
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
Assert(!b->isnull1);
bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
- state->max_buckets, state->high_mask,
- state->low_mask);
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
if (bucket1 > bucket2)
return 1;
else if (bucket1 < bucket2)
@@ -4394,14 +4507,14 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
IndexTuple tuple = (IndexTuple) stup->tuple;
unsigned int tuplen;
tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
}
@@ -4409,18 +4522,19 @@ static void
readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
unsigned int tuplen = len - sizeof(unsigned int);
IndexTuple tuple = (IndexTuple) readtup_alloc(state, tuplen);
LogicalTapeReadExact(tape, tuple, tuplen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value */
stup->datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup->isnull1);
}
@@ -4440,20 +4554,21 @@ removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
int compare;
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- state->sortKeys);
+ base->sortKeys);
if (compare != 0)
return compare;
/* if we have abbreviations, then "tuple" has the original value */
- if (state->sortKeys->abbrev_converter)
+ if (base->sortKeys->abbrev_converter)
compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
PointerGetDatum(b->tuple), b->isnull1,
- state->sortKeys);
+ base->sortKeys);
return compare;
}
@@ -4461,6 +4576,8 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
void *waddr;
unsigned int tuplen;
unsigned int writtenlen;
@@ -4470,7 +4587,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
waddr = NULL;
tuplen = 0;
}
- else if (!state->tuples)
+ else if (!base->tuples)
{
waddr = &stup->datum1;
tuplen = sizeof(Datum);
@@ -4478,7 +4595,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
else
{
waddr = stup->tuple;
- tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, state->datumTypeLen);
+ tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
Assert(tuplen != 0);
}
@@ -4486,8 +4603,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
LogicalTapeWrite(tape, waddr, tuplen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
}
@@ -4495,6 +4611,7 @@ static void
readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
unsigned int tuplen = len - sizeof(unsigned int);
if (tuplen == 0)
@@ -4504,7 +4621,7 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
stup->isnull1 = true;
stup->tuple = NULL;
}
- else if (!state->tuples)
+ else if (!base->tuples)
{
Assert(tuplen == sizeof(Datum));
LogicalTapeReadExact(tape, &stup->datum1, tuplen);
@@ -4521,8 +4638,7 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
stup->tuple = raddr;
}
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 34a76ceb60..1f88be06aa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2833,8 +2833,14 @@ TupleHashTable
TupleQueueReader
TupleTableSlot
TupleTableSlotOps
+TuplesortClusterArg
+TuplesortDatumArg
+TuplesortIndexArg
+TuplesortIndexBTreeArg
+TuplesortIndexHashArg
TuplesortInstrumentation
TuplesortMethod
+TuplesortPublic
TuplesortSpaceType
Tuplesortstate
Tuplestorestate
--
2.24.3 (Apple Git-128)
v3-0003-Put-abbreviation-logic-into-puttuple_common.patchapplication/octet-stream; name=v3-0003-Put-abbreviation-logic-into-puttuple_common.patchDownload
From 597a1d83da52a1a4219f617360a68b2fe89e6c1e Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:13:56 +0300
Subject: [PATCH v3 3/7] Put abbreviation logic into puttuple_common()
Abbreviation code is very similar along tuplesort_put*() functions. This
commit unifies that code and puts it into puttuple_common(). tuplesort_put*()
functions differs in the abbreviation condition, so it has been added as an
argument to the puttuple_common() function.
---
src/backend/utils/sort/tuplesort.c | 222 ++++++++---------------------
1 file changed, 56 insertions(+), 166 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 8b6b2bc1d3..828efe701e 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -616,7 +616,8 @@ static Tuplesortstate *tuplesort_begin_common(int workMem,
SortCoordinate coordinate,
int sortopt);
static void tuplesort_begin_batch(Tuplesortstate *state);
-static void puttuple_common(Tuplesortstate *state, SortTuple *tuple);
+static void puttuple_common(Tuplesortstate *state, SortTuple *tuple,
+ bool useAbbrev);
static bool consider_abort_common(Tuplesortstate *state);
static void inittapes(Tuplesortstate *state, bool mergeruns);
static void inittapestate(Tuplesortstate *state, int maxTapes);
@@ -1841,7 +1842,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
- Datum original;
MinimalTuple tuple;
HeapTupleData htup;
@@ -1852,51 +1852,15 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- original = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &stup.isnull1);
+ stup.datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup.isnull1);
MemoryContextSwitchTo(state->sortcontext);
- if (!state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
-
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1910,7 +1874,6 @@ void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
SortTuple stup;
- Datum original;
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
/* copy the tuple into sort storage */
@@ -1926,51 +1889,14 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
*/
if (state->haveDatum1)
{
- original = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &stup.isnull1);
-
- if (!state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there
- * is a converter it won't expect NULL values, and cost model is
- * not required to account for NULL, so in that case we avoid
- * calling converter and just set datum1 to zeroed representation
- * (to be consistent, and to support cheap inequality tests for
- * NULL abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any
- * case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
+ stup.datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup.isnull1);
}
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->haveDatum1 && state->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1986,7 +1912,6 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
{
MemoryContext oldcontext;
SortTuple stup;
- Datum original;
IndexTuple tuple;
stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
@@ -1995,51 +1920,15 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
tuple->t_tid = *self;
USEMEM(state, GetMemoryChunkSpace(stup.tuple));
/* set up first-column key value */
- original = index_getattr(tuple,
- 1,
- RelationGetDescr(state->indexRel),
- &stup.isnull1);
+ stup.datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(state->indexRel),
+ &stup.isnull1);
oldcontext = MemoryContextSwitchTo(state->sortcontext);
- if (!state->sortKeys || !state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
-
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->sortKeys && state->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -2080,45 +1969,15 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
}
else
{
- Datum original = datumCopy(val, false, state->datumTypeLen);
-
stup.isnull1 = false;
- stup.tuple = DatumGetPointer(original);
+ stup.datum1 = datumCopy(val, false, state->datumTypeLen);
+ stup.tuple = DatumGetPointer(stup.datum1);
USEMEM(state, GetMemoryChunkSpace(stup.tuple));
MemoryContextSwitchTo(state->sortcontext);
-
- if (!state->sortKeys->abbrev_converter)
- {
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any
- * case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
}
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->tuples && !isNull && state->sortKeys->abbrev_converter);
MemoryContextSwitchTo(oldcontext);
}
@@ -2127,10 +1986,41 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* Shared code for tuple and datum cases.
*/
static void
-puttuple_common(Tuplesortstate *state, SortTuple *tuple)
+puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
Assert(!LEADER(state));
+ if (!useAbbrev)
+ {
+ /*
+ * Leave ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ tuple->datum1 = state->sortKeys->abbrev_converter(tuple->datum1,
+ state->sortKeys);
+ }
+ else
+ {
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
+ }
+
switch (state->status)
{
case TSS_INITIAL:
--
2.24.3 (Apple Git-128)
v3-0002-Add-new-Tuplesortstate.removeabbrev-function.patchapplication/octet-stream; name=v3-0002-Add-new-Tuplesortstate.removeabbrev-function.patchDownload
From 468e26a90cad4ef63283f1cf693b9ab9121b4fe2 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:03:13 +0300
Subject: [PATCH v3 2/7] Add new Tuplesortstate.removeabbrev function
This commit is the preparation to move abbreviation logic into
puttuple_common(). The new removeabbrev function turns datum1 representation
of SortTuple's from the abbreviated key to the first column value. Therefore,
it encapsulates the differential part of abbreviation handling code in
tuplesort_put*() functions, making these functions similar.
---
src/backend/utils/sort/tuplesort.c | 156 +++++++++++++++++++----------
1 file changed, 103 insertions(+), 53 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 4812b1d9ae..8b6b2bc1d3 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -279,6 +279,13 @@ struct Tuplesortstate
*/
SortTupleComparator comparetup;
+ /*
+ * Alter datum1 representation in the SortTuple's array back from the
+ * abbreviated key to the first column value.
+ */
+ void (*removeabbrev) (Tuplesortstate *state, SortTuple *stups,
+ int count);
+
/*
* Function to write a stored tuple onto tape. The representation of the
* tuple on tape need not be the same as it is in memory; requirements on
@@ -540,6 +547,7 @@ struct Sharedsort
pfree(buf); \
} while(0)
+#define REMOVEABBREV(state,stup,count) ((*(state)->removeabbrev) (state, stup, count))
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
@@ -629,6 +637,14 @@ static void reversedirection(Tuplesortstate *state);
static unsigned int getlen(LogicalTape *tape, bool eofOK);
static void markrunend(LogicalTape *tape);
static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
+static void removeabbrev_heap(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_index(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_datum(Tuplesortstate *state, SortTuple *stups,
+ int count);
static int comparetup_heap(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
@@ -1042,6 +1058,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_heap;
state->comparetup = comparetup_heap;
state->writetup = writetup_heap;
state->readtup = readtup_heap;
@@ -1117,6 +1134,7 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_cluster;
state->comparetup = comparetup_cluster;
state->writetup = writetup_cluster;
state->readtup = readtup_cluster;
@@ -1221,6 +1239,7 @@ tuplesort_begin_index_btree(Relation heapRel,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_index;
state->comparetup = comparetup_index_btree;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1297,6 +1316,7 @@ tuplesort_begin_index_hash(Relation heapRel,
state->nKeys = 1; /* Only one sort column, the hash code */
+ state->removeabbrev = removeabbrev_index;
state->comparetup = comparetup_index_hash;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1337,6 +1357,7 @@ tuplesort_begin_index_gist(Relation heapRel,
state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ state->removeabbrev = removeabbrev_index;
state->comparetup = comparetup_index_btree;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1400,6 +1421,7 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_datum;
state->comparetup = comparetup_datum;
state->writetup = writetup_datum;
state->readtup = readtup_datum;
@@ -1871,20 +1893,7 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
* sorted on tape, since serialized tuples lack abbreviated keys
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- mtup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &mtup->isnull1);
- }
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
puttuple_common(state, &stup);
@@ -1925,12 +1934,12 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
if (!state->sortKeys->abbrev_converter || stup.isnull1)
{
/*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
+ * Store ordinary Datum representation, or NULL value. If there
+ * is a converter it won't expect NULL values, and cost model is
+ * not required to account for NULL, so in that case we avoid
+ * calling converter and just set datum1 to zeroed representation
+ * (to be consistent, and to support cheap inequality tests for
+ * NULL abbreviated keys).
*/
stup.datum1 = original;
}
@@ -1949,23 +1958,15 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
/*
* Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tup = (HeapTuple) mtup->tuple;
- mtup->datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &mtup->isnull1);
- }
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any
+ * case).
+ */
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
}
@@ -2035,16 +2036,7 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
* sorted on tape, since serialized tuples lack abbreviated keys
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tuple = mtup->tuple;
- mtup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(state->indexRel),
- &mtup->isnull1);
- }
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
puttuple_common(state, &stup);
@@ -2122,12 +2114,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* (TSS_BUILDRUNS state prevents control reaching here in any
* case).
*/
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- mtup->datum1 = PointerGetDatum(mtup->tuple);
- }
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
}
@@ -3984,6 +3971,26 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
* Routines specialized for HeapTuple (actually MinimalTuple) case
*/
+static void
+removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTupleData htup;
+
+ htup.t_len = ((MinimalTuple) stups[i].tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
+ MINIMAL_TUPLE_OFFSET);
+ stups[i].datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stups[i].isnull1);
+ }
+}
+
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
@@ -4102,6 +4109,23 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
* comparisons per a btree index definition)
*/
+static void
+removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTuple tup;
+
+ tup = (HeapTuple) stups[i].tuple;
+ stups[i].datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stups[i].isnull1);
+ }
+}
+
static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
@@ -4271,6 +4295,23 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
* functions can be shared.
*/
+static void
+removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ IndexTuple tuple;
+
+ tuple = stups[i].tuple;
+ stups[i].datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(state->indexRel),
+ &stups[i].isnull1);
+ }
+}
+
static int
comparetup_index_btree(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
@@ -4503,6 +4544,15 @@ readtup_index(Tuplesortstate *state, SortTuple *stup,
* Routines specialized for DatumTuple case
*/
+static void
+removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ stups[i].datum1 = PointerGetDatum(stups[i].tuple);
+}
+
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
--
2.24.3 (Apple Git-128)
Hi, Pavel!
Thank you for your review and corrections.
On Fri, Jul 22, 2022 at 6:57 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
I've looked through the updated patch. Overall it looks good enough.
Some minor things:
- PARALLEL_SORT macro is based on coordinate struct instead of state struct. In some calls(i.e. from _bt_spools_heapscan) coordinate could appear to be NULL, which can be a segfault on items dereference inside the macro.
- state->worker and coordinate->isWorker a little bit differ in semantics i.e.:
..............................................worker............... leader
state -> worker........................ >=0.....................-1
coordinate ->isWorker............. 1..........................0- in tuplesort_begin_index_btree I suppose it should be base->nKeys instead of state->nKeys
Perfect, thank you!
- Cfbot reports gcc warnings due to mixed code and declarations. So I used this to beautify code in tuplesortvariants.c a little. (This is added as a separate patch 0007)
It appears that warnings were caused by the extra semicolon in
TuplesortstateGetPublic() macro. I've removed that semicolon, and I
don't think we need a beautification patch. Also, please note that
there is no point to add indentation, which doesn't survive pgindent.
All these things are corrected/done in a new version 3 of a patchset (PFA). For me, the patchset seems like a long-needed thing to support PostgreSQL extensibility. Overall corrections in v3 are minor, so I'd like to mark the patch as RfC if there are no objections.
Thank you. I've also revised the comments in the top of tuplesort.c
and tuplesortvariants.c. The revised patchset is attached.
Also, my OrioleDB colleagues Ilya Kobets and Tatsiana Yaumenenka run
tests to check if the patchset causes a performance regression. The
script and results are present in the "tuplesort_patch_test.zip"
archive. The final comparison is given in the result/final_table.txt.
In short, they repeat each test 10 times and there is no difference
exceeding the random variation.
------
Regards,
Alexander Korotkov
Attachments:
0001-Remove-Tuplesortstate.copytup-function-v4.patchapplication/octet-stream; name=0001-Remove-Tuplesortstate.copytup-function-v4.patchDownload
From 1b883b1fd18e45095e671b7f62ef2fd3656f1a53 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 13:28:27 +0300
Subject: [PATCH 1/6] Remove Tuplesortstate.copytup function
It's currently unclear how do we split functionality between
Tuplesortstate.copytup() function and tuplesort_put*() functions.
For instance, copytup_index() and copytup_datum() return error while
tuplesort_putindextuplevalues() and tuplesort_putdatum() do their work.
This commit removes Tuplesortstate.copytup() altogether, putting the
corresponding code into tuplesort_put*().
---
src/backend/utils/sort/tuplesort.c | 330 ++++++++++++-----------------
1 file changed, 132 insertions(+), 198 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 421afcf47d3..4812b1d9ae3 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -279,14 +279,6 @@ struct Tuplesortstate
*/
SortTupleComparator comparetup;
- /*
- * Function to copy a supplied input tuple into palloc'd space and set up
- * its SortTuple representation (ie, set tuple/datum1/isnull1). Also,
- * state->availMem must be decreased by the amount of space used for the
- * tuple copy (note the SortTuple struct itself is not counted).
- */
- void (*copytup) (Tuplesortstate *state, SortTuple *stup, void *tup);
-
/*
* Function to write a stored tuple onto tape. The representation of the
* tuple on tape need not be the same as it is in memory; requirements on
@@ -549,7 +541,6 @@ struct Sharedsort
} while(0)
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
-#define COPYTUP(state,stup,tup) ((*(state)->copytup) (state, stup, tup))
#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
@@ -600,10 +591,7 @@ struct Sharedsort
* a lot better than what we were doing before 7.3. As of 9.6, a
* separate memory context is used for caller passed tuples. Resetting
* it at certain key increments significantly ameliorates fragmentation.
- * Note that this places a responsibility on copytup routines to use the
- * correct memory context for these tuples (and to not use the reset
- * context for anything whose lifetime needs to span multiple external
- * sort runs). readtup routines use the slab allocator (they cannot use
+ * readtup routines use the slab allocator (they cannot use
* the reset context because it gets deleted at the point that merging
* begins).
*/
@@ -643,14 +631,12 @@ static void markrunend(LogicalTape *tape);
static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
static int comparetup_heap(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_cluster(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
@@ -659,14 +645,12 @@ static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_index(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
static int comparetup_datum(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
-static void copytup_datum(Tuplesortstate *state, SortTuple *stup, void *tup);
static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
@@ -1059,7 +1043,6 @@ tuplesort_begin_heap(TupleDesc tupDesc,
PARALLEL_SORT(state));
state->comparetup = comparetup_heap;
- state->copytup = copytup_heap;
state->writetup = writetup_heap;
state->readtup = readtup_heap;
state->haveDatum1 = true;
@@ -1135,7 +1118,6 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
PARALLEL_SORT(state));
state->comparetup = comparetup_cluster;
- state->copytup = copytup_cluster;
state->writetup = writetup_cluster;
state->readtup = readtup_cluster;
state->abbrevNext = 10;
@@ -1240,7 +1222,6 @@ tuplesort_begin_index_btree(Relation heapRel,
PARALLEL_SORT(state));
state->comparetup = comparetup_index_btree;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->abbrevNext = 10;
@@ -1317,7 +1298,6 @@ tuplesort_begin_index_hash(Relation heapRel,
state->nKeys = 1; /* Only one sort column, the hash code */
state->comparetup = comparetup_index_hash;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->haveDatum1 = true;
@@ -1358,7 +1338,6 @@ tuplesort_begin_index_gist(Relation heapRel,
state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
state->comparetup = comparetup_index_btree;
- state->copytup = copytup_index;
state->writetup = writetup_index;
state->readtup = readtup_index;
state->haveDatum1 = true;
@@ -1422,7 +1401,6 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
PARALLEL_SORT(state));
state->comparetup = comparetup_datum;
- state->copytup = copytup_datum;
state->writetup = writetup_datum;
state->readtup = readtup_datum;
state->abbrevNext = 10;
@@ -1839,14 +1817,75 @@ noalloc:
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
+ Datum original;
+ MinimalTuple tuple;
+ HeapTupleData htup;
- /*
- * Copy the given tuple into memory we control, and decrease availMem.
- * Then call the common code.
- */
- COPYTUP(state, &stup, (void *) slot);
+ /* copy the tuple into sort storage */
+ tuple = ExecCopySlotMinimalTuple(slot);
+ stup.tuple = (void *) tuple;
+ USEMEM(state, GetMemoryChunkSpace(tuple));
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ original = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup.isnull1);
+
+ MemoryContextSwitchTo(state->sortcontext);
+
+ if (!state->sortKeys->abbrev_converter || stup.isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ stup.datum1 = original;
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ stup.datum1 = state->sortKeys->abbrev_converter(original,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ stup.datum1 = original;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ {
+ SortTuple *mtup = &state->memtuples[i];
+
+ htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
+ MINIMAL_TUPLE_OFFSET);
+
+ mtup->datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &mtup->isnull1);
+ }
+ }
puttuple_common(state, &stup);
@@ -1861,14 +1900,74 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
+ Datum original;
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+
+ /* copy the tuple into sort storage */
+ tup = heap_copytuple(tup);
+ stup.tuple = (void *) tup;
+ USEMEM(state, GetMemoryChunkSpace(tup));
+
+ MemoryContextSwitchTo(state->sortcontext);
/*
- * Copy the given tuple into memory we control, and decrease availMem.
- * Then call the common code.
+ * set up first-column key value, and potentially abbreviate, if it's a
+ * simple column
*/
- COPYTUP(state, &stup, (void *) tup);
+ if (state->haveDatum1)
+ {
+ original = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup.isnull1);
+
+ if (!state->sortKeys->abbrev_converter || stup.isnull1)
+ {
+ /*
+ * Store ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ stup.datum1 = original;
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ stup.datum1 = state->sortKeys->abbrev_converter(original,
+ state->sortKeys);
+ }
+ else
+ {
+ /* Abort abbreviation */
+ int i;
+
+ stup.datum1 = original;
+
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ for (i = 0; i < state->memtupcount; i++)
+ {
+ SortTuple *mtup = &state->memtuples[i];
+
+ tup = (HeapTuple) mtup->tuple;
+ mtup->datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &mtup->isnull1);
+ }
+ }
+ }
puttuple_common(state, &stup);
@@ -3947,84 +4046,6 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
return 0;
}
-static void
-copytup_heap(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /*
- * We expect the passed "tup" to be a TupleTableSlot, and form a
- * MinimalTuple using the exported interface for that.
- */
- TupleTableSlot *slot = (TupleTableSlot *) tup;
- Datum original;
- MinimalTuple tuple;
- HeapTupleData htup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
-
- /* copy the tuple into sort storage */
- tuple = ExecCopySlotMinimalTuple(slot);
- stup->tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- original = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &stup->isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-
- if (!state->sortKeys->abbrev_converter || stup->isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup->datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup->datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup->datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- mtup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &mtup->isnull1);
- }
- }
-}
-
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4193,79 +4214,6 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
return 0;
}
-static void
-copytup_cluster(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- HeapTuple tuple = (HeapTuple) tup;
- Datum original;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
-
- /* copy the tuple into sort storage */
- tuple = heap_copytuple(tuple);
- stup->tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
-
- MemoryContextSwitchTo(oldcontext);
-
- /*
- * set up first-column key value, and potentially abbreviate, if it's a
- * simple column
- */
- if (!state->haveDatum1)
- return;
-
- original = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &stup->isnull1);
-
- if (!state->sortKeys->abbrev_converter || stup->isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup->datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup->datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup->datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tuple = (HeapTuple) mtup->tuple;
- mtup->datum1 = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &mtup->isnull1);
- }
- }
-}
-
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4512,13 +4460,6 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
return 0;
}
-static void
-copytup_index(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /* Not currently needed */
- elog(ERROR, "copytup_index() should not be called");
-}
-
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
@@ -4583,13 +4524,6 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
return compare;
}
-static void
-copytup_datum(Tuplesortstate *state, SortTuple *stup, void *tup)
-{
- /* Not currently needed */
- elog(ERROR, "copytup_datum() should not be called");
-}
-
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
--
2.24.3 (Apple Git-128)
0004-Move-memory-management-away-from-writetup-and-tup-v4.patchapplication/octet-stream; name=0004-Move-memory-management-away-from-writetup-and-tup-v4.patchDownload
From fa779c7ac6405aab1d5ced163501695c9e8b2777 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 00:14:51 +0300
Subject: [PATCH 4/6] Move memory management away from writetup() and
tuplesort_put*()
This commit puts some generic work away from sort-variant-specific function.
In particular, tuplesort_put*() now doesn't need to decrease available memory
and switch to sort context before calling puttuple_common(). writetup()
doesn't need to free SortTuple.tuple and increase available memory.
---
src/backend/utils/sort/tuplesort.c | 78 +++++++++++++-----------------
1 file changed, 33 insertions(+), 45 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 828efe701e5..c8c511fb8c5 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -288,11 +288,7 @@ struct Tuplesortstate
/*
* Function to write a stored tuple onto tape. The representation of the
- * tuple on tape need not be the same as it is in memory; requirements on
- * the tape representation are given below. Unless the slab allocator is
- * used, after writing the tuple, pfree() the out-of-line data (not the
- * SortTuple struct!), and increase state->availMem by the amount of
- * memory space thereby released.
+ * tuple on tape need not be the same as it is in memory.
*/
void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
@@ -549,7 +545,7 @@ struct Sharedsort
#define REMOVEABBREV(state,stup,count) ((*(state)->removeabbrev) (state, stup, count))
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
-#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
+#define WRITETUP(state,tape,stup) (writetuple(state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
#define USEMEM(state,amt) ((state)->availMem -= (amt))
@@ -618,6 +614,8 @@ static Tuplesortstate *tuplesort_begin_common(int workMem,
static void tuplesort_begin_batch(Tuplesortstate *state);
static void puttuple_common(Tuplesortstate *state, SortTuple *tuple,
bool useAbbrev);
+static void writetuple(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
static bool consider_abort_common(Tuplesortstate *state);
static void inittapes(Tuplesortstate *state, bool mergeruns);
static void inittapestate(Tuplesortstate *state, int maxTapes);
@@ -1848,7 +1846,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
/* copy the tuple into sort storage */
tuple = ExecCopySlotMinimalTuple(slot);
stup.tuple = (void *) tuple;
- USEMEM(state, GetMemoryChunkSpace(tuple));
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
@@ -1857,8 +1854,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
state->tupDesc,
&stup.isnull1);
- MemoryContextSwitchTo(state->sortcontext);
-
puttuple_common(state, &stup,
state->sortKeys->abbrev_converter && !stup.isnull1);
@@ -1879,9 +1874,6 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
stup.tuple = (void *) tup;
- USEMEM(state, GetMemoryChunkSpace(tup));
-
- MemoryContextSwitchTo(state->sortcontext);
/*
* set up first-column key value, and potentially abbreviate, if it's a
@@ -1910,7 +1902,6 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
ItemPointer self, Datum *values,
bool *isnull)
{
- MemoryContext oldcontext;
SortTuple stup;
IndexTuple tuple;
@@ -1918,19 +1909,14 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
isnull, state->tuplecontext);
tuple = ((IndexTuple) stup.tuple);
tuple->t_tid = *self;
- USEMEM(state, GetMemoryChunkSpace(stup.tuple));
/* set up first-column key value */
stup.datum1 = index_getattr(tuple,
1,
RelationGetDescr(state->indexRel),
&stup.isnull1);
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
-
puttuple_common(state, &stup,
state->sortKeys && state->sortKeys->abbrev_converter && !stup.isnull1);
-
- MemoryContextSwitchTo(oldcontext);
}
/*
@@ -1965,15 +1951,12 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
stup.datum1 = !isNull ? val : (Datum) 0;
stup.isnull1 = isNull;
stup.tuple = NULL; /* no separate storage */
- MemoryContextSwitchTo(state->sortcontext);
}
else
{
stup.isnull1 = false;
stup.datum1 = datumCopy(val, false, state->datumTypeLen);
stup.tuple = DatumGetPointer(stup.datum1);
- USEMEM(state, GetMemoryChunkSpace(stup.tuple));
- MemoryContextSwitchTo(state->sortcontext);
}
puttuple_common(state, &stup,
@@ -1988,8 +1971,14 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
static void
puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+
Assert(!LEADER(state));
+ /* Count the size of the out-of-line data */
+ if (tuple->tuple != NULL)
+ USEMEM(state, GetMemoryChunkSpace(tuple->tuple));
+
if (!useAbbrev)
{
/*
@@ -2062,6 +2051,7 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
pg_rusage_show(&state->ru_start));
#endif
make_bounded_heap(state);
+ MemoryContextSwitchTo(oldcontext);
return;
}
@@ -2069,7 +2059,10 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
* Done if we still fit in available memory and have array slots.
*/
if (state->memtupcount < state->memtupsize && !LACKMEM(state))
+ {
+ MemoryContextSwitchTo(oldcontext);
return;
+ }
/*
* Nope; time to switch to tape-based operation.
@@ -2123,6 +2116,25 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
elog(ERROR, "invalid tuplesort state");
break;
}
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Write a stored tuple onto tape.tuple. Unless the slab allocator is
+ * used, after writing the tuple, pfree() the out-of-line data (not the
+ * SortTuple struct!), and increase state->availMem by the amount of
+ * memory space thereby released.
+ */
+static void
+writetuple(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ state->writetup(state, tape, stup);
+
+ if (!state->slabAllocatorUsed && stup->tuple)
+ {
+ FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
+ pfree(stup->tuple);
+ }
}
static bool
@@ -3960,12 +3972,6 @@ writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- heap_free_minimal_tuple(tuple);
- }
}
static void
@@ -4141,12 +4147,6 @@ writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- heap_freetuple(tuple);
- }
}
static void
@@ -4403,12 +4403,6 @@ writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-
- if (!state->slabAllocatorUsed)
- {
- FREEMEM(state, GetMemoryChunkSpace(tuple));
- pfree(tuple);
- }
}
static void
@@ -4495,12 +4489,6 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
* word? */
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
-
- if (!state->slabAllocatorUsed && stup->tuple)
- {
- FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
- pfree(stup->tuple);
- }
}
static void
--
2.24.3 (Apple Git-128)
0002-Add-new-Tuplesortstate.removeabbrev-function-v4.patchapplication/octet-stream; name=0002-Add-new-Tuplesortstate.removeabbrev-function-v4.patchDownload
From 0ae121f5756241e8bc5831f74130d74bf4e5f8a4 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:03:13 +0300
Subject: [PATCH 2/6] Add new Tuplesortstate.removeabbrev function
This commit is the preparation to move abbreviation logic into
puttuple_common(). The new removeabbrev function turns datum1 representation
of SortTuple's from the abbreviated key to the first column value. Therefore,
it encapsulates the differential part of abbreviation handling code in
tuplesort_put*() functions, making these functions similar.
---
src/backend/utils/sort/tuplesort.c | 156 +++++++++++++++++++----------
1 file changed, 103 insertions(+), 53 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 4812b1d9ae3..8b6b2bc1d38 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -279,6 +279,13 @@ struct Tuplesortstate
*/
SortTupleComparator comparetup;
+ /*
+ * Alter datum1 representation in the SortTuple's array back from the
+ * abbreviated key to the first column value.
+ */
+ void (*removeabbrev) (Tuplesortstate *state, SortTuple *stups,
+ int count);
+
/*
* Function to write a stored tuple onto tape. The representation of the
* tuple on tape need not be the same as it is in memory; requirements on
@@ -540,6 +547,7 @@ struct Sharedsort
pfree(buf); \
} while(0)
+#define REMOVEABBREV(state,stup,count) ((*(state)->removeabbrev) (state, stup, count))
#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) ((*(state)->writetup) (state, tape, stup))
#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
@@ -629,6 +637,14 @@ static void reversedirection(Tuplesortstate *state);
static unsigned int getlen(LogicalTape *tape, bool eofOK);
static void markrunend(LogicalTape *tape);
static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
+static void removeabbrev_heap(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_index(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_datum(Tuplesortstate *state, SortTuple *stups,
+ int count);
static int comparetup_heap(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
@@ -1042,6 +1058,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_heap;
state->comparetup = comparetup_heap;
state->writetup = writetup_heap;
state->readtup = readtup_heap;
@@ -1117,6 +1134,7 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_cluster;
state->comparetup = comparetup_cluster;
state->writetup = writetup_cluster;
state->readtup = readtup_cluster;
@@ -1221,6 +1239,7 @@ tuplesort_begin_index_btree(Relation heapRel,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_index;
state->comparetup = comparetup_index_btree;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1297,6 +1316,7 @@ tuplesort_begin_index_hash(Relation heapRel,
state->nKeys = 1; /* Only one sort column, the hash code */
+ state->removeabbrev = removeabbrev_index;
state->comparetup = comparetup_index_hash;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1337,6 +1357,7 @@ tuplesort_begin_index_gist(Relation heapRel,
state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ state->removeabbrev = removeabbrev_index;
state->comparetup = comparetup_index_btree;
state->writetup = writetup_index;
state->readtup = readtup_index;
@@ -1400,6 +1421,7 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
sortopt & TUPLESORT_RANDOMACCESS,
PARALLEL_SORT(state));
+ state->removeabbrev = removeabbrev_datum;
state->comparetup = comparetup_datum;
state->writetup = writetup_datum;
state->readtup = readtup_datum;
@@ -1871,20 +1893,7 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
* sorted on tape, since serialized tuples lack abbreviated keys
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- htup.t_len = ((MinimalTuple) mtup->tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) mtup->tuple -
- MINIMAL_TUPLE_OFFSET);
-
- mtup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &mtup->isnull1);
- }
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
puttuple_common(state, &stup);
@@ -1925,12 +1934,12 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
if (!state->sortKeys->abbrev_converter || stup.isnull1)
{
/*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
+ * Store ordinary Datum representation, or NULL value. If there
+ * is a converter it won't expect NULL values, and cost model is
+ * not required to account for NULL, so in that case we avoid
+ * calling converter and just set datum1 to zeroed representation
+ * (to be consistent, and to support cheap inequality tests for
+ * NULL abbreviated keys).
*/
stup.datum1 = original;
}
@@ -1949,23 +1958,15 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
/*
* Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tup = (HeapTuple) mtup->tuple;
- mtup->datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &mtup->isnull1);
- }
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any
+ * case).
+ */
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
}
@@ -2035,16 +2036,7 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
* sorted on tape, since serialized tuples lack abbreviated keys
* (TSS_BUILDRUNS state prevents control reaching here in any case).
*/
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- tuple = mtup->tuple;
- mtup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(state->indexRel),
- &mtup->isnull1);
- }
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
puttuple_common(state, &stup);
@@ -2122,12 +2114,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* (TSS_BUILDRUNS state prevents control reaching here in any
* case).
*/
- for (i = 0; i < state->memtupcount; i++)
- {
- SortTuple *mtup = &state->memtuples[i];
-
- mtup->datum1 = PointerGetDatum(mtup->tuple);
- }
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
}
}
@@ -3984,6 +3971,26 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
* Routines specialized for HeapTuple (actually MinimalTuple) case
*/
+static void
+removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTupleData htup;
+
+ htup.t_len = ((MinimalTuple) stups[i].tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
+ MINIMAL_TUPLE_OFFSET);
+ stups[i].datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stups[i].isnull1);
+ }
+}
+
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
@@ -4102,6 +4109,23 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
* comparisons per a btree index definition)
*/
+static void
+removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTuple tup;
+
+ tup = (HeapTuple) stups[i].tuple;
+ stups[i].datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stups[i].isnull1);
+ }
+}
+
static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
@@ -4271,6 +4295,23 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
* functions can be shared.
*/
+static void
+removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ IndexTuple tuple;
+
+ tuple = stups[i].tuple;
+ stups[i].datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(state->indexRel),
+ &stups[i].isnull1);
+ }
+}
+
static int
comparetup_index_btree(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
@@ -4503,6 +4544,15 @@ readtup_index(Tuplesortstate *state, SortTuple *stup,
* Routines specialized for DatumTuple case
*/
+static void
+removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ stups[i].datum1 = PointerGetDatum(stups[i].tuple);
+}
+
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
--
2.24.3 (Apple Git-128)
0005-Split-TuplesortPublic-from-Tuplesortstate-v4.patchapplication/octet-stream; name=0005-Split-TuplesortPublic-from-Tuplesortstate-v4.patchDownload
From 2a4f2d0530b533d3c923272121b731d1a011dbbc Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 18:11:26 +0300
Subject: [PATCH 5/6] Split TuplesortPublic from Tuplesortstate
The new TuplesortPublic data structure contains the definition of
sort-variant-specific interface methods and the part of Tuple sort operation
state required by their implementations. This will let define Tuple sort
variants without knowledge of Tuplesortstate, that is without knowledge
of generic sort implementation guts.
---
src/backend/utils/sort/tuplesort.c | 814 ++++++++++++++++-------------
src/tools/pgindent/typedefs.list | 6 +
2 files changed, 471 insertions(+), 349 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index c8c511fb8c5..0a630956dc1 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -126,8 +126,9 @@
#define CLUSTER_SORT 3
/* Sort parallel code from state for sort__start probes */
-#define PARALLEL_SORT(state) ((state)->shared == NULL ? 0 : \
- (state)->worker >= 0 ? 1 : 2)
+#define PARALLEL_SORT(coordinate) (coordinate == NULL || \
+ (coordinate)->sharedsort == NULL ? 0 : \
+ (coordinate)->isWorker ? 1 : 2)
/*
* Initial size of memtuples array. We're trying to select this size so that
@@ -236,38 +237,18 @@ typedef enum
#define TAPE_BUFFER_OVERHEAD BLCKSZ
#define MERGE_BUFFER_SIZE (BLCKSZ * 32)
+typedef struct TuplesortPublic TuplesortPublic;
+
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
Tuplesortstate *state);
/*
- * Private state of a Tuplesort operation.
+ * The public part of a Tuple sort operation state. This data structure
+ * containsthe definition of sort-variant-specific interface methods and
+ * the part of Tuple sort operation state required by their implementations.
*/
-struct Tuplesortstate
+struct TuplesortPublic
{
- TupSortStatus status; /* enumerated value as shown above */
- int nKeys; /* number of columns in sort key */
- int sortopt; /* Bitmask of flags used to setup sort */
- bool bounded; /* did caller specify a maximum number of
- * tuples to return? */
- bool boundUsed; /* true if we made use of a bounded heap */
- int bound; /* if bounded, the maximum number of tuples */
- bool tuples; /* Can SortTuple.tuple ever be set? */
- int64 availMem; /* remaining memory available, in bytes */
- int64 allowedMem; /* total memory allowed, in bytes */
- int maxTapes; /* max number of input tapes to merge in each
- * pass */
- int64 maxSpace; /* maximum amount of space occupied among sort
- * of groups, either in-memory or on-disk */
- bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
- * space, false when it's value for in-memory
- * space */
- TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
- MemoryContext maincontext; /* memory context for tuple sort metadata that
- * persists across multiple batches */
- MemoryContext sortcontext; /* memory context holding most sort data */
- MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
- LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
-
/*
* These function pointers decouple the routines that must know what kind
* of tuple we are sorting from the routines that don't need to know it.
@@ -301,12 +282,134 @@ struct Tuplesortstate
void (*readtup) (Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
+ /*
+ * Function to do some specific release of resources for the sort variant.
+ * In particular, this function should free everything stored in the "arg"
+ * field, which wouldn't be cleared on reset of the Tuple sort memory
+ * contextes. This can be NULL if nothing specific needs to be done.
+ */
+ void (*freestate) (Tuplesortstate *state);
+
+ /*
+ * The subsequent fields are used in the implementations of the functions
+ * above.
+ */
+ MemoryContext maincontext; /* memory context for tuple sort metadata that
+ * persists across multiple batches */
+ MemoryContext sortcontext; /* memory context holding most sort data */
+ MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
+
/*
* Whether SortTuple's datum1 and isnull1 members are maintained by the
* above routines. If not, some sort specializations are disabled.
*/
bool haveDatum1;
+ /*
+ * The sortKeys variable is used by every case other than the hash index
+ * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
+ * MinimalTuple and CLUSTER routines, though.
+ */
+ int nKeys; /* number of columns in sort key */
+ SortSupport sortKeys; /* array of length nKeys */
+
+ /*
+ * This variable is shared by the single-key MinimalTuple case and the
+ * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
+ * presence of a value in this field is also checked by various sort
+ * specialization functions as an optimization when comparing the leading
+ * key in a tiebreak situation to determine if there are any subsequent
+ * keys to sort on.
+ */
+ SortSupport onlyKey;
+
+ int sortopt; /* Bitmask of flags used to setup sort */
+
+ bool tuples; /* Can SortTuple.tuple ever be set? */
+
+ void *arg; /* Specific information for the sort variant */
+};
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
+ * the tuplesort_begin_cluster.
+ */
+typedef struct
+{
+ TupleDesc tupDesc;
+
+ IndexInfo *indexInfo; /* info about index being used for reference */
+ EState *estate; /* for evaluating index expressions */
+} TuplesortClusterArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the IndexTuple case.
+ * Set by tuplesort_begin_index_xxx and used only by the IndexTuple routines.
+ */
+typedef struct
+{
+ Relation heapRel; /* table the index is being built on */
+ Relation indexRel; /* index being built */
+} TuplesortIndexArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_btree subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ bool enforceUnique; /* complain if we find duplicate tuples */
+ bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+} TuplesortIndexBTreeArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_hash subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ uint32 high_mask; /* masks for sortable part of hash code */
+ uint32 low_mask;
+ uint32 max_buckets;
+} TuplesortIndexHashArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the Datum case.
+ * Set by tuplesort_begin_datum and used only by the DatumTuple routines.
+ */
+typedef struct
+{
+ /* the datatype oid of Datum's to be sorted */
+ Oid datumType;
+ /* we need typelen in order to know how to copy the Datums. */
+ int datumTypeLen;
+} TuplesortDatumArg;
+
+/*
+ * Private state of a Tuplesort operation.
+ */
+struct Tuplesortstate
+{
+ TuplesortPublic base;
+ TupSortStatus status; /* enumerated value as shown above */
+ bool bounded; /* did caller specify a maximum number of
+ * tuples to return? */
+ bool boundUsed; /* true if we made use of a bounded heap */
+ int bound; /* if bounded, the maximum number of tuples */
+ int64 availMem; /* remaining memory available, in bytes */
+ int64 allowedMem; /* total memory allowed, in bytes */
+ int maxTapes; /* max number of input tapes to merge in each
+ * pass */
+ int64 maxSpace; /* maximum amount of space occupied among sort
+ * of groups, either in-memory or on-disk */
+ bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
+ * space, false when it's value for in-memory
+ * space */
+ TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
+ LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
+
/*
* This array holds the tuples now in sort memory. If we are in state
* INITIAL, the tuples are in no particular order; if we are in state
@@ -421,24 +524,6 @@ struct Tuplesortstate
Sharedsort *shared;
int nParticipants;
- /*
- * The sortKeys variable is used by every case other than the hash index
- * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
- * MinimalTuple and CLUSTER routines, though.
- */
- TupleDesc tupDesc;
- SortSupport sortKeys; /* array of length nKeys */
-
- /*
- * This variable is shared by the single-key MinimalTuple case and the
- * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
- * presence of a value in this field is also checked by various sort
- * specialization functions as an optimization when comparing the leading
- * key in a tiebreak situation to determine if there are any subsequent
- * keys to sort on.
- */
- SortSupport onlyKey;
-
/*
* Additional state for managing "abbreviated key" sortsupport routines
* (which currently may be used by all cases except the hash index case).
@@ -448,37 +533,6 @@ struct Tuplesortstate
int64 abbrevNext; /* Tuple # at which to next check
* applicability */
- /*
- * These variables are specific to the CLUSTER case; they are set by
- * tuplesort_begin_cluster.
- */
- IndexInfo *indexInfo; /* info about index being used for reference */
- EState *estate; /* for evaluating index expressions */
-
- /*
- * These variables are specific to the IndexTuple case; they are set by
- * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
- */
- Relation heapRel; /* table the index is being built on */
- Relation indexRel; /* index being built */
-
- /* These are specific to the index_btree subcase: */
- bool enforceUnique; /* complain if we find duplicate tuples */
- bool uniqueNullsNotDistinct; /* unique constraint null treatment */
-
- /* These are specific to the index_hash subcase: */
- uint32 high_mask; /* masks for sortable part of hash code */
- uint32 low_mask;
- uint32 max_buckets;
-
- /*
- * These variables are specific to the Datum case; they are set by
- * tuplesort_begin_datum and used only by the DatumTuple routines.
- */
- Oid datumType;
- /* we need typelen in order to know how to copy the Datums. */
- int datumTypeLen;
-
/*
* Resource snapshot for time of sort start.
*/
@@ -543,10 +597,13 @@ struct Sharedsort
pfree(buf); \
} while(0)
-#define REMOVEABBREV(state,stup,count) ((*(state)->removeabbrev) (state, stup, count))
-#define COMPARETUP(state,a,b) ((*(state)->comparetup) (a, b, state))
+#define TuplesortstateGetPublic(state) ((TuplesortPublic *) state)
+
+#define REMOVEABBREV(state,stup,count) ((*(state)->base.removeabbrev) (state, stup, count))
+#define COMPARETUP(state,a,b) ((*(state)->base.comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) (writetuple(state, tape, stup))
-#define READTUP(state,stup,tape,len) ((*(state)->readtup) (state, stup, tape, len))
+#define READTUP(state,stup,tape,len) ((*(state)->base.readtup) (state, stup, tape, len))
+#define FREESTATE(state) ((state)->base.freestate ? (*(state)->base.freestate) (state) : (void) 0)
#define LACKMEM(state) ((state)->availMem < 0 && !(state)->slabAllocatorUsed)
#define USEMEM(state,amt) ((state)->availMem -= (amt))
#define FREEMEM(state,amt) ((state)->availMem += (amt))
@@ -670,6 +727,7 @@ static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len);
+static void freestate_cluster(Tuplesortstate *state);
static int worker_get_identifier(Tuplesortstate *state);
static void worker_freeze_result_tape(Tuplesortstate *state);
static void worker_nomergeruns(Tuplesortstate *state);
@@ -700,7 +758,7 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplyUnsignedSortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->base.sortKeys[0]);
if (compare != 0)
return compare;
@@ -708,10 +766,10 @@ qsort_tuple_unsigned_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->base.comparetup(a, b, state);
}
#if SIZEOF_DATUM >= 8
@@ -723,7 +781,7 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplySignedSortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->base.sortKeys[0]);
if (compare != 0)
return compare;
@@ -732,10 +790,10 @@ qsort_tuple_signed_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->base.comparetup(a, b, state);
}
#endif
@@ -747,7 +805,7 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
compare = ApplyInt32SortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- &state->sortKeys[0]);
+ &state->base.sortKeys[0]);
if (compare != 0)
return compare;
@@ -756,10 +814,10 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* No need to waste effort calling the tiebreak function when there are no
* other keys to sort on.
*/
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
return 0;
- return state->comparetup(a, b, state);
+ return state->base.comparetup(a, b, state);
}
/*
@@ -886,8 +944,9 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
pg_rusage_init(&state->ru_start);
#endif
- state->sortopt = sortopt;
- state->tuples = true;
+ state->base.sortopt = sortopt;
+ state->base.tuples = true;
+ state->abbrevNext = 10;
/*
* workMem is forced to be at least 64KB, the current minimum valid value
@@ -896,8 +955,8 @@ tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
* with very little memory.
*/
state->allowedMem = Max(workMem, 64) * (int64) 1024;
- state->sortcontext = sortcontext;
- state->maincontext = maincontext;
+ state->base.sortcontext = sortcontext;
+ state->base.maincontext = maincontext;
/*
* Initial size of array must be more than ALLOCSET_SEPARATE_THRESHOLD;
@@ -956,7 +1015,7 @@ tuplesort_begin_batch(Tuplesortstate *state)
{
MemoryContext oldcontext;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(state->base.maincontext);
/*
* Caller tuple (e.g. IndexTuple) memory context.
@@ -971,14 +1030,14 @@ tuplesort_begin_batch(Tuplesortstate *state)
* generation.c context as this keeps allocations more compact with less
* wastage. Allocations are also slightly more CPU efficient.
*/
- if (state->sortopt & TUPLESORT_ALLOWBOUNDED)
- state->tuplecontext = AllocSetContextCreate(state->sortcontext,
- "Caller tuples",
- ALLOCSET_DEFAULT_SIZES);
+ if (state->base.sortopt & TUPLESORT_ALLOWBOUNDED)
+ state->base.tuplecontext = AllocSetContextCreate(state->base.sortcontext,
+ "Caller tuples",
+ ALLOCSET_DEFAULT_SIZES);
else
- state->tuplecontext = GenerationContextCreate(state->sortcontext,
- "Caller tuples",
- ALLOCSET_DEFAULT_SIZES);
+ state->base.tuplecontext = GenerationContextCreate(state->base.sortcontext,
+ "Caller tuples",
+ ALLOCSET_DEFAULT_SIZES);
state->status = TSS_INITIAL;
@@ -1034,10 +1093,11 @@ tuplesort_begin_heap(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MemoryContext oldcontext;
int i;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
AssertArg(nkeys > 0);
@@ -1048,30 +1108,28 @@ tuplesort_begin_heap(TupleDesc tupDesc,
nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = nkeys;
+ base->nKeys = nkeys;
TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
false, /* no unique check */
nkeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_heap;
- state->comparetup = comparetup_heap;
- state->writetup = writetup_heap;
- state->readtup = readtup_heap;
- state->haveDatum1 = true;
-
- state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
- state->abbrevNext = 10;
+ base->removeabbrev = removeabbrev_heap;
+ base->comparetup = comparetup_heap;
+ base->writetup = writetup_heap;
+ base->readtup = readtup_heap;
+ base->haveDatum1 = true;
+ base->arg = tupDesc; /* assume we need not copy tupDesc */
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
for (i = 0; i < nkeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
AssertArg(attNums[i] != 0);
AssertArg(sortOperators[i] != 0);
@@ -1081,7 +1139,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
sortKey->ssup_nulls_first = nullsFirstFlags[i];
sortKey->ssup_attno = attNums[i];
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
}
@@ -1092,8 +1150,8 @@ tuplesort_begin_heap(TupleDesc tupDesc,
* is only of value to pass-by-value types anyway, whereas abbreviated
* keys are typically only of value to pass-by-reference types.
*/
- if (nkeys == 1 && !state->sortKeys->abbrev_converter)
- state->onlyKey = state->sortKeys;
+ if (nkeys == 1 && !base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
MemoryContextSwitchTo(oldcontext);
@@ -1108,13 +1166,16 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
BTScanInsert indexScanKey;
MemoryContext oldcontext;
+ TuplesortClusterArg *arg;
int i;
Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortClusterArg *) palloc0(sizeof(TuplesortClusterArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1124,37 +1185,38 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
false, /* no unique check */
- state->nKeys,
+ base->nKeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_cluster;
- state->comparetup = comparetup_cluster;
- state->writetup = writetup_cluster;
- state->readtup = readtup_cluster;
- state->abbrevNext = 10;
+ base->removeabbrev = removeabbrev_cluster;
+ base->comparetup = comparetup_cluster;
+ base->writetup = writetup_cluster;
+ base->readtup = readtup_cluster;
+ base->freestate = freestate_cluster;
+ base->arg = arg;
- state->indexInfo = BuildIndexInfo(indexRel);
+ arg->indexInfo = BuildIndexInfo(indexRel);
/*
* If we don't have a simple leading attribute, we don't currently
* initialize datum1, so disable optimizations that require it.
*/
- if (state->indexInfo->ii_IndexAttrNumbers[0] == 0)
- state->haveDatum1 = false;
+ if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
+ base->haveDatum1 = false;
else
- state->haveDatum1 = true;
+ base->haveDatum1 = true;
- state->tupDesc = tupDesc; /* assume we need not copy tupDesc */
+ arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
indexScanKey = _bt_mkscankey(indexRel, NULL);
- if (state->indexInfo->ii_Expressions != NULL)
+ if (arg->indexInfo->ii_Expressions != NULL)
{
TupleTableSlot *slot;
ExprContext *econtext;
@@ -1165,19 +1227,19 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
* TupleTableSlot to put the table tuples into. The econtext's
* scantuple has to point to that slot, too.
*/
- state->estate = CreateExecutorState();
+ arg->estate = CreateExecutorState();
slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
- econtext = GetPerTupleExprContext(state->estate);
+ econtext = GetPerTupleExprContext(arg->estate);
econtext->ecxt_scantuple = slot;
}
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < base->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
ScanKey scanKey = indexScanKey->scankeys + i;
int16 strategy;
@@ -1187,7 +1249,7 @@ tuplesort_begin_cluster(TupleDesc tupDesc,
(scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
sortKey->ssup_attno = scanKey->sk_attno;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1215,11 +1277,14 @@ tuplesort_begin_index_btree(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
BTScanInsert indexScanKey;
+ TuplesortIndexBTreeArg *arg;
MemoryContext oldcontext;
int i;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1229,36 +1294,36 @@ tuplesort_begin_index_btree(Relation heapRel,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
enforceUnique,
- state->nKeys,
+ base->nKeys,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_index;
- state->comparetup = comparetup_index_btree;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->abbrevNext = 10;
- state->haveDatum1 = true;
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
- state->enforceUnique = enforceUnique;
- state->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = enforceUnique;
+ arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
indexScanKey = _bt_mkscankey(indexRel, NULL);
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < base->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
ScanKey scanKey = indexScanKey->scankeys + i;
int16 strategy;
@@ -1268,7 +1333,7 @@ tuplesort_begin_index_btree(Relation heapRel,
(scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
sortKey->ssup_attno = scanKey->sk_attno;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1297,9 +1362,12 @@ tuplesort_begin_index_hash(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MemoryContext oldcontext;
+ TuplesortIndexHashArg *arg;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexHashArg *) palloc(sizeof(TuplesortIndexHashArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1313,20 +1381,21 @@ tuplesort_begin_index_hash(Relation heapRel,
sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = 1; /* Only one sort column, the hash code */
+ base->nKeys = 1; /* Only one sort column, the hash code */
- state->removeabbrev = removeabbrev_index;
- state->comparetup = comparetup_index_hash;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->haveDatum1 = true;
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_hash;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
- state->high_mask = high_mask;
- state->low_mask = low_mask;
- state->max_buckets = max_buckets;
+ arg->high_mask = high_mask;
+ arg->low_mask = low_mask;
+ arg->max_buckets = max_buckets;
MemoryContextSwitchTo(oldcontext);
@@ -1342,10 +1411,13 @@ tuplesort_begin_index_gist(Relation heapRel,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MemoryContext oldcontext;
+ TuplesortIndexBTreeArg *arg;
int i;
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1354,31 +1426,34 @@ tuplesort_begin_index_gist(Relation heapRel,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
- state->removeabbrev = removeabbrev_index;
- state->comparetup = comparetup_index_btree;
- state->writetup = writetup_index;
- state->readtup = readtup_index;
- state->haveDatum1 = true;
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->heapRel = heapRel;
- state->indexRel = indexRel;
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = false;
+ arg->uniqueNullsNotDistinct = false;
/* Prepare SortSupport data for each column */
- state->sortKeys = (SortSupport) palloc0(state->nKeys *
- sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
- for (i = 0; i < state->nKeys; i++)
+ for (i = 0; i < base->nKeys; i++)
{
- SortSupport sortKey = state->sortKeys + i;
+ SortSupport sortKey = base->sortKeys + i;
sortKey->ssup_cxt = CurrentMemoryContext;
sortKey->ssup_collation = indexRel->rd_indcollation[i];
sortKey->ssup_nulls_first = false;
sortKey->ssup_attno = i + 1;
/* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && state->haveDatum1);
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
AssertState(sortKey->ssup_attno != 0);
@@ -1398,11 +1473,14 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg;
MemoryContext oldcontext;
int16 typlen;
bool typbyval;
- oldcontext = MemoryContextSwitchTo(state->maincontext);
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortDatumArg *) palloc(sizeof(TuplesortDatumArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -1411,35 +1489,36 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
#endif
- state->nKeys = 1; /* always a one-column sort */
+ base->nKeys = 1; /* always a one-column sort */
TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
false, /* no unique check */
1,
workMem,
sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(state));
+ PARALLEL_SORT(coordinate));
- state->removeabbrev = removeabbrev_datum;
- state->comparetup = comparetup_datum;
- state->writetup = writetup_datum;
- state->readtup = readtup_datum;
+ base->removeabbrev = removeabbrev_datum;
+ base->comparetup = comparetup_datum;
+ base->writetup = writetup_datum;
+ base->readtup = readtup_datum;
state->abbrevNext = 10;
- state->haveDatum1 = true;
+ base->haveDatum1 = true;
+ base->arg = arg;
- state->datumType = datumType;
+ arg->datumType = datumType;
/* lookup necessary attributes of the datum type */
get_typlenbyval(datumType, &typlen, &typbyval);
- state->datumTypeLen = typlen;
- state->tuples = !typbyval;
+ arg->datumTypeLen = typlen;
+ base->tuples = !typbyval;
/* Prepare SortSupport data */
- state->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+ base->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
- state->sortKeys->ssup_cxt = CurrentMemoryContext;
- state->sortKeys->ssup_collation = sortCollation;
- state->sortKeys->ssup_nulls_first = nullsFirstFlag;
+ base->sortKeys->ssup_cxt = CurrentMemoryContext;
+ base->sortKeys->ssup_collation = sortCollation;
+ base->sortKeys->ssup_nulls_first = nullsFirstFlag;
/*
* Abbreviation is possible here only for by-reference types. In theory,
@@ -1449,9 +1528,9 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
* can't, because a datum sort only stores a single copy of the datum; the
* "tuple" field of each SortTuple is NULL.
*/
- state->sortKeys->abbreviate = !typbyval;
+ base->sortKeys->abbreviate = !typbyval;
- PrepareSortSupportFromOrderingOp(sortOperator, state->sortKeys);
+ PrepareSortSupportFromOrderingOp(sortOperator, base->sortKeys);
/*
* The "onlyKey" optimization cannot be used with abbreviated keys, since
@@ -1459,8 +1538,8 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
* is only of value to pass-by-value types anyway, whereas abbreviated
* keys are typically only of value to pass-by-reference types.
*/
- if (!state->sortKeys->abbrev_converter)
- state->onlyKey = state->sortKeys;
+ if (!base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
MemoryContextSwitchTo(oldcontext);
@@ -1485,7 +1564,7 @@ tuplesort_set_bound(Tuplesortstate *state, int64 bound)
/* Assert we're called before loading any tuples */
Assert(state->status == TSS_INITIAL && state->memtupcount == 0);
/* Assert we allow bounded sorts */
- Assert(state->sortopt & TUPLESORT_ALLOWBOUNDED);
+ Assert(state->base.sortopt & TUPLESORT_ALLOWBOUNDED);
/* Can't set the bound twice, either */
Assert(!state->bounded);
/* Also, this shouldn't be called in a parallel worker */
@@ -1513,13 +1592,13 @@ tuplesort_set_bound(Tuplesortstate *state, int64 bound)
* optimization. Disable by setting state to be consistent with no
* abbreviation support.
*/
- state->sortKeys->abbrev_converter = NULL;
- if (state->sortKeys->abbrev_full_comparator)
- state->sortKeys->comparator = state->sortKeys->abbrev_full_comparator;
+ state->base.sortKeys->abbrev_converter = NULL;
+ if (state->base.sortKeys->abbrev_full_comparator)
+ state->base.sortKeys->comparator = state->base.sortKeys->abbrev_full_comparator;
/* Not strictly necessary, but be tidy */
- state->sortKeys->abbrev_abort = NULL;
- state->sortKeys->abbrev_full_comparator = NULL;
+ state->base.sortKeys->abbrev_abort = NULL;
+ state->base.sortKeys->abbrev_full_comparator = NULL;
}
/*
@@ -1542,7 +1621,7 @@ static void
tuplesort_free(Tuplesortstate *state)
{
/* context swap probably not needed, but let's be safe */
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
#ifdef TRACE_SORT
long spaceUsed;
@@ -1589,21 +1668,13 @@ tuplesort_free(Tuplesortstate *state)
TRACE_POSTGRESQL_SORT_DONE(state->tapeset != NULL, 0L);
#endif
- /* Free any execution state created for CLUSTER case */
- if (state->estate != NULL)
- {
- ExprContext *econtext = GetPerTupleExprContext(state->estate);
-
- ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
- FreeExecutorState(state->estate);
- }
-
+ FREESTATE(state);
MemoryContextSwitchTo(oldcontext);
/*
* Free the per-sort memory context, thereby releasing all working memory.
*/
- MemoryContextReset(state->sortcontext);
+ MemoryContextReset(state->base.sortcontext);
}
/*
@@ -1624,7 +1695,7 @@ tuplesort_end(Tuplesortstate *state)
* Free the main memory context, including the Tuplesortstate struct
* itself.
*/
- MemoryContextDelete(state->maincontext);
+ MemoryContextDelete(state->base.maincontext);
}
/*
@@ -1838,7 +1909,9 @@ noalloc:
void
tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TupleDesc tupDesc = (TupleDesc) base->arg;
SortTuple stup;
MinimalTuple tuple;
HeapTupleData htup;
@@ -1850,12 +1923,12 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
stup.datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ base->sortKeys[0].ssup_attno,
+ tupDesc,
&stup.isnull1);
puttuple_common(state, &stup,
- state->sortKeys->abbrev_converter && !stup.isnull1);
+ base->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1869,7 +1942,9 @@ void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
SortTuple stup;
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
/* copy the tuple into sort storage */
tup = heap_copytuple(tup);
@@ -1879,16 +1954,16 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
* set up first-column key value, and potentially abbreviate, if it's a
* simple column
*/
- if (state->haveDatum1)
+ if (base->haveDatum1)
{
stup.datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup.isnull1);
}
puttuple_common(state, &stup,
- state->haveDatum1 && state->sortKeys->abbrev_converter && !stup.isnull1);
+ base->haveDatum1 && base->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1904,19 +1979,21 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
{
SortTuple stup;
IndexTuple tuple;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
- isnull, state->tuplecontext);
+ isnull, base->tuplecontext);
tuple = ((IndexTuple) stup.tuple);
tuple->t_tid = *self;
/* set up first-column key value */
stup.datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup.isnull1);
puttuple_common(state, &stup,
- state->sortKeys && state->sortKeys->abbrev_converter && !stup.isnull1);
+ base->sortKeys && base->sortKeys->abbrev_converter && !stup.isnull1);
}
/*
@@ -1927,7 +2004,9 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
void
tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
SortTuple stup;
/*
@@ -1942,7 +2021,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* identical to stup.tuple.
*/
- if (isNull || !state->tuples)
+ if (isNull || !base->tuples)
{
/*
* Set datum1 to zeroed representation for NULLs (to be consistent,
@@ -1955,12 +2034,12 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
else
{
stup.isnull1 = false;
- stup.datum1 = datumCopy(val, false, state->datumTypeLen);
+ stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
stup.tuple = DatumGetPointer(stup.datum1);
}
puttuple_common(state, &stup,
- state->tuples && !isNull && state->sortKeys->abbrev_converter);
+ base->tuples && !isNull && base->sortKeys->abbrev_converter);
MemoryContextSwitchTo(oldcontext);
}
@@ -1971,7 +2050,7 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
static void
puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
Assert(!LEADER(state));
@@ -1993,8 +2072,8 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
else if (!consider_abort_common(state))
{
/* Store abbreviated key representation */
- tuple->datum1 = state->sortKeys->abbrev_converter(tuple->datum1,
- state->sortKeys);
+ tuple->datum1 = state->base.sortKeys->abbrev_converter(tuple->datum1,
+ state->base.sortKeys);
}
else
{
@@ -2128,7 +2207,7 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
static void
writetuple(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
- state->writetup(state, tape, stup);
+ state->base.writetup(state, tape, stup);
if (!state->slabAllocatorUsed && stup->tuple)
{
@@ -2140,9 +2219,9 @@ writetuple(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
static bool
consider_abort_common(Tuplesortstate *state)
{
- Assert(state->sortKeys[0].abbrev_converter != NULL);
- Assert(state->sortKeys[0].abbrev_abort != NULL);
- Assert(state->sortKeys[0].abbrev_full_comparator != NULL);
+ Assert(state->base.sortKeys[0].abbrev_converter != NULL);
+ Assert(state->base.sortKeys[0].abbrev_abort != NULL);
+ Assert(state->base.sortKeys[0].abbrev_full_comparator != NULL);
/*
* Check effectiveness of abbreviation optimization. Consider aborting
@@ -2157,19 +2236,19 @@ consider_abort_common(Tuplesortstate *state)
* Check opclass-supplied abbreviation abort routine. It may indicate
* that abbreviation should not proceed.
*/
- if (!state->sortKeys->abbrev_abort(state->memtupcount,
- state->sortKeys))
+ if (!state->base.sortKeys->abbrev_abort(state->memtupcount,
+ state->base.sortKeys))
return false;
/*
* Finally, restore authoritative comparator, and indicate that
* abbreviation is not in play by setting abbrev_converter to NULL
*/
- state->sortKeys[0].comparator = state->sortKeys[0].abbrev_full_comparator;
- state->sortKeys[0].abbrev_converter = NULL;
+ state->base.sortKeys[0].comparator = state->base.sortKeys[0].abbrev_full_comparator;
+ state->base.sortKeys[0].abbrev_converter = NULL;
/* Not strictly necessary, but be tidy */
- state->sortKeys[0].abbrev_abort = NULL;
- state->sortKeys[0].abbrev_full_comparator = NULL;
+ state->base.sortKeys[0].abbrev_abort = NULL;
+ state->base.sortKeys[0].abbrev_full_comparator = NULL;
/* Give up - expect original pass-by-value representation */
return true;
@@ -2184,7 +2263,7 @@ consider_abort_common(Tuplesortstate *state)
void
tuplesort_performsort(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
#ifdef TRACE_SORT
if (trace_sort)
@@ -2304,7 +2383,7 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
switch (state->status)
{
case TSS_SORTEDINMEM:
- Assert(forward || state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(forward || state->base.sortopt & TUPLESORT_RANDOMACCESS);
Assert(!state->slabAllocatorUsed);
if (forward)
{
@@ -2348,7 +2427,7 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
break;
case TSS_SORTEDONTAPE:
- Assert(forward || state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(forward || state->base.sortopt & TUPLESORT_RANDOMACCESS);
Assert(state->slabAllocatorUsed);
/*
@@ -2550,7 +2629,8 @@ bool
tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
TupleTableSlot *slot, Datum *abbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2561,7 +2641,7 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
if (stup.tuple)
{
/* Record abbreviated key for caller */
- if (state->sortKeys->abbrev_converter && abbrev)
+ if (base->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
if (copy)
@@ -2586,7 +2666,8 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
HeapTuple
tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2606,7 +2687,8 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
IndexTuple
tuplesort_getindextuple(Tuplesortstate *state, bool forward)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2636,7 +2718,9 @@ bool
tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
SortTuple stup;
if (!tuplesort_gettuple_common(state, forward, &stup))
@@ -2649,10 +2733,10 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
MemoryContextSwitchTo(oldcontext);
/* Record abbreviated key for caller */
- if (state->sortKeys->abbrev_converter && abbrev)
+ if (base->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
- if (stup.isnull1 || !state->tuples)
+ if (stup.isnull1 || !base->tuples)
{
*val = stup.datum1;
*isNull = stup.isnull1;
@@ -2660,7 +2744,7 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
else
{
/* use stup.tuple because stup.datum1 may be an abbreviation */
- *val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
*isNull = false;
}
@@ -2713,7 +2797,7 @@ tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples, bool forward)
* We could probably optimize these cases better, but for now it's
* not worth the trouble.
*/
- oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
while (ntuples-- > 0)
{
SortTuple stup;
@@ -2989,7 +3073,7 @@ mergeruns(Tuplesortstate *state)
Assert(state->status == TSS_BUILDRUNS);
Assert(state->memtupcount == 0);
- if (state->sortKeys != NULL && state->sortKeys->abbrev_converter != NULL)
+ if (state->base.sortKeys != NULL && state->base.sortKeys->abbrev_converter != NULL)
{
/*
* If there are multiple runs to be merged, when we go to read back
@@ -2997,19 +3081,19 @@ mergeruns(Tuplesortstate *state)
* we don't care to regenerate them. Disable abbreviation from this
* point on.
*/
- state->sortKeys->abbrev_converter = NULL;
- state->sortKeys->comparator = state->sortKeys->abbrev_full_comparator;
+ state->base.sortKeys->abbrev_converter = NULL;
+ state->base.sortKeys->comparator = state->base.sortKeys->abbrev_full_comparator;
/* Not strictly necessary, but be tidy */
- state->sortKeys->abbrev_abort = NULL;
- state->sortKeys->abbrev_full_comparator = NULL;
+ state->base.sortKeys->abbrev_abort = NULL;
+ state->base.sortKeys->abbrev_full_comparator = NULL;
}
/*
* Reset tuple memory. We've freed all the tuples that we previously
* allocated. We will use the slab allocator from now on.
*/
- MemoryContextResetOnly(state->tuplecontext);
+ MemoryContextResetOnly(state->base.tuplecontext);
/*
* We no longer need a large memtuples array. (We will allocate a smaller
@@ -3032,7 +3116,7 @@ mergeruns(Tuplesortstate *state)
* From this point on, we no longer use the USEMEM()/LACKMEM() mechanism
* to track memory usage of individual tuples.
*/
- if (state->tuples)
+ if (state->base.tuples)
init_slab_allocator(state, state->nOutputTapes + 1);
else
init_slab_allocator(state, 0);
@@ -3046,7 +3130,7 @@ mergeruns(Tuplesortstate *state)
* number of input tapes will not increase between passes.)
*/
state->memtupsize = state->nOutputTapes;
- state->memtuples = (SortTuple *) MemoryContextAlloc(state->maincontext,
+ state->memtuples = (SortTuple *) MemoryContextAlloc(state->base.maincontext,
state->nOutputTapes * sizeof(SortTuple));
USEMEM(state, GetMemoryChunkSpace(state->memtuples));
@@ -3123,7 +3207,7 @@ mergeruns(Tuplesortstate *state)
* sorted tape, we can stop at this point and do the final merge
* on-the-fly.
*/
- if ((state->sortopt & TUPLESORT_RANDOMACCESS) == 0
+ if ((state->base.sortopt & TUPLESORT_RANDOMACCESS) == 0
&& state->nInputRuns <= state->nInputTapes
&& !WORKER(state))
{
@@ -3349,7 +3433,7 @@ dumptuples(Tuplesortstate *state, bool alltuples)
* AllocSetFree's bucketing by size class might be particularly bad if
* this step wasn't taken.
*/
- MemoryContextReset(state->tuplecontext);
+ MemoryContextReset(state->base.tuplecontext);
markrunend(state->destTape);
@@ -3367,9 +3451,9 @@ dumptuples(Tuplesortstate *state, bool alltuples)
void
tuplesort_rescan(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->base.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3400,9 +3484,9 @@ tuplesort_rescan(Tuplesortstate *state)
void
tuplesort_markpos(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->base.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3431,9 +3515,9 @@ tuplesort_markpos(Tuplesortstate *state)
void
tuplesort_restorepos(Tuplesortstate *state)
{
- MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
+ MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
- Assert(state->sortopt & TUPLESORT_RANDOMACCESS);
+ Assert(state->base.sortopt & TUPLESORT_RANDOMACCESS);
switch (state->status)
{
@@ -3649,9 +3733,9 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
* Do we have the leading column's value or abbreviation in datum1,
* and is there a specialization for its comparator?
*/
- if (state->haveDatum1 && state->sortKeys)
+ if (state->base.haveDatum1 && state->base.sortKeys)
{
- if (state->sortKeys[0].comparator == ssup_datum_unsigned_cmp)
+ if (state->base.sortKeys[0].comparator == ssup_datum_unsigned_cmp)
{
qsort_tuple_unsigned(state->memtuples,
state->memtupcount,
@@ -3659,7 +3743,7 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
return;
}
#if SIZEOF_DATUM >= 8
- else if (state->sortKeys[0].comparator == ssup_datum_signed_cmp)
+ else if (state->base.sortKeys[0].comparator == ssup_datum_signed_cmp)
{
qsort_tuple_signed(state->memtuples,
state->memtupcount,
@@ -3667,7 +3751,7 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
return;
}
#endif
- else if (state->sortKeys[0].comparator == ssup_datum_int32_cmp)
+ else if (state->base.sortKeys[0].comparator == ssup_datum_int32_cmp)
{
qsort_tuple_int32(state->memtuples,
state->memtupcount,
@@ -3677,16 +3761,16 @@ tuplesort_sort_memtuples(Tuplesortstate *state)
}
/* Can we use the single-key sort function? */
- if (state->onlyKey != NULL)
+ if (state->base.onlyKey != NULL)
{
qsort_ssup(state->memtuples, state->memtupcount,
- state->onlyKey);
+ state->base.onlyKey);
}
else
{
qsort_tuple(state->memtuples,
state->memtupcount,
- state->comparetup,
+ state->base.comparetup,
state);
}
}
@@ -3803,10 +3887,10 @@ tuplesort_heap_replace_top(Tuplesortstate *state, SortTuple *tuple)
static void
reversedirection(Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ SortSupport sortKey = state->base.sortKeys;
int nkey;
- for (nkey = 0; nkey < state->nKeys; nkey++, sortKey++)
+ for (nkey = 0; nkey < state->base.nKeys; nkey++, sortKey++)
{
sortKey->ssup_reverse = !sortKey->ssup_reverse;
sortKey->ssup_nulls_first = !sortKey->ssup_nulls_first;
@@ -3857,7 +3941,7 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
Assert(state->slabFreeHead);
if (tuplen > SLAB_SLOT_SIZE || !state->slabFreeHead)
- return MemoryContextAlloc(state->sortcontext, tuplen);
+ return MemoryContextAlloc(state->base.sortcontext, tuplen);
else
{
buf = state->slabFreeHead;
@@ -3877,6 +3961,7 @@ static void
removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
{
int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
for (i = 0; i < count; i++)
{
@@ -3887,8 +3972,8 @@ removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
MINIMAL_TUPLE_OFFSET);
stups[i].datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
&stups[i].isnull1);
}
}
@@ -3896,7 +3981,8 @@ removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
static int
comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys;
HeapTupleData ltup;
HeapTupleData rtup;
TupleDesc tupDesc;
@@ -3921,7 +4007,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
- tupDesc = state->tupDesc;
+ tupDesc = (TupleDesc) base->arg;
if (sortKey->abbrev_converter)
{
@@ -3938,7 +4024,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
}
sortKey++;
- for (nkey = 1; nkey < state->nKeys; nkey++, sortKey++)
+ for (nkey = 1; nkey < base->nKeys; nkey++, sortKey++)
{
attno = sortKey->ssup_attno;
@@ -3958,6 +4044,7 @@ comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
MinimalTuple tuple = (MinimalTuple) stup->tuple;
/* the part of the MinimalTuple we'll write: */
@@ -3969,8 +4056,7 @@ writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
}
@@ -3982,21 +4068,21 @@ readtup_heap(Tuplesortstate *state, SortTuple *stup,
unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
MinimalTuple tuple = (MinimalTuple) readtup_alloc(state, tuplen);
char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
HeapTupleData htup;
/* read in the tuple proper */
tuple->t_len = tuplen;
LogicalTapeReadExact(tape, tupbody, tupbodylen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
stup->datum1 = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
&stup->isnull1);
}
@@ -4009,6 +4095,8 @@ static void
removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
{
int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
for (i = 0; i < count; i++)
{
@@ -4016,8 +4104,8 @@ removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
tup = (HeapTuple) stups[i].tuple;
stups[i].datum1 = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stups[i].isnull1);
}
}
@@ -4026,7 +4114,9 @@ static int
comparetup_cluster(const SortTuple *a, const SortTuple *b,
Tuplesortstate *state)
{
- SortSupport sortKey = state->sortKeys;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
HeapTuple ltup;
HeapTuple rtup;
TupleDesc tupDesc;
@@ -4040,10 +4130,10 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
/* Be prepared to compare additional sort keys */
ltup = (HeapTuple) a->tuple;
rtup = (HeapTuple) b->tuple;
- tupDesc = state->tupDesc;
+ tupDesc = arg->tupDesc;
/* Compare the leading sort key, if it's simple */
- if (state->haveDatum1)
+ if (base->haveDatum1)
{
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
@@ -4053,7 +4143,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
if (sortKey->abbrev_converter)
{
- AttrNumber leading = state->indexInfo->ii_IndexAttrNumbers[0];
+ AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
@@ -4062,7 +4152,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
datum2, isnull2,
sortKey);
}
- if (compare != 0 || state->nKeys == 1)
+ if (compare != 0 || base->nKeys == 1)
return compare;
/* Compare additional columns the hard way */
sortKey++;
@@ -4074,13 +4164,13 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
nkey = 0;
}
- if (state->indexInfo->ii_Expressions == NULL)
+ if (arg->indexInfo->ii_Expressions == NULL)
{
/* If not expression index, just compare the proper heap attrs */
- for (; nkey < state->nKeys; nkey++, sortKey++)
+ for (; nkey < base->nKeys; nkey++, sortKey++)
{
- AttrNumber attno = state->indexInfo->ii_IndexAttrNumbers[nkey];
+ AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
@@ -4107,19 +4197,19 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
TupleTableSlot *ecxt_scantuple;
/* Reset context each time to prevent memory leakage */
- ResetPerTupleExprContext(state->estate);
+ ResetPerTupleExprContext(arg->estate);
- ecxt_scantuple = GetPerTupleExprContext(state->estate)->ecxt_scantuple;
+ ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
- FormIndexDatum(state->indexInfo, ecxt_scantuple, state->estate,
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
l_index_values, l_index_isnull);
ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
- FormIndexDatum(state->indexInfo, ecxt_scantuple, state->estate,
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
r_index_values, r_index_isnull);
- for (; nkey < state->nKeys; nkey++, sortKey++)
+ for (; nkey < base->nKeys; nkey++, sortKey++)
{
compare = ApplySortComparator(l_index_values[nkey],
l_index_isnull[nkey],
@@ -4137,6 +4227,7 @@ comparetup_cluster(const SortTuple *a, const SortTuple *b,
static void
writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
HeapTuple tuple = (HeapTuple) stup->tuple;
unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
@@ -4144,8 +4235,7 @@ writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
}
@@ -4153,6 +4243,8 @@ static void
readtup_cluster(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int tuplen)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
HeapTuple tuple = (HeapTuple) readtup_alloc(state,
t_len + HEAPTUPLESIZE);
@@ -4165,18 +4257,33 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
tuple->t_tableOid = InvalidOid;
/* Read in the tuple body */
LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value, if it's a simple column */
- if (state->haveDatum1)
+ if (base->haveDatum1)
stup->datum1 = heap_getattr(tuple,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
&stup->isnull1);
}
+static void
+freestate_cluster(Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ /* Free any execution state created for CLUSTER case */
+ if (arg->estate != NULL)
+ {
+ ExprContext *econtext = GetPerTupleExprContext(arg->estate);
+
+ ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
+ FreeExecutorState(arg->estate);
+ }
+}
+
/*
* Routines specialized for IndexTuple case
*
@@ -4188,6 +4295,8 @@ readtup_cluster(Tuplesortstate *state, SortTuple *stup,
static void
removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
int i;
for (i = 0; i < count; i++)
@@ -4197,7 +4306,7 @@ removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
tuple = stups[i].tuple;
stups[i].datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stups[i].isnull1);
}
}
@@ -4211,7 +4320,9 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* is also special handling for enforcing uniqueness, and special
* treatment for equal keys at the end.
*/
- SortSupport sortKey = state->sortKeys;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
IndexTuple tuple1;
IndexTuple tuple2;
int keysz;
@@ -4235,8 +4346,8 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
/* Compare additional sort keys */
tuple1 = (IndexTuple) a->tuple;
tuple2 = (IndexTuple) b->tuple;
- keysz = state->nKeys;
- tupDes = RelationGetDescr(state->indexRel);
+ keysz = base->nKeys;
+ tupDes = RelationGetDescr(arg->index.indexRel);
if (sortKey->abbrev_converter)
{
@@ -4281,7 +4392,7 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
* sort algorithm wouldn't have checked whether one must appear before the
* other.
*/
- if (state->enforceUnique && !(!state->uniqueNullsNotDistinct && equal_hasnull))
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
{
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
@@ -4297,16 +4408,16 @@ comparetup_index_btree(const SortTuple *a, const SortTuple *b,
index_deform_tuple(tuple1, tupDes, values, isnull);
- key_desc = BuildIndexValueDescription(state->indexRel, values, isnull);
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
ereport(ERROR,
(errcode(ERRCODE_UNIQUE_VIOLATION),
errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(state->indexRel)),
+ RelationGetRelationName(arg->index.indexRel)),
key_desc ? errdetail("Key %s is duplicated.", key_desc) :
errdetail("Duplicate keys exist."),
- errtableconstraint(state->heapRel,
- RelationGetRelationName(state->indexRel))));
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
}
/*
@@ -4344,6 +4455,8 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
Bucket bucket2;
IndexTuple tuple1;
IndexTuple tuple2;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexHashArg *arg = (TuplesortIndexHashArg *) base->arg;
/*
* Fetch hash keys and mask off bits we don't want to sort by. We know
@@ -4351,12 +4464,12 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
*/
Assert(!a->isnull1);
bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
- state->max_buckets, state->high_mask,
- state->low_mask);
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
Assert(!b->isnull1);
bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
- state->max_buckets, state->high_mask,
- state->low_mask);
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
if (bucket1 > bucket2)
return 1;
else if (bucket1 < bucket2)
@@ -4394,14 +4507,14 @@ comparetup_index_hash(const SortTuple *a, const SortTuple *b,
static void
writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
IndexTuple tuple = (IndexTuple) stup->tuple;
unsigned int tuplen;
tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
}
@@ -4409,18 +4522,19 @@ static void
readtup_index(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
unsigned int tuplen = len - sizeof(unsigned int);
IndexTuple tuple = (IndexTuple) readtup_alloc(state, tuplen);
LogicalTapeReadExact(tape, tuple, tuplen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
stup->tuple = (void *) tuple;
/* set up first-column key value */
stup->datum1 = index_getattr(tuple,
1,
- RelationGetDescr(state->indexRel),
+ RelationGetDescr(arg->indexRel),
&stup->isnull1);
}
@@ -4440,20 +4554,21 @@ removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
static int
comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
int compare;
compare = ApplySortComparator(a->datum1, a->isnull1,
b->datum1, b->isnull1,
- state->sortKeys);
+ base->sortKeys);
if (compare != 0)
return compare;
/* if we have abbreviations, then "tuple" has the original value */
- if (state->sortKeys->abbrev_converter)
+ if (base->sortKeys->abbrev_converter)
compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
PointerGetDatum(b->tuple), b->isnull1,
- state->sortKeys);
+ base->sortKeys);
return compare;
}
@@ -4461,6 +4576,8 @@ comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
static void
writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
void *waddr;
unsigned int tuplen;
unsigned int writtenlen;
@@ -4470,7 +4587,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
waddr = NULL;
tuplen = 0;
}
- else if (!state->tuples)
+ else if (!base->tuples)
{
waddr = &stup->datum1;
tuplen = sizeof(Datum);
@@ -4478,7 +4595,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
else
{
waddr = stup->tuple;
- tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, state->datumTypeLen);
+ tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
Assert(tuplen != 0);
}
@@ -4486,8 +4603,7 @@ writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
LogicalTapeWrite(tape, waddr, tuplen);
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
}
@@ -4495,6 +4611,7 @@ static void
readtup_datum(Tuplesortstate *state, SortTuple *stup,
LogicalTape *tape, unsigned int len)
{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
unsigned int tuplen = len - sizeof(unsigned int);
if (tuplen == 0)
@@ -4504,7 +4621,7 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
stup->isnull1 = true;
stup->tuple = NULL;
}
- else if (!state->tuples)
+ else if (!base->tuples)
{
Assert(tuplen == sizeof(Datum));
LogicalTapeReadExact(tape, &stup->datum1, tuplen);
@@ -4521,8 +4638,7 @@ readtup_datum(Tuplesortstate *state, SortTuple *stup,
stup->tuple = raddr;
}
- if (state->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length
- * word? */
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 34a76ceb60f..1f88be06aa1 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2833,8 +2833,14 @@ TupleHashTable
TupleQueueReader
TupleTableSlot
TupleTableSlotOps
+TuplesortClusterArg
+TuplesortDatumArg
+TuplesortIndexArg
+TuplesortIndexBTreeArg
+TuplesortIndexHashArg
TuplesortInstrumentation
TuplesortMethod
+TuplesortPublic
TuplesortSpaceType
Tuplesortstate
Tuplestorestate
--
2.24.3 (Apple Git-128)
0003-Put-abbreviation-logic-into-puttuple_common-v4.patchapplication/octet-stream; name=0003-Put-abbreviation-logic-into-puttuple_common-v4.patchDownload
From ba69019340ad853f7d0ab4cb696ec32d5f361733 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 21 Jun 2022 14:13:56 +0300
Subject: [PATCH 3/6] Put abbreviation logic into puttuple_common()
Abbreviation code is very similar along tuplesort_put*() functions. This
commit unifies that code and puts it into puttuple_common(). tuplesort_put*()
functions differs in the abbreviation condition, so it has been added as an
argument to the puttuple_common() function.
---
src/backend/utils/sort/tuplesort.c | 222 ++++++++---------------------
1 file changed, 56 insertions(+), 166 deletions(-)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 8b6b2bc1d38..828efe701e5 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -616,7 +616,8 @@ static Tuplesortstate *tuplesort_begin_common(int workMem,
SortCoordinate coordinate,
int sortopt);
static void tuplesort_begin_batch(Tuplesortstate *state);
-static void puttuple_common(Tuplesortstate *state, SortTuple *tuple);
+static void puttuple_common(Tuplesortstate *state, SortTuple *tuple,
+ bool useAbbrev);
static bool consider_abort_common(Tuplesortstate *state);
static void inittapes(Tuplesortstate *state, bool mergeruns);
static void inittapestate(Tuplesortstate *state, int maxTapes);
@@ -1841,7 +1842,6 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
SortTuple stup;
- Datum original;
MinimalTuple tuple;
HeapTupleData htup;
@@ -1852,51 +1852,15 @@ tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
/* set up first-column key value */
htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- original = heap_getattr(&htup,
- state->sortKeys[0].ssup_attno,
- state->tupDesc,
- &stup.isnull1);
+ stup.datum1 = heap_getattr(&htup,
+ state->sortKeys[0].ssup_attno,
+ state->tupDesc,
+ &stup.isnull1);
MemoryContextSwitchTo(state->sortcontext);
- if (!state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
-
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1910,7 +1874,6 @@ void
tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
{
SortTuple stup;
- Datum original;
MemoryContext oldcontext = MemoryContextSwitchTo(state->tuplecontext);
/* copy the tuple into sort storage */
@@ -1926,51 +1889,14 @@ tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
*/
if (state->haveDatum1)
{
- original = heap_getattr(tup,
- state->indexInfo->ii_IndexAttrNumbers[0],
- state->tupDesc,
- &stup.isnull1);
-
- if (!state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there
- * is a converter it won't expect NULL values, and cost model is
- * not required to account for NULL, so in that case we avoid
- * calling converter and just set datum1 to zeroed representation
- * (to be consistent, and to support cheap inequality tests for
- * NULL abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any
- * case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
+ stup.datum1 = heap_getattr(tup,
+ state->indexInfo->ii_IndexAttrNumbers[0],
+ state->tupDesc,
+ &stup.isnull1);
}
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->haveDatum1 && state->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -1986,7 +1912,6 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
{
MemoryContext oldcontext;
SortTuple stup;
- Datum original;
IndexTuple tuple;
stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
@@ -1995,51 +1920,15 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
tuple->t_tid = *self;
USEMEM(state, GetMemoryChunkSpace(stup.tuple));
/* set up first-column key value */
- original = index_getattr(tuple,
- 1,
- RelationGetDescr(state->indexRel),
- &stup.isnull1);
+ stup.datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(state->indexRel),
+ &stup.isnull1);
oldcontext = MemoryContextSwitchTo(state->sortcontext);
- if (!state->sortKeys || !state->sortKeys->abbrev_converter || stup.isnull1)
- {
- /*
- * Store ordinary Datum representation, or NULL value. If there is a
- * converter it won't expect NULL values, and cost model is not
- * required to account for NULL, so in that case we avoid calling
- * converter and just set datum1 to zeroed representation (to be
- * consistent, and to support cheap inequality tests for NULL
- * abbreviated keys).
- */
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
-
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->sortKeys && state->sortKeys->abbrev_converter && !stup.isnull1);
MemoryContextSwitchTo(oldcontext);
}
@@ -2080,45 +1969,15 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
}
else
{
- Datum original = datumCopy(val, false, state->datumTypeLen);
-
stup.isnull1 = false;
- stup.tuple = DatumGetPointer(original);
+ stup.datum1 = datumCopy(val, false, state->datumTypeLen);
+ stup.tuple = DatumGetPointer(stup.datum1);
USEMEM(state, GetMemoryChunkSpace(stup.tuple));
MemoryContextSwitchTo(state->sortcontext);
-
- if (!state->sortKeys->abbrev_converter)
- {
- stup.datum1 = original;
- }
- else if (!consider_abort_common(state))
- {
- /* Store abbreviated key representation */
- stup.datum1 = state->sortKeys->abbrev_converter(original,
- state->sortKeys);
- }
- else
- {
- /* Abort abbreviation */
- int i;
-
- stup.datum1 = original;
-
- /*
- * Set state to be consistent with never trying abbreviation.
- *
- * Alter datum1 representation in already-copied tuples, so as to
- * ensure a consistent representation (current tuple was just
- * handled). It does not matter if some dumped tuples are already
- * sorted on tape, since serialized tuples lack abbreviated keys
- * (TSS_BUILDRUNS state prevents control reaching here in any
- * case).
- */
- REMOVEABBREV(state, state->memtuples, state->memtupcount);
- }
}
- puttuple_common(state, &stup);
+ puttuple_common(state, &stup,
+ state->tuples && !isNull && state->sortKeys->abbrev_converter);
MemoryContextSwitchTo(oldcontext);
}
@@ -2127,10 +1986,41 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
* Shared code for tuple and datum cases.
*/
static void
-puttuple_common(Tuplesortstate *state, SortTuple *tuple)
+puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
Assert(!LEADER(state));
+ if (!useAbbrev)
+ {
+ /*
+ * Leave ordinary Datum representation, or NULL value. If there is a
+ * converter it won't expect NULL values, and cost model is not
+ * required to account for NULL, so in that case we avoid calling
+ * converter and just set datum1 to zeroed representation (to be
+ * consistent, and to support cheap inequality tests for NULL
+ * abbreviated keys).
+ */
+ }
+ else if (!consider_abort_common(state))
+ {
+ /* Store abbreviated key representation */
+ tuple->datum1 = state->sortKeys->abbrev_converter(tuple->datum1,
+ state->sortKeys);
+ }
+ else
+ {
+ /*
+ * Set state to be consistent with never trying abbreviation.
+ *
+ * Alter datum1 representation in already-copied tuples, so as to
+ * ensure a consistent representation (current tuple was just
+ * handled). It does not matter if some dumped tuples are already
+ * sorted on tape, since serialized tuples lack abbreviated keys
+ * (TSS_BUILDRUNS state prevents control reaching here in any case).
+ */
+ REMOVEABBREV(state, state->memtuples, state->memtupcount);
+ }
+
switch (state->status)
{
case TSS_INITIAL:
--
2.24.3 (Apple Git-128)
0006-Split-tuplesortvariants.c-from-tuplesort.c-v4.patchapplication/octet-stream; name=0006-Split-tuplesortvariants.c-from-tuplesort.c-v4.patchDownload
From b87ae062a74469dfef0579f9b4ad19ed78a9f8d1 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Wed, 22 Jun 2022 21:48:05 +0300
Subject: [PATCH 6/6] Split tuplesortvariants.c from tuplesort.c
This commit puts the implementation of Tuple sort variants into the separate
file tuplesortvariants.c. That gives better separation of the code and
serves well as the demonstration that Tuple sort variant can be defined outside
of tuplesort.c.
---
src/backend/utils/sort/Makefile | 1 +
src/backend/utils/sort/tuplesort.c | 1729 +-------------------
src/backend/utils/sort/tuplesortvariants.c | 1577 ++++++++++++++++++
src/include/utils/tuplesort.h | 222 ++-
4 files changed, 1784 insertions(+), 1745 deletions(-)
create mode 100644 src/backend/utils/sort/tuplesortvariants.c
diff --git a/src/backend/utils/sort/Makefile b/src/backend/utils/sort/Makefile
index 2c31fd453d6..5bfca3040aa 100644
--- a/src/backend/utils/sort/Makefile
+++ b/src/backend/utils/sort/Makefile
@@ -20,6 +20,7 @@ OBJS = \
sharedtuplestore.o \
sortsupport.o \
tuplesort.o \
+ tuplesortvariants.o \
tuplestore.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 0a630956dc1..d90a1aebdf0 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -3,9 +3,10 @@
* tuplesort.c
* Generalized tuple sorting routines.
*
- * This module handles sorting of heap tuples, index tuples, or single
- * Datums (and could easily support other kinds of sortable objects,
- * if necessary). It works efficiently for both small and large amounts
+ * This module provides a generalized facility for tuple sorting, which can be
+ * applied to different kinds of sortable objects. Implementation of
+ * the particular sorting variants is given in tuplesortvariants.c.
+ * This module works efficiently for both small and large amounts
* of data. Small amounts are sorted in-memory using qsort(). Large
* amounts are sorted using temporary files and a standard external sort
* algorithm.
@@ -100,36 +101,17 @@
#include <limits.h>
-#include "access/hash.h"
-#include "access/htup_details.h"
-#include "access/nbtree.h"
-#include "catalog/index.h"
#include "catalog/pg_am.h"
#include "commands/tablespace.h"
#include "executor/executor.h"
#include "miscadmin.h"
#include "pg_trace.h"
-#include "utils/datum.h"
-#include "utils/logtape.h"
-#include "utils/lsyscache.h"
+#include "storage/shmem.h"
#include "utils/memutils.h"
#include "utils/pg_rusage.h"
#include "utils/rel.h"
-#include "utils/sortsupport.h"
#include "utils/tuplesort.h"
-
-/* sort-type codes for sort__start probes */
-#define HEAP_SORT 0
-#define INDEX_SORT 1
-#define DATUM_SORT 2
-#define CLUSTER_SORT 3
-
-/* Sort parallel code from state for sort__start probes */
-#define PARALLEL_SORT(coordinate) (coordinate == NULL || \
- (coordinate)->sharedsort == NULL ? 0 : \
- (coordinate)->isWorker ? 1 : 2)
-
/*
* Initial size of memtuples array. We're trying to select this size so that
* array doesn't exceed ALLOCSET_SEPARATE_THRESHOLD and so that the overhead of
@@ -150,43 +132,6 @@ bool optimize_bounded_sort = true;
#endif
-/*
- * The objects we actually sort are SortTuple structs. These contain
- * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
- * which is a separate palloc chunk --- we assume it is just one chunk and
- * can be freed by a simple pfree() (except during merge, when we use a
- * simple slab allocator). SortTuples also contain the tuple's first key
- * column in Datum/nullflag format, and a source/input tape number that
- * tracks which tape each heap element/slot belongs to during merging.
- *
- * Storing the first key column lets us save heap_getattr or index_getattr
- * calls during tuple comparisons. We could extract and save all the key
- * columns not just the first, but this would increase code complexity and
- * overhead, and wouldn't actually save any comparison cycles in the common
- * case where the first key determines the comparison result. Note that
- * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
- *
- * There is one special case: when the sort support infrastructure provides an
- * "abbreviated key" representation, where the key is (typically) a pass by
- * value proxy for a pass by reference type. In this case, the abbreviated key
- * is stored in datum1 in place of the actual first key column.
- *
- * When sorting single Datums, the data value is represented directly by
- * datum1/isnull1 for pass by value types (or null values). If the datatype is
- * pass-by-reference and isnull1 is false, then "tuple" points to a separately
- * palloc'd data value, otherwise "tuple" is NULL. The value of datum1 is then
- * either the same pointer as "tuple", or is an abbreviated key value as
- * described above. Accordingly, "tuple" is always used in preference to
- * datum1 as the authoritative value for pass-by-reference cases.
- */
-typedef struct
-{
- void *tuple; /* the tuple itself */
- Datum datum1; /* value of first key column */
- bool isnull1; /* is first key column NULL? */
- int srctape; /* source tape number */
-} SortTuple;
-
/*
* During merge, we use a pre-allocated set of fixed-size slots to hold
* tuples. To avoid palloc/pfree overhead.
@@ -237,155 +182,6 @@ typedef enum
#define TAPE_BUFFER_OVERHEAD BLCKSZ
#define MERGE_BUFFER_SIZE (BLCKSZ * 32)
-typedef struct TuplesortPublic TuplesortPublic;
-
-typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-
-/*
- * The public part of a Tuple sort operation state. This data structure
- * containsthe definition of sort-variant-specific interface methods and
- * the part of Tuple sort operation state required by their implementations.
- */
-struct TuplesortPublic
-{
- /*
- * These function pointers decouple the routines that must know what kind
- * of tuple we are sorting from the routines that don't need to know it.
- * They are set up by the tuplesort_begin_xxx routines.
- *
- * Function to compare two tuples; result is per qsort() convention, ie:
- * <0, 0, >0 according as a<b, a=b, a>b. The API must match
- * qsort_arg_comparator.
- */
- SortTupleComparator comparetup;
-
- /*
- * Alter datum1 representation in the SortTuple's array back from the
- * abbreviated key to the first column value.
- */
- void (*removeabbrev) (Tuplesortstate *state, SortTuple *stups,
- int count);
-
- /*
- * Function to write a stored tuple onto tape. The representation of the
- * tuple on tape need not be the same as it is in memory.
- */
- void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-
- /*
- * Function to read a stored tuple from tape back into memory. 'len' is
- * the already-read length of the stored tuple. The tuple is allocated
- * from the slab memory arena, or is palloc'd, see readtup_alloc().
- */
- void (*readtup) (Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-
- /*
- * Function to do some specific release of resources for the sort variant.
- * In particular, this function should free everything stored in the "arg"
- * field, which wouldn't be cleared on reset of the Tuple sort memory
- * contextes. This can be NULL if nothing specific needs to be done.
- */
- void (*freestate) (Tuplesortstate *state);
-
- /*
- * The subsequent fields are used in the implementations of the functions
- * above.
- */
- MemoryContext maincontext; /* memory context for tuple sort metadata that
- * persists across multiple batches */
- MemoryContext sortcontext; /* memory context holding most sort data */
- MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
-
- /*
- * Whether SortTuple's datum1 and isnull1 members are maintained by the
- * above routines. If not, some sort specializations are disabled.
- */
- bool haveDatum1;
-
- /*
- * The sortKeys variable is used by every case other than the hash index
- * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
- * MinimalTuple and CLUSTER routines, though.
- */
- int nKeys; /* number of columns in sort key */
- SortSupport sortKeys; /* array of length nKeys */
-
- /*
- * This variable is shared by the single-key MinimalTuple case and the
- * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
- * presence of a value in this field is also checked by various sort
- * specialization functions as an optimization when comparing the leading
- * key in a tiebreak situation to determine if there are any subsequent
- * keys to sort on.
- */
- SortSupport onlyKey;
-
- int sortopt; /* Bitmask of flags used to setup sort */
-
- bool tuples; /* Can SortTuple.tuple ever be set? */
-
- void *arg; /* Specific information for the sort variant */
-};
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
- * the tuplesort_begin_cluster.
- */
-typedef struct
-{
- TupleDesc tupDesc;
-
- IndexInfo *indexInfo; /* info about index being used for reference */
- EState *estate; /* for evaluating index expressions */
-} TuplesortClusterArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the IndexTuple case.
- * Set by tuplesort_begin_index_xxx and used only by the IndexTuple routines.
- */
-typedef struct
-{
- Relation heapRel; /* table the index is being built on */
- Relation indexRel; /* index being built */
-} TuplesortIndexArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the index_btree subcase.
- */
-typedef struct
-{
- TuplesortIndexArg index;
-
- bool enforceUnique; /* complain if we find duplicate tuples */
- bool uniqueNullsNotDistinct; /* unique constraint null treatment */
-} TuplesortIndexBTreeArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the index_hash subcase.
- */
-typedef struct
-{
- TuplesortIndexArg index;
-
- uint32 high_mask; /* masks for sortable part of hash code */
- uint32 low_mask;
- uint32 max_buckets;
-} TuplesortIndexHashArg;
-
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the Datum case.
- * Set by tuplesort_begin_datum and used only by the DatumTuple routines.
- */
-typedef struct
-{
- /* the datatype oid of Datum's to be sorted */
- Oid datumType;
- /* we need typelen in order to know how to copy the Datums. */
- int datumTypeLen;
-} TuplesortDatumArg;
/*
* Private state of a Tuplesort operation.
@@ -597,8 +393,6 @@ struct Sharedsort
pfree(buf); \
} while(0)
-#define TuplesortstateGetPublic(state) ((TuplesortPublic *) state)
-
#define REMOVEABBREV(state,stup,count) ((*(state)->base.removeabbrev) (state, stup, count))
#define COMPARETUP(state,a,b) ((*(state)->base.comparetup) (a, b, state))
#define WRITETUP(state,tape,stup) (writetuple(state, tape, stup))
@@ -657,20 +451,8 @@ struct Sharedsort
* begins).
*/
-/* When using this macro, beware of double evaluation of len */
-#define LogicalTapeReadExact(tape, ptr, len) \
- do { \
- if (LogicalTapeRead(tape, ptr, len) != (size_t) (len)) \
- elog(ERROR, "unexpected end of data"); \
- } while(0)
-
-static Tuplesortstate *tuplesort_begin_common(int workMem,
- SortCoordinate coordinate,
- int sortopt);
static void tuplesort_begin_batch(Tuplesortstate *state);
-static void puttuple_common(Tuplesortstate *state, SortTuple *tuple,
- bool useAbbrev);
static void writetuple(Tuplesortstate *state, LogicalTape *tape,
SortTuple *stup);
static bool consider_abort_common(Tuplesortstate *state);
@@ -692,42 +474,6 @@ static void tuplesort_heap_delete_top(Tuplesortstate *state);
static void reversedirection(Tuplesortstate *state);
static unsigned int getlen(LogicalTape *tape, bool eofOK);
static void markrunend(LogicalTape *tape);
-static void *readtup_alloc(Tuplesortstate *state, Size tuplen);
-static void removeabbrev_heap(Tuplesortstate *state, SortTuple *stups,
- int count);
-static void removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups,
- int count);
-static void removeabbrev_index(Tuplesortstate *state, SortTuple *stups,
- int count);
-static void removeabbrev_datum(Tuplesortstate *state, SortTuple *stups,
- int count);
-static int comparetup_heap(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_index(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static int comparetup_datum(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state);
-static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
- SortTuple *stup);
-static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len);
-static void freestate_cluster(Tuplesortstate *state);
static int worker_get_identifier(Tuplesortstate *state);
static void worker_freeze_result_tape(Tuplesortstate *state);
static void worker_nomergeruns(Tuplesortstate *state);
@@ -898,7 +644,7 @@ qsort_tuple_int32_compare(SortTuple *a, SortTuple *b, Tuplesortstate *state)
* sort options. See TUPLESORT_* definitions in tuplesort.h
*/
-static Tuplesortstate *
+Tuplesortstate *
tuplesort_begin_common(int workMem, SortCoordinate coordinate, int sortopt)
{
Tuplesortstate *state;
@@ -1084,468 +830,6 @@ tuplesort_begin_batch(Tuplesortstate *state)
MemoryContextSwitchTo(oldcontext);
}
-Tuplesortstate *
-tuplesort_begin_heap(TupleDesc tupDesc,
- int nkeys, AttrNumber *attNums,
- Oid *sortOperators, Oid *sortCollations,
- bool *nullsFirstFlags,
- int workMem, SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- int i;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
-
- AssertArg(nkeys > 0);
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
- nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = nkeys;
-
- TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
- false, /* no unique check */
- nkeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_heap;
- base->comparetup = comparetup_heap;
- base->writetup = writetup_heap;
- base->readtup = readtup_heap;
- base->haveDatum1 = true;
- base->arg = tupDesc; /* assume we need not copy tupDesc */
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
-
- for (i = 0; i < nkeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
-
- AssertArg(attNums[i] != 0);
- AssertArg(sortOperators[i] != 0);
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = sortCollations[i];
- sortKey->ssup_nulls_first = nullsFirstFlags[i];
- sortKey->ssup_attno = attNums[i];
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
- }
-
- /*
- * The "onlyKey" optimization cannot be used with abbreviated keys, since
- * tie-breaker comparisons may be required. Typically, the optimization
- * is only of value to pass-by-value types anyway, whereas abbreviated
- * keys are typically only of value to pass-by-reference types.
- */
- if (nkeys == 1 && !base->sortKeys->abbrev_converter)
- base->onlyKey = base->sortKeys;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_cluster(TupleDesc tupDesc,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- BTScanInsert indexScanKey;
- MemoryContext oldcontext;
- TuplesortClusterArg *arg;
- int i;
-
- Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortClusterArg *) palloc0(sizeof(TuplesortClusterArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
- RelationGetNumberOfAttributes(indexRel),
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
- false, /* no unique check */
- base->nKeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_cluster;
- base->comparetup = comparetup_cluster;
- base->writetup = writetup_cluster;
- base->readtup = readtup_cluster;
- base->freestate = freestate_cluster;
- base->arg = arg;
-
- arg->indexInfo = BuildIndexInfo(indexRel);
-
- /*
- * If we don't have a simple leading attribute, we don't currently
- * initialize datum1, so disable optimizations that require it.
- */
- if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
- base->haveDatum1 = false;
- else
- base->haveDatum1 = true;
-
- arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
-
- indexScanKey = _bt_mkscankey(indexRel, NULL);
-
- if (arg->indexInfo->ii_Expressions != NULL)
- {
- TupleTableSlot *slot;
- ExprContext *econtext;
-
- /*
- * We will need to use FormIndexDatum to evaluate the index
- * expressions. To do that, we need an EState, as well as a
- * TupleTableSlot to put the table tuples into. The econtext's
- * scantuple has to point to that slot, too.
- */
- arg->estate = CreateExecutorState();
- slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
- econtext = GetPerTupleExprContext(arg->estate);
- econtext->ecxt_scantuple = slot;
- }
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(base->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < base->nKeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
- ScanKey scanKey = indexScanKey->scankeys + i;
- int16 strategy;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = scanKey->sk_collation;
- sortKey->ssup_nulls_first =
- (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
- sortKey->ssup_attno = scanKey->sk_attno;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
- BTGreaterStrategyNumber : BTLessStrategyNumber;
-
- PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
- }
-
- pfree(indexScanKey);
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_btree(Relation heapRel,
- Relation indexRel,
- bool enforceUnique,
- bool uniqueNullsNotDistinct,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- BTScanInsert indexScanKey;
- TuplesortIndexBTreeArg *arg;
- MemoryContext oldcontext;
- int i;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: unique = %c, workMem = %d, randomAccess = %c",
- enforceUnique ? 't' : 'f',
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
- enforceUnique,
- base->nKeys,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_index;
- base->comparetup = comparetup_index_btree;
- base->writetup = writetup_index;
- base->readtup = readtup_index;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
- arg->enforceUnique = enforceUnique;
- arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
-
- indexScanKey = _bt_mkscankey(indexRel, NULL);
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(base->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < base->nKeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
- ScanKey scanKey = indexScanKey->scankeys + i;
- int16 strategy;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = scanKey->sk_collation;
- sortKey->ssup_nulls_first =
- (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
- sortKey->ssup_attno = scanKey->sk_attno;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
- BTGreaterStrategyNumber : BTLessStrategyNumber;
-
- PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
- }
-
- pfree(indexScanKey);
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_hash(Relation heapRel,
- Relation indexRel,
- uint32 high_mask,
- uint32 low_mask,
- uint32 max_buckets,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- TuplesortIndexHashArg *arg;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexHashArg *) palloc(sizeof(TuplesortIndexHashArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: high_mask = 0x%x, low_mask = 0x%x, "
- "max_buckets = 0x%x, workMem = %d, randomAccess = %c",
- high_mask,
- low_mask,
- max_buckets,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = 1; /* Only one sort column, the hash code */
-
- base->removeabbrev = removeabbrev_index;
- base->comparetup = comparetup_index_hash;
- base->writetup = writetup_index;
- base->readtup = readtup_index;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
-
- arg->high_mask = high_mask;
- arg->low_mask = low_mask;
- arg->max_buckets = max_buckets;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_index_gist(Relation heapRel,
- Relation indexRel,
- int workMem,
- SortCoordinate coordinate,
- int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- TuplesortIndexBTreeArg *arg;
- int i;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin index sort: workMem = %d, randomAccess = %c",
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
-
- base->removeabbrev = removeabbrev_index;
- base->comparetup = comparetup_index_btree;
- base->writetup = writetup_index;
- base->readtup = readtup_index;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
- arg->enforceUnique = false;
- arg->uniqueNullsNotDistinct = false;
-
- /* Prepare SortSupport data for each column */
- base->sortKeys = (SortSupport) palloc0(base->nKeys *
- sizeof(SortSupportData));
-
- for (i = 0; i < base->nKeys; i++)
- {
- SortSupport sortKey = base->sortKeys + i;
-
- sortKey->ssup_cxt = CurrentMemoryContext;
- sortKey->ssup_collation = indexRel->rd_indcollation[i];
- sortKey->ssup_nulls_first = false;
- sortKey->ssup_attno = i + 1;
- /* Convey if abbreviation optimization is applicable in principle */
- sortKey->abbreviate = (i == 0 && base->haveDatum1);
-
- AssertState(sortKey->ssup_attno != 0);
-
- /* Look for a sort support function */
- PrepareSortSupportFromGistIndexRel(indexRel, sortKey);
- }
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
-Tuplesortstate *
-tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
- bool nullsFirstFlag, int workMem,
- SortCoordinate coordinate, int sortopt)
-{
- Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
- sortopt);
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortDatumArg *arg;
- MemoryContext oldcontext;
- int16 typlen;
- bool typbyval;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortDatumArg *) palloc(sizeof(TuplesortDatumArg));
-
-#ifdef TRACE_SORT
- if (trace_sort)
- elog(LOG,
- "begin datum sort: workMem = %d, randomAccess = %c",
- workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
-#endif
-
- base->nKeys = 1; /* always a one-column sort */
-
- TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
- false, /* no unique check */
- 1,
- workMem,
- sortopt & TUPLESORT_RANDOMACCESS,
- PARALLEL_SORT(coordinate));
-
- base->removeabbrev = removeabbrev_datum;
- base->comparetup = comparetup_datum;
- base->writetup = writetup_datum;
- base->readtup = readtup_datum;
- state->abbrevNext = 10;
- base->haveDatum1 = true;
- base->arg = arg;
-
- arg->datumType = datumType;
-
- /* lookup necessary attributes of the datum type */
- get_typlenbyval(datumType, &typlen, &typbyval);
- arg->datumTypeLen = typlen;
- base->tuples = !typbyval;
-
- /* Prepare SortSupport data */
- base->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
-
- base->sortKeys->ssup_cxt = CurrentMemoryContext;
- base->sortKeys->ssup_collation = sortCollation;
- base->sortKeys->ssup_nulls_first = nullsFirstFlag;
-
- /*
- * Abbreviation is possible here only for by-reference types. In theory,
- * a pass-by-value datatype could have an abbreviated form that is cheaper
- * to compare. In a tuple sort, we could support that, because we can
- * always extract the original datum from the tuple as needed. Here, we
- * can't, because a datum sort only stores a single copy of the datum; the
- * "tuple" field of each SortTuple is NULL.
- */
- base->sortKeys->abbreviate = !typbyval;
-
- PrepareSortSupportFromOrderingOp(sortOperator, base->sortKeys);
-
- /*
- * The "onlyKey" optimization cannot be used with abbreviated keys, since
- * tie-breaker comparisons may be required. Typically, the optimization
- * is only of value to pass-by-value types anyway, whereas abbreviated
- * keys are typically only of value to pass-by-reference types.
- */
- if (!base->sortKeys->abbrev_converter)
- base->onlyKey = base->sortKeys;
-
- MemoryContextSwitchTo(oldcontext);
-
- return state;
-}
-
/*
* tuplesort_set_bound
*
@@ -1901,154 +1185,11 @@ noalloc:
return false;
}
-/*
- * Accept one tuple while collecting input data for sort.
- *
- * Note that the input data is always copied; the caller need not save it.
- */
-void
-tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TupleDesc tupDesc = (TupleDesc) base->arg;
- SortTuple stup;
- MinimalTuple tuple;
- HeapTupleData htup;
-
- /* copy the tuple into sort storage */
- tuple = ExecCopySlotMinimalTuple(slot);
- stup.tuple = (void *) tuple;
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- stup.datum1 = heap_getattr(&htup,
- base->sortKeys[0].ssup_attno,
- tupDesc,
- &stup.isnull1);
-
- puttuple_common(state, &stup,
- base->sortKeys->abbrev_converter && !stup.isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Accept one tuple while collecting input data for sort.
- *
- * Note that the input data is always copied; the caller need not save it.
- */
-void
-tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
-{
- SortTuple stup;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
-
- /* copy the tuple into sort storage */
- tup = heap_copytuple(tup);
- stup.tuple = (void *) tup;
-
- /*
- * set up first-column key value, and potentially abbreviate, if it's a
- * simple column
- */
- if (base->haveDatum1)
- {
- stup.datum1 = heap_getattr(tup,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup.isnull1);
- }
-
- puttuple_common(state, &stup,
- base->haveDatum1 && base->sortKeys->abbrev_converter && !stup.isnull1);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
-/*
- * Collect one index tuple while collecting input data for sort, building
- * it from caller-supplied values.
- */
-void
-tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
- ItemPointer self, Datum *values,
- bool *isnull)
-{
- SortTuple stup;
- IndexTuple tuple;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
-
- stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
- isnull, base->tuplecontext);
- tuple = ((IndexTuple) stup.tuple);
- tuple->t_tid = *self;
- /* set up first-column key value */
- stup.datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup.isnull1);
-
- puttuple_common(state, &stup,
- base->sortKeys && base->sortKeys->abbrev_converter && !stup.isnull1);
-}
-
-/*
- * Accept one Datum while collecting input data for sort.
- *
- * If the Datum is pass-by-ref type, the value will be copied.
- */
-void
-tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- SortTuple stup;
-
- /*
- * Pass-by-value types or null values are just stored directly in
- * stup.datum1 (and stup.tuple is not used and set to NULL).
- *
- * Non-null pass-by-reference values need to be copied into memory we
- * control, and possibly abbreviated. The copied value is pointed to by
- * stup.tuple and is treated as the canonical copy (e.g. to return via
- * tuplesort_getdatum or when writing to tape); stup.datum1 gets the
- * abbreviated value if abbreviation is happening, otherwise it's
- * identical to stup.tuple.
- */
-
- if (isNull || !base->tuples)
- {
- /*
- * Set datum1 to zeroed representation for NULLs (to be consistent,
- * and to support cheap inequality tests for NULL abbreviated keys).
- */
- stup.datum1 = !isNull ? val : (Datum) 0;
- stup.isnull1 = isNull;
- stup.tuple = NULL; /* no separate storage */
- }
- else
- {
- stup.isnull1 = false;
- stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
- stup.tuple = DatumGetPointer(stup.datum1);
- }
-
- puttuple_common(state, &stup,
- base->tuples && !isNull && base->sortKeys->abbrev_converter);
-
- MemoryContextSwitchTo(oldcontext);
-}
-
/*
* Shared code for tuple and datum cases.
*/
-static void
-puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
+void
+tuplesort_puttuple_common(Tuplesortstate *state, SortTuple *tuple, bool useAbbrev)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->base.sortcontext);
@@ -2371,7 +1512,7 @@ tuplesort_performsort(Tuplesortstate *state)
* by caller. Note that fetched tuple is stored in memory that may be
* recycled by any future fetch.
*/
-static bool
+bool
tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
SortTuple *stup)
{
@@ -2595,162 +1736,17 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
}
newtup.srctape = srcTapeIndex;
tuplesort_heap_replace_top(state, &newtup);
- return true;
- }
- return false;
-
- default:
- elog(ERROR, "invalid tuplesort state");
- return false; /* keep compiler quiet */
- }
-}
-
-/*
- * Fetch the next tuple in either forward or back direction.
- * If successful, put tuple in slot and return true; else, clear the slot
- * and return false.
- *
- * Caller may optionally be passed back abbreviated value (on true return
- * value) when abbreviation was used, which can be used to cheaply avoid
- * equality checks that might otherwise be required. Caller can safely make a
- * determination of "non-equal tuple" based on simple binary inequality. A
- * NULL value in leading attribute will set abbreviated value to zeroed
- * representation, which caller may rely on in abbreviated inequality check.
- *
- * If copy is true, the slot receives a tuple that's been copied into the
- * caller's memory context, so that it will stay valid regardless of future
- * manipulations of the tuplesort's state (up to and including deleting the
- * tuplesort). If copy is false, the slot will just receive a pointer to a
- * tuple held within the tuplesort, which is more efficient, but only safe for
- * callers that are prepared to have any subsequent manipulation of the
- * tuplesort's state invalidate slot contents.
- */
-bool
-tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
- TupleTableSlot *slot, Datum *abbrev)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- if (stup.tuple)
- {
- /* Record abbreviated key for caller */
- if (base->sortKeys->abbrev_converter && abbrev)
- *abbrev = stup.datum1;
-
- if (copy)
- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
-
- ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
- return true;
- }
- else
- {
- ExecClearTuple(slot);
- return false;
- }
-}
-
-/*
- * Fetch the next tuple in either forward or back direction.
- * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller may not rely on tuple
- * remaining valid after any further manipulation of tuplesort.
- */
-HeapTuple
-tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- return stup.tuple;
-}
-
-/*
- * Fetch the next index tuple in either forward or back direction.
- * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller may not rely on tuple
- * remaining valid after any further manipulation of tuplesort.
- */
-IndexTuple
-tuplesort_getindextuple(Tuplesortstate *state, bool forward)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- stup.tuple = NULL;
-
- MemoryContextSwitchTo(oldcontext);
-
- return (IndexTuple) stup.tuple;
-}
-
-/*
- * Fetch the next Datum in either forward or back direction.
- * Returns false if no more datums.
- *
- * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * in caller's context, and is now owned by the caller (this differs from
- * similar routines for other types of tuplesorts).
- *
- * Caller may optionally be passed back abbreviated value (on true return
- * value) when abbreviation was used, which can be used to cheaply avoid
- * equality checks that might otherwise be required. Caller can safely make a
- * determination of "non-equal tuple" based on simple binary inequality. A
- * NULL value will have a zeroed abbreviated value representation, which caller
- * may rely on in abbreviated inequality check.
- */
-bool
-tuplesort_getdatum(Tuplesortstate *state, bool forward,
- Datum *val, bool *isNull, Datum *abbrev)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- SortTuple stup;
-
- if (!tuplesort_gettuple_common(state, forward, &stup))
- {
- MemoryContextSwitchTo(oldcontext);
- return false;
- }
-
- /* Ensure we copy into caller's memory context */
- MemoryContextSwitchTo(oldcontext);
-
- /* Record abbreviated key for caller */
- if (base->sortKeys->abbrev_converter && abbrev)
- *abbrev = stup.datum1;
+ return true;
+ }
+ return false;
- if (stup.isnull1 || !base->tuples)
- {
- *val = stup.datum1;
- *isNull = stup.isnull1;
- }
- else
- {
- /* use stup.tuple because stup.datum1 may be an abbreviation */
- *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
- *isNull = false;
+ default:
+ elog(ERROR, "invalid tuplesort state");
+ return false; /* keep compiler quiet */
}
-
- return true;
}
+
/*
* Advance over N tuples in either forward or back direction,
* without returning any data. N==0 is a no-op.
@@ -3929,8 +2925,8 @@ markrunend(LogicalTape *tape)
* We use next free slot from the slab allocator, or palloc() if the tuple
* is too large for that.
*/
-static void *
-readtup_alloc(Tuplesortstate *state, Size tuplen)
+void *
+tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen)
{
SlabSlot *buf;
@@ -3953,695 +2949,6 @@ readtup_alloc(Tuplesortstate *state, Size tuplen)
}
-/*
- * Routines specialized for HeapTuple (actually MinimalTuple) case
- */
-
-static void
-removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
-{
- int i;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
-
- for (i = 0; i < count; i++)
- {
- HeapTupleData htup;
-
- htup.t_len = ((MinimalTuple) stups[i].tuple)->t_len +
- MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
- MINIMAL_TUPLE_OFFSET);
- stups[i].datum1 = heap_getattr(&htup,
- base->sortKeys[0].ssup_attno,
- (TupleDesc) base->arg,
- &stups[i].isnull1);
- }
-}
-
-static int
-comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- SortSupport sortKey = base->sortKeys;
- HeapTupleData ltup;
- HeapTupleData rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- AttrNumber attno;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
- /* Compare the leading sort key */
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- /* Compare additional sort keys */
- ltup.t_len = ((MinimalTuple) a->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
- ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
- rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
- rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
- tupDesc = (TupleDesc) base->arg;
-
- if (sortKey->abbrev_converter)
- {
- attno = sortKey->ssup_attno;
-
- datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- sortKey++;
- for (nkey = 1; nkey < base->nKeys; nkey++, sortKey++)
- {
- attno = sortKey->ssup_attno;
-
- datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- return 0;
-}
-
-static void
-writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- MinimalTuple tuple = (MinimalTuple) stup->tuple;
-
- /* the part of the MinimalTuple we'll write: */
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
-
- /* total on-disk footprint: */
- unsigned int tuplen = tupbodylen + sizeof(int);
-
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_heap(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- unsigned int tupbodylen = len - sizeof(int);
- unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
- MinimalTuple tuple = (MinimalTuple) readtup_alloc(state, tuplen);
- char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- HeapTupleData htup;
-
- /* read in the tuple proper */
- tuple->t_len = tuplen;
- LogicalTapeReadExact(tape, tupbody, tupbodylen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value */
- htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
- htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
- stup->datum1 = heap_getattr(&htup,
- base->sortKeys[0].ssup_attno,
- (TupleDesc) base->arg,
- &stup->isnull1);
-}
-
-/*
- * Routines specialized for the CLUSTER case (HeapTuple data, with
- * comparisons per a btree index definition)
- */
-
-static void
-removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
-{
- int i;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
-
- for (i = 0; i < count; i++)
- {
- HeapTuple tup;
-
- tup = (HeapTuple) stups[i].tuple;
- stups[i].datum1 = heap_getattr(tup,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stups[i].isnull1);
- }
-}
-
-static int
-comparetup_cluster(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
- SortSupport sortKey = base->sortKeys;
- HeapTuple ltup;
- HeapTuple rtup;
- TupleDesc tupDesc;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
- /* Be prepared to compare additional sort keys */
- ltup = (HeapTuple) a->tuple;
- rtup = (HeapTuple) b->tuple;
- tupDesc = arg->tupDesc;
-
- /* Compare the leading sort key, if it's simple */
- if (base->haveDatum1)
- {
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- if (sortKey->abbrev_converter)
- {
- AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
-
- datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
- datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- }
- if (compare != 0 || base->nKeys == 1)
- return compare;
- /* Compare additional columns the hard way */
- sortKey++;
- nkey = 1;
- }
- else
- {
- /* Must compare all keys the hard way */
- nkey = 0;
- }
-
- if (arg->indexInfo->ii_Expressions == NULL)
- {
- /* If not expression index, just compare the proper heap attrs */
-
- for (; nkey < base->nKeys; nkey++, sortKey++)
- {
- AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
-
- datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
- datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
- }
- else
- {
- /*
- * In the expression index case, compute the whole index tuple and
- * then compare values. It would perhaps be faster to compute only as
- * many columns as we need to compare, but that would require
- * duplicating all the logic in FormIndexDatum.
- */
- Datum l_index_values[INDEX_MAX_KEYS];
- bool l_index_isnull[INDEX_MAX_KEYS];
- Datum r_index_values[INDEX_MAX_KEYS];
- bool r_index_isnull[INDEX_MAX_KEYS];
- TupleTableSlot *ecxt_scantuple;
-
- /* Reset context each time to prevent memory leakage */
- ResetPerTupleExprContext(arg->estate);
-
- ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
-
- ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
- FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
- l_index_values, l_index_isnull);
-
- ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
- FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
- r_index_values, r_index_isnull);
-
- for (; nkey < base->nKeys; nkey++, sortKey++)
- {
- compare = ApplySortComparator(l_index_values[nkey],
- l_index_isnull[nkey],
- r_index_values[nkey],
- r_index_isnull[nkey],
- sortKey);
- if (compare != 0)
- return compare;
- }
- }
-
- return 0;
-}
-
-static void
-writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- HeapTuple tuple = (HeapTuple) stup->tuple;
- unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
-
- /* We need to store t_self, but not other fields of HeapTupleData */
- LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
- LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_cluster(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int tuplen)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
- unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
- HeapTuple tuple = (HeapTuple) readtup_alloc(state,
- t_len + HEAPTUPLESIZE);
-
- /* Reconstruct the HeapTupleData header */
- tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
- tuple->t_len = t_len;
- LogicalTapeReadExact(tape, &tuple->t_self, sizeof(ItemPointerData));
- /* We don't currently bother to reconstruct t_tableOid */
- tuple->t_tableOid = InvalidOid;
- /* Read in the tuple body */
- LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value, if it's a simple column */
- if (base->haveDatum1)
- stup->datum1 = heap_getattr(tuple,
- arg->indexInfo->ii_IndexAttrNumbers[0],
- arg->tupDesc,
- &stup->isnull1);
-}
-
-static void
-freestate_cluster(Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
-
- /* Free any execution state created for CLUSTER case */
- if (arg->estate != NULL)
- {
- ExprContext *econtext = GetPerTupleExprContext(arg->estate);
-
- ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
- FreeExecutorState(arg->estate);
- }
-}
-
-/*
- * Routines specialized for IndexTuple case
- *
- * The btree and hash cases require separate comparison functions, but the
- * IndexTuple representation is the same so the copy/write/read support
- * functions can be shared.
- */
-
-static void
-removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
- int i;
-
- for (i = 0; i < count; i++)
- {
- IndexTuple tuple;
-
- tuple = stups[i].tuple;
- stups[i].datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stups[i].isnull1);
- }
-}
-
-static int
-comparetup_index_btree(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- /*
- * This is similar to comparetup_heap(), but expects index tuples. There
- * is also special handling for enforcing uniqueness, and special
- * treatment for equal keys at the end.
- */
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
- SortSupport sortKey = base->sortKeys;
- IndexTuple tuple1;
- IndexTuple tuple2;
- int keysz;
- TupleDesc tupDes;
- bool equal_hasnull = false;
- int nkey;
- int32 compare;
- Datum datum1,
- datum2;
- bool isnull1,
- isnull2;
-
-
- /* Compare the leading sort key */
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- sortKey);
- if (compare != 0)
- return compare;
-
- /* Compare additional sort keys */
- tuple1 = (IndexTuple) a->tuple;
- tuple2 = (IndexTuple) b->tuple;
- keysz = base->nKeys;
- tupDes = RelationGetDescr(arg->index.indexRel);
-
- if (sortKey->abbrev_converter)
- {
- datum1 = index_getattr(tuple1, 1, tupDes, &isnull1);
- datum2 = index_getattr(tuple2, 1, tupDes, &isnull2);
-
- compare = ApplySortAbbrevFullComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare;
- }
-
- /* they are equal, so we only need to examine one null flag */
- if (a->isnull1)
- equal_hasnull = true;
-
- sortKey++;
- for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
- {
- datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
- datum2 = index_getattr(tuple2, nkey, tupDes, &isnull2);
-
- compare = ApplySortComparator(datum1, isnull1,
- datum2, isnull2,
- sortKey);
- if (compare != 0)
- return compare; /* done when we find unequal attributes */
-
- /* they are equal, so we only need to examine one null flag */
- if (isnull1)
- equal_hasnull = true;
- }
-
- /*
- * If btree has asked us to enforce uniqueness, complain if two equal
- * tuples are detected (unless there was at least one NULL field and NULLS
- * NOT DISTINCT was not set).
- *
- * It is sufficient to make the test here, because if two tuples are equal
- * they *must* get compared at some stage of the sort --- otherwise the
- * sort algorithm wouldn't have checked whether one must appear before the
- * other.
- */
- if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
- {
- Datum values[INDEX_MAX_KEYS];
- bool isnull[INDEX_MAX_KEYS];
- char *key_desc;
-
- /*
- * Some rather brain-dead implementations of qsort (such as the one in
- * QNX 4) will sometimes call the comparison routine to compare a
- * value to itself, but we always use our own implementation, which
- * does not.
- */
- Assert(tuple1 != tuple2);
-
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
- }
-
- /*
- * If key values are equal, we sort on ItemPointer. This is required for
- * btree indexes, since heap TID is treated as an implicit last key
- * attribute in order to ensure that all keys in the index are physically
- * unique.
- */
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
-}
-
-static int
-comparetup_index_hash(const SortTuple *a, const SortTuple *b,
- Tuplesortstate *state)
-{
- Bucket bucket1;
- Bucket bucket2;
- IndexTuple tuple1;
- IndexTuple tuple2;
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexHashArg *arg = (TuplesortIndexHashArg *) base->arg;
-
- /*
- * Fetch hash keys and mask off bits we don't want to sort by. We know
- * that the first column of the index tuple is the hash key.
- */
- Assert(!a->isnull1);
- bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
- arg->max_buckets, arg->high_mask,
- arg->low_mask);
- Assert(!b->isnull1);
- bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
- arg->max_buckets, arg->high_mask,
- arg->low_mask);
- if (bucket1 > bucket2)
- return 1;
- else if (bucket1 < bucket2)
- return -1;
-
- /*
- * If hash values are equal, we sort on ItemPointer. This does not affect
- * validity of the finished index, but it may be useful to have index
- * scans in physical order.
- */
- tuple1 = (IndexTuple) a->tuple;
- tuple2 = (IndexTuple) b->tuple;
-
- {
- BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
- BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
-
- if (blk1 != blk2)
- return (blk1 < blk2) ? -1 : 1;
- }
- {
- OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
- OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
-
- if (pos1 != pos2)
- return (pos1 < pos2) ? -1 : 1;
- }
-
- /* ItemPointer values should never be equal */
- Assert(false);
-
- return 0;
-}
-
-static void
-writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- IndexTuple tuple = (IndexTuple) stup->tuple;
- unsigned int tuplen;
-
- tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
- LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
-}
-
-static void
-readtup_index(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
- unsigned int tuplen = len - sizeof(unsigned int);
- IndexTuple tuple = (IndexTuple) readtup_alloc(state, tuplen);
-
- LogicalTapeReadExact(tape, tuple, tuplen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
- stup->tuple = (void *) tuple;
- /* set up first-column key value */
- stup->datum1 = index_getattr(tuple,
- 1,
- RelationGetDescr(arg->indexRel),
- &stup->isnull1);
-}
-
-/*
- * Routines specialized for DatumTuple case
- */
-
-static void
-removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
-{
- int i;
-
- for (i = 0; i < count; i++)
- stups[i].datum1 = PointerGetDatum(stups[i].tuple);
-}
-
-static int
-comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- int compare;
-
- compare = ApplySortComparator(a->datum1, a->isnull1,
- b->datum1, b->isnull1,
- base->sortKeys);
- if (compare != 0)
- return compare;
-
- /* if we have abbreviations, then "tuple" has the original value */
-
- if (base->sortKeys->abbrev_converter)
- compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
- PointerGetDatum(b->tuple), b->isnull1,
- base->sortKeys);
-
- return compare;
-}
-
-static void
-writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
- void *waddr;
- unsigned int tuplen;
- unsigned int writtenlen;
-
- if (stup->isnull1)
- {
- waddr = NULL;
- tuplen = 0;
- }
- else if (!base->tuples)
- {
- waddr = &stup->datum1;
- tuplen = sizeof(Datum);
- }
- else
- {
- waddr = stup->tuple;
- tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
- Assert(tuplen != 0);
- }
-
- writtenlen = tuplen + sizeof(unsigned int);
-
- LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
- LogicalTapeWrite(tape, waddr, tuplen);
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
-}
-
-static void
-readtup_datum(Tuplesortstate *state, SortTuple *stup,
- LogicalTape *tape, unsigned int len)
-{
- TuplesortPublic *base = TuplesortstateGetPublic(state);
- unsigned int tuplen = len - sizeof(unsigned int);
-
- if (tuplen == 0)
- {
- /* it's NULL */
- stup->datum1 = (Datum) 0;
- stup->isnull1 = true;
- stup->tuple = NULL;
- }
- else if (!base->tuples)
- {
- Assert(tuplen == sizeof(Datum));
- LogicalTapeReadExact(tape, &stup->datum1, tuplen);
- stup->isnull1 = false;
- stup->tuple = NULL;
- }
- else
- {
- void *raddr = readtup_alloc(state, tuplen);
-
- LogicalTapeReadExact(tape, raddr, tuplen);
- stup->datum1 = PointerGetDatum(raddr);
- stup->isnull1 = false;
- stup->tuple = raddr;
- }
-
- if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
- LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
-}
-
/*
* Parallel sort routines
*/
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
new file mode 100644
index 00000000000..2933020dcc8
--- /dev/null
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -0,0 +1,1577 @@
+/*-------------------------------------------------------------------------
+ *
+ * tuplesortvariants.c
+ * Implementation of tuple sorting variants.
+ *
+ * This module handles the sorting of heap tuples, index tuples, or single
+ * Datums. The implementation is based on the generalized tuple sorting
+ * facility given in tuplesort.c. Support other kinds of sortable objects
+ * could be easily added here, another module, or even an extension.
+ *
+ *
+ * Copyright (c) 2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/utils/sort/tuplesortvariants.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/hash.h"
+#include "access/htup_details.h"
+#include "access/nbtree.h"
+#include "catalog/index.h"
+#include "executor/executor.h"
+#include "pg_trace.h"
+#include "utils/datum.h"
+#include "utils/lsyscache.h"
+#include "utils/guc.h"
+#include "utils/tuplesort.h"
+
+
+/* sort-type codes for sort__start probes */
+#define HEAP_SORT 0
+#define INDEX_SORT 1
+#define DATUM_SORT 2
+#define CLUSTER_SORT 3
+
+static void removeabbrev_heap(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_index(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static void removeabbrev_datum(Tuplesortstate *state, SortTuple *stups,
+ int count);
+static int comparetup_heap(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_heap(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_heap(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_cluster(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_cluster(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_cluster(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_index_btree(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static int comparetup_index_hash(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_index(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_index(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static int comparetup_datum(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+static void writetup_datum(Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+static void readtup_datum(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+static void freestate_cluster(Tuplesortstate *state);
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the CLUSTER case. Set by
+ * the tuplesort_begin_cluster.
+ */
+typedef struct
+{
+ TupleDesc tupDesc;
+
+ IndexInfo *indexInfo; /* info about index being used for reference */
+ EState *estate; /* for evaluating index expressions */
+} TuplesortClusterArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the IndexTuple case.
+ * Set by tuplesort_begin_index_xxx and used only by the IndexTuple routines.
+ */
+typedef struct
+{
+ Relation heapRel; /* table the index is being built on */
+ Relation indexRel; /* index being built */
+} TuplesortIndexArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_btree subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ bool enforceUnique; /* complain if we find duplicate tuples */
+ bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+} TuplesortIndexBTreeArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the index_hash subcase.
+ */
+typedef struct
+{
+ TuplesortIndexArg index;
+
+ uint32 high_mask; /* masks for sortable part of hash code */
+ uint32 low_mask;
+ uint32 max_buckets;
+} TuplesortIndexHashArg;
+
+/*
+ * Data struture pointed by "TuplesortPublic.arg" for the Datum case.
+ * Set by tuplesort_begin_datum and used only by the DatumTuple routines.
+ */
+typedef struct
+{
+ /* the datatype oid of Datum's to be sorted */
+ Oid datumType;
+ /* we need typelen in order to know how to copy the Datums. */
+ int datumTypeLen;
+} TuplesortDatumArg;
+
+Tuplesortstate *
+tuplesort_begin_heap(TupleDesc tupDesc,
+ int nkeys, AttrNumber *attNums,
+ Oid *sortOperators, Oid *sortCollations,
+ bool *nullsFirstFlags,
+ int workMem, SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+
+ AssertArg(nkeys > 0);
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
+ nkeys, workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = nkeys;
+
+ TRACE_POSTGRESQL_SORT_START(HEAP_SORT,
+ false, /* no unique check */
+ nkeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_heap;
+ base->comparetup = comparetup_heap;
+ base->writetup = writetup_heap;
+ base->readtup = readtup_heap;
+ base->haveDatum1 = true;
+ base->arg = tupDesc; /* assume we need not copy tupDesc */
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData));
+
+ for (i = 0; i < nkeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+
+ AssertArg(attNums[i] != 0);
+ AssertArg(sortOperators[i] != 0);
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = sortCollations[i];
+ sortKey->ssup_nulls_first = nullsFirstFlags[i];
+ sortKey->ssup_attno = attNums[i];
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
+ }
+
+ /*
+ * The "onlyKey" optimization cannot be used with abbreviated keys, since
+ * tie-breaker comparisons may be required. Typically, the optimization
+ * is only of value to pass-by-value types anyway, whereas abbreviated
+ * keys are typically only of value to pass-by-reference types.
+ */
+ if (nkeys == 1 && !base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_cluster(TupleDesc tupDesc,
+ Relation indexRel,
+ int workMem,
+ SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ BTScanInsert indexScanKey;
+ MemoryContext oldcontext;
+ TuplesortClusterArg *arg;
+ int i;
+
+ Assert(indexRel->rd_rel->relam == BTREE_AM_OID);
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortClusterArg *) palloc0(sizeof(TuplesortClusterArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
+ RelationGetNumberOfAttributes(indexRel),
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ TRACE_POSTGRESQL_SORT_START(CLUSTER_SORT,
+ false, /* no unique check */
+ base->nKeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_cluster;
+ base->comparetup = comparetup_cluster;
+ base->writetup = writetup_cluster;
+ base->readtup = readtup_cluster;
+ base->freestate = freestate_cluster;
+ base->arg = arg;
+
+ arg->indexInfo = BuildIndexInfo(indexRel);
+
+ /*
+ * If we don't have a simple leading attribute, we don't currently
+ * initialize datum1, so disable optimizations that require it.
+ */
+ if (arg->indexInfo->ii_IndexAttrNumbers[0] == 0)
+ base->haveDatum1 = false;
+ else
+ base->haveDatum1 = true;
+
+ arg->tupDesc = tupDesc; /* assume we need not copy tupDesc */
+
+ indexScanKey = _bt_mkscankey(indexRel, NULL);
+
+ if (arg->indexInfo->ii_Expressions != NULL)
+ {
+ TupleTableSlot *slot;
+ ExprContext *econtext;
+
+ /*
+ * We will need to use FormIndexDatum to evaluate the index
+ * expressions. To do that, we need an EState, as well as a
+ * TupleTableSlot to put the table tuples into. The econtext's
+ * scantuple has to point to that slot, too.
+ */
+ arg->estate = CreateExecutorState();
+ slot = MakeSingleTupleTableSlot(tupDesc, &TTSOpsHeapTuple);
+ econtext = GetPerTupleExprContext(arg->estate);
+ econtext->ecxt_scantuple = slot;
+ }
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < base->nKeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+ ScanKey scanKey = indexScanKey->scankeys + i;
+ int16 strategy;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = scanKey->sk_collation;
+ sortKey->ssup_nulls_first =
+ (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
+ sortKey->ssup_attno = scanKey->sk_attno;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
+ BTGreaterStrategyNumber : BTLessStrategyNumber;
+
+ PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
+ }
+
+ pfree(indexScanKey);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_btree(Relation heapRel,
+ Relation indexRel,
+ bool enforceUnique,
+ bool uniqueNullsNotDistinct,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ BTScanInsert indexScanKey;
+ TuplesortIndexBTreeArg *arg;
+ MemoryContext oldcontext;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: unique = %c, workMem = %d, randomAccess = %c",
+ enforceUnique ? 't' : 'f',
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ TRACE_POSTGRESQL_SORT_START(INDEX_SORT,
+ enforceUnique,
+ base->nKeys,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = enforceUnique;
+ arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+
+ indexScanKey = _bt_mkscankey(indexRel, NULL);
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < base->nKeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+ ScanKey scanKey = indexScanKey->scankeys + i;
+ int16 strategy;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = scanKey->sk_collation;
+ sortKey->ssup_nulls_first =
+ (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0;
+ sortKey->ssup_attno = scanKey->sk_attno;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ strategy = (scanKey->sk_flags & SK_BT_DESC) != 0 ?
+ BTGreaterStrategyNumber : BTLessStrategyNumber;
+
+ PrepareSortSupportFromIndexRel(indexRel, strategy, sortKey);
+ }
+
+ pfree(indexScanKey);
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_hash(Relation heapRel,
+ Relation indexRel,
+ uint32 high_mask,
+ uint32 low_mask,
+ uint32 max_buckets,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext;
+ TuplesortIndexHashArg *arg;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexHashArg *) palloc(sizeof(TuplesortIndexHashArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: high_mask = 0x%x, low_mask = 0x%x, "
+ "max_buckets = 0x%x, workMem = %d, randomAccess = %c",
+ high_mask,
+ low_mask,
+ max_buckets,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = 1; /* Only one sort column, the hash code */
+
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_hash;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+
+ arg->high_mask = high_mask;
+ arg->low_mask = low_mask;
+ arg->max_buckets = max_buckets;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_index_gist(Relation heapRel,
+ Relation indexRel,
+ int workMem,
+ SortCoordinate coordinate,
+ int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext;
+ TuplesortIndexBTreeArg *arg;
+ int i;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortIndexBTreeArg *) palloc(sizeof(TuplesortIndexBTreeArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin index sort: workMem = %d, randomAccess = %c",
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel);
+
+ base->removeabbrev = removeabbrev_index;
+ base->comparetup = comparetup_index_btree;
+ base->writetup = writetup_index;
+ base->readtup = readtup_index;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->index.heapRel = heapRel;
+ arg->index.indexRel = indexRel;
+ arg->enforceUnique = false;
+ arg->uniqueNullsNotDistinct = false;
+
+ /* Prepare SortSupport data for each column */
+ base->sortKeys = (SortSupport) palloc0(base->nKeys *
+ sizeof(SortSupportData));
+
+ for (i = 0; i < base->nKeys; i++)
+ {
+ SortSupport sortKey = base->sortKeys + i;
+
+ sortKey->ssup_cxt = CurrentMemoryContext;
+ sortKey->ssup_collation = indexRel->rd_indcollation[i];
+ sortKey->ssup_nulls_first = false;
+ sortKey->ssup_attno = i + 1;
+ /* Convey if abbreviation optimization is applicable in principle */
+ sortKey->abbreviate = (i == 0 && base->haveDatum1);
+
+ AssertState(sortKey->ssup_attno != 0);
+
+ /* Look for a sort support function */
+ PrepareSortSupportFromGistIndexRel(indexRel, sortKey);
+ }
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+Tuplesortstate *
+tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
+ bool nullsFirstFlag, int workMem,
+ SortCoordinate coordinate, int sortopt)
+{
+ Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
+ sortopt);
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg;
+ MemoryContext oldcontext;
+ int16 typlen;
+ bool typbyval;
+
+ oldcontext = MemoryContextSwitchTo(base->maincontext);
+ arg = (TuplesortDatumArg *) palloc(sizeof(TuplesortDatumArg));
+
+#ifdef TRACE_SORT
+ if (trace_sort)
+ elog(LOG,
+ "begin datum sort: workMem = %d, randomAccess = %c",
+ workMem, sortopt & TUPLESORT_RANDOMACCESS ? 't' : 'f');
+#endif
+
+ base->nKeys = 1; /* always a one-column sort */
+
+ TRACE_POSTGRESQL_SORT_START(DATUM_SORT,
+ false, /* no unique check */
+ 1,
+ workMem,
+ sortopt & TUPLESORT_RANDOMACCESS,
+ PARALLEL_SORT(coordinate));
+
+ base->removeabbrev = removeabbrev_datum;
+ base->comparetup = comparetup_datum;
+ base->writetup = writetup_datum;
+ base->readtup = readtup_datum;
+ base->haveDatum1 = true;
+ base->arg = arg;
+
+ arg->datumType = datumType;
+
+ /* lookup necessary attributes of the datum type */
+ get_typlenbyval(datumType, &typlen, &typbyval);
+ arg->datumTypeLen = typlen;
+ base->tuples = !typbyval;
+
+ /* Prepare SortSupport data */
+ base->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+
+ base->sortKeys->ssup_cxt = CurrentMemoryContext;
+ base->sortKeys->ssup_collation = sortCollation;
+ base->sortKeys->ssup_nulls_first = nullsFirstFlag;
+
+ /*
+ * Abbreviation is possible here only for by-reference types. In theory,
+ * a pass-by-value datatype could have an abbreviated form that is cheaper
+ * to compare. In a tuple sort, we could support that, because we can
+ * always extract the original datum from the tuple as needed. Here, we
+ * can't, because a datum sort only stores a single copy of the datum; the
+ * "tuple" field of each SortTuple is NULL.
+ */
+ base->sortKeys->abbreviate = !typbyval;
+
+ PrepareSortSupportFromOrderingOp(sortOperator, base->sortKeys);
+
+ /*
+ * The "onlyKey" optimization cannot be used with abbreviated keys, since
+ * tie-breaker comparisons may be required. Typically, the optimization
+ * is only of value to pass-by-value types anyway, whereas abbreviated
+ * keys are typically only of value to pass-by-reference types.
+ */
+ if (!base->sortKeys->abbrev_converter)
+ base->onlyKey = base->sortKeys;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return state;
+}
+
+/*
+ * Accept one tuple while collecting input data for sort.
+ *
+ * Note that the input data is always copied; the caller need not save it.
+ */
+void
+tuplesort_puttupleslot(Tuplesortstate *state, TupleTableSlot *slot)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TupleDesc tupDesc = (TupleDesc) base->arg;
+ SortTuple stup;
+ MinimalTuple tuple;
+ HeapTupleData htup;
+
+ /* copy the tuple into sort storage */
+ tuple = ExecCopySlotMinimalTuple(slot);
+ stup.tuple = (void *) tuple;
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ stup.datum1 = heap_getattr(&htup,
+ base->sortKeys[0].ssup_attno,
+ tupDesc,
+ &stup.isnull1);
+
+ tuplesort_puttuple_common(state, &stup,
+ base->sortKeys->abbrev_converter &&
+ !stup.isnull1);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Accept one tuple while collecting input data for sort.
+ *
+ * Note that the input data is always copied; the caller need not save it.
+ */
+void
+tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup)
+{
+ SortTuple stup;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ /* copy the tuple into sort storage */
+ tup = heap_copytuple(tup);
+ stup.tuple = (void *) tup;
+
+ /*
+ * set up first-column key value, and potentially abbreviate, if it's a
+ * simple column
+ */
+ if (base->haveDatum1)
+ {
+ stup.datum1 = heap_getattr(tup,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup.isnull1);
+ }
+
+ tuplesort_puttuple_common(state, &stup,
+ base->haveDatum1 &&
+ base->sortKeys->abbrev_converter &&
+ !stup.isnull1);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Collect one index tuple while collecting input data for sort, building
+ * it from caller-supplied values.
+ */
+void
+tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
+ ItemPointer self, Datum *values,
+ bool *isnull)
+{
+ SortTuple stup;
+ IndexTuple tuple;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
+
+ stup.tuple = index_form_tuple_context(RelationGetDescr(rel), values,
+ isnull, base->tuplecontext);
+ tuple = ((IndexTuple) stup.tuple);
+ tuple->t_tid = *self;
+ /* set up first-column key value */
+ stup.datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup.isnull1);
+
+ tuplesort_puttuple_common(state, &stup,
+ base->sortKeys &&
+ base->sortKeys->abbrev_converter &&
+ !stup.isnull1);
+}
+
+/*
+ * Accept one Datum while collecting input data for sort.
+ *
+ * If the Datum is pass-by-ref type, the value will be copied.
+ */
+void
+tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->tuplecontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
+ SortTuple stup;
+
+ /*
+ * Pass-by-value types or null values are just stored directly in
+ * stup.datum1 (and stup.tuple is not used and set to NULL).
+ *
+ * Non-null pass-by-reference values need to be copied into memory we
+ * control, and possibly abbreviated. The copied value is pointed to by
+ * stup.tuple and is treated as the canonical copy (e.g. to return via
+ * tuplesort_getdatum or when writing to tape); stup.datum1 gets the
+ * abbreviated value if abbreviation is happening, otherwise it's
+ * identical to stup.tuple.
+ */
+
+ if (isNull || !base->tuples)
+ {
+ /*
+ * Set datum1 to zeroed representation for NULLs (to be consistent,
+ * and to support cheap inequality tests for NULL abbreviated keys).
+ */
+ stup.datum1 = !isNull ? val : (Datum) 0;
+ stup.isnull1 = isNull;
+ stup.tuple = NULL; /* no separate storage */
+ }
+ else
+ {
+ stup.isnull1 = false;
+ stup.datum1 = datumCopy(val, false, arg->datumTypeLen);
+ stup.tuple = DatumGetPointer(stup.datum1);
+ }
+
+ tuplesort_puttuple_common(state, &stup,
+ base->tuples &&
+ base->sortKeys->abbrev_converter && !isNull);
+
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Fetch the next tuple in either forward or back direction.
+ * If successful, put tuple in slot and return true; else, clear the slot
+ * and return false.
+ *
+ * Caller may optionally be passed back abbreviated value (on true return
+ * value) when abbreviation was used, which can be used to cheaply avoid
+ * equality checks that might otherwise be required. Caller can safely make a
+ * determination of "non-equal tuple" based on simple binary inequality. A
+ * NULL value in leading attribute will set abbreviated value to zeroed
+ * representation, which caller may rely on in abbreviated inequality check.
+ *
+ * If copy is true, the slot receives a tuple that's been copied into the
+ * caller's memory context, so that it will stay valid regardless of future
+ * manipulations of the tuplesort's state (up to and including deleting the
+ * tuplesort). If copy is false, the slot will just receive a pointer to a
+ * tuple held within the tuplesort, which is more efficient, but only safe for
+ * callers that are prepared to have any subsequent manipulation of the
+ * tuplesort's state invalidate slot contents.
+ */
+bool
+tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
+ TupleTableSlot *slot, Datum *abbrev)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ if (stup.tuple)
+ {
+ /* Record abbreviated key for caller */
+ if (base->sortKeys->abbrev_converter && abbrev)
+ *abbrev = stup.datum1;
+
+ if (copy)
+ stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
+
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
+ return true;
+ }
+ else
+ {
+ ExecClearTuple(slot);
+ return false;
+ }
+}
+
+/*
+ * Fetch the next tuple in either forward or back direction.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller may not rely on tuple
+ * remaining valid after any further manipulation of tuplesort.
+ */
+HeapTuple
+tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return stup.tuple;
+}
+
+/*
+ * Fetch the next index tuple in either forward or back direction.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller may not rely on tuple
+ * remaining valid after any further manipulation of tuplesort.
+ */
+IndexTuple
+tuplesort_getindextuple(Tuplesortstate *state, bool forward)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ stup.tuple = NULL;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ return (IndexTuple) stup.tuple;
+}
+
+/*
+ * Fetch the next Datum in either forward or back direction.
+ * Returns false if no more datums.
+ *
+ * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
+ * in caller's context, and is now owned by the caller (this differs from
+ * similar routines for other types of tuplesorts).
+ *
+ * Caller may optionally be passed back abbreviated value (on true return
+ * value) when abbreviation was used, which can be used to cheaply avoid
+ * equality checks that might otherwise be required. Caller can safely make a
+ * determination of "non-equal tuple" based on simple binary inequality. A
+ * NULL value will have a zeroed abbreviated value representation, which caller
+ * may rely on in abbreviated inequality check.
+ */
+bool
+tuplesort_getdatum(Tuplesortstate *state, bool forward,
+ Datum *val, bool *isNull, Datum *abbrev)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MemoryContext oldcontext = MemoryContextSwitchTo(base->sortcontext);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
+ SortTuple stup;
+
+ if (!tuplesort_gettuple_common(state, forward, &stup))
+ {
+ MemoryContextSwitchTo(oldcontext);
+ return false;
+ }
+
+ /* Ensure we copy into caller's memory context */
+ MemoryContextSwitchTo(oldcontext);
+
+ /* Record abbreviated key for caller */
+ if (base->sortKeys->abbrev_converter && abbrev)
+ *abbrev = stup.datum1;
+
+ if (stup.isnull1 || !base->tuples)
+ {
+ *val = stup.datum1;
+ *isNull = stup.isnull1;
+ }
+ else
+ {
+ /* use stup.tuple because stup.datum1 may be an abbreviation */
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, arg->datumTypeLen);
+ *isNull = false;
+ }
+
+ return true;
+}
+
+
+/*
+ * Routines specialized for HeapTuple (actually MinimalTuple) case
+ */
+
+static void
+removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTupleData htup;
+
+ htup.t_len = ((MinimalTuple) stups[i].tuple)->t_len +
+ MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) stups[i].tuple -
+ MINIMAL_TUPLE_OFFSET);
+ stups[i].datum1 = heap_getattr(&htup,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
+ &stups[i].isnull1);
+ }
+}
+
+static int
+comparetup_heap(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ SortSupport sortKey = base->sortKeys;
+ HeapTupleData ltup;
+ HeapTupleData rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ AttrNumber attno;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+
+ /* Compare the leading sort key */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ /* Compare additional sort keys */
+ ltup.t_len = ((MinimalTuple) a->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ ltup.t_data = (HeapTupleHeader) ((char *) a->tuple - MINIMAL_TUPLE_OFFSET);
+ rtup.t_len = ((MinimalTuple) b->tuple)->t_len + MINIMAL_TUPLE_OFFSET;
+ rtup.t_data = (HeapTupleHeader) ((char *) b->tuple - MINIMAL_TUPLE_OFFSET);
+ tupDesc = (TupleDesc) base->arg;
+
+ if (sortKey->abbrev_converter)
+ {
+ attno = sortKey->ssup_attno;
+
+ datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ sortKey++;
+ for (nkey = 1; nkey < base->nKeys; nkey++, sortKey++)
+ {
+ attno = sortKey->ssup_attno;
+
+ datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(&rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ return 0;
+}
+
+static void
+writetup_heap(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ MinimalTuple tuple = (MinimalTuple) stup->tuple;
+
+ /* the part of the MinimalTuple we'll write: */
+ char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ unsigned int tupbodylen = tuple->t_len - MINIMAL_TUPLE_DATA_OFFSET;
+
+ /* total on-disk footprint: */
+ unsigned int tuplen = tupbodylen + sizeof(int);
+
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, (void *) tupbody, tupbodylen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_heap(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ unsigned int tupbodylen = len - sizeof(int);
+ unsigned int tuplen = tupbodylen + MINIMAL_TUPLE_DATA_OFFSET;
+ MinimalTuple tuple = (MinimalTuple) tuplesort_readtup_alloc(state, tuplen);
+ char *tupbody = (char *) tuple + MINIMAL_TUPLE_DATA_OFFSET;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ HeapTupleData htup;
+
+ /* read in the tuple proper */
+ tuple->t_len = tuplen;
+ LogicalTapeReadExact(tape, tupbody, tupbodylen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value */
+ htup.t_len = tuple->t_len + MINIMAL_TUPLE_OFFSET;
+ htup.t_data = (HeapTupleHeader) ((char *) tuple - MINIMAL_TUPLE_OFFSET);
+ stup->datum1 = heap_getattr(&htup,
+ base->sortKeys[0].ssup_attno,
+ (TupleDesc) base->arg,
+ &stup->isnull1);
+}
+
+/*
+ * Routines specialized for the CLUSTER case (HeapTuple data, with
+ * comparisons per a btree index definition)
+ */
+
+static void
+removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ for (i = 0; i < count; i++)
+ {
+ HeapTuple tup;
+
+ tup = (HeapTuple) stups[i].tuple;
+ stups[i].datum1 = heap_getattr(tup,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stups[i].isnull1);
+ }
+}
+
+static int
+comparetup_cluster(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
+ HeapTuple ltup;
+ HeapTuple rtup;
+ TupleDesc tupDesc;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+ /* Be prepared to compare additional sort keys */
+ ltup = (HeapTuple) a->tuple;
+ rtup = (HeapTuple) b->tuple;
+ tupDesc = arg->tupDesc;
+
+ /* Compare the leading sort key, if it's simple */
+ if (base->haveDatum1)
+ {
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ if (sortKey->abbrev_converter)
+ {
+ AttrNumber leading = arg->indexInfo->ii_IndexAttrNumbers[0];
+
+ datum1 = heap_getattr(ltup, leading, tupDesc, &isnull1);
+ datum2 = heap_getattr(rtup, leading, tupDesc, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ }
+ if (compare != 0 || base->nKeys == 1)
+ return compare;
+ /* Compare additional columns the hard way */
+ sortKey++;
+ nkey = 1;
+ }
+ else
+ {
+ /* Must compare all keys the hard way */
+ nkey = 0;
+ }
+
+ if (arg->indexInfo->ii_Expressions == NULL)
+ {
+ /* If not expression index, just compare the proper heap attrs */
+
+ for (; nkey < base->nKeys; nkey++, sortKey++)
+ {
+ AttrNumber attno = arg->indexInfo->ii_IndexAttrNumbers[nkey];
+
+ datum1 = heap_getattr(ltup, attno, tupDesc, &isnull1);
+ datum2 = heap_getattr(rtup, attno, tupDesc, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+ }
+ else
+ {
+ /*
+ * In the expression index case, compute the whole index tuple and
+ * then compare values. It would perhaps be faster to compute only as
+ * many columns as we need to compare, but that would require
+ * duplicating all the logic in FormIndexDatum.
+ */
+ Datum l_index_values[INDEX_MAX_KEYS];
+ bool l_index_isnull[INDEX_MAX_KEYS];
+ Datum r_index_values[INDEX_MAX_KEYS];
+ bool r_index_isnull[INDEX_MAX_KEYS];
+ TupleTableSlot *ecxt_scantuple;
+
+ /* Reset context each time to prevent memory leakage */
+ ResetPerTupleExprContext(arg->estate);
+
+ ecxt_scantuple = GetPerTupleExprContext(arg->estate)->ecxt_scantuple;
+
+ ExecStoreHeapTuple(ltup, ecxt_scantuple, false);
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
+ l_index_values, l_index_isnull);
+
+ ExecStoreHeapTuple(rtup, ecxt_scantuple, false);
+ FormIndexDatum(arg->indexInfo, ecxt_scantuple, arg->estate,
+ r_index_values, r_index_isnull);
+
+ for (; nkey < base->nKeys; nkey++, sortKey++)
+ {
+ compare = ApplySortComparator(l_index_values[nkey],
+ l_index_isnull[nkey],
+ r_index_values[nkey],
+ r_index_isnull[nkey],
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+ }
+
+ return 0;
+}
+
+static void
+writetup_cluster(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ HeapTuple tuple = (HeapTuple) stup->tuple;
+ unsigned int tuplen = tuple->t_len + sizeof(ItemPointerData) + sizeof(int);
+
+ /* We need to store t_self, but not other fields of HeapTupleData */
+ LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, &tuple->t_self, sizeof(ItemPointerData));
+ LogicalTapeWrite(tape, tuple->t_data, tuple->t_len);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_cluster(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int tuplen)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+ unsigned int t_len = tuplen - sizeof(ItemPointerData) - sizeof(int);
+ HeapTuple tuple = (HeapTuple) tuplesort_readtup_alloc(state,
+ t_len + HEAPTUPLESIZE);
+
+ /* Reconstruct the HeapTupleData header */
+ tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);
+ tuple->t_len = t_len;
+ LogicalTapeReadExact(tape, &tuple->t_self, sizeof(ItemPointerData));
+ /* We don't currently bother to reconstruct t_tableOid */
+ tuple->t_tableOid = InvalidOid;
+ /* Read in the tuple body */
+ LogicalTapeReadExact(tape, tuple->t_data, tuple->t_len);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value, if it's a simple column */
+ if (base->haveDatum1)
+ stup->datum1 = heap_getattr(tuple,
+ arg->indexInfo->ii_IndexAttrNumbers[0],
+ arg->tupDesc,
+ &stup->isnull1);
+}
+
+static void
+freestate_cluster(Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortClusterArg *arg = (TuplesortClusterArg *) base->arg;
+
+ /* Free any execution state created for CLUSTER case */
+ if (arg->estate != NULL)
+ {
+ ExprContext *econtext = GetPerTupleExprContext(arg->estate);
+
+ ExecDropSingleTupleTableSlot(econtext->ecxt_scantuple);
+ FreeExecutorState(arg->estate);
+ }
+}
+
+/*
+ * Routines specialized for IndexTuple case
+ *
+ * The btree and hash cases require separate comparison functions, but the
+ * IndexTuple representation is the same so the copy/write/read support
+ * functions can be shared.
+ */
+
+static void
+removeabbrev_index(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
+ int i;
+
+ for (i = 0; i < count; i++)
+ {
+ IndexTuple tuple;
+
+ tuple = stups[i].tuple;
+ stups[i].datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stups[i].isnull1);
+ }
+}
+
+static int
+comparetup_index_btree(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ /*
+ * This is similar to comparetup_heap(), but expects index tuples. There
+ * is also special handling for enforcing uniqueness, and special
+ * treatment for equal keys at the end.
+ */
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexBTreeArg *arg = (TuplesortIndexBTreeArg *) base->arg;
+ SortSupport sortKey = base->sortKeys;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+ int keysz;
+ TupleDesc tupDes;
+ bool equal_hasnull = false;
+ int nkey;
+ int32 compare;
+ Datum datum1,
+ datum2;
+ bool isnull1,
+ isnull2;
+
+
+ /* Compare the leading sort key */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ sortKey);
+ if (compare != 0)
+ return compare;
+
+ /* Compare additional sort keys */
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+ keysz = base->nKeys;
+ tupDes = RelationGetDescr(arg->index.indexRel);
+
+ if (sortKey->abbrev_converter)
+ {
+ datum1 = index_getattr(tuple1, 1, tupDes, &isnull1);
+ datum2 = index_getattr(tuple2, 1, tupDes, &isnull2);
+
+ compare = ApplySortAbbrevFullComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare;
+ }
+
+ /* they are equal, so we only need to examine one null flag */
+ if (a->isnull1)
+ equal_hasnull = true;
+
+ sortKey++;
+ for (nkey = 2; nkey <= keysz; nkey++, sortKey++)
+ {
+ datum1 = index_getattr(tuple1, nkey, tupDes, &isnull1);
+ datum2 = index_getattr(tuple2, nkey, tupDes, &isnull2);
+
+ compare = ApplySortComparator(datum1, isnull1,
+ datum2, isnull2,
+ sortKey);
+ if (compare != 0)
+ return compare; /* done when we find unequal attributes */
+
+ /* they are equal, so we only need to examine one null flag */
+ if (isnull1)
+ equal_hasnull = true;
+ }
+
+ /*
+ * If btree has asked us to enforce uniqueness, complain if two equal
+ * tuples are detected (unless there was at least one NULL field and NULLS
+ * NOT DISTINCT was not set).
+ *
+ * It is sufficient to make the test here, because if two tuples are equal
+ * they *must* get compared at some stage of the sort --- otherwise the
+ * sort algorithm wouldn't have checked whether one must appear before the
+ * other.
+ */
+ if (arg->enforceUnique && !(!arg->uniqueNullsNotDistinct && equal_hasnull))
+ {
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ char *key_desc;
+
+ /*
+ * Some rather brain-dead implementations of qsort (such as the one in
+ * QNX 4) will sometimes call the comparison routine to compare a
+ * value to itself, but we always use our own implementation, which
+ * does not.
+ */
+ Assert(tuple1 != tuple2);
+
+ index_deform_tuple(tuple1, tupDes, values, isnull);
+
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+ }
+
+ /*
+ * If key values are equal, we sort on ItemPointer. This is required for
+ * btree indexes, since heap TID is treated as an implicit last key
+ * attribute in order to ensure that all keys in the index are physically
+ * unique.
+ */
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+static int
+comparetup_index_hash(const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state)
+{
+ Bucket bucket1;
+ Bucket bucket2;
+ IndexTuple tuple1;
+ IndexTuple tuple2;
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexHashArg *arg = (TuplesortIndexHashArg *) base->arg;
+
+ /*
+ * Fetch hash keys and mask off bits we don't want to sort by. We know
+ * that the first column of the index tuple is the hash key.
+ */
+ Assert(!a->isnull1);
+ bucket1 = _hash_hashkey2bucket(DatumGetUInt32(a->datum1),
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
+ Assert(!b->isnull1);
+ bucket2 = _hash_hashkey2bucket(DatumGetUInt32(b->datum1),
+ arg->max_buckets, arg->high_mask,
+ arg->low_mask);
+ if (bucket1 > bucket2)
+ return 1;
+ else if (bucket1 < bucket2)
+ return -1;
+
+ /*
+ * If hash values are equal, we sort on ItemPointer. This does not affect
+ * validity of the finished index, but it may be useful to have index
+ * scans in physical order.
+ */
+ tuple1 = (IndexTuple) a->tuple;
+ tuple2 = (IndexTuple) b->tuple;
+
+ {
+ BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
+ BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
+
+ if (blk1 != blk2)
+ return (blk1 < blk2) ? -1 : 1;
+ }
+ {
+ OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
+ OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
+
+ if (pos1 != pos2)
+ return (pos1 < pos2) ? -1 : 1;
+ }
+
+ /* ItemPointer values should never be equal */
+ Assert(false);
+
+ return 0;
+}
+
+static void
+writetup_index(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ IndexTuple tuple = (IndexTuple) stup->tuple;
+ unsigned int tuplen;
+
+ tuplen = IndexTupleSize(tuple) + sizeof(tuplen);
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+ LogicalTapeWrite(tape, (void *) tuple, IndexTupleSize(tuple));
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, (void *) &tuplen, sizeof(tuplen));
+}
+
+static void
+readtup_index(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortIndexArg *arg = (TuplesortIndexArg *) base->arg;
+ unsigned int tuplen = len - sizeof(unsigned int);
+ IndexTuple tuple = (IndexTuple) tuplesort_readtup_alloc(state, tuplen);
+
+ LogicalTapeReadExact(tape, tuple, tuplen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+ stup->tuple = (void *) tuple;
+ /* set up first-column key value */
+ stup->datum1 = index_getattr(tuple,
+ 1,
+ RelationGetDescr(arg->indexRel),
+ &stup->isnull1);
+}
+
+/*
+ * Routines specialized for DatumTuple case
+ */
+
+static void
+removeabbrev_datum(Tuplesortstate *state, SortTuple *stups, int count)
+{
+ int i;
+
+ for (i = 0; i < count; i++)
+ stups[i].datum1 = PointerGetDatum(stups[i].tuple);
+}
+
+static int
+comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ int compare;
+
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+ b->datum1, b->isnull1,
+ base->sortKeys);
+ if (compare != 0)
+ return compare;
+
+ /* if we have abbreviations, then "tuple" has the original value */
+
+ if (base->sortKeys->abbrev_converter)
+ compare = ApplySortAbbrevFullComparator(PointerGetDatum(a->tuple), a->isnull1,
+ PointerGetDatum(b->tuple), b->isnull1,
+ base->sortKeys);
+
+ return compare;
+}
+
+static void
+writetup_datum(Tuplesortstate *state, LogicalTape *tape, SortTuple *stup)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ TuplesortDatumArg *arg = (TuplesortDatumArg *) base->arg;
+ void *waddr;
+ unsigned int tuplen;
+ unsigned int writtenlen;
+
+ if (stup->isnull1)
+ {
+ waddr = NULL;
+ tuplen = 0;
+ }
+ else if (!base->tuples)
+ {
+ waddr = &stup->datum1;
+ tuplen = sizeof(Datum);
+ }
+ else
+ {
+ waddr = stup->tuple;
+ tuplen = datumGetSize(PointerGetDatum(stup->tuple), false, arg->datumTypeLen);
+ Assert(tuplen != 0);
+ }
+
+ writtenlen = tuplen + sizeof(unsigned int);
+
+ LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
+ LogicalTapeWrite(tape, waddr, tuplen);
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeWrite(tape, (void *) &writtenlen, sizeof(writtenlen));
+}
+
+static void
+readtup_datum(Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len)
+{
+ TuplesortPublic *base = TuplesortstateGetPublic(state);
+ unsigned int tuplen = len - sizeof(unsigned int);
+
+ if (tuplen == 0)
+ {
+ /* it's NULL */
+ stup->datum1 = (Datum) 0;
+ stup->isnull1 = true;
+ stup->tuple = NULL;
+ }
+ else if (!base->tuples)
+ {
+ Assert(tuplen == sizeof(Datum));
+ LogicalTapeReadExact(tape, &stup->datum1, tuplen);
+ stup->isnull1 = false;
+ stup->tuple = NULL;
+ }
+ else
+ {
+ void *raddr = tuplesort_readtup_alloc(state, tuplen);
+
+ LogicalTapeReadExact(tape, raddr, tuplen);
+ stup->datum1 = PointerGetDatum(raddr);
+ stup->isnull1 = false;
+ stup->tuple = raddr;
+ }
+
+ if (base->sortopt & TUPLESORT_RANDOMACCESS) /* need trailing length word? */
+ LogicalTapeReadExact(tape, &tuplen, sizeof(tuplen));
+}
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 364cf132fcb..e82b5a638d2 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -24,7 +24,9 @@
#include "access/itup.h"
#include "executor/tuptable.h"
#include "storage/dsm.h"
+#include "utils/logtape.h"
#include "utils/relcache.h"
+#include "utils/sortsupport.h"
/*
@@ -102,6 +104,148 @@ typedef struct TuplesortInstrumentation
int64 spaceUsed; /* space consumption, in kB */
} TuplesortInstrumentation;
+/*
+ * The objects we actually sort are SortTuple structs. These contain
+ * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
+ * which is a separate palloc chunk --- we assume it is just one chunk and
+ * can be freed by a simple pfree() (except during merge, when we use a
+ * simple slab allocator). SortTuples also contain the tuple's first key
+ * column in Datum/nullflag format, and a source/input tape number that
+ * tracks which tape each heap element/slot belongs to during merging.
+ *
+ * Storing the first key column lets us save heap_getattr or index_getattr
+ * calls during tuple comparisons. We could extract and save all the key
+ * columns not just the first, but this would increase code complexity and
+ * overhead, and wouldn't actually save any comparison cycles in the common
+ * case where the first key determines the comparison result. Note that
+ * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
+ *
+ * There is one special case: when the sort support infrastructure provides an
+ * "abbreviated key" representation, where the key is (typically) a pass by
+ * value proxy for a pass by reference type. In this case, the abbreviated key
+ * is stored in datum1 in place of the actual first key column.
+ *
+ * When sorting single Datums, the data value is represented directly by
+ * datum1/isnull1 for pass by value types (or null values). If the datatype is
+ * pass-by-reference and isnull1 is false, then "tuple" points to a separately
+ * palloc'd data value, otherwise "tuple" is NULL. The value of datum1 is then
+ * either the same pointer as "tuple", or is an abbreviated key value as
+ * described above. Accordingly, "tuple" is always used in preference to
+ * datum1 as the authoritative value for pass-by-reference cases.
+ */
+typedef struct
+{
+ void *tuple; /* the tuple itself */
+ Datum datum1; /* value of first key column */
+ bool isnull1; /* is first key column NULL? */
+ int srctape; /* source tape number */
+} SortTuple;
+
+typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
+ Tuplesortstate *state);
+
+/*
+ * The public part of a Tuple sort operation state. This data structure
+ * containsthe definition of sort-variant-specific interface methods and
+ * the part of Tuple sort operation state required by their implementations.
+ */
+typedef struct
+{
+ /*
+ * These function pointers decouple the routines that must know what kind
+ * of tuple we are sorting from the routines that don't need to know it.
+ * They are set up by the tuplesort_begin_xxx routines.
+ *
+ * Function to compare two tuples; result is per qsort() convention, ie:
+ * <0, 0, >0 according as a<b, a=b, a>b. The API must match
+ * qsort_arg_comparator.
+ */
+ SortTupleComparator comparetup;
+
+ /*
+ * Alter datum1 representation in the SortTuple's array back from the
+ * abbreviated key to the first column value.
+ */
+ void (*removeabbrev) (Tuplesortstate *state, SortTuple *stups,
+ int count);
+
+ /*
+ * Function to write a stored tuple onto tape. The representation of the
+ * tuple on tape need not be the same as it is in memory.
+ */
+ void (*writetup) (Tuplesortstate *state, LogicalTape *tape,
+ SortTuple *stup);
+
+ /*
+ * Function to read a stored tuple from tape back into memory. 'len' is
+ * the already-read length of the stored tuple. The tuple is allocated
+ * from the slab memory arena, or is palloc'd, see
+ * tuplesort_readtup_alloc().
+ */
+ void (*readtup) (Tuplesortstate *state, SortTuple *stup,
+ LogicalTape *tape, unsigned int len);
+
+ /*
+ * Function to do some specific release of resources for the sort variant.
+ * In particular, this function should free everything stored in the "arg"
+ * field, which wouldn't be cleared on reset of the Tuple sort memory
+ * contextes. This can be NULL if nothing specific needs to be done.
+ */
+ void (*freestate) (Tuplesortstate *state);
+
+ /*
+ * The subsequent fields are used in the implementations of the functions
+ * above.
+ */
+ MemoryContext maincontext; /* memory context for tuple sort metadata that
+ * persists across multiple batches */
+ MemoryContext sortcontext; /* memory context holding most sort data */
+ MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
+
+ /*
+ * Whether SortTuple's datum1 and isnull1 members are maintained by the
+ * above routines. If not, some sort specializations are disabled.
+ */
+ bool haveDatum1;
+
+ /*
+ * The sortKeys variable is used by every case other than the hash index
+ * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the
+ * MinimalTuple and CLUSTER routines, though.
+ */
+ int nKeys; /* number of columns in sort key */
+ SortSupport sortKeys; /* array of length nKeys */
+
+ /*
+ * This variable is shared by the single-key MinimalTuple case and the
+ * Datum case (which both use qsort_ssup()). Otherwise, it's NULL. The
+ * presence of a value in this field is also checked by various sort
+ * specialization functions as an optimization when comparing the leading
+ * key in a tiebreak situation to determine if there are any subsequent
+ * keys to sort on.
+ */
+ SortSupport onlyKey;
+
+ int sortopt; /* Bitmask of flags used to setup sort */
+
+ bool tuples; /* Can SortTuple.tuple ever be set? */
+
+ void *arg; /* Specific information for the sort variant */
+} TuplesortPublic;
+
+/* Sort parallel code from state for sort__start probes */
+#define PARALLEL_SORT(coordinate) (coordinate == NULL || \
+ (coordinate)->sharedsort == NULL ? 0 : \
+ (coordinate)->isWorker ? 1 : 2)
+
+#define TuplesortstateGetPublic(state) ((TuplesortPublic *) state)
+
+/* When using this macro, beware of double evaluation of len */
+#define LogicalTapeReadExact(tape, ptr, len) \
+ do { \
+ if (LogicalTapeRead(tape, ptr, len) != (size_t) (len)) \
+ elog(ERROR, "unexpected end of data"); \
+ } while(0)
/*
* We provide multiple interfaces to what is essentially the same code,
@@ -205,6 +349,50 @@ typedef struct TuplesortInstrumentation
* generated (typically, caller uses a parallel heap scan).
*/
+
+extern Tuplesortstate *tuplesort_begin_common(int workMem,
+ SortCoordinate coordinate,
+ int sortopt);
+extern void tuplesort_set_bound(Tuplesortstate *state, int64 bound);
+extern bool tuplesort_used_bound(Tuplesortstate *state);
+extern void tuplesort_puttuple_common(Tuplesortstate *state,
+ SortTuple *tuple, bool useAbbrev);
+extern void tuplesort_performsort(Tuplesortstate *state);
+extern bool tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
+ SortTuple *stup);
+extern bool tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples,
+ bool forward);
+extern void tuplesort_end(Tuplesortstate *state);
+extern void tuplesort_reset(Tuplesortstate *state);
+
+extern void tuplesort_get_stats(Tuplesortstate *state,
+ TuplesortInstrumentation *stats);
+extern const char *tuplesort_method_name(TuplesortMethod m);
+extern const char *tuplesort_space_type_name(TuplesortSpaceType t);
+
+extern int tuplesort_merge_order(int64 allowedMem);
+
+extern Size tuplesort_estimate_shared(int nworkers);
+extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
+ dsm_segment *seg);
+extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
+
+/*
+ * These routines may only be called if randomAccess was specified 'true'.
+ * Likewise, backwards scan in gettuple/getdatum is only allowed if
+ * randomAccess was specified. Note that parallel sorts do not support
+ * randomAccess.
+ */
+
+extern void tuplesort_rescan(Tuplesortstate *state);
+extern void tuplesort_markpos(Tuplesortstate *state);
+extern void tuplesort_restorepos(Tuplesortstate *state);
+
+extern void *tuplesort_readtup_alloc(Tuplesortstate *state, Size tuplen);
+
+
+/* tuplesortvariants.c */
+
extern Tuplesortstate *tuplesort_begin_heap(TupleDesc tupDesc,
int nkeys, AttrNumber *attNums,
Oid *sortOperators, Oid *sortCollations,
@@ -238,9 +426,6 @@ extern Tuplesortstate *tuplesort_begin_datum(Oid datumType,
int workMem, SortCoordinate coordinate,
int sortopt);
-extern void tuplesort_set_bound(Tuplesortstate *state, int64 bound);
-extern bool tuplesort_used_bound(Tuplesortstate *state);
-
extern void tuplesort_puttupleslot(Tuplesortstate *state,
TupleTableSlot *slot);
extern void tuplesort_putheaptuple(Tuplesortstate *state, HeapTuple tup);
@@ -250,8 +435,6 @@ extern void tuplesort_putindextuplevalues(Tuplesortstate *state,
extern void tuplesort_putdatum(Tuplesortstate *state, Datum val,
bool isNull);
-extern void tuplesort_performsort(Tuplesortstate *state);
-
extern bool tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
bool copy, TupleTableSlot *slot, Datum *abbrev);
extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward);
@@ -259,34 +442,5 @@ extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward);
extern bool tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev);
-extern bool tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples,
- bool forward);
-
-extern void tuplesort_end(Tuplesortstate *state);
-
-extern void tuplesort_reset(Tuplesortstate *state);
-
-extern void tuplesort_get_stats(Tuplesortstate *state,
- TuplesortInstrumentation *stats);
-extern const char *tuplesort_method_name(TuplesortMethod m);
-extern const char *tuplesort_space_type_name(TuplesortSpaceType t);
-
-extern int tuplesort_merge_order(int64 allowedMem);
-
-extern Size tuplesort_estimate_shared(int nworkers);
-extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
- dsm_segment *seg);
-extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
-
-/*
- * These routines may only be called if randomAccess was specified 'true'.
- * Likewise, backwards scan in gettuple/getdatum is only allowed if
- * randomAccess was specified. Note that parallel sorts do not support
- * randomAccess.
- */
-
-extern void tuplesort_rescan(Tuplesortstate *state);
-extern void tuplesort_markpos(Tuplesortstate *state);
-extern void tuplesort_restorepos(Tuplesortstate *state);
#endif /* TUPLESORT_H */
--
2.24.3 (Apple Git-128)
tuplesort_patch_test.zipapplication/zip; name=tuplesort_patch_test.zipDownload
PK ���T result/UT
��b��b
��bux � PK ���T � 1 result/patch_gist_index_test_1_in_memory_mean.outUT
���b���b���bux � ��_K�0���)����m��?�����Q���F�A(�KKX�H��>���nL�A�/����='$
gr������Y������
�
h�i>FAJ&� %�%��4"P mr!W|�w���V�j�����qFR
aL�OxG����D$������08��������6MPq�[fx�y+�v����r�a�n5�g��tl.�>$��6�ea<�y�Y;g��d��!����d�e�B���d�5���`n@�%F� ����*�+�v�O��\��2o�������ZU=���BJ�T�������/�][{M���|=��
�����9�A����M$�EH1zPK�URO � PK ���T M 5 result/no_patch_cluster_test_2_on_disk_index_mean.outUT
W��bW��bW��bux � �TM��0��W��
����j�!��
��8UW��\�D(`V����~
l�$�F����`�{o��l����s{���Te��.���|NJ��u��O��Q`4z �X��z'vy�zmm�����4��#|���_!*���3PU���y"�G�8�)� f��t�:�yf
���)o����m�u�����d
�b^r��cY�o`fM��v��=���~o���$�
�49��U(
��e,�G�k~�����)'�k�k���#�z���
I|z�����2F�3�*��3��-��.��������my���A��k� L�E(���J�������#WW��:^��vJ�U���xQW(T�T"�L�
J
���f��o��Ub�8T����$���s����y�>r�����������v99nG��/C�Dh��@�rF����O�(`yPK�?� M PK ���T ; . result/patch_tuplesort_test_1_on_disk_vals.outUT
���b���b���bux � ��
��a��������"!���~� �K�bZ�F�6Rt]PK�*�w* ; PK ���T C / result/patch_gist_index_test_1_on_disk_vals.outUT
���b���b���bux � �� �?�d*���������@��h�x��
�F:������PK��0 C PK ��T ; 4 result/no_patch_gist_index_test_3_in_memory_vals.outUT
t��bt��bt��bux � %��
0��c"�@��kQf ��cL���s����:�^��PK���+ ; PK z��T ; 5 result/no_patch_btree_index_test_3_in_memory_vals.outUT
x��bx��bx��bux � ��
0��a,�������^O�UH4�������&b�E�)d>PK`��, ; PK ���T E 1 result/patch_cluster_test_3_on_disk_pkey_vals.outUT
���b���b���bux � %�� ��a|q^�1BP#J!�"����;B�S��1�t�)�PK���o, E PK ��T 2 result/no_patch_hash_index_test_1_on_disk_mean.outUT
z��bz��bz��bux � ���K�0���+��U�h�0X��uq+��h3� ���vem
M���M�P���1��>�G�r��M�3��n�/s�NlJ4*
�=v9���0�3�,� w��P���;<e=9"`N����������G)K8���t�����,`IB�;�1S��jDo�R��1^`/�F|tm�{j�]��%{���D!����M�hk7]�y8�,�x�������1d�V~������ J�\�T��pmQ8d<�Hw"7��v��uw�A=�+�:k*�5��(Q��n���Y���6���^��`����}{wX�IW�QB+�A��E���a�s��PK
O�A PK A��T 0 result/patch_btree_index_test_3_on_disk_mean.outUT
���b���b���bux � �T�j�0}�W�7��Ru�����(h+5�=Jl�R�)$q���:��������9'��.)���f�M��2|s�����J�*72E��10
�N���1����I.2�&�"`VZ�[�q��_er��e��@����H�h� Yt�WI�w
��c�$ftL}E��$YYX��V�v�(Pr��2�,s��u�~
�{��e��9d��
�q4�h�����#�r�'��QxqU��I���>������Y�!}
B���q������{�J��(�4p$i�S3����r�gx��Q���ER�R|��k�rY����
��R�$)���?�{P^,��~����+�gQ���]
k�`��7`E����������8~4��#PK�1|�q PK ���T O 2 result/patch_cluster_test_1_on_disk_index_vals.outUT
���b���b���bux � -�� ����r1�����k`a��@
�$"���k�_O��1�����l��r�PK���;4 O PK ���T E 5 result/no_patch_cluster_test_3_on_disk_index_vals.outUT
e��be��be��bux � �� ����Z@��|�"%2�t����zF0�����r7�PK,5�. E PK z��T 3 result/no_patch_btree_index_test_3_on_disk_mean.outUT
x��bx��bx��bux � �T]k�0}���o�C��sk���ll+��%�W)������:��������9'��.)�����=Q:�������!�Und�.��t�(0N�
�c
s-�\d�M*�E���\;�q��_er��e��@����H���A��R����v{��I������kI.���l���:,Q��ePY�z��l6m����G�^���#f���8����o�A4�/x�o���s���O�8&�hO��x�V���Syf��c0|��U�<@�Xx���#IK���,�6��<��u�B'E.���KT�X[���G^un�%��'IaDg�Y���b�}�c���\�>���^g5���7��ya��f���K��q�#PK���0q PK ���T � 0 result/patch_tuplesort_test_3_in_memory_mean.outUT
��b
��b
��bux � ��QK�0���+���Q�f��->�5�B��6�a�5eK*M����e���/�}H�����)r��x���H���.��g���t��W�EAJ ��i<N��������j�l�C���U�A������ ���.W��F�<��$�)�0���E���=���6B�:7b����:��, �����6����**�-�m��n����'�-��,�j���*���0if�0�H� X�3`��
$!���vgy�����Koi
����S,#ts��)e�6P��nE?���6$HoU�]J!�z����y������{�PK{���7 � PK ��T � 3 result/no_patch_tuplesort_test_3_in_memory_mean.outUT
���b���b���bux � ��_K�0���)���Q���m����L
m3�N�Q�l�-�4���m���_������_Nr����t6p�H��/.�������tu�7�EAJ ��i�N���9>��j�n�C���MuA����X�A�����3#��<��$�)�0���E��'V{>�
8��un�Z7d�q#����d��?�?� �+\���$�\����9�v�p��`����-����Oo�����2"a8`��-6��D���!\��S�,��)��.�O��,��������A�.����n��8D, �V�~-��k�M��������!��;PK��^l7 � PK ���T E 2 result/patch_cluster_test_2_on_disk_index_vals.outUT
���b���b���bux � �� 0��a|�y����C�hDQh��m`)tG������kf+�n�PK'��+. E PK ���T / result/patch_hash_index_test_1_on_disk_mean.outUT
���b���b���bux � ��=o�0�w��� UA�b(DBp!R>Pb��"7\BD�H�i��u�U��K���{�yO��(��d6��D��_�0�fD����c����1s8��
pg�3��OK��S��C�dMEAkl��X6!^������KG[][�|�r�G(�Se,���(��c�0��F��o�sOOm���^�dO�rZ�(Da����`�$k8g�em����18�,��m�1��u|@)^*<�*��s��F��Vdf��V��n���gU�R�u)��*��|��E)� �F����G'w������4��[r5�����m�PKB��{A PK ���T ( 5 result/no_patch_cluster_test_1_on_disk_index_mean.outUT
3��b3��b3��bux � �T]o�0}����-��k��h������&!Jn*��v�����$M��U�����{|�9��k���k�����K���w��[�P*��F�8&^L]F���-0wP���T(�f������+������'zo:W�p��9!�(����G�fz��B_6u*�(yaT��
����MS���D��c$p����QH>�M����$V�+���(:Qr��g�� FB�1�v��(�z92���9������?7q4�9�La*Q�(t��X��G������jR(��z�cA2A����d�0�1k
Gg��Z�w{�u���jq�`8�=��� �0�;%���h��\9�����L��O��E��������������y3-��K}� ��D{0X$��g�E���3��#��� ������o��)���D? A�� �*LK����3�Y�&d0�'�Y�_b#�n+���~^��UZ�<�Q��@����)��U��Q@��j��x%��.�����q�ju��G#���U���q�?�����g\�=:�U�7�x�|�� �PKN���& ( PK =��T E 0 result/patch_btree_index_test_2_on_disk_vals.outUT
���b���b���bux � ��
0�v��*H���X����$��XFC7T���[#�H�#;!%�O��PK�"�1 E PK ���T p 3 result/patch_cluster_test_3_in_memory_pkey_mean.outUT
���b���b���bux � �TM��0��W��J��v���Pw���
��+UB.q"0+p���1l�IN��C��3�o��l���9������S��x��}=}G��t��-���)]0
�F����BQ�;%�\��1ES�PJ�i-��L��TY#�����T�g�_��mY�#A���A����nc���<�g��,
����P����\��cY\�70���k;�������D��+&C��D~J���o�-6�;�-�1r� �2�{�����U��o�~����l�_O�y�lC�&A�w���$\e��g�U�wg��;��� �.������dI������ L�EB����K��]�%|�f���d� ���e�*,�v/��EU
����9���oE��Q�lG��W��[h�A��.��^�����n�7�#�?�v�2k����-G��Qs|DpO$�=��3@�
7~E��/PK���� p PK ���T q 2 result/patch_cluster_test_3_on_disk_index_mean.outUT
���b���b���bux � �T]o�0}���/S�F*L���!a^��G��J��K�LN���]�K�<-��0?������c���P3������V�V�y���h�jh�]���6�8o������Z6Y��p�uI�Pn?�'������+6��B�����i����:�v�l��.���%���'��6fO���~v��%e�8*���Jh9�-�� �L-��Rt���.6���������-��_�8���d�M�e�v����.!�j�x��{�_v�A�;�G_��q1+^(��q:���}�# �)g� t���� ��V�t����s����6=Y�?r�0qbo�T���Y�V������P��.�L�
������e���I=/�tV*�[���l8*���%xU��Z��~g�"N��.�|]�v�y�>p��n�(��~t����vT��~�����������.�PK7�O� q PK ���T O 4 result/patch_cluster_test_1_in_memory_index_vals.outUT
���b���b���bux � ���0�����D��XL�o���<��5^�f�U�������z�-x>PKx���4 O PK ���T p 7 result/no_patch_cluster_test_3_in_memory_index_mean.outUT
e��be��be��bux � �TM��0��W��Jh�
]�A=$��B�c�jW�����P����n�}�f������f���g���s����?��k�}��c-����B������htl�)�^i����Q4�
��p��"}�����]q@�5by�|�M;��������-bx$�3�2b������=e?O��5:����g�M��N�k����,>