tuplesort_gettuple_common() and *should_free argument
I noticed that the routine tuplesort_gettuple_common() fails to set
*should_free in all paths in master branch (no bug in 9.6). Consider
the TSS_SORTEDONTAPE case, where we can return false without also
setting "*should_free = false" to instruct caller not to pfree() tuple
memory that tuplesort still owns. I suppose that this is okay because
caller should never pfree() a tuple when NULL pointer was returned at
higher level (as "pointer-to-tuple"). Even still, it seems like a bad
idea to rely on caller testing things such that caller doesn't go on
to pfree() a NULL pointer when seemingly instructed to do so by
tuplesort "get tuple" interface routine (that is, some higher level
routine that calls tuplesort_gettuple_common()).
More importantly, there are no remaining cases where
tuplesort_gettuple_common() sets "*should_free = true", because there
is no remaining need for caller to *ever* pfree() tuple. Moreover, I
don't anticipate any future need for this -- the scheme recently
established around per-tape "slab slots" makes it seem unlikely to me
that the need to "*should_free = true" will ever arise again. So, for
Postgres 10, why not rip out the "*should_free" arguments that appear
in a bunch of places? This lets us slightly simplify most tuplesort.c
callers.
Now, it is still true that at least some callers to
tuplesort_gettupleslot() (but not any other "get tuple" interface
routine) are going to *require* ownership of memory for returned
tuples, which means a call to heap_copy_minimal_tuple() remains
necessary there (see recent commit
d8589946ddd5c43e1ebd01c5e92d0e177cbfc198 for details). But, that's
beside the point. In the master branch only, the
tuplesort_gettupleslot() test "if (!should_free)" seems tautological,
just as similar tests are for every other tuplesort_gettuple_common()
caller. So, tuplesort_gettupleslot() can safely assume it should
*always* heap_copy_minimal_tuple() in the master branch. (BTW, I'm
going to teach tuplesort_gettuple_common() to avoid this
heap_copy_minimal_tuple() call for callers that don't actually need
it, but that's a discussion for another thread).
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/22/2016 02:45 AM, Peter Geoghegan wrote:
I noticed that the routine tuplesort_gettuple_common() fails to set
*should_free in all paths in master branch (no bug in 9.6). Consider
the TSS_SORTEDONTAPE case, where we can return false without also
setting "*should_free = false" to instruct caller not to pfree() tuple
memory that tuplesort still owns. I suppose that this is okay because
caller should never pfree() a tuple when NULL pointer was returned at
higher level (as "pointer-to-tuple"). Even still, it seems like a bad
idea to rely on caller testing things such that caller doesn't go on
to pfree() a NULL pointer when seemingly instructed to do so by
tuplesort "get tuple" interface routine (that is, some higher level
routine that calls tuplesort_gettuple_common()).More importantly, there are no remaining cases where
tuplesort_gettuple_common() sets "*should_free = true", because there
is no remaining need for caller to *ever* pfree() tuple. Moreover, I
don't anticipate any future need for this -- the scheme recently
established around per-tape "slab slots" makes it seem unlikely to me
that the need to "*should_free = true" will ever arise again. So, for
Postgres 10, why not rip out the "*should_free" arguments that appear
in a bunch of places? This lets us slightly simplify most tuplesort.c
callers.
Yeah, I agree we should just remove the *should_free arguments.
Should we be worried about breaking the API of tuplesort_get* functions?
They might be used by extensions. I think that's OK, but wanted to bring
it up. This would be only for master, of course, and any breakage would
be straightforward to fix.
Now, it is still true that at least some callers to
tuplesort_gettupleslot() (but not any other "get tuple" interface
routine) are going to *require* ownership of memory for returned
tuples, which means a call to heap_copy_minimal_tuple() remains
necessary there (see recent commit
d8589946ddd5c43e1ebd01c5e92d0e177cbfc198 for details). But, that's
beside the point. In the master branch only, the
tuplesort_gettupleslot() test "if (!should_free)" seems tautological,
just as similar tests are for every other tuplesort_gettuple_common()
caller. So, tuplesort_gettupleslot() can safely assume it should
*always* heap_copy_minimal_tuple() in the master branch. (BTW, I'm
going to teach tuplesort_gettuple_common() to avoid this
heap_copy_minimal_tuple() call for callers that don't actually need
it, but that's a discussion for another thread).
Yep.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Dec 7, 2016 at 11:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Should we be worried about breaking the API of tuplesort_get* functions?
They might be used by extensions. I think that's OK, but wanted to bring it
up. This would be only for master, of course, and any breakage would be
straightforward to fix.
I don't think so. I'm not aware of any third party extensions that
call tuplesort.c routines, despite having looked for them. I noticed
that pg_repack doesn't. For any that do, they'll break in a
predictable, obvious way.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter Geoghegan <pg@heroku.com> writes:
On Wed, Dec 7, 2016 at 11:55 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Should we be worried about breaking the API of tuplesort_get* functions?
They might be used by extensions. I think that's OK, but wanted to bring it
up. This would be only for master, of course, and any breakage would be
straightforward to fix.
I don't think so. I'm not aware of any third party extensions that
call tuplesort.c routines, despite having looked for them. I noticed
that pg_repack doesn't. For any that do, they'll break in a
predictable, obvious way.
Adding or subtracting function arguments is something we do all the time.
As long as it's not proposed for back-patch, I don't see a problem.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 21, 2016 at 4:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
More importantly, there are no remaining cases where
tuplesort_gettuple_common() sets "*should_free = true", because there
is no remaining need for caller to *ever* pfree() tuple. Moreover, I
don't anticipate any future need for this -- the scheme recently
established around per-tape "slab slots" makes it seem unlikely to me
that the need to "*should_free = true" will ever arise again.
Attached patch 0001-* removes all should_free arguments. To reiterate,
this is purely a refactoring patch.
Now, it is still true that at least some callers to
tuplesort_gettupleslot() (but not any other "get tuple" interface
routine) are going to *require* ownership of memory for returned
tuples, which means a call to heap_copy_minimal_tuple() remains
necessary there (see recent commit
d8589946ddd5c43e1ebd01c5e92d0e177cbfc198 for details). But, that's
beside the point. In the master branch only, the
tuplesort_gettupleslot() test "if (!should_free)" seems tautological,
just as similar tests are for every other tuplesort_gettuple_common()
caller. So, tuplesort_gettupleslot() can safely assume it should
*always* heap_copy_minimal_tuple() in the master branch. (BTW, I'm
going to teach tuplesort_gettuple_common() to avoid this
heap_copy_minimal_tuple() call for callers that don't actually need
it, but that's a discussion for another thread).
I decided that it made sense to address this at the same time after
all. So, the second patch attached here, 0002-*, goes on to do so.
This is a performance optimization that we already agreed was a good
idea for Postgres 10 when d8589946ddd5c43e1ebd01c5e92d0e177cbfc198
went in.
Note that the call sites that now opt out of receiving their own
personal copy are essentially returning to their behavior for the
entire beta phase of PostgreSQL 9.6. The problem with that accidental
state of affairs was that it wasn't always safe. But, it was still
generally safe for the majority of callers, I believe, not least
because it took months to hear any complaints at all. That majority of
callers now opt out.
The main question for reviewers of 0002-*, then, is whether or not I
have correctly determined which particular callers can safely opt out.
Finally, 0003-* is a Valgrind suppression borrowed from my parallel
CREATE INDEX patch. It's self-explanatory.
--
Peter Geoghegan
Attachments:
0003-Add-valgrind-suppression-for-logtape_write.patchtext/x-patch; charset=US-ASCII; name=0003-Add-valgrind-suppression-for-logtape_write.patchDownload
From 440f9724362c376807c36d814046015285e5c237 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Fri, 9 Dec 2016 13:49:18 -0800
Subject: [PATCH 3/3] Add valgrind suppression for logtape_write
This avoids having Valgrind report that logtape.c writes out
uninitialized memory.
---
src/tools/valgrind.supp | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/tools/valgrind.supp b/src/tools/valgrind.supp
index af03051..775db0d 100644
--- a/src/tools/valgrind.supp
+++ b/src/tools/valgrind.supp
@@ -120,6 +120,15 @@
fun:RelationMapFinishBootstrap
}
+{
+ logtape_write
+ Memcheck:Param
+ write(buf)
+
+ ...
+ fun:LogicalTape*
+}
+
# gcc on ppc64 can generate a four-byte read to fetch the final "char" fields
# of a FormData_pg_cast. This is valid compiler behavior, because a proper
--
2.7.4
0002-Avoid-copying-within-tuplesort_gettupleslot.patchtext/x-patch; charset=US-ASCII; name=0002-Avoid-copying-within-tuplesort_gettupleslot.patchDownload
From 3e3ef067713527e210ce76511955a5220fbbf15d Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Thu, 13 Oct 2016 10:54:31 -0700
Subject: [PATCH 2/3] Avoid copying within tuplesort_gettupleslot()
Add a "copy" argument to make it optional to receive a copy of caller
tuple that is safe to use across tuplesort_gettupleslot() calls, as a
performance optimization. Existing tuplesort_gettupleslot() callers are
made to opt out of copying where that has been determined to be safe.
This brings tuplesort_gettupleslot() in line with
tuplestore_gettupleslot().
In the future, a "copy" tuplesort_getdatum() argument may be added, that
similarly allows callers to opt out of receiving a new copy of tuple
under their direct ownership.
---
src/backend/executor/nodeAgg.c | 9 ++++++---
src/backend/executor/nodeSort.c | 5 +++--
src/backend/utils/adt/orderedsetaggs.c | 5 +++--
src/backend/utils/sort/tuplesort.c | 17 +++++++++++------
src/include/utils/tuplesort.h | 2 +-
5 files changed, 24 insertions(+), 14 deletions(-)
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index eefb3d6..16a1263 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -570,6 +570,9 @@ initialize_phase(AggState *aggstate, int newphase)
* Fetch a tuple from either the outer plan (for phase 0) or from the sorter
* populated by the previous phase. Copy it to the sorter for the next phase
* if any.
+ *
+ * Callers cannot rely on memory for tuple in returned slot remaining allocated
+ * past any subsequent call here. (The sorter may recycle the memory.)
*/
static TupleTableSlot *
fetch_input_tuple(AggState *aggstate)
@@ -578,8 +581,8 @@ fetch_input_tuple(AggState *aggstate)
if (aggstate->sort_in)
{
- if (!tuplesort_gettupleslot(aggstate->sort_in, true, aggstate->sort_slot,
- NULL))
+ if (!tuplesort_gettupleslot(aggstate->sort_in, true, false,
+ aggstate->sort_slot, NULL))
return NULL;
slot = aggstate->sort_slot;
}
@@ -1273,7 +1276,7 @@ process_ordered_aggregate_multi(AggState *aggstate,
ExecClearTuple(slot2);
while (tuplesort_gettupleslot(pertrans->sortstates[aggstate->current_set],
- true, slot1, &newAbbrevVal))
+ true, true, slot1, &newAbbrevVal))
{
/*
* Extract the first numTransInputs columns as datums to pass to the
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index a34dcc5..8fc7106 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -132,12 +132,13 @@ ExecSort(SortState *node)
/*
* Get the first or next tuple from tuplesort. Returns NULL if no more
- * tuples.
+ * tuples. Note that we rely on memory for tuple in slot remaining
+ * allocated only until subsequent call here.
*/
slot = node->ss.ps.ps_ResultTupleSlot;
(void) tuplesort_gettupleslot(tuplesortstate,
ScanDirectionIsForward(dir),
- slot, NULL);
+ false, slot, NULL);
return slot;
}
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index fe44d56..3c23329 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -1189,7 +1189,7 @@ hypothetical_rank_common(FunctionCallInfo fcinfo, int flag,
tuplesort_performsort(osastate->sortstate);
/* iterate till we find the hypothetical row */
- while (tuplesort_gettupleslot(osastate->sortstate, true, slot, NULL))
+ while (tuplesort_gettupleslot(osastate->sortstate, true, true, slot, NULL))
{
bool isnull;
Datum d = slot_getattr(slot, nargs + 1, &isnull);
@@ -1352,7 +1352,8 @@ hypothetical_dense_rank_final(PG_FUNCTION_ARGS)
slot2 = extraslot;
/* iterate till we find the hypothetical row */
- while (tuplesort_gettupleslot(osastate->sortstate, true, slot, &abbrevVal))
+ while (tuplesort_gettupleslot(osastate->sortstate, true, true, slot,
+ &abbrevVal))
{
bool isnull;
Datum d = slot_getattr(slot, nargs + 1, &isnull);
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 0320f0d..d60159a 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -2083,12 +2083,15 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
* NULL value in leading attribute will set abbreviated value to zeroed
* representation, which caller may rely on in abbreviated inequality check.
*
- * The slot receives a copied tuple (sometimes allocated in caller memory
- * context) that will stay valid regardless of future manipulations of the
- * tuplesort's state.
+ * If copy is TRUE, the slot receives a copied tuple that will stay valid
+ * regardless of future manipulations of the tuplesort's state. Memory is
+ * owned by the caller. If copy is FALSE, the slot may just receive a pointer
+ * to a tuple held within the tuplesort. The latter is more efficient, but
+ * the slot contents may be corrupted if there is another call here before
+ * previous slot contents are used.
*/
bool
-tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
+tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
TupleTableSlot *slot, Datum *abbrev)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
@@ -2105,8 +2108,10 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
if (state->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
- ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, false);
+ if (copy)
+ stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
+
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
return true;
}
else
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 38af746..ad7adca 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -93,7 +93,7 @@ extern void tuplesort_putdatum(Tuplesortstate *state, Datum val,
extern void tuplesort_performsort(Tuplesortstate *state);
extern bool tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
- TupleTableSlot *slot, Datum *abbrev);
+ bool copy, TupleTableSlot *slot, Datum *abbrev);
extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward);
extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward);
extern bool tuplesort_getdatum(Tuplesortstate *state, bool forward,
--
2.7.4
0001-Remove-should_free-tuplesort-routine-arguments.patchtext/x-patch; charset=US-ASCII; name=0001-Remove-should_free-tuplesort-routine-arguments.patchDownload
From 6ca426173e0de6499ebb1d136a2442a518a46215 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Thu, 8 Dec 2016 14:37:54 -0800
Subject: [PATCH 1/3] Remove should_free tuplesort routine arguments
Tuplesort routine should_free status arguments are redundant, as memory
for tuples returned from a sort are always owned by tuplesort in
practice. This has been the case since commit e94568ecc removed any
instances of caller tuples being returned as anything other than
pointers into tuplesort-owned buffers. Refactor interface of certain
tuplesort routines to reflect this general state of affairs.
There are two remaining cases where we must always copy caller tuples
because callers are likely to require ownership of memory: when the
caller passes a TupleTableSlot which tuplesort stores a heap tuple in
for caller, and when the caller gets a pass-by-reference Datum tuple
from tuplesort. In the future, it may be possible for some of the
callers of these two remaining functions to opt out of having tuplesort
produce a new copy for them, as a performance optimization, but that has
nothing to do with tuplesort memory management, and everything to do
with the requirements of those callers.
---
src/backend/access/hash/hashsort.c | 6 +---
src/backend/access/nbtree/nbtsort.c | 24 ++++----------
src/backend/commands/cluster.c | 6 +---
src/backend/utils/sort/tuplesort.c | 62 +++++++++++++------------------------
src/include/utils/tuplesort.h | 6 ++--
5 files changed, 31 insertions(+), 73 deletions(-)
diff --git a/src/backend/access/hash/hashsort.c b/src/backend/access/hash/hashsort.c
index 8938ab5..12fb78a 100644
--- a/src/backend/access/hash/hashsort.c
+++ b/src/backend/access/hash/hashsort.c
@@ -104,15 +104,13 @@ void
_h_indexbuild(HSpool *hspool)
{
IndexTuple itup;
- bool should_free;
#ifdef USE_ASSERT_CHECKING
uint32 hashkey = 0;
#endif
tuplesort_performsort(hspool->sortstate);
- while ((itup = tuplesort_getindextuple(hspool->sortstate,
- true, &should_free)) != NULL)
+ while ((itup = tuplesort_getindextuple(hspool->sortstate, true)) != NULL)
{
/*
* Technically, it isn't critical that hash keys be found in sorted
@@ -129,7 +127,5 @@ _h_indexbuild(HSpool *hspool)
#endif
_hash_doinsert(hspool->index, itup);
- if (should_free)
- pfree(itup);
}
}
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 99a014e..7f65b5a 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -680,9 +680,7 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
bool merge = (btspool2 != NULL);
IndexTuple itup,
itup2 = NULL;
- bool should_free,
- should_free2,
- load1;
+ bool load1;
TupleDesc tupdes = RelationGetDescr(wstate->index);
int i,
keysz = RelationGetNumberOfAttributes(wstate->index);
@@ -697,10 +695,8 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
*/
/* the preparation of merge */
- itup = tuplesort_getindextuple(btspool->sortstate,
- true, &should_free);
- itup2 = tuplesort_getindextuple(btspool2->sortstate,
- true, &should_free2);
+ itup = tuplesort_getindextuple(btspool->sortstate, true);
+ itup2 = tuplesort_getindextuple(btspool2->sortstate, true);
indexScanKey = _bt_mkscankey_nodata(wstate->index);
/* Prepare SortSupport data for each column */
@@ -775,18 +771,12 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
if (load1)
{
_bt_buildadd(wstate, state, itup);
- if (should_free)
- pfree(itup);
- itup = tuplesort_getindextuple(btspool->sortstate,
- true, &should_free);
+ itup = tuplesort_getindextuple(btspool->sortstate, true);
}
else
{
_bt_buildadd(wstate, state, itup2);
- if (should_free2)
- pfree(itup2);
- itup2 = tuplesort_getindextuple(btspool2->sortstate,
- true, &should_free2);
+ itup2 = tuplesort_getindextuple(btspool2->sortstate, true);
}
}
pfree(sortKeys);
@@ -795,15 +785,13 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
{
/* merge is unnecessary */
while ((itup = tuplesort_getindextuple(btspool->sortstate,
- true, &should_free)) != NULL)
+ true)) != NULL)
{
/* When we see first tuple, create first index page */
if (state == NULL)
state = _bt_pagestate(wstate, 0);
_bt_buildadd(wstate, state, itup);
- if (should_free)
- pfree(itup);
}
}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index dc1f79f..2131226 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1057,11 +1057,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
for (;;)
{
HeapTuple tuple;
- bool shouldfree;
CHECK_FOR_INTERRUPTS();
- tuple = tuplesort_getheaptuple(tuplesort, true, &shouldfree);
+ tuple = tuplesort_getheaptuple(tuplesort, true);
if (tuple == NULL)
break;
@@ -1069,9 +1068,6 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
oldTupDesc, newTupDesc,
values, isnull,
NewHeap->rd_rel->relhasoids, rwstate);
-
- if (shouldfree)
- heap_freetuple(tuple);
}
tuplesort_end(tuplesort);
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index baf87b3..0320f0d 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1841,12 +1841,12 @@ tuplesort_performsort(Tuplesortstate *state)
/*
* Internal routine to fetch the next tuple in either forward or back
* direction into *stup. Returns FALSE if no more tuples.
- * If *should_free is set, the caller must pfree stup.tuple when done with it.
- * Otherwise, caller should not use tuple following next call here.
+ * Returned tuple belongs to tuplesort memory context, and must not be freed
+ * by caller. Caller should not use tuple following next call here.
*/
static bool
tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
- SortTuple *stup, bool *should_free)
+ SortTuple *stup)
{
unsigned int tuplen;
@@ -1855,7 +1855,6 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
case TSS_SORTEDINMEM:
Assert(forward || state->randomAccess);
Assert(!state->slabAllocatorUsed);
- *should_free = false;
if (forward)
{
if (state->current < state->memtupcount)
@@ -1927,7 +1926,6 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
*/
state->lastReturnedTuple = stup->tuple;
- *should_free = false;
return true;
}
else
@@ -2008,14 +2006,12 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
*/
state->lastReturnedTuple = stup->tuple;
- *should_free = false;
return true;
case TSS_FINALMERGE:
Assert(forward);
/* We are managing memory ourselves, with the slab allocator. */
Assert(state->slabAllocatorUsed);
- *should_free = false;
/*
* The slab slot holding the tuple that we returned in previous
@@ -2097,9 +2093,8 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
- bool should_free;
- if (!tuplesort_gettuple_common(state, forward, &stup, &should_free))
+ if (!tuplesort_gettuple_common(state, forward, &stup))
stup.tuple = NULL;
MemoryContextSwitchTo(oldcontext);
@@ -2110,12 +2105,8 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
if (state->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
- if (!should_free)
- {
- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
- should_free = true;
- }
- ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, should_free);
+ stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, false);
return true;
}
else
@@ -2127,18 +2118,17 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
/*
* Fetch the next tuple in either forward or back direction.
- * Returns NULL if no more tuples. If *should_free is set, the
- * caller must pfree the returned tuple when done with it.
- * If it is not set, caller should not use tuple following next
- * call here.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller should not use tuple
+ * following next call here.
*/
HeapTuple
-tuplesort_getheaptuple(Tuplesortstate *state, bool forward, bool *should_free)
+tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
- if (!tuplesort_gettuple_common(state, forward, &stup, should_free))
+ if (!tuplesort_gettuple_common(state, forward, &stup))
stup.tuple = NULL;
MemoryContextSwitchTo(oldcontext);
@@ -2148,19 +2138,17 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward, bool *should_free)
/*
* Fetch the next index tuple in either forward or back direction.
- * Returns NULL if no more tuples. If *should_free is set, the
- * caller must pfree the returned tuple when done with it.
- * If it is not set, caller should not use tuple following next
- * call here.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller should not use tuple
+ * following next call here.
*/
IndexTuple
-tuplesort_getindextuple(Tuplesortstate *state, bool forward,
- bool *should_free)
+tuplesort_getindextuple(Tuplesortstate *state, bool forward)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
- if (!tuplesort_gettuple_common(state, forward, &stup, should_free))
+ if (!tuplesort_gettuple_common(state, forward, &stup))
stup.tuple = NULL;
MemoryContextSwitchTo(oldcontext);
@@ -2173,7 +2161,8 @@ tuplesort_getindextuple(Tuplesortstate *state, bool forward,
* Returns FALSE if no more datums.
*
* If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * and is now owned by the caller.
+ * and is now owned by the caller (this differs from similar routines for
+ * other types of tuplesorts).
*
* Caller may optionally be passed back abbreviated value (on TRUE return
* value) when abbreviation was used, which can be used to cheaply avoid
@@ -2188,9 +2177,8 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
- bool should_free;
- if (!tuplesort_gettuple_common(state, forward, &stup, &should_free))
+ if (!tuplesort_gettuple_common(state, forward, &stup))
{
MemoryContextSwitchTo(oldcontext);
return false;
@@ -2208,11 +2196,7 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
else
{
/* use stup.tuple because stup.datum1 may be an abbreviation */
-
- if (should_free)
- *val = PointerGetDatum(stup.tuple);
- else
- *val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
*isNull = false;
}
@@ -2270,16 +2254,12 @@ tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples, bool forward)
while (ntuples-- > 0)
{
SortTuple stup;
- bool should_free;
- if (!tuplesort_gettuple_common(state, forward,
- &stup, &should_free))
+ if (!tuplesort_gettuple_common(state, forward, &stup))
{
MemoryContextSwitchTo(oldcontext);
return false;
}
- if (should_free && stup.tuple)
- pfree(stup.tuple);
CHECK_FOR_INTERRUPTS();
}
MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 5cecd6d..38af746 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -94,10 +94,8 @@ extern void tuplesort_performsort(Tuplesortstate *state);
extern bool tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
TupleTableSlot *slot, Datum *abbrev);
-extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward,
- bool *should_free);
-extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward,
- bool *should_free);
+extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward);
+extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward);
extern bool tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev);
--
2.7.4
On Fri, Dec 9, 2016 at 5:59 PM, Peter Geoghegan <pg@heroku.com> wrote:
Attached patch 0001-* removes all should_free arguments. To reiterate,
this is purely a refactoring patch.
I think this patch might have a bug. In the existing code,
tuplesort_gettupleslot sets should_free = true if it isn't already
just before calling ExecStoreMinimalTuple((MinimalTuple) stup.tuple,
slot, should_free), so it seems that ExecStoreMinimalTuple() will
always get "true" as the fourth argument. However the patch changes
that line of code like this:
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, false);
So the patch seems to have the effect of changing the fourth argument
to this call to ExecStoreMinimalTuple() from always-true to
always-false. I might be missing something, but my guess is that's
not right.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Dec 12, 2016 at 9:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
I think this patch might have a bug. In the existing code,
tuplesort_gettupleslot sets should_free = true if it isn't already
just before calling ExecStoreMinimalTuple((MinimalTuple) stup.tuple,
slot, should_free), so it seems that ExecStoreMinimalTuple() will
always get "true" as the fourth argument. However the patch changes
that line of code like this:+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, false);
So the patch seems to have the effect of changing the fourth argument
to this call to ExecStoreMinimalTuple() from always-true to
always-false. I might be missing something, but my guess is that's
not right.
There was a memory leak added by 0001-*, but then fixed by 0002-*. I
should have done more testing of 0001-* alone. Oops.
Attached revision of 0001-* fixes this. A revised 0002-* is also
attached, just as a convenience for reviewers (they won't need to
resolve the conflict themselves).
--
Peter Geoghegan
Attachments:
0002-Avoid-copying-within-tuplesort_gettupleslot.patchtext/x-patch; charset=US-ASCII; name=0002-Avoid-copying-within-tuplesort_gettupleslot.patchDownload
From 541a7f0ae6060763cd4448359159c1a2c5980a68 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Thu, 13 Oct 2016 10:54:31 -0700
Subject: [PATCH 2/3] Avoid copying within tuplesort_gettupleslot()
Add a "copy" argument to make it optional to receive a copy of caller
tuple that is safe to use across tuplesort_gettupleslot() calls, as a
performance optimization. Existing tuplesort_gettupleslot() callers are
made to opt out of copying where that has been determined to be safe.
This brings tuplesort_gettupleslot() in line with
tuplestore_gettupleslot().
In the future, a "copy" tuplesort_getdatum() argument may be added, that
similarly allows callers to opt out of receiving a new copy of tuple
under their direct ownership.
---
src/backend/executor/nodeAgg.c | 9 ++++++---
src/backend/executor/nodeSort.c | 5 +++--
src/backend/utils/adt/orderedsetaggs.c | 5 +++--
src/backend/utils/sort/tuplesort.c | 17 +++++++++++------
src/include/utils/tuplesort.h | 2 +-
5 files changed, 24 insertions(+), 14 deletions(-)
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index eefb3d6..16a1263 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -570,6 +570,9 @@ initialize_phase(AggState *aggstate, int newphase)
* Fetch a tuple from either the outer plan (for phase 0) or from the sorter
* populated by the previous phase. Copy it to the sorter for the next phase
* if any.
+ *
+ * Callers cannot rely on memory for tuple in returned slot remaining allocated
+ * past any subsequent call here. (The sorter may recycle the memory.)
*/
static TupleTableSlot *
fetch_input_tuple(AggState *aggstate)
@@ -578,8 +581,8 @@ fetch_input_tuple(AggState *aggstate)
if (aggstate->sort_in)
{
- if (!tuplesort_gettupleslot(aggstate->sort_in, true, aggstate->sort_slot,
- NULL))
+ if (!tuplesort_gettupleslot(aggstate->sort_in, true, false,
+ aggstate->sort_slot, NULL))
return NULL;
slot = aggstate->sort_slot;
}
@@ -1273,7 +1276,7 @@ process_ordered_aggregate_multi(AggState *aggstate,
ExecClearTuple(slot2);
while (tuplesort_gettupleslot(pertrans->sortstates[aggstate->current_set],
- true, slot1, &newAbbrevVal))
+ true, true, slot1, &newAbbrevVal))
{
/*
* Extract the first numTransInputs columns as datums to pass to the
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index a34dcc5..8fc7106 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -132,12 +132,13 @@ ExecSort(SortState *node)
/*
* Get the first or next tuple from tuplesort. Returns NULL if no more
- * tuples.
+ * tuples. Note that we rely on memory for tuple in slot remaining
+ * allocated only until subsequent call here.
*/
slot = node->ss.ps.ps_ResultTupleSlot;
(void) tuplesort_gettupleslot(tuplesortstate,
ScanDirectionIsForward(dir),
- slot, NULL);
+ false, slot, NULL);
return slot;
}
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index fe44d56..3c23329 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -1189,7 +1189,7 @@ hypothetical_rank_common(FunctionCallInfo fcinfo, int flag,
tuplesort_performsort(osastate->sortstate);
/* iterate till we find the hypothetical row */
- while (tuplesort_gettupleslot(osastate->sortstate, true, slot, NULL))
+ while (tuplesort_gettupleslot(osastate->sortstate, true, true, slot, NULL))
{
bool isnull;
Datum d = slot_getattr(slot, nargs + 1, &isnull);
@@ -1352,7 +1352,8 @@ hypothetical_dense_rank_final(PG_FUNCTION_ARGS)
slot2 = extraslot;
/* iterate till we find the hypothetical row */
- while (tuplesort_gettupleslot(osastate->sortstate, true, slot, &abbrevVal))
+ while (tuplesort_gettupleslot(osastate->sortstate, true, true, slot,
+ &abbrevVal))
{
bool isnull;
Datum d = slot_getattr(slot, nargs + 1, &isnull);
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 71524c2..d60159a 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -2083,12 +2083,15 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
* NULL value in leading attribute will set abbreviated value to zeroed
* representation, which caller may rely on in abbreviated inequality check.
*
- * The slot receives a copied tuple (sometimes allocated in caller memory
- * context) that will stay valid regardless of future manipulations of the
- * tuplesort's state.
+ * If copy is TRUE, the slot receives a copied tuple that will stay valid
+ * regardless of future manipulations of the tuplesort's state. Memory is
+ * owned by the caller. If copy is FALSE, the slot may just receive a pointer
+ * to a tuple held within the tuplesort. The latter is more efficient, but
+ * the slot contents may be corrupted if there is another call here before
+ * previous slot contents are used.
*/
bool
-tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
+tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
TupleTableSlot *slot, Datum *abbrev)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
@@ -2105,8 +2108,10 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
if (state->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
- ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, true);
+ if (copy)
+ stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
+
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
return true;
}
else
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 38af746..ad7adca 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -93,7 +93,7 @@ extern void tuplesort_putdatum(Tuplesortstate *state, Datum val,
extern void tuplesort_performsort(Tuplesortstate *state);
extern bool tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
- TupleTableSlot *slot, Datum *abbrev);
+ bool copy, TupleTableSlot *slot, Datum *abbrev);
extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward);
extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward);
extern bool tuplesort_getdatum(Tuplesortstate *state, bool forward,
--
2.7.4
0001-Remove-should_free-tuplesort-routine-arguments.patchtext/x-patch; charset=US-ASCII; name=0001-Remove-should_free-tuplesort-routine-arguments.patchDownload
From 7b67ec71b8291fb3051a924919f2f1c82e8fe121 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Thu, 8 Dec 2016 14:37:54 -0800
Subject: [PATCH 1/3] Remove should_free tuplesort routine arguments
Tuplesort routine should_free status arguments are redundant, as memory
for tuples returned from a sort are always owned by tuplesort in
practice. This has been the case since commit e94568ecc removed any
instances of caller tuples being returned as anything other than
pointers into tuplesort-owned buffers. Refactor interface of certain
tuplesort routines to reflect this general state of affairs.
There are two remaining cases where we must always copy caller tuples
because callers are likely to require ownership of memory: when the
caller passes a TupleTableSlot which tuplesort stores a heap tuple in
for caller, and when the caller gets a pass-by-reference Datum tuple
from tuplesort. In the future, it may be possible for some of the
callers of these two remaining functions to opt out of having tuplesort
produce a new copy for them, as a performance optimization, but that has
nothing to do with tuplesort memory management, and everything to do
with the requirements of those callers.
---
src/backend/access/hash/hashsort.c | 6 +---
src/backend/access/nbtree/nbtsort.c | 24 ++++----------
src/backend/commands/cluster.c | 6 +---
src/backend/utils/sort/tuplesort.c | 62 +++++++++++++------------------------
src/include/utils/tuplesort.h | 6 ++--
5 files changed, 31 insertions(+), 73 deletions(-)
diff --git a/src/backend/access/hash/hashsort.c b/src/backend/access/hash/hashsort.c
index 8938ab5..12fb78a 100644
--- a/src/backend/access/hash/hashsort.c
+++ b/src/backend/access/hash/hashsort.c
@@ -104,15 +104,13 @@ void
_h_indexbuild(HSpool *hspool)
{
IndexTuple itup;
- bool should_free;
#ifdef USE_ASSERT_CHECKING
uint32 hashkey = 0;
#endif
tuplesort_performsort(hspool->sortstate);
- while ((itup = tuplesort_getindextuple(hspool->sortstate,
- true, &should_free)) != NULL)
+ while ((itup = tuplesort_getindextuple(hspool->sortstate, true)) != NULL)
{
/*
* Technically, it isn't critical that hash keys be found in sorted
@@ -129,7 +127,5 @@ _h_indexbuild(HSpool *hspool)
#endif
_hash_doinsert(hspool->index, itup);
- if (should_free)
- pfree(itup);
}
}
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 99a014e..7f65b5a 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -680,9 +680,7 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
bool merge = (btspool2 != NULL);
IndexTuple itup,
itup2 = NULL;
- bool should_free,
- should_free2,
- load1;
+ bool load1;
TupleDesc tupdes = RelationGetDescr(wstate->index);
int i,
keysz = RelationGetNumberOfAttributes(wstate->index);
@@ -697,10 +695,8 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
*/
/* the preparation of merge */
- itup = tuplesort_getindextuple(btspool->sortstate,
- true, &should_free);
- itup2 = tuplesort_getindextuple(btspool2->sortstate,
- true, &should_free2);
+ itup = tuplesort_getindextuple(btspool->sortstate, true);
+ itup2 = tuplesort_getindextuple(btspool2->sortstate, true);
indexScanKey = _bt_mkscankey_nodata(wstate->index);
/* Prepare SortSupport data for each column */
@@ -775,18 +771,12 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
if (load1)
{
_bt_buildadd(wstate, state, itup);
- if (should_free)
- pfree(itup);
- itup = tuplesort_getindextuple(btspool->sortstate,
- true, &should_free);
+ itup = tuplesort_getindextuple(btspool->sortstate, true);
}
else
{
_bt_buildadd(wstate, state, itup2);
- if (should_free2)
- pfree(itup2);
- itup2 = tuplesort_getindextuple(btspool2->sortstate,
- true, &should_free2);
+ itup2 = tuplesort_getindextuple(btspool2->sortstate, true);
}
}
pfree(sortKeys);
@@ -795,15 +785,13 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
{
/* merge is unnecessary */
while ((itup = tuplesort_getindextuple(btspool->sortstate,
- true, &should_free)) != NULL)
+ true)) != NULL)
{
/* When we see first tuple, create first index page */
if (state == NULL)
state = _bt_pagestate(wstate, 0);
_bt_buildadd(wstate, state, itup);
- if (should_free)
- pfree(itup);
}
}
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index dc1f79f..2131226 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1057,11 +1057,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
for (;;)
{
HeapTuple tuple;
- bool shouldfree;
CHECK_FOR_INTERRUPTS();
- tuple = tuplesort_getheaptuple(tuplesort, true, &shouldfree);
+ tuple = tuplesort_getheaptuple(tuplesort, true);
if (tuple == NULL)
break;
@@ -1069,9 +1068,6 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
oldTupDesc, newTupDesc,
values, isnull,
NewHeap->rd_rel->relhasoids, rwstate);
-
- if (shouldfree)
- heap_freetuple(tuple);
}
tuplesort_end(tuplesort);
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index baf87b3..71524c2 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1841,12 +1841,12 @@ tuplesort_performsort(Tuplesortstate *state)
/*
* Internal routine to fetch the next tuple in either forward or back
* direction into *stup. Returns FALSE if no more tuples.
- * If *should_free is set, the caller must pfree stup.tuple when done with it.
- * Otherwise, caller should not use tuple following next call here.
+ * Returned tuple belongs to tuplesort memory context, and must not be freed
+ * by caller. Caller should not use tuple following next call here.
*/
static bool
tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
- SortTuple *stup, bool *should_free)
+ SortTuple *stup)
{
unsigned int tuplen;
@@ -1855,7 +1855,6 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
case TSS_SORTEDINMEM:
Assert(forward || state->randomAccess);
Assert(!state->slabAllocatorUsed);
- *should_free = false;
if (forward)
{
if (state->current < state->memtupcount)
@@ -1927,7 +1926,6 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
*/
state->lastReturnedTuple = stup->tuple;
- *should_free = false;
return true;
}
else
@@ -2008,14 +2006,12 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
*/
state->lastReturnedTuple = stup->tuple;
- *should_free = false;
return true;
case TSS_FINALMERGE:
Assert(forward);
/* We are managing memory ourselves, with the slab allocator. */
Assert(state->slabAllocatorUsed);
- *should_free = false;
/*
* The slab slot holding the tuple that we returned in previous
@@ -2097,9 +2093,8 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
- bool should_free;
- if (!tuplesort_gettuple_common(state, forward, &stup, &should_free))
+ if (!tuplesort_gettuple_common(state, forward, &stup))
stup.tuple = NULL;
MemoryContextSwitchTo(oldcontext);
@@ -2110,12 +2105,8 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
if (state->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
- if (!should_free)
- {
- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
- should_free = true;
- }
- ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, should_free);
+ stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, true);
return true;
}
else
@@ -2127,18 +2118,17 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
/*
* Fetch the next tuple in either forward or back direction.
- * Returns NULL if no more tuples. If *should_free is set, the
- * caller must pfree the returned tuple when done with it.
- * If it is not set, caller should not use tuple following next
- * call here.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller should not use tuple
+ * following next call here.
*/
HeapTuple
-tuplesort_getheaptuple(Tuplesortstate *state, bool forward, bool *should_free)
+tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
- if (!tuplesort_gettuple_common(state, forward, &stup, should_free))
+ if (!tuplesort_gettuple_common(state, forward, &stup))
stup.tuple = NULL;
MemoryContextSwitchTo(oldcontext);
@@ -2148,19 +2138,17 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward, bool *should_free)
/*
* Fetch the next index tuple in either forward or back direction.
- * Returns NULL if no more tuples. If *should_free is set, the
- * caller must pfree the returned tuple when done with it.
- * If it is not set, caller should not use tuple following next
- * call here.
+ * Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
+ * context, and must not be freed by caller. Caller should not use tuple
+ * following next call here.
*/
IndexTuple
-tuplesort_getindextuple(Tuplesortstate *state, bool forward,
- bool *should_free)
+tuplesort_getindextuple(Tuplesortstate *state, bool forward)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
- if (!tuplesort_gettuple_common(state, forward, &stup, should_free))
+ if (!tuplesort_gettuple_common(state, forward, &stup))
stup.tuple = NULL;
MemoryContextSwitchTo(oldcontext);
@@ -2173,7 +2161,8 @@ tuplesort_getindextuple(Tuplesortstate *state, bool forward,
* Returns FALSE if no more datums.
*
* If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * and is now owned by the caller.
+ * and is now owned by the caller (this differs from similar routines for
+ * other types of tuplesorts).
*
* Caller may optionally be passed back abbreviated value (on TRUE return
* value) when abbreviation was used, which can be used to cheaply avoid
@@ -2188,9 +2177,8 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
SortTuple stup;
- bool should_free;
- if (!tuplesort_gettuple_common(state, forward, &stup, &should_free))
+ if (!tuplesort_gettuple_common(state, forward, &stup))
{
MemoryContextSwitchTo(oldcontext);
return false;
@@ -2208,11 +2196,7 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
else
{
/* use stup.tuple because stup.datum1 may be an abbreviation */
-
- if (should_free)
- *val = PointerGetDatum(stup.tuple);
- else
- *val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
+ *val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
*isNull = false;
}
@@ -2270,16 +2254,12 @@ tuplesort_skiptuples(Tuplesortstate *state, int64 ntuples, bool forward)
while (ntuples-- > 0)
{
SortTuple stup;
- bool should_free;
- if (!tuplesort_gettuple_common(state, forward,
- &stup, &should_free))
+ if (!tuplesort_gettuple_common(state, forward, &stup))
{
MemoryContextSwitchTo(oldcontext);
return false;
}
- if (should_free && stup.tuple)
- pfree(stup.tuple);
CHECK_FOR_INTERRUPTS();
}
MemoryContextSwitchTo(oldcontext);
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 5cecd6d..38af746 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -94,10 +94,8 @@ extern void tuplesort_performsort(Tuplesortstate *state);
extern bool tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
TupleTableSlot *slot, Datum *abbrev);
-extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward,
- bool *should_free);
-extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward,
- bool *should_free);
+extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward);
+extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward);
extern bool tuplesort_getdatum(Tuplesortstate *state, bool forward,
Datum *val, bool *isNull, Datum *abbrev);
--
2.7.4
On Mon, Dec 12, 2016 at 2:30 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Mon, Dec 12, 2016 at 9:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
I think this patch might have a bug. In the existing code,
tuplesort_gettupleslot sets should_free = true if it isn't already
just before calling ExecStoreMinimalTuple((MinimalTuple) stup.tuple,
slot, should_free), so it seems that ExecStoreMinimalTuple() will
always get "true" as the fourth argument. However the patch changes
that line of code like this:+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, false);
So the patch seems to have the effect of changing the fourth argument
to this call to ExecStoreMinimalTuple() from always-true to
always-false. I might be missing something, but my guess is that's
not right.There was a memory leak added by 0001-*, but then fixed by 0002-*. I
should have done more testing of 0001-* alone. Oops.Attached revision of 0001-* fixes this. A revised 0002-* is also
attached, just as a convenience for reviewers (they won't need to
resolve the conflict themselves).
Committed 0001.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Dec 12, 2016 at 12:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:
Committed 0001.
Thanks.
I should have already specifically pointed out that the original
discussion on what became 0002-* is here:
postgr.es/m/7256.1476711733@sss.pgh.pa.us
As I said already, the general idea seems uncontroversial.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter Geoghegan <pg@heroku.com> writes:
I should have already specifically pointed out that the original
discussion on what became 0002-* is here:
postgr.es/m/7256.1476711733@sss.pgh.pa.us
As I said already, the general idea seems uncontroversial.
I looked at the 0002 patch, and while the code is probably OK, I am
dissatisfied with this API spec:
+ * If copy is TRUE, the slot receives a copied tuple that will stay valid
+ * regardless of future manipulations of the tuplesort's state. Memory is
+ * owned by the caller. If copy is FALSE, the slot may just receive a pointer
+ * to a tuple held within the tuplesort. The latter is more efficient, but
+ * the slot contents may be corrupted if there is another call here before
+ * previous slot contents are used.
What does "here" mean? If that means specifically "another call of
tuplesort_gettupleslot", say so. If "here" refers to the whole module,
it would be better to say something like "the slot contents may be
invalidated by any subsequent manipulation of the tuplesort's state".
In any case it'd be a good idea to delineate safe usage patterns, perhaps
"copy=FALSE is recommended only when the next tuplesort manipulation will
be another tuplesort_gettupleslot fetch into the same slot."
There are several other uses of "call here", both in this patch and
pre-existing in tuplesort.c, that I find equally vague and unsatisfactory.
Let's try to improve that.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 25, 2017 at 2:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I looked at the 0002 patch, and while the code is probably OK, I am
dissatisfied with this API spec:+ * If copy is TRUE, the slot receives a copied tuple that will stay valid + * regardless of future manipulations of the tuplesort's state. Memory is + * owned by the caller. If copy is FALSE, the slot may just receive a pointer + * to a tuple held within the tuplesort. The latter is more efficient, but + * the slot contents may be corrupted if there is another call here before + * previous slot contents are used.What does "here" mean? If that means specifically "another call of
tuplesort_gettupleslot", say so. If "here" refers to the whole module,
it would be better to say something like "the slot contents may be
invalidated by any subsequent manipulation of the tuplesort's state".
In any case it'd be a good idea to delineate safe usage patterns, perhaps
"copy=FALSE is recommended only when the next tuplesort manipulation will
be another tuplesort_gettupleslot fetch into the same slot."
I agree with your analysis.
It means "another call to tuplesort_gettupleslot", but I believe that
it would be safer (more future-proof) to actually specify "the slot
contents may be invalidated by any subsequent manipulation of the
tuplesort's state" instead.
There are several other uses of "call here", both in this patch and
pre-existing in tuplesort.c, that I find equally vague and unsatisfactory.
Let's try to improve that.
Should I write a patch along those lines?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter Geoghegan <pg@heroku.com> writes:
It means "another call to tuplesort_gettupleslot", but I believe that
it would be safer (more future-proof) to actually specify "the slot
contents may be invalidated by any subsequent manipulation of the
tuplesort's state" instead.
WFM.
There are several other uses of "call here", both in this patch and
pre-existing in tuplesort.c, that I find equally vague and unsatisfactory.
Let's try to improve that.
Should I write a patch along those lines?
Please. You might want to hit the existing ones with a separate patch,
but it doesn't much matter; I'd be just as happy with a patch that did
both things.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 25, 2017 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Please. You might want to hit the existing ones with a separate patch,
but it doesn't much matter; I'd be just as happy with a patch that did
both things.
Got it.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
[ in the service of closing out this thread... ]
Peter Geoghegan <pg@heroku.com> writes:
Finally, 0003-* is a Valgrind suppression borrowed from my parallel
CREATE INDEX patch. It's self-explanatory.
Um, I didn't find it all that self-explanatory. Why wouldn't we want
to avoid writing undefined data? I think the comment at least needs
to explain exactly what part of the written data might be uninitialized.
And I'd put the comment into valgrind.supp, too, not in the commit msg.
Also, the suppression seems far too broad. It would for instance
block any complaint about a write() invoked via an elog call from
any function invoked from any LogicalTape* function, no matter
how far removed.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jan 26, 2017 at 8:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
[ in the service of closing out this thread... ]
Peter Geoghegan <pg@heroku.com> writes:
Finally, 0003-* is a Valgrind suppression borrowed from my parallel
CREATE INDEX patch. It's self-explanatory.Um, I didn't find it all that self-explanatory. Why wouldn't we want
to avoid writing undefined data? I think the comment at least needs
to explain exactly what part of the written data might be uninitialized.
And I'd put the comment into valgrind.supp, too, not in the commit msg.Also, the suppression seems far too broad. It would for instance
block any complaint about a write() invoked via an elog call from
any function invoked from any LogicalTape* function, no matter
how far removed.
It seems like a new patch will be provided, so moved to next CF with
"waiting on author".
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter,
On 2/1/17 12:59 AM, Michael Paquier wrote:
On Thu, Jan 26, 2017 at 8:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
[ in the service of closing out this thread... ]
Peter Geoghegan <pg@heroku.com> writes:
Finally, 0003-* is a Valgrind suppression borrowed from my parallel
CREATE INDEX patch. It's self-explanatory.Um, I didn't find it all that self-explanatory. Why wouldn't we want
to avoid writing undefined data? I think the comment at least needs
to explain exactly what part of the written data might be uninitialized.
And I'd put the comment into valgrind.supp, too, not in the commit msg.Also, the suppression seems far too broad. It would for instance
block any complaint about a write() invoked via an elog call from
any function invoked from any LogicalTape* function, no matter
how far removed.
It looks like we are waiting on a new patch. Do you know when you will
have that ready?
Thanks,
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Peter,
On 3/2/17 9:43 AM, David Steele wrote:
Peter,
On 2/1/17 12:59 AM, Michael Paquier wrote:
On Thu, Jan 26, 2017 at 8:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
[ in the service of closing out this thread... ]
Peter Geoghegan <pg@heroku.com> writes:
Finally, 0003-* is a Valgrind suppression borrowed from my parallel
CREATE INDEX patch. It's self-explanatory.Um, I didn't find it all that self-explanatory. Why wouldn't we want
to avoid writing undefined data? I think the comment at least needs
to explain exactly what part of the written data might be uninitialized.
And I'd put the comment into valgrind.supp, too, not in the commit msg.Also, the suppression seems far too broad. It would for instance
block any complaint about a write() invoked via an elog call from
any function invoked from any LogicalTape* function, no matter
how far removed.It looks like we are waiting on a new patch. Do you know when you will
have that ready?
It's been a while since there was a new patch or any activity on this
thread.
If you need more time to produce a patch, please post an explanation for
the delay and a schedule for the new patch. If no patch or explanation
is is posted by 2017-03-16 AoE I will mark this submission
"Returned with Feedback".
Thanks,
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 13, 2017 at 8:23 AM, David Steele <david@pgmasters.net> wrote:
It's been a while since there was a new patch or any activity on this
thread.If you need more time to produce a patch, please post an explanation for
the delay and a schedule for the new patch. If no patch or explanation
is is posted by 2017-03-16 AoE I will mark this submission
"Returned with Feedback".
Apologies for the delay on this. I intend to get back to it before that time.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 25, 2017 at 3:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Um, I didn't find it all that self-explanatory. Why wouldn't we want
to avoid writing undefined data?
For roughly the same reason we'd want to avoid it in existing cases
that are next to the proposed new suppression. We happen to not need
to initialize the data, but the interface we're using still requires
that we write at a coarser granularity than bytes. logtape.c always
writes whole blocks at a time.
Also, the suppression seems far too broad. It would for instance
block any complaint about a write() invoked via an elog call from
any function invoked from any LogicalTape* function, no matter
how far removed.
Writing blocks doesn't just happen in what tuplesort.c or even
logtape.c would consider to be a write path -- it happens when
logtape.c has a dirty buffer that it cleans. Plus you have double
buffering here -- buffile.c manages its own BLCKSZ buffer at the end
of the BufFile struct.
IIRC the reason I did things this way is because both
LogicalTapeRead() and LogicalTapeWrite() both appeared in offending
stack traces, and ltsWriteBlock() would not have plugged that, because
ltsReadBlock() could be involved instead. I don't remember the exact
details offhand, so I will have to look into it again.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jan 25, 2017 at 3:11 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Wed, Jan 25, 2017 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Please. You might want to hit the existing ones with a separate patch,
but it doesn't much matter; I'd be just as happy with a patch that did
both things.Got it.
Attached is a patch that does both things at once. The internal
function tuplesort_gettuple_common() still mentions repeated fetches,
since that context matters to its internal callers. The various
external-facing functions have a simpler, stricter contract, as
discussed.
I didn't end up adding a line like "copy=FALSE is recommended only
when the next tuplesort manipulation will be another
tuplesort_gettupleslot fetch into the same slot", which you suggested.
When the tuplesort state machine is in any state following
"performing" a sort, there are very few remaining sane manipulations.
Just things like skipping or seeking around for tuples, and actually
ending the tuplesort and releasing resources.
--
Peter Geoghegan
Attachments:
0001-Avoid-copying-within-tuplesort_gettupleslot.patchtext/x-patch; charset=US-ASCII; name=0001-Avoid-copying-within-tuplesort_gettupleslot.patchDownload
From 5351b5db257cb39832647d9117465c0217e6268b Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Thu, 13 Oct 2016 10:54:31 -0700
Subject: [PATCH 1/2] Avoid copying within tuplesort_gettupleslot().
Add a "copy" argument to make it optional to receive a copy of caller
tuple that is safe to use following a subsequent manipulating of
tuplesort's state. This is a performance optimization. Most existing
tuplesort_gettupleslot() callers are made to opt out of copying.
Existing callers that happen to rely on the validity of tuple memory
beyond subsequent manipulations of the tuplesort request their own copy.
This brings tuplesort_gettupleslot() in line with
tuplestore_gettupleslot(). In the future, a "copy" tuplesort_getdatum()
argument may be added, that similarly allows callers to opt out of
receiving their own copy of tuple.
In passing, clarify assumptions that callers of other tuplesort fetch
routines may make about tuple memory validity, per gripe from Tom Lane.
---
src/backend/executor/nodeAgg.c | 10 +++++++---
src/backend/executor/nodeSort.c | 5 +++--
src/backend/utils/adt/orderedsetaggs.c | 5 +++--
src/backend/utils/sort/tuplesort.c | 28 +++++++++++++++++-----------
src/include/utils/tuplesort.h | 2 +-
5 files changed, 31 insertions(+), 19 deletions(-)
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index aa08152..b650f71 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -570,6 +570,10 @@ initialize_phase(AggState *aggstate, int newphase)
* Fetch a tuple from either the outer plan (for phase 0) or from the sorter
* populated by the previous phase. Copy it to the sorter for the next phase
* if any.
+ *
+ * Callers cannot rely on memory for tuple in returned slot remaining valid
+ * past any subsequent manipulation of the sorter, such as another fetch of
+ * input from sorter. (The sorter may recycle memory.)
*/
static TupleTableSlot *
fetch_input_tuple(AggState *aggstate)
@@ -578,8 +582,8 @@ fetch_input_tuple(AggState *aggstate)
if (aggstate->sort_in)
{
- if (!tuplesort_gettupleslot(aggstate->sort_in, true, aggstate->sort_slot,
- NULL))
+ if (!tuplesort_gettupleslot(aggstate->sort_in, true, false,
+ aggstate->sort_slot, NULL))
return NULL;
slot = aggstate->sort_slot;
}
@@ -1272,7 +1276,7 @@ process_ordered_aggregate_multi(AggState *aggstate,
ExecClearTuple(slot2);
while (tuplesort_gettupleslot(pertrans->sortstates[aggstate->current_set],
- true, slot1, &newAbbrevVal))
+ true, true, slot1, &newAbbrevVal))
{
/*
* Extract the first numTransInputs columns as datums to pass to the
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 591a31a..d901034 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -132,12 +132,13 @@ ExecSort(SortState *node)
/*
* Get the first or next tuple from tuplesort. Returns NULL if no more
- * tuples.
+ * tuples. Note that we rely on memory for tuple in slot remaining valid
+ * only until some subsequent manipulation of tuplesort.
*/
slot = node->ss.ps.ps_ResultTupleSlot;
(void) tuplesort_gettupleslot(tuplesortstate,
ScanDirectionIsForward(dir),
- slot, NULL);
+ false, slot, NULL);
return slot;
}
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index e462fbd..8502fcf 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -1190,7 +1190,7 @@ hypothetical_rank_common(FunctionCallInfo fcinfo, int flag,
tuplesort_performsort(osastate->sortstate);
/* iterate till we find the hypothetical row */
- while (tuplesort_gettupleslot(osastate->sortstate, true, slot, NULL))
+ while (tuplesort_gettupleslot(osastate->sortstate, true, true, slot, NULL))
{
bool isnull;
Datum d = slot_getattr(slot, nargs + 1, &isnull);
@@ -1353,7 +1353,8 @@ hypothetical_dense_rank_final(PG_FUNCTION_ARGS)
slot2 = extraslot;
/* iterate till we find the hypothetical row */
- while (tuplesort_gettupleslot(osastate->sortstate, true, slot, &abbrevVal))
+ while (tuplesort_gettupleslot(osastate->sortstate, true, true, slot,
+ &abbrevVal))
{
bool isnull;
Datum d = slot_getattr(slot, nargs + 1, &isnull);
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index e1e692d..af79ab3 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1842,7 +1842,8 @@ tuplesort_performsort(Tuplesortstate *state)
* Internal routine to fetch the next tuple in either forward or back
* direction into *stup. Returns FALSE if no more tuples.
* Returned tuple belongs to tuplesort memory context, and must not be freed
- * by caller. Caller should not use tuple following next call here.
+ * by caller. Note that fetched tuple is stored in memory that may be
+ * recycled by any future fetch.
*/
static bool
tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
@@ -2091,12 +2092,15 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
* NULL value in leading attribute will set abbreviated value to zeroed
* representation, which caller may rely on in abbreviated inequality check.
*
- * The slot receives a copied tuple (sometimes allocated in caller memory
- * context) that will stay valid regardless of future manipulations of the
- * tuplesort's state.
+ * If copy is TRUE, the slot receives a copied tuple that will stay valid
+ * regardless of future manipulations of the tuplesort's state. Memory is
+ * owned by the caller. If copy is FALSE, the slot will just receive a
+ * pointer to a tuple held within the tuplesort, which is more efficient,
+ * but only safe for callers that are prepared to have any subsequent
+ * manipulation of the tuplesort's state invalidate slot contents.
*/
bool
-tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
+tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
TupleTableSlot *slot, Datum *abbrev)
{
MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
@@ -2113,8 +2117,10 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
if (state->sortKeys->abbrev_converter && abbrev)
*abbrev = stup.datum1;
- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
- ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, true);
+ if (copy)
+ stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
+
+ ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
return true;
}
else
@@ -2127,8 +2133,8 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
/*
* Fetch the next tuple in either forward or back direction.
* Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller should not use tuple
- * following next call here.
+ * context, and must not be freed by caller. Caller should not rely on tuple
+ * memory remaining valid after any further manipulation of tuplesort.
*/
HeapTuple
tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
@@ -2147,8 +2153,8 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward)
/*
* Fetch the next index tuple in either forward or back direction.
* Returns NULL if no more tuples. Returned tuple belongs to tuplesort memory
- * context, and must not be freed by caller. Caller should not use tuple
- * following next call here.
+ * context, and must not be freed by caller. Caller should not rely on tuple
+ * memory remaining valid after any further manipulation of tuplesort.
*/
IndexTuple
tuplesort_getindextuple(Tuplesortstate *state, bool forward)
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 5b3f475..32b287c 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -93,7 +93,7 @@ extern void tuplesort_putdatum(Tuplesortstate *state, Datum val,
extern void tuplesort_performsort(Tuplesortstate *state);
extern bool tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
- TupleTableSlot *slot, Datum *abbrev);
+ bool copy, TupleTableSlot *slot, Datum *abbrev);
extern HeapTuple tuplesort_getheaptuple(Tuplesortstate *state, bool forward);
extern IndexTuple tuplesort_getindextuple(Tuplesortstate *state, bool forward);
extern bool tuplesort_getdatum(Tuplesortstate *state, bool forward,
--
2.7.4
Hi Anastasia,
On 3/13/17 9:14 PM, Peter Geoghegan wrote:
On Wed, Jan 25, 2017 at 3:11 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Wed, Jan 25, 2017 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Please. You might want to hit the existing ones with a separate patch,
but it doesn't much matter; I'd be just as happy with a patch that did
both things.Got it.
Attached is a patch that does both things at once. The internal
function tuplesort_gettuple_common() still mentions repeated fetches,
since that context matters to its internal callers. The various
external-facing functions have a simpler, stricter contract, as
discussed.I didn't end up adding a line like "copy=FALSE is recommended only
when the next tuplesort manipulation will be another
tuplesort_gettupleslot fetch into the same slot", which you suggested.
When the tuplesort state machine is in any state following
"performing" a sort, there are very few remaining sane manipulations.
Just things like skipping or seeking around for tuples, and actually
ending the tuplesort and releasing resources.
You are signed up to review this patch. Do you know when you'll have a
chance to do that?
Thanks,
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-03-13 18:14:07 -0700, Peter Geoghegan wrote:
From 5351b5db257cb39832647d9117465c0217e6268b Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Thu, 13 Oct 2016 10:54:31 -0700
Subject: [PATCH 1/2] Avoid copying within tuplesort_gettupleslot().
s/Avoid/Allow to avoid/
Add a "copy" argument to make it optional to receive a copy of caller
tuple that is safe to use following a subsequent manipulating of
tuplesort's state. This is a performance optimization. Most existing
tuplesort_gettupleslot() callers are made to opt out of copying.
Existing callers that happen to rely on the validity of tuple memory
beyond subsequent manipulations of the tuplesort request their own copy.This brings tuplesort_gettupleslot() in line with
tuplestore_gettupleslot(). In the future, a "copy" tuplesort_getdatum()
argument may be added, that similarly allows callers to opt out of
receiving their own copy of tuple.In passing, clarify assumptions that callers of other tuplesort fetch
routines may make about tuple memory validity, per gripe from Tom Lane.
---
src/backend/executor/nodeAgg.c | 10 +++++++---
src/backend/executor/nodeSort.c | 5 +++--
src/backend/utils/adt/orderedsetaggs.c | 5 +++--
src/backend/utils/sort/tuplesort.c | 28 +++++++++++++++++-----------
src/include/utils/tuplesort.h | 2 +-
5 files changed, 31 insertions(+), 19 deletions(-)diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index aa08152..b650f71 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -570,6 +570,10 @@ initialize_phase(AggState *aggstate, int newphase) * Fetch a tuple from either the outer plan (for phase 0) or from the sorter * populated by the previous phase. Copy it to the sorter for the next phase * if any. + * + * Callers cannot rely on memory for tuple in returned slot remaining valid + * past any subsequent manipulation of the sorter, such as another fetch of + * input from sorter. (The sorter may recycle memory.) */
It's not just the sorter, right? I'd just rephrase that the caller
cannot rely on tuple to stay valid beyond a single call.
static bool tuplesort_gettuple_common(Tuplesortstate *state, bool forward, @@ -2091,12 +2092,15 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward, * NULL value in leading attribute will set abbreviated value to zeroed * representation, which caller may rely on in abbreviated inequality check. * - * The slot receives a copied tuple (sometimes allocated in caller memory - * context) that will stay valid regardless of future manipulations of the - * tuplesort's state. + * If copy is TRUE, the slot receives a copied tuple that will stay valid + * regardless of future manipulations of the tuplesort's state. Memory is + * owned by the caller. If copy is FALSE, the slot will just receive a + * pointer to a tuple held within the tuplesort, which is more efficient, + * but only safe for callers that are prepared to have any subsequent + * manipulation of the tuplesort's state invalidate slot contents. */
Hm. Do we need to note anything about how long caller memory needs to
live? I think we'd get into trouble if the caller does a memory context
reset, without also clearing the slot?
bool -tuplesort_gettupleslot(Tuplesortstate *state, bool forward, +tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy, TupleTableSlot *slot, Datum *abbrev) { MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext); @@ -2113,8 +2117,10 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward, if (state->sortKeys->abbrev_converter && abbrev) *abbrev = stup.datum1;- stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple); - ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, true); + if (copy) + stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple); + + ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy); return true;
Other than these minimal adjustments, this looks good to go to me.
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 4, 2017 at 1:32 PM, Andres Freund <andres@anarazel.de> wrote:
s/Avoid/Allow to avoid/
WFM.
+ * + * Callers cannot rely on memory for tuple in returned slot remaining valid + * past any subsequent manipulation of the sorter, such as another fetch of + * input from sorter. (The sorter may recycle memory.) */It's not just the sorter, right? I'd just rephrase that the caller
cannot rely on tuple to stay valid beyond a single call.
It started that way, but changed on the basis of Tom's feedback. I
would be equally happy with either wording.
static bool tuplesort_gettuple_common(Tuplesortstate *state, bool forward, @@ -2091,12 +2092,15 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward, * NULL value in leading attribute will set abbreviated value to zeroed * representation, which caller may rely on in abbreviated inequality check. * - * The slot receives a copied tuple (sometimes allocated in caller memory - * context) that will stay valid regardless of future manipulations of the - * tuplesort's state. + * If copy is TRUE, the slot receives a copied tuple that will stay valid + * regardless of future manipulations of the tuplesort's state. Memory is + * owned by the caller. If copy is FALSE, the slot will just receive a + * pointer to a tuple held within the tuplesort, which is more efficient, + * but only safe for callers that are prepared to have any subsequent + * manipulation of the tuplesort's state invalidate slot contents. */Hm. Do we need to note anything about how long caller memory needs to
live? I think we'd get into trouble if the caller does a memory context
reset, without also clearing the slot?
Since we arrange to have caller explicitly own memory (it's always in
their own memory context (current context), which is not the case with
other similar routines), it's 100% the caller's problem. As I assume
you know, convention in executor around resource management of memory
like this is pretty centralized within ExecStoreTuple(), and nearby
routines. In general, the executor is more or less used to having this
be its problem alone, and is pessimistic about memory lifetime unless
otherwise noted.
Other than these minimal adjustments, this looks good to go to me.
Cool.
I'll try to get out a revision soon, maybe later today, including an
updated 0002-* (Valgrind suppression), which I have not forgotten
about.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested
The patch looks fine to me. Changes are clear and all functions are commented properly.
So I marked it "Ready for committer".
The new status of this patch is: Ready for Committer
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-04-04 13:49:11 -0700, Peter Geoghegan wrote:
On Tue, Apr 4, 2017 at 1:32 PM, Andres Freund <andres@anarazel.de> wrote:
static bool tuplesort_gettuple_common(Tuplesortstate *state, bool forward, @@ -2091,12 +2092,15 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward, * NULL value in leading attribute will set abbreviated value to zeroed * representation, which caller may rely on in abbreviated inequality check. * - * The slot receives a copied tuple (sometimes allocated in caller memory - * context) that will stay valid regardless of future manipulations of the - * tuplesort's state. + * If copy is TRUE, the slot receives a copied tuple that will stay valid + * regardless of future manipulations of the tuplesort's state. Memory is + * owned by the caller. If copy is FALSE, the slot will just receive a + * pointer to a tuple held within the tuplesort, which is more efficient, + * but only safe for callers that are prepared to have any subsequent + * manipulation of the tuplesort's state invalidate slot contents. */Hm. Do we need to note anything about how long caller memory needs to
live? I think we'd get into trouble if the caller does a memory context
reset, without also clearing the slot?Since we arrange to have caller explicitly own memory (it's always in
their own memory context (current context), which is not the case with
other similar routines), it's 100% the caller's problem. As I assume
you know, convention in executor around resource management of memory
like this is pretty centralized within ExecStoreTuple(), and nearby
routines. In general, the executor is more or less used to having this
be its problem alone, and is pessimistic about memory lifetime unless
otherwise noted.
I'm not sure you entirely got my point here. If tuplesort_gettupleslot
is called with copy = true, it'll store that tuple w/
ExecStoreMinimalTuple passing shouldFree = copy = true. If the caller
is in a short-lived context, e.g. some expression context, and resets
that context before the slot is cleared (either by storing another tuple
or ExecClearTuple()) you'll get a double free, because the context has
freed the tuple in bulk, and then
if (slot->tts_shouldFree)
heap_freetuple(slot->tts_tuple);
will do its work.
Now, that's obviously not a problem for existing code, because it worked
that way for a long time - I just was wondering whether that needs to be
called out.
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Apr 6, 2017 at 2:05 PM, Andres Freund <andres@anarazel.de> wrote:
I'm not sure you entirely got my point here. If tuplesort_gettupleslot
is called with copy = true, it'll store that tuple w/
ExecStoreMinimalTuple passing shouldFree = copy = true. If the caller
is in a short-lived context, e.g. some expression context, and resets
that context before the slot is cleared (either by storing another tuple
or ExecClearTuple()) you'll get a double free, because the context has
freed the tuple in bulk, and then
if (slot->tts_shouldFree)
heap_freetuple(slot->tts_tuple);
will do its work.
Calling ExecClearTuple() will mark the slot "tts_shouldFree = false"
-- you only have a problem when a memory context is reset, which
obviously cannot be accounted for by ExecClearTuple(). ISTM that
ResetExprContext() is kind of called to "make sure" that memory is
freed in a short-lived expression context, at a level up from any
ExecStoreMinimalTuple()/ExecClearTuple() style memory management. The
conventions are not centrally documented, AFAIK, but this still seems
natural enough to me. Per-tuple contexts tend to be associated with
expression contexts.
In any case, I'm not sure where you'd centrally document the
conventions. Although, it seems clear that it wouldn't be anywhere
this patch currently touches. The executor README, perhaps?
--
Peter Geoghegan
VMware vCenter Server
https://www.vmware.com/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-03-13 18:14:07 -0700, Peter Geoghegan wrote:
On Wed, Jan 25, 2017 at 3:11 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Wed, Jan 25, 2017 at 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Please. You might want to hit the existing ones with a separate patch,
but it doesn't much matter; I'd be just as happy with a patch that did
both things.Got it.
Attached is a patch that does both things at once.
Pushed with very minor wording changes.
Thanks,
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Apr 6, 2017 at 2:50 PM, Andres Freund <andres@anarazel.de> wrote:
Pushed with very minor wording changes.
Thanks Andres.
--
Peter Geoghegan
VMware vCenter Server
https://www.vmware.com/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Apr 6, 2017 at 2:50 PM, Andres Freund <andres@anarazel.de> wrote:
Pushed with very minor wording changes.
This had a typo:
+ * If copy is true, the slot receives a copied tuple that'll that will stay
--
Peter Geoghegan
VMware vCenter Server
https://www.vmware.com/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-04-06 14:57:56 -0700, Peter Geoghegan wrote:
On Thu, Apr 6, 2017 at 2:50 PM, Andres Freund <andres@anarazel.de> wrote:
Pushed with very minor wording changes.
This had a typo:
+ * If copy is true, the slot receives a copied tuple that'll that will stay
Belatedly fixed.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers