failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)

Started by Tomas Vondraalmost 11 years ago7 messages

tomas.vondra@2ndquadrant.com

almost 11 years ago

Hi,

while reviewing one of the 'abbreviated keys' patches, I noticed that
the ordered set aggregates are broken when using tuplesort with multiple
runs.

ISTM this got broken by 5cefbf5a6c4466ac6b1cc2a4316b4eba9108c802:

Don't use abbreviated keys for the final merge pass.

When we write tuples out to disk and read them back in, the
abbreviated keys become non-abbreviated, because the readtup
routines don't know anything about abbreviation. But without
this fix, the rest of the code still thinks the
abbreviation-aware compartor should be used, so chaos ensues.

Triggering it is quite simple:

CREATE TABLE stuff AS SELECT random()::text AS randtext
FROM generate_series(1,1000000);
ANALYZE stuff;

SET work_mem = '1MB';

SELECT percentile_disc(0) WITHIN GROUP (ORDER BY randtext)
FROM stuff;

which fails like this:

----------------------------------------------------------------------
Program received signal SIGSEGV, Segmentation fault.
0x0000000000931bf6 in mergeruns (state=0x1dd3b58) at tuplesort.c:2175
2175 if (state->sortKeys->abbrev_converter)
(gdb) print state
$1 = (Tuplesortstate *) 0x1dd3b58
(gdb) print state->sortKeys
$2 = (SortSupport) 0x0
(gdb) print state->onlyKey
$3 = (SortSupport) 0x1dd3d70
----------------------------------------------------------------------
(gdb) bt
#0 0x0000000000931bf6 in mergeruns (state=0x1dd3b58) at tuplesort.c:2175
#1 0x00000000009309c1 in tuplesort_performsort (state=0x1dd3b58) at
tuplesort.c:1563
#2 0x0000000000864038 in percentile_disc_final (fcinfo=0x7fff2c9f4bc0)
at orderedsetaggs.c:443
#3 0x0000000000664f23 in finalize_aggregate (aggstate=0x1dcbc40,
peraggstate=0x1dcdb38, pergroupstate=0x1dce350, resultVal=0x1dcd308,
resultIsNull=0x1dcd328 "") at nodeAgg.c:865
#4 0x0000000000665826 in agg_retrieve_direct (aggstate=0x1dcbc40) at
nodeAgg.c:1295
#5 0x000000000066551a in ExecAgg (node=0x1dcbc40) at nodeAgg.c:1119
----------------------------------------------------------------------

This seems to happen because ordered_set_startup() calls
tuplesort_begin_datum() when (use_tuples == true), which only sets
'onlyKey' and leaves (sortKeys == NULL). So 'mergeruns' fails because it
does not expect that.

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@heroku.com

almost 11 years ago

In reply to: Tomas Vondra (#1)

1 attachment(s)

Re: failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)

On Fri, Feb 20, 2015 at 11:58 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

This seems to happen because ordered_set_startup() calls
tuplesort_begin_datum() when (use_tuples == true), which only sets
'onlyKey' and leaves (sortKeys == NULL). So 'mergeruns' fails because it
does not expect that.

Oops. You're right. Attached patch fixes the issue, by making
tuplesort_begin_datum() consistent with other comparable routines for
other tuple cases. I think it makes sense to have the 'sortKeys' field
always set, and the 'onlyKey' field also set when that additional
optimization makes sense. That was what I understood the API to be, so
arguably this is a pre-existing issue with tuplesort_begin_datum().

Thanks for the report!
--
Peter Geoghegan

Attachments:

fix_datumsort_fieled.patchtext/x-patch; charset=US-ASCII; name=fix_datumsort_fieled.patchDownload

diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index b8494a2..3074e8e 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -905,18 +905,21 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
 	state->datumType = datumType;
 
 	/* Prepare SortSupport data */
-	state->onlyKey = (SortSupport) palloc0(sizeof(SortSupportData));
+	state->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
 
-	state->onlyKey->ssup_cxt = CurrentMemoryContext;
-	state->onlyKey->ssup_collation = sortCollation;
-	state->onlyKey->ssup_nulls_first = nullsFirstFlag;
+	state->sortKeys->ssup_cxt = CurrentMemoryContext;
+	state->sortKeys->ssup_collation = sortCollation;
+	state->sortKeys->ssup_nulls_first = nullsFirstFlag;
 	/*
 	 * Conversion to abbreviated representation infeasible in the Datum case.
 	 * It must be possible to subsequently fetch original datum values within
 	 * tuplesort_getdatum(), which would require special-case preservation of
 	 * original values.
 	 */
-	state->onlyKey->abbreviate = false;
+	state->sortKeys->abbreviate = false;
+
+	/* "onlyKey" optimization can always be used */
+	state->onlyKey = state->sortKeys;
 
 	PrepareSortSupportFromOrderingOp(sortOperator, state->onlyKey);

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Peter Geoghegan (#2)

1 attachment(s)

Re: failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)

On Fri, Feb 20, 2015 at 4:01 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Fri, Feb 20, 2015 at 11:58 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

This seems to happen because ordered_set_startup() calls
tuplesort_begin_datum() when (use_tuples == true), which only sets
'onlyKey' and leaves (sortKeys == NULL). So 'mergeruns' fails because it
does not expect that.

Oops. You're right. Attached patch fixes the issue, by making
tuplesort_begin_datum() consistent with other comparable routines for
other tuple cases. I think it makes sense to have the 'sortKeys' field
always set, and the 'onlyKey' field also set when that additional
optimization makes sense. That was what I understood the API to be, so
arguably this is a pre-existing issue with tuplesort_begin_datum().

This doesn't completely fix the issue. Even with this patch applied:

CREATE TABLE stuff AS SELECT random()::text AS randtext FROM
generate_series(1,50000000);
set maintenance_work_mem='1024';
create index on stuff using hash (randtext);
...wait for a bit, server crash...

I find your statement that this is a pre-existing issue in
tuplesort_begin_datum() to be pretty misleading, unless what you mean
by it is "pre-existing since November, when an earlier patch by Peter
Geoghegan changed the comments without changing the code to match".
Commit 5ea86e6e65dd2da3e9a3464484985d48328e7fe3, changed the comments
for the sortKey field to say that it would be set in all cases except
for the hash index case, but it did not make that statement true.
Subsequently, another of your patches, which I committed as
5cefbf5a6c4466ac6b1cc2a4316b4eba9108c802, added a dependency on
state->sortKeys always being set. Now you've got this patch here,
which you claim makes sure that sortKeys is always set, but it
actually doesn't, which is why the above still crashes. There are not
so many tuplesort_begin_xxx routines that you couldn't audit them all
before submitting your patches.

Anyway, I propose the attached instead, which makes both Tomas's test
case and the one above pass.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

tuplesort-fixes.patchbinary/octet-stream; name=tuplesort-fixes.patchDownload

diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index b8494a2..8ad5635 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -336,9 +336,9 @@ struct Tuplesortstate
 	bool		markpos_eof;	/* saved "eof_reached" */
 
 	/*
-	 * The sortKeys variable is used by every case other than the hash index
-	 * case; it is set by tuplesort_begin_xxx.  tupDesc is only used by the
-	 * MinimalTuple and CLUSTER routines, though.
+	 * The sortKeys variable is used by every case other than the datum and
+	 * hash index cases; it is set by tuplesort_begin_xxx.  tupDesc is only
+	 * used by the MinimalTuple and CLUSTER routines, though.
 	 */
 	TupleDesc	tupDesc;
 	SortSupport sortKeys;		/* array of length nKeys */
@@ -1246,7 +1246,7 @@ tuplesort_putindextuplevalues(Tuplesortstate *state, Relation rel,
 							 RelationGetDescr(state->indexRel),
 							 &stup.isnull1);
 
-	if (!state->sortKeys->abbrev_converter || stup.isnull1)
+	if (!state->sortKeys || !state->sortKeys->abbrev_converter || stup.isnull1)
 	{
 		/*
 		 * Store ordinary Datum representation, or NULL value.  If there is a
@@ -2172,7 +2172,7 @@ mergeruns(Tuplesortstate *state)
 		return;
 	}
 
-	if (state->sortKeys->abbrev_converter)
+	if (state->sortKeys != NULL && state->sortKeys->abbrev_converter != NULL)
 	{
 		/*
 		 * If there are multiple runs to be merged, when we go to read back

Peter Geoghegan

pg@heroku.com

almost 11 years ago

In reply to: Robert Haas (#3)

Re: failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)

On Tue, Mar 3, 2015 at 3:53 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I find your statement that this is a pre-existing issue in
tuplesort_begin_datum() to be pretty misleading, unless what you mean
by it is "pre-existing since November, when an earlier patch by Peter
Geoghegan changed the comments without changing the code to match".
Commit 5ea86e6e65dd2da3e9a3464484985d48328e7fe3, changed the comments
for the sortKey field to say that it would be set in all cases except
for the hash index case, but it did not make that statement true.

Agreed. I believe that is in fact what I meant. I'm not trying to
deflect blame - just to explain my understanding of the problem.

Subsequently, another of your patches, which I committed as
5cefbf5a6c4466ac6b1cc2a4316b4eba9108c802, added a dependency on
state->sortKeys always being set. Now you've got this patch here,
which you claim makes sure that sortKeys is always set, but it
actually doesn't, which is why the above still crashes. There are not
so many tuplesort_begin_xxx routines that you couldn't audit them all
before submitting your patches.

Sorry. I should have caught the hash index case too.

Anyway, I propose the attached instead, which makes both Tomas's test
case and the one above pass.

My patch actually matches Andrew Gierth's datumsort patch, in that it
also uses this convention, as I believe it should. For that reason,
I'd prefer to make the comment added in November true, rather than
changing the comment...I feel it ought to be true anyway (which is
what I meant about a pre-existing issue). You'll still need the
defenses you've added for the the hash case, of course. You might want
to also put a comment specifying why it's necessary, since the hash
tuple case is the oddball cases that doesn't always have "sortKeys"
set.

In other words, I suggest that you commit the union of our two
patches, less your changes to the comments around "sortKeys'.

Thanks
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Peter Geoghegan (#4)

Re: failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)

On Tue, Mar 3, 2015 at 7:14 PM, Peter Geoghegan <pg@heroku.com> wrote:

My patch actually matches Andrew Gierth's datumsort patch, in that it
also uses this convention, as I believe it should. For that reason,
I'd prefer to make the comment added in November true, rather than
changing the comment...I feel it ought to be true anyway (which is
what I meant about a pre-existing issue). You'll still need the
defenses you've added for the the hash case, of course. You might want
to also put a comment specifying why it's necessary, since the hash
tuple case is the oddball cases that doesn't always have "sortKeys"
set.

In other words, I suggest that you commit the union of our two
patches, less your changes to the comments around "sortKeys'.

I think we should commit my patch, and if a future patch needs
sortKeys set in more places, it can make that change itself. There's
no reason why it's needed with the code as it is today, and no reason
to let bits of future changes leak into a bug-fix patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@heroku.com

almost 11 years ago

In reply to: Robert Haas (#5)

Re: failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)

On Wed, Mar 4, 2015 at 8:26 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I think we should commit my patch, and if a future patch needs
sortKeys set in more places, it can make that change itself. There's
no reason why it's needed with the code as it is today, and no reason
to let bits of future changes leak into a bug-fix patch.

It's not as if I feel strongly about it. It's trivial to change the
code to make the comment true, rather than change the comment to make
the code true, and we need to do so anyway. I defer to you.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Peter Geoghegan (#6)

Re: failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)

On Wed, Mar 4, 2015 at 1:55 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Wed, Mar 4, 2015 at 8:26 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I think we should commit my patch, and if a future patch needs
sortKeys set in more places, it can make that change itself. There's
no reason why it's needed with the code as it is today, and no reason
to let bits of future changes leak into a bug-fix patch.

It's not as if I feel strongly about it. It's trivial to change the
code to make the comment true, rather than change the comment to make
the code true, and we need to do so anyway. I defer to you.

I did it my way. Thanks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers