Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

Started by Lukas Ederover 8 years ago12 messagesbugs
Jump to latest
#1Lukas Eder
lukas.eder@gmail.com

When running the following query:

select
cume_dist(1) within group (order by a desc),
rank(1) within group (order by a desc),
dense_rank(1) within group (order by a asc),
percent_rank(1) within group (order by a asc)
from (values(1)) t(a);

My JDBC connection is immediately terminated:

SQL Error [08006]: An I/O error occurred while sending to the backend.
An I/O error occurred while sending to the backend.
Connection reset

The issue depends on a certain set of combinations of the above function
calls. Each function can be called individually without problems. Some
functions can be combined without problems as well.

The issue can be reproduced in pgAdmin III and pgAdmin 4.

I'm using PostgreSQL 9.6.5 on Windows 10 x86-64

SELECT version();

version |
------------------------------------------------------------|
PostgreSQL 9.6.5, compiled by Visual C++ build 1800, 64-bit |

Thanks,
Lukas

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Lukas Eder (#1)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

2017-10-11 10:49 GMT+02:00 Lukas Eder <lukas.eder@gmail.com>:

When running the following query:

select
cume_dist(1) within group (order by a desc),
rank(1) within group (order by a desc),
dense_rank(1) within group (order by a asc),
percent_rank(1) within group (order by a asc)
from (values(1)) t(a);

My JDBC connection is immediately terminated:

SQL Error [08006]: An I/O error occurred while sending to the backend.
An I/O error occurred while sending to the backend.
Connection reset

The issue depends on a certain set of combinations of the above function
calls. Each function can be called individually without problems. Some
functions can be combined without problems as well.

The issue can be reproduced in pgAdmin III and pgAdmin 4.

I'm using PostgreSQL 9.6.5 on Windows 10 x86-64

SELECT version();

version |
------------------------------------------------------------|
PostgreSQL 9.6.5, compiled by Visual C++ build 1800, 64-bit |

Thanks,
Lukas

yes. It is PostgreSQL bug

Program received signal SIGSEGV, Segmentation fault.
tuplesort_puttupleslot (state=0x0, slot=slot@entry=0x2886f50) at
tuplesort.c:1303
1303 MemoryContext oldcontext =
MemoryContextSwitchTo(state->sortcontext);
(gdb) bt
#0 tuplesort_puttupleslot (state=0x0, slot=slot@entry=0x2886f50) at
tuplesort.c:1303
#1 0x00000000007ddca7 in hypothetical_dense_rank_final (fcinfo=<optimized
out>) at orderedsetaggs.c:1344
#2 0x00000000006244a5 in finalize_aggregate
(aggstate=aggstate@entry=0x286dcf8,
peragg=peragg@entry=0x286f630,
pergroupstate=0x286f800, resultVal=0x286f598, resultIsNull=0x286f5c9
"") at nodeAgg.c:1562
#3 0x0000000000624f9b in finalize_aggregates
(aggstate=aggstate@entry=0x286dcf8,
peraggs=peraggs@entry=0x286f5e8,
pergroup=pergroup@entry=0x286f800) at nodeAgg.c:1769
#4 0x0000000000625c6d in agg_retrieve_direct (aggstate=0x286dcf8) at
nodeAgg.c:2475
#5 ExecAgg (pstate=0x286dcf8) at nodeAgg.c:2128
#6 0x00000000006175ea in ExecProcNode (node=0x286dcf8) at
../../../src/include/executor/executor.h:251
#7 ExecutePlan (execute_once=<optimized out>, dest=0x28678d0,
direction=<optimized out>, numberTuples=0,
sendTuples=<optimized out>, operation=CMD_SELECT,
use_parallel_mode=<optimized out>, planstate=0x286dcf8,
estate=0x286dae0) at execMain.c:1719
#8 standard_ExecutorRun (queryDesc=0x27ccc60, direction=<optimized out>,
count=0, execute_once=<optimized out>)
at execMain.c:363
#9 0x0000000000751435 in PortalRunSelect (portal=portal@entry=0x286bad0,
forward=forward@entry=1 '\001', count=0,
count@entry=9223372036854775807, dest=dest@entry=0x28678d0) at
pquery.c:932
#10 0x0000000000752a60 in PortalRun (portal=portal@entry=0x286bad0,
count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=1 '\001', run_once=run_once@entry=1 '\001',
dest=dest@entry=0x28678d0,
altdest=altdest@entry=0x28678d0, completionTag=0x7fff09d671c0 "") at
pquery.c:773
#11 0x000000000074e6e8 in exec_simple_query (

Regards

Pavel

#3Pantelis Theodosiou
ypercube@gmail.com
In reply to: Pavel Stehule (#2)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

Just some notes, they may help find the cause:

- happens in 9.65. and in 10
- it can be seen with only rank and dense_rank, with any order by (asc,
desc, null):

select
rank(1) within group (order by a),
dense_rank(1) within group (order by a)
from (values (1)) t(a) ;

select
rank(1) within group (order by null),
dense_rank(1) within group (order by null)
from(values (1)) t(a) ;

- but it doesn't happen if (values (1)) is replaced with a single row
table.

On Wed, Oct 11, 2017 at 3:25 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Show quoted text

2017-10-11 10:49 GMT+02:00 Lukas Eder <lukas.eder@gmail.com>:

When running the following query:

select
cume_dist(1) within group (order by a desc),
rank(1) within group (order by a desc),
dense_rank(1) within group (order by a asc),
percent_rank(1) within group (order by a asc)
from (values(1)) t(a);

My JDBC connection is immediately terminated:

SQL Error [08006]: An I/O error occurred while sending to the backend.
An I/O error occurred while sending to the backend.
Connection reset

The issue depends on a certain set of combinations of the above function
calls. Each function can be called individually without problems. Some
functions can be combined without problems as well.

The issue can be reproduced in pgAdmin III and pgAdmin 4.

I'm using PostgreSQL 9.6.5 on Windows 10 x86-64

SELECT version();

version |
------------------------------------------------------------|
PostgreSQL 9.6.5, compiled by Visual C++ build 1800, 64-bit |

Thanks,
Lukas

yes. It is PostgreSQL bug

Program received signal SIGSEGV, Segmentation fault.
tuplesort_puttupleslot (state=0x0, slot=slot@entry=0x2886f50) at
tuplesort.c:1303
1303 MemoryContext oldcontext = MemoryContextSwitchTo(state->
sortcontext);
(gdb) bt
#0 tuplesort_puttupleslot (state=0x0, slot=slot@entry=0x2886f50) at
tuplesort.c:1303
#1 0x00000000007ddca7 in hypothetical_dense_rank_final (fcinfo=<optimized
out>) at orderedsetaggs.c:1344
#2 0x00000000006244a5 in finalize_aggregate (aggstate=aggstate@entry=0x286dcf8,
peragg=peragg@entry=0x286f630,
pergroupstate=0x286f800, resultVal=0x286f598, resultIsNull=0x286f5c9
"") at nodeAgg.c:1562
#3 0x0000000000624f9b in finalize_aggregates (aggstate=aggstate@entry=0x286dcf8,
peraggs=peraggs@entry=0x286f5e8,
pergroup=pergroup@entry=0x286f800) at nodeAgg.c:1769
#4 0x0000000000625c6d in agg_retrieve_direct (aggstate=0x286dcf8) at
nodeAgg.c:2475
#5 ExecAgg (pstate=0x286dcf8) at nodeAgg.c:2128
#6 0x00000000006175ea in ExecProcNode (node=0x286dcf8) at
../../../src/include/executor/executor.h:251
#7 ExecutePlan (execute_once=<optimized out>, dest=0x28678d0,
direction=<optimized out>, numberTuples=0,
sendTuples=<optimized out>, operation=CMD_SELECT,
use_parallel_mode=<optimized out>, planstate=0x286dcf8,
estate=0x286dae0) at execMain.c:1719
#8 standard_ExecutorRun (queryDesc=0x27ccc60, direction=<optimized out>,
count=0, execute_once=<optimized out>)
at execMain.c:363
#9 0x0000000000751435 in PortalRunSelect (portal=portal@entry=0x286bad0,
forward=forward@entry=1 '\001', count=0,
count@entry=9223372036854775807, dest=dest@entry=0x28678d0) at
pquery.c:932
#10 0x0000000000752a60 in PortalRun (portal=portal@entry=0x286bad0,
count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=1 '\001', run_once=run_once@entry=1
'\001', dest=dest@entry=0x28678d0,
altdest=altdest@entry=0x28678d0, completionTag=0x7fff09d671c0 "") at
pquery.c:773
#11 0x000000000074e6e8 in exec_simple_query (

Regards

Pavel

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pantelis Theodosiou (#3)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

Pantelis Theodosiou <ypercube@gmail.com> writes:

- it can be seen with only rank and dense_rank, with any order by (asc,
desc, null):
select
rank(1) within group (order by a),
dense_rank(1) within group (order by a)
from (values (1)) t(a) ;

Check ...

- but it doesn't happen if (values (1)) is replaced with a single row
table.

It did for me. I'm using a debug-enabled build, which typically helps
to make this sort of thing more reproducible.

regression=# create table t(a int) ;
CREATE TABLE
regression=# insert into t values(1);
INSERT 0 1
regression=# select
rank(1) within group (order by a),
dense_rank(1) within group (order by a)
from t;
server closed the connection unexpectedly

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Lukas Eder (#1)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

Lukas Eder <lukas.eder@gmail.com> writes:

[ this crashes: ]
select
cume_dist(1) within group (order by a desc),
rank(1) within group (order by a desc),
dense_rank(1) within group (order by a asc),
percent_rank(1) within group (order by a asc)
from (values(1)) t(a);

The issue depends on a certain set of combinations of the above function
calls. Each function can be called individually without problems. Some
functions can be combined without problems as well.

So the problem arises when nodeAgg.c decides it can combine the transition
calculations for two different ordered-set aggregates, leading to the
final functions for those OSAs being invoked successively on the same
transition state. The finalfns are not expecting that and the second
one crashes. (Concretely, this happens because osastate->sortstate
has already been reset to null, after closing down the contained
tuplesort object.)

It seems like this is probably fixable by having the finalfns not do
tuplesort_end immediately, but rather track whether anyone's yet
done the sort, and then do something like

if (already_sorted)
tuplesort_rescan(osastate->sortstate);
else
tuplesort_performsort(osastate->sortstate);

However, in order to make use of tuplesort_rescan, we'd have had
to pass randomAccess = true to tuplesort_begin_xxx, which would
be rather an annoying overhead for the majority case where there
isn't a potential for reuse.

What I think we should do about this is introduce another aggregate
API function, a bit like AggGetAggref or AggCheckCallContext,
that an aggregate function could call to find out whether there is
any possibility of multiple invocation of finalfns on the same
transition state. For the moment I'd just be worried about making
it work for ordered-set aggs, which are the only case where we don't
(er, didn't) require that to work anyway. But potentially we could
extend it to work for all agg cases and then finalfns could be
optimized in cases where it's safe for them to be destructive
of the transition state value.

Speaking of AggGetAggref, there's another thing that I think 804163bc2
did wrong for ordered-set aggregates: it can return the wrong Aggref
when two aggregates' intermediate states have been merged. I do not
think it's appropriate to say "well, you shouldn't care which of the
Aggrefs you get". It looks like this accidentally fails to fail
for the current OSAs, because indeed they do only look at the input-
related fields of the Aggref, but surely that's not something to
rely on. It's most certainly not acceptable that the function's
documentation doesn't mention that its result may be a lie.

This might be a bigger change than we want to push into the back
branches. In that case probably a back-patchable fix is to hack
nodeAgg.c so it will never combine input states for OSAs.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#6David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#5)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

On 12 October 2017 at 08:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

So the problem arises when nodeAgg.c decides it can combine the transition
calculations for two different ordered-set aggregates, leading to the
final functions for those OSAs being invoked successively on the same
transition state. The finalfns are not expecting that and the second
one crashes. (Concretely, this happens because osastate->sortstate
has already been reset to null, after closing down the contained
tuplesort object.)

Hmm, yeah. It was all coded with the assumption that final functions
never modify the transaction state.

It seems like this is probably fixable by having the finalfns not do
tuplesort_end immediately, but rather track whether anyone's yet
done the sort, and then do something like

if (already_sorted)
tuplesort_rescan(osastate->sortstate);
else
tuplesort_performsort(osastate->sortstate);

However, in order to make use of tuplesort_rescan, we'd have had
to pass randomAccess = true to tuplesort_begin_xxx, which would
be rather an annoying overhead for the majority case where there
isn't a potential for reuse.

What I think we should do about this is introduce another aggregate
API function, a bit like AggGetAggref or AggCheckCallContext,
that an aggregate function could call to find out whether there is
any possibility of multiple invocation of finalfns on the same
transition state. For the moment I'd just be worried about making
it work for ordered-set aggs, which are the only case where we don't
(er, didn't) require that to work anyway. But potentially we could
extend it to work for all agg cases and then finalfns could be
optimized in cases where it's safe for them to be destructive
of the transition state value.

Yeah maybe if core tracked the total number of references in
AggStatePerTrans, and finalize_aggregate() incremented another counter
to track how many times the final function had been called on this
state, then if there was some way to expose that information to the
final function, it would know if it was the first or the last final
function to use the state. I'm just not sure how much we'd want to
allow the final function to see. Would we expose the whole
AggStatePerTrans? or just invent API functions to allow them to know
if they're the first or last?

Speaking of AggGetAggref, there's another thing that I think 804163bc2
did wrong for ordered-set aggregates: it can return the wrong Aggref
when two aggregates' intermediate states have been merged. I do not
think it's appropriate to say "well, you shouldn't care which of the
Aggrefs you get". It looks like this accidentally fails to fail
for the current OSAs, because indeed they do only look at the input-
related fields of the Aggref, but surely that's not something to
rely on. It's most certainly not acceptable that the function's
documentation doesn't mention that its result may be a lie.

This might be a bigger change than we want to push into the back
branches. In that case probably a back-patchable fix is to hack
nodeAgg.c so it will never combine input states for OSAs.

I've attached a patch which does this.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

shared_aggregate_state_fix.patchapplication/octet-stream; name=shared_aggregate_state_fix.patchDownload+11-0
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#6)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

David Rowley <david.rowley@2ndquadrant.com> writes:

On 12 October 2017 at 08:51, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, in order to make use of tuplesort_rescan, we'd have had
to pass randomAccess = true to tuplesort_begin_xxx, which would
be rather an annoying overhead for the majority case where there
isn't a potential for reuse.

Yeah maybe if core tracked the total number of references in
AggStatePerTrans, and finalize_aggregate() incremented another counter
to track how many times the final function had been called on this
state, then if there was some way to expose that information to the
final function, it would know if it was the first or the last final
function to use the state.

That seems kind of irrelevant, at least for the existing OSAs.
To know what value of randomAccess to pass to the tuplesort setup,
we have to know *at the first transition-function call* whether
there may be multiple final-function calls coming up. So what
what I'm imagining is a simple boolean result "yes, there will be
only one finalfn call, so it can destructively modify the transition
state", or "there might be more than one finalfn call, so the finalfn(s)
must preserve transition state". And this info has to be available
throughout the aggregate run.

This might be a bigger change than we want to push into the back
branches. In that case probably a back-patchable fix is to hack
nodeAgg.c so it will never combine input states for OSAs.

I've attached a patch which does this.

Needs to reject plain OSAs too, not just hypotheticals. Pushed
with that fix and some test cases.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#8Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#7)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

On 10/12/2017 05:27 AM, Tom Lane wrote:

This might be a bigger change than we want to push into the back
branches. In that case probably a back-patchable fix is to hw ck
nodeAgg.c so it will never combine input states for OSAs.

I've attached a patch which does this.

Needs to reject plain OSAs too, not just hypotheticals. Pushed
with that fix and some test cases.

Thanks!

Speaking of AggGetAggref, there's another thing that I think 804163bc2
did wrong for ordered-set aggregates: it can return the wrong Aggref
when two aggregates' intermediate states have been merged. I do not
think it's appropriate to say "well, you shouldn't care which of the
Aggrefs you get". It looks like this accidentally fails to fail
for the current OSAs, because indeed they do only look at the input-
related fields of the Aggref, but surely that's not something to
rely on. It's most certainly not acceptable that the function's
documentation doesn't mention that its result may be a lie.

Hmm. All the fields except for aggfnoid, aggtype and aggcollid are
related to the inputs or the transition function, so all the other
fields would be the same between two shared transition states. But yes,
this really should be documented. Perhaps AggGetAggref() should return
an Aggref with those fields set to InvalidOid, to make it clear that
they should not be looked at?

Conceivably we could have another function like AggGetAggref() that
returns all of the Aggrefs. But I don't think it's worth the
complication. If the transition function needs to do something different
depending on the aggregate it's for, well, don't do that. Define a
different transition function for both aggregates.

- Heikki

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#8)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 10/12/2017 05:27 AM, Tom Lane wrote:

Speaking of AggGetAggref, there's another thing that I think 804163bc2
did wrong for ordered-set aggregates: it can return the wrong Aggref
when two aggregates' intermediate states have been merged.

Conceivably we could have another function like AggGetAggref() that
returns all of the Aggrefs. But I don't think it's worth the
complication. If the transition function needs to do something different
depending on the aggregate it's for, well, don't do that. Define a
different transition function for both aggregates.

Thinking about it more clearly, if a transition function is being run
on behalf of several different Aggrefs (with identical input states),
then no, the transition function should not care which Aggref it
looks at. As you say it mustn't do anything different on the basis of
the finalfn-related fields. The problem occurs when a finalfn calls
AggGetAggref --- then, I think that the finalfn is entirely entitled
to expect that it will see its own Aggref, not some other one that
happens to share input+transition.

So the issue is that we need some different behavior during
finalize_aggregate than during the transition function calls.
I'm inclined to put back the curperagg field we had before,
and populate that during finalize_aggregate. Then, AggGetAggref
would look at either curperagg or curpertrans, whichever is set.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#9)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

I wrote:

Thinking about it more clearly, if a transition function is being run
on behalf of several different Aggrefs (with identical input states),
then no, the transition function should not care which Aggref it
looks at. As you say it mustn't do anything different on the basis of
the finalfn-related fields. The problem occurs when a finalfn calls
AggGetAggref --- then, I think that the finalfn is entirely entitled
to expect that it will see its own Aggref, not some other one that
happens to share input+transition.

Concretely, I think we need to do the attached. This seems like
a bug fix to me, so I'm inclined to back-patch it. In the back
branches we could put the extra AggState field at the end, to
minimize ABI-break hazards.

BTW ... I was quite surprised to notice that the aggdirectargs
are treated as a property that has to be matched in order to
combine transition states. They aren't used during the transition
phase, so this seems like a pointless constraint. We could move
the aggdirectargs ExprState list to AggStatePerAggData and treat
the aggdirectargs as part of the finalfn-related data, instead.
As long as we're not merging states at all for OSAs, this is
moot, but it seems like something to fix along with that.

regards, tom lane

Attachments:

make-AggGetAggref-honest.patchtext/x-diff; charset=us-ascii; name=make-AggGetAggref-honest.patchDownload+25-9
#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

I wrote:

To know what value of randomAccess to pass to the tuplesort setup,
we have to know *at the first transition-function call* whether
there may be multiple final-function calls coming up. So what
what I'm imagining is a simple boolean result "yes, there will be
only one finalfn call, so it can destructively modify the transition
state", or "there might be more than one finalfn call, so the finalfn(s)
must preserve transition state". And this info has to be available
throughout the aggregate run.

Attached is a proposed patch to make the ordered-set aggregates
safe for state merging. I've not tested it really thoroughly,
but it passes the regression cases added in 52328727b.

regards, tom lane

Attachments:

allow-state-merging-in-builtin-OSAs.patchtext/x-diff; charset=us-ascii; name=allow-state-merging-in-builtin-OSAs.patchDownload+161-131
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#10)
Re: Combination of ordered-set aggregate function terminates JDBC connection on PostgreSQL 9.6.5

I wrote:

BTW ... I was quite surprised to notice that the aggdirectargs
are treated as a property that has to be matched in order to
combine transition states. They aren't used during the transition
phase, so this seems like a pointless constraint. We could move
the aggdirectargs ExprState list to AggStatePerAggData and treat
the aggdirectargs as part of the finalfn-related data, instead.

Here's a pretty-lightly-tested patch for that.

regards, tom lane

Attachments:

direct-args-are-not-transition-state.patchtext/x-diff; charset=us-ascii; name=direct-args-are-not-transition-state.patchDownload+19-22