libpq async duplicate error results

Started by Peter Eisentrautabout 4 years ago19 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

This issue was discovered by Fabien in the SHOW_ALL_RESULTS thread. I'm
posting it here separately, because I think it ought to be addressed in
libpq rather than with a workaround in psql, as proposed over there.

When using PQsendQuery() + PQgetResult() and the server crashes during
the execution of the command, PQgetResult() then returns two result sets
with partially duplicated error messages, like this from the attached
test program:

command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL: terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server

This is hidden in normal use because PQexec() throws away all but the
last result set, but the extra one is still generated internally.

Apparently, this has changed between PG13 and PG14. In PG13 and
earlier, the output is

command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL: terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server

In PG13, PQexec() concatenates all the error messages from multiple
results, so a user of PQexec() sees the same output before and after.
But for users of the lower-level APIs, things have become a bit more
confusing.

Also, why are there multiple results being generated in the first place?

[0]: /messages/by-id/alpine.DEB.2.22.394.2112230703530.2668598@pseudo
/messages/by-id/alpine.DEB.2.22.394.2112230703530.2668598@pseudo

Attachments:

test.ctext/plain; charset=UTF-8; name=test.cDownload
#2Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Peter Eisentraut (#1)
Re: libpq async duplicate error results

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL: terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
"

Also, why are there multiple results being generated in the first place?

My interpretation is that the first message is a close message issued by
the server before actually severing the connection, and the second message
is generated by libpq when it notices that the connection has been closed,
so there is some sense in having to results to report these two
consecutive errors, and the question might be about when the buffer should
be reset.

--
Fabien.

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Fabien COELHO (#2)
Re: libpq async duplicate error results

On 26.01.22 14:52, Fabien COELHO wrote:

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator
command
"
result 2 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator
command
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
"

Also, why are there multiple results being generated in the first place?

My interpretation is that the first message is a close message issued by
the server before actually severing the connection, and the second
message is generated by libpq when it notices that the connection has
been closed, so there is some sense in having to results to report these
two consecutive errors, and the question might be about when the buffer
should be reset.

I see. If we stipulate for a moment that getting two results is what we
want here, I see a conceptual problem. Users of PQexec() don't care
whether the error message they are ultimately getting is a concatenation
of two messages from two internal results (old behavior) or just an
appropriately accumulated message from the last internal result (new
behavior). But users of PQgetResult() don't have a principled way to
deal with this anymore. They can't just ignore one or the other result,
because they could after all be two separate errors from two separate
commands. So the only proper behavior is to show all errors from all
results. But that results in this misbehavior because the last error
message must include the messages from the previous messages in order to
accommodate PQexec() users.

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#1)
Re: libpq async duplicate error results

Tom, you worked on reorganzing the error message handling in libpq in
PostgreSQL 14 (commit ffa2e4670123124b92f037d335a1e844c3782d3f). Any
thoughts on this?

Show quoted text

On 25.01.22 09:32, Peter Eisentraut wrote:

This issue was discovered by Fabien in the SHOW_ALL_RESULTS thread.  I'm
posting it here separately, because I think it ought to be addressed in
libpq rather than with a workaround in psql, as proposed over there.

When using PQsendQuery() + PQgetResult() and the server crashes during
the execution of the command, PQgetResult() then returns two result sets
with partially duplicated error messages, like this from the attached
test program:

command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator
command
"
result 2 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator
command
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server

This is hidden in normal use because PQexec() throws away all but the
last result set, but the extra one is still generated internally.

Apparently, this has changed between PG13 and PG14.  In PG13 and
earlier, the output is

command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator
command
"
result 2 status = PGRES_FATAL_ERROR
error message = "server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server

In PG13, PQexec() concatenates all the error messages from multiple
results, so a user of PQexec() sees the same output before and after.
But for users of the lower-level APIs, things have become a bit more
confusing.

Also, why are there multiple results being generated in the first place?

[0]:
/messages/by-id/alpine.DEB.2.22.394.2112230703530.2668598@pseudo

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#4)
Re: libpq async duplicate error results

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

Tom, you worked on reorganzing the error message handling in libpq in
PostgreSQL 14 (commit ffa2e4670123124b92f037d335a1e844c3782d3f). Any
thoughts on this?

Ah, sorry, I'd not noticed this thread.

I concur with Fabien's analysis: we report the FATAL message from
the server during the first PQgetResult, and then the second call
discovers that the connection is gone and reports "server closed
the connection unexpectedly". Those are two independent events
(in libpq's view anyway) so reporting them separately is correct,
or at least necessary unless you want to engage in seriously
major redesign and behavioral changes.

It is annoying that some of the text is duplicated in the second
report, but in the end that's cosmetic, and I'm not sure we can
improve it without breaking other things. In particular, we
can't just think about what comes back in the PGgetResult calls,
we also have to consider what PQerrorMessage(conn) will report
afterwards. So I don't think that resetting conn->errorMessage
during a PQgetResult series would be a good fix.

An idea that I don't have time to pursue right now is to track
how much of conn->errorMessage has been read out by PQgetResult,
and only report newly-added text in later PQgetResult calls of
a series, while PQerrorMessage would continue to report the
whole thing. Buffer resets would occur where they do now.

It'd probably be necessary to make a user-visible PQgetResult
that becomes a wrapper around PQgetResultInternal, because
internal callers such as PQexecFinish will need the old behavior,
or at least some of them will.

regards, tom lane

#6Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tom Lane (#5)
Re: libpq async duplicate error results

Hello Tom,

I concur with Fabien's analysis: we report the FATAL message from
the server during the first PQgetResult, and then the second call
discovers that the connection is gone and reports "server closed
the connection unexpectedly". Those are two independent events
(in libpq's view anyway) so reporting them separately is correct,
or at least necessary unless you want to engage in seriously
major redesign and behavioral changes.

It is annoying that some of the text is duplicated in the second
report, but in the end that's cosmetic, and I'm not sure we can
improve it without breaking other things. In particular, we
can't just think about what comes back in the PGgetResult calls,
we also have to consider what PQerrorMessage(conn) will report
afterwards. So I don't think that resetting conn->errorMessage
during a PQgetResult series would be a good fix.

An idea that I don't have time to pursue right now is to track
how much of conn->errorMessage has been read out by PQgetResult,
and only report newly-added text in later PQgetResult calls of
a series, while PQerrorMessage would continue to report the
whole thing. Buffer resets would occur where they do now.

It'd probably be necessary to make a user-visible PQgetResult
that becomes a wrapper around PQgetResultInternal, because
internal callers such as PQexecFinish will need the old behavior,
or at least some of them will.

I agree that the message reset is not convincing, but it was the only way
to get the expected behavior without changing the existing functions.

I see two paths:

(1) keep the duplicate message in this corner case.

(2) do the hocus pocus you suggest around PQerrorMessage and PQgetResult
to work around the duplication, just for this corner case.

I'd tend to choose (1) to keep it simple, even if (2) is feasible.

--
Fabien.

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fabien COELHO (#6)
Re: libpq async duplicate error results

[ cc'ing Alvaro for pipeline questions ]

Fabien COELHO <coelho@cri.ensmp.fr> writes:

It is annoying that some of the text is duplicated in the second
report, but in the end that's cosmetic, and I'm not sure we can
improve it without breaking other things. In particular, we
can't just think about what comes back in the PGgetResult calls,
we also have to consider what PQerrorMessage(conn) will report
afterwards. So I don't think that resetting conn->errorMessage
during a PQgetResult series would be a good fix.

An idea that I don't have time to pursue right now is to track
how much of conn->errorMessage has been read out by PQgetResult,
and only report newly-added text in later PQgetResult calls of
a series, while PQerrorMessage would continue to report the
whole thing. Buffer resets would occur where they do now.

I agree that the message reset is not convincing, but it was the only way
to get the expected behavior without changing the existing functions.

I spent a bit more time poking at this issue. I no longer like the
idea of reporting only the tail part of conn->errorMessage; I'm
afraid that would convert "edge cases sometimes return redundant text"
into "edge cases sometimes miss returning text at all", which is not
an improvement.

In any case, the particular example we're looking at here is connection
loss, which is not something that should greatly concern us as far as
pipeline semantics are concerned, because you're not getting any more
pipelined results anyway if that happens. In the non-I/O-error case,
I see that PQgetResult does a hacky "resetPQExpBuffer(&conn->errorMessage)"
in between pipelined queries. It seems like if there are real usability
issues in this area, they probably come from that not being well placed.
In particular, I wonder why that's done with the pqPipelineProcessQueue
call in the PGASYNC_IDLE stanza, yet not with the pqPipelineProcessQueue
call in the PGASYNC_READY stanza (where we're returning a PIPELINE_SYNC
result). It kinda looks to me like maybe pqPipelineProcessQueue
ought to have the responsibility for doing that, rather than having
two out of the three call sites do it. Also it would seem more natural
to associate it with the reinitialization that happens inside
pqPipelineProcessQueue.

regards, tom lane

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#7)
Re: libpq async duplicate error results

On 2022-Feb-07, Tom Lane wrote:

In any case, the particular example we're looking at here is connection
loss, which is not something that should greatly concern us as far as
pipeline semantics are concerned, because you're not getting any more
pipelined results anyway if that happens. In the non-I/O-error case,
I see that PQgetResult does a hacky "resetPQExpBuffer(&conn->errorMessage)"
in between pipelined queries. It seems like if there are real usability
issues in this area, they probably come from that not being well placed.
In particular, I wonder why that's done with the pqPipelineProcessQueue
call in the PGASYNC_IDLE stanza, yet not with the pqPipelineProcessQueue
call in the PGASYNC_READY stanza (where we're returning a PIPELINE_SYNC
result). It kinda looks to me like maybe pqPipelineProcessQueue
ought to have the responsibility for doing that, rather than having
two out of the three call sites do it. Also it would seem more natural
to associate it with the reinitialization that happens inside
pqPipelineProcessQueue.

Yeah, I agree that the placement of error message reset in pipelined
libpq is more ad-hoc to the testing I was doing than carefully
principled. I didn't test changing it, but on a quick look your
proposed change seems reasonable to me.

--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
"Pensar que el espectro que vemos es ilusorio no lo despoja de espanto,
sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
Re: libpq async duplicate error results

I wrote:

An idea that I don't have time to pursue right now is to track
how much of conn->errorMessage has been read out by PQgetResult,
and only report newly-added text in later PQgetResult calls of
a series, while PQerrorMessage would continue to report the
whole thing. Buffer resets would occur where they do now.

I spent a bit more time poking at this issue. I no longer like the
idea of reporting only the tail part of conn->errorMessage; I'm
afraid that would convert "edge cases sometimes return redundant text"
into "edge cases sometimes miss returning text at all", which is not
an improvement.

I figured out what seems like an adequately bulletproof solution to
this problem. The thing that was scaring me before is that there are
code paths in which we'll discard a pending conn->result PGresult
(which could be an error) and make a new one; it wasn't entirely
clear that we could track which result represents what text.
The solution to that has two components:

1. Never advance the how-much-text-has-been-reported counter until
we are just about to return an error PGresult to the application.
This reduces the risk of missing or duplicated text to what seems
an acceptable level. I found that pqPrepareAsyncResult can be used
to handle this logic -- it's only called at outer levels, and it
already has some state-updating behavior, so we'd already have bugs
if it were called improperly.

2. Don't make error PGresults at all until we reach pqPrepareAsyncResult.
This saves some cycles in the paths where we previously replaced one
error with another. Importantly, it also means that we can now have
a bulletproof solution to the problem of how to return an error when
we get OOM on any attempt to make a PGresult. The attached patch
includes a statically-declared PGresult that we can return in that
case.

It's not very practical to do #2 exactly, because in the case of
an error received from the server, we need to build a PGresult that
includes the broken-down error fields. But all internally-generated
errors can be handled in this style.

This seems workable, and you'll notice it fixes the duplicated text
in the test case Andres was worried about.

Note that this patch is on top of the event-proc changes I posted
in [1]/messages/by-id/3390587.1645035110@sss.pgh.pa.us. Without that change, it's not very clear how to cope
with PGEVT_RESULTCREATE failure after we've already updated the
error status.

regards, tom lane

[1]: /messages/by-id/3390587.1645035110@sss.pgh.pa.us

Attachments:

dont-return-same-error-multiple-times-1.patchtext/x-diff; charset=us-ascii; name=dont-return-same-error-multiple-times-1.patchDownload+210-63
#10Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#9)
Re: libpq async duplicate error results

Hi,

On 2022-02-16 18:51:37 -0500, Tom Lane wrote:

This seems workable, and you'll notice it fixes the duplicated text
in the test case Andres was worried about.

Cool.

I find it mildly scary that we didn't have any other tests verifying the libpq
side of connection termination. Seems like we we maybe should add a few more?
Some simple cases we can do via isolationtester. But some others would
probably require a C test program to be robust...

+	/* Also, do nothing if the argument is OOM_result */
+	if (res == unconstify(PGresult *, &OOM_result))
+		return;

Wouldn't it make more sense to make res const, rather than unconstifying
&OOM_result?

Greetings,

Andres Freund

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#10)
Re: libpq async duplicate error results

Andres Freund <andres@anarazel.de> writes:

On 2022-02-16 18:51:37 -0500, Tom Lane wrote:

+	/* Also, do nothing if the argument is OOM_result */
+	if (res == unconstify(PGresult *, &OOM_result))
+		return;

Wouldn't it make more sense to make res const, rather than unconstifying
&OOM_result?

Uh ... then we'd have to cast away the const to do free().

regards, tom lane

#12Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#11)
Re: libpq async duplicate error results

Hi,

On 2022-02-16 20:28:02 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2022-02-16 18:51:37 -0500, Tom Lane wrote:

+	/* Also, do nothing if the argument is OOM_result */
+	if (res == unconstify(PGresult *, &OOM_result))
+		return;

Wouldn't it make more sense to make res const, rather than unconstifying
&OOM_result?

Uh ... then we'd have to cast away the const to do free().

I was just thinking of

if ((const PGresult *) res == &OOM_result)

It's not important, I just find it stylistically nicer (making a pointer const
from an non-const pointer is safe, the other way round not generally).

- Andres

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#12)
Re: libpq async duplicate error results

Andres Freund <andres@anarazel.de> writes:

On 2022-02-16 20:28:02 -0500, Tom Lane wrote:

Uh ... then we'd have to cast away the const to do free().

I was just thinking of
if ((const PGresult *) res == &OOM_result)

Oh, I see. Sure, can do it like that.

regards, tom lane

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#8)
Re: libpq async duplicate error results

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

On 2022-Feb-07, Tom Lane wrote:

I see that PQgetResult does a hacky "resetPQExpBuffer(&conn->errorMessage)"
in between pipelined queries. It seems like if there are real usability
issues in this area, they probably come from that not being well placed.
In particular, I wonder why that's done with the pqPipelineProcessQueue
call in the PGASYNC_IDLE stanza, yet not with the pqPipelineProcessQueue
call in the PGASYNC_READY stanza (where we're returning a PIPELINE_SYNC
result). It kinda looks to me like maybe pqPipelineProcessQueue
ought to have the responsibility for doing that, rather than having
two out of the three call sites do it. Also it would seem more natural
to associate it with the reinitialization that happens inside
pqPipelineProcessQueue.

Yeah, I agree that the placement of error message reset in pipelined
libpq is more ad-hoc to the testing I was doing than carefully
principled. I didn't test changing it, but on a quick look your
proposed change seems reasonable to me.

Here's a patch to do that. It passes check-world (particularly
src/test/modules/libpq_pipeline), for whatever that's worth.

However ... in the wake of 618c16707, I wonder if we should consider
an alternative definition, which is to NOT clear errorMessage when
starting a new pipelined query. (That would be this patch minus
the addition to pqPipelineProcessQueue.) Thanks to 618c16707,
the returned error PGresult(s) should bear exactly the same contents
either way. What would change is that PQerrorMessage(conn) would return
the concatenation of all errors that have occurred during the current
pipeline sequence. I think that's arguably a more useful definition
than what we have now --- though obviously it'd require a docs change.
It seems to fit with the spirit of the v14 changes to ensure that
PQerrorMessage tells you about everything that happened during a
failed connection attempt, not only the last thing.

I've tested that variant, and it also passes check-world, which may
say something about libpq_pipeline's coverage of error cases ...
I didn't look closely.

regards, tom lane

Attachments:

centralize-pipeline-errorMessage-reset.patchtext/x-diff; charset=us-ascii; name=centralize-pipeline-errorMessage-reset.patchDownload+4-7
#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#14)
Re: libpq async duplicate error results

I wrote:

However ... in the wake of 618c16707, I wonder if we should consider
an alternative definition, which is to NOT clear errorMessage when
starting a new pipelined query. (That would be this patch minus
the addition to pqPipelineProcessQueue.) Thanks to 618c16707,
the returned error PGresult(s) should bear exactly the same contents
either way. What would change is that PQerrorMessage(conn) would return
the concatenation of all errors that have occurred during the current
pipeline sequence. I think that's arguably a more useful definition
than what we have now --- though obviously it'd require a docs change.
It seems to fit with the spirit of the v14 changes to ensure that
PQerrorMessage tells you about everything that happened during a
failed connection attempt, not only the last thing.

After studying the pipeline mode some more, I no longer think that's
a great idea. ISTM we want pipeline mode to act as nearly as possible
the same as issuing the same series of queries non-pipelined. But
this redefinition would make it different. So, barring objection,
I'll push that cleanup patch as-posted.

regards, tom lane

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#15)
Re: libpq async duplicate error results

Actually ... it now seems to me that there's live bugs in the
interaction between pipeline mode and error-buffer clearing.
Namely, that places like PQsendQueryStart will unconditionally
clear the buffer even though we may be in the middle of a pipeline
sequence, and the buffer might hold an error that's yet to be
reported to the application. So I think we need something
more like the attached.

regards, tom lane

Attachments:

fix-pipeline-errorMessage-reset-2.patchtext/x-diff; charset=us-ascii; name=fix-pipeline-errorMessage-reset-2.patchDownload+32-19
#17Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#16)
Re: libpq async duplicate error results

I took another look at this. The output from the test program at the
beginning of the thread is now (master branch):

command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL: terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: FATAL: terminating connection due to administrator
command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
no connection to the server

It appears the "after" query is getting the error message from the
previous cycle somehow.

The output in PG14 and PG13 is:

command = SELECT 'after';
PQsendQuery() error: no connection to the server

Is the change intended or do we need to think about more tweaking?

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#17)
Re: libpq async duplicate error results

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

I took another look at this. The output from the test program at the
beginning of the thread is now (master branch):
...
command = SELECT 'after';
PQsendQuery() error: FATAL: terminating connection due to administrator
command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
no connection to the server

It appears the "after" query is getting the error message from the
previous cycle somehow.

What is happening is that this bit in PQsendQueryStart is deciding
not to clear the message buffer because conn->cmd_queue_head isn't
NULL:

/*
* If this is the beginning of a query cycle, reset the error state.
* However, in pipeline mode with something already queued, the error
* buffer belongs to that command and we shouldn't clear it.
*/
if (newQuery && conn->cmd_queue_head == NULL)
pqClearConnErrorState(conn);

So apparently the fault is with the pipelined-query logic. I'm
not sure what cleanup step is missing though.

regards, tom lane

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#18)
Re: libpq async duplicate error results

I wrote:

What is happening is that this bit in PQsendQueryStart is deciding
not to clear the message buffer because conn->cmd_queue_head isn't
NULL:

/*
* If this is the beginning of a query cycle, reset the error state.
* However, in pipeline mode with something already queued, the error
* buffer belongs to that command and we shouldn't clear it.
*/
if (newQuery && conn->cmd_queue_head == NULL)
pqClearConnErrorState(conn);

So apparently the fault is with the pipelined-query logic. I'm
not sure what cleanup step is missing though.

I experimented with the attached patch, which is based on the idea
that clearing the command queue is being done in entirely the wrong
place. It's more appropriate to do it where we drop unsent output
data, ie in pqDropConnection not pqDropServerData. The fact that
it was jammed into the latter without any commentary doesn't do much
to convince me that that placement was thought about carefully.

This fixes the complained-of symptom and still passes check-world.
However, I wonder whether cutting the queue down to completely empty
is the right thing. Why is the queue set up like this in the first
place --- that is, why don't we remove an entry the moment the
command is sent? I also wonder whether pipelineStatus ought to
get reset here. We aren't resetting asyncStatus here, so maybe it's
fine, but I'm not quite sure.

regards, tom lane

Attachments:

reset-libpq-command-queue-more-correctly-wip.patchtext/x-diff; charset=us-ascii; name=reset-libpq-command-queue-more-correctly-wip.patchDownload+7-6