Using PQexecQuery in pipeline mode produces unexpected Close messages

Started by Daniele Varrazzoalmost 4 years ago23 messageshackersbugs
Jump to latest
#1Daniele Varrazzo
daniele.varrazzo@gmail.com
hackersbugs

Hello,

Experimenting with pipeline mode, with libpq 14.2, sometimes we
receive the notice "message type 0x33 arrived from server while idle".
Tested with Postgres server 12 and 14.

This notice is generated by libpq upon receiving messages after using
PQsendQuery(). The libpq trace shows:

F 101 Parse "" "INSERT INTO pq_pipeline_demo(itemno,
int8filler) VALUES (1, 4611686018427387904) RETURNING id" 0
F 12 Bind "" "" 0 0 0
F 6 Describe P ""
F 9 Execute "" 0
F 6 Close P ""
F 4 Flush
B 4 ParseComplete
B 4 BindComplete
B 27 RowDescription 1 "id" 561056 1 23 4 -1 0
B 11 DataRow 1 1 '3'
B 15 CommandComplete "INSERT 0 1"
B 4 CloseComplete
F 4 Sync
B 5 ReadyForQuery I

in the state the server messages are received, CloseComplete is unexpected.

For comparison, PQsendQueryParams() produces the trace:

F 93 Parse "" "INSERT INTO pq_pipeline_demo(itemno,
int8filler) VALUES ($1, $2) RETURNING id" 2 21 20
F 36 Bind "" "" 2 1 1 2 2 '\x00\x01' 8
'@\x00\x00\x00\x00\x00\x00\x00' 1 0
F 6 Describe P ""
F 9 Execute "" 0
F 4 Flush
B 4 ParseComplete
B 4 BindComplete
B 27 RowDescription 1 "id" 561056 1 23 4 -1 0
B 11 DataRow 1 1 '4'
B 15 CommandComplete "INSERT 0 1"
F 4 Sync
B 5 ReadyForQuery I

where no Close is sent.

Is this a problem with PQexecQuery which should not send the Close, or
with receiving in IDLE mode which should expect a CloseComplete?

Should we avoid using PQexecQuery in pipeline mode altogether?

A playground to reproduce the issue is available at
https://github.com/psycopg/psycopg/issues/314

Cheers

-- Daniele

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Daniele Varrazzo (#1)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

On 2022-Jun-08, Daniele Varrazzo wrote:

Is this a problem with PQexecQuery which should not send the Close, or
with receiving in IDLE mode which should expect a CloseComplete?

Interesting.

What that Close message is doing is closing the unnamed portal, which
is otherwise closed implicitly when the next one is opened. That's how
single-query mode works: if you run a single portal, it'll be kept open.

I believe that the right fix is to not send that Close message in
PQsendQuery.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Para tener más hay que desear menos"

#3Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Alvaro Herrera (#2)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

(Moved to -hackers)

At Wed, 8 Jun 2022 17:08:47 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

What that Close message is doing is closing the unnamed portal, which
is otherwise closed implicitly when the next one is opened. That's how
single-query mode works: if you run a single portal, it'll be kept open.

I believe that the right fix is to not send that Close message in
PQsendQuery.

Agreed. At least Close message in that context is useless and
PQsendQueryGuts doesn't send it. And removes the Close message surely
fixes the issue.

The doc [1]https://www.postgresql.org/docs/14/protocol-flow.html says:

[1]: https://www.postgresql.org/docs/14/protocol-flow.html

The simple Query message is approximately equivalent to the series
Parse, Bind, portal Describe, Execute, Close, Sync, using the
unnamed prepared statement and portal objects and no parameters. One

The current implement of PQsendQueryInternal looks like the result of
a misunderstanding of the doc. In the regression tests, that path is
excercised only for an error case, where no CloseComplete comes.

The attached adds a test for the normal-path of pipelined
PQsendQuery() to simple_pipeline test then modifies that function not
to send Close message. Without the fix, the test fails by "unexpected
notice" even if the trace matches the "expected" content.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Remove-useless-Close-message-from-PQsendQuery-in-pip.patchtext/x-patch; charset=us-asciiDownload+68-8
#4Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#3)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

At Fri, 10 Jun 2022 15:25:44 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

The current implement of PQsendQueryInternal looks like the result of
a misunderstanding of the doc. In the regression tests, that path is
excercised only for an error case, where no CloseComplete comes.

The attached adds a test for the normal-path of pipelined
PQsendQuery() to simple_pipeline test then modifies that function not
to send Close message. Without the fix, the test fails by "unexpected
notice" even if the trace matches the "expected" content.

And, as a matter of course, this fix should be back-patched to 14.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kyotaro Horiguchi (#3)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

On Fri, Jun 10, 2022, at 8:25 AM, Kyotaro Horiguchi wrote:

The current implement of PQsendQueryInternal looks like the result of
a misunderstanding of the doc. In the regression tests, that path is
excercised only for an error case, where no CloseComplete comes.

The attached adds a test for the normal-path of pipelined
PQsendQuery() to simple_pipeline test then modifies that function not
to send Close message. Without the fix, the test fails by "unexpected
notice" even if the trace matches the "expected" content.

Hah, the patch I wrote is almost identical to yours, down to the notice processor counting the number of notices received. The only difference is that I put my test in pipeline_abort.

Sadly, it looks like I won't be able to get this patched pushed for 14.4.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#5)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

=?UTF-8?Q?=C3=81lvaro_Herrera?= <alvherre@alvh.no-ip.org> writes:

Sadly, it looks like I won't be able to get this patched pushed for 14.4.

I think that's a good thing actually; this isn't urgent enough to
risk a last-minute commit. Please wait till the release freeze
lifts.

regards, tom lane

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kyotaro Horiguchi (#3)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

On 2022-Jun-10, Kyotaro Horiguchi wrote:

(Moved to -hackers)

At Wed, 8 Jun 2022 17:08:47 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

What that Close message is doing is closing the unnamed portal, which
is otherwise closed implicitly when the next one is opened. That's how
single-query mode works: if you run a single portal, it'll be kept open.

I believe that the right fix is to not send that Close message in
PQsendQuery.

Agreed. At least Close message in that context is useless and
PQsendQueryGuts doesn't send it. And removes the Close message surely
fixes the issue.

So, git archaeology led me to this thread
/messages/by-id/202106072107.d4i55hdscxqj@alvherre.pgsql
which is why we added that message in the first place.

I was about to push the attached patch (a merged version of Kyotaro's
and mine), but now I'm wondering if that's the right approach.

Alternatives:

- Have the client not complain if it gets CloseComplete in idle state.
(After all, it's a pretty useless message, since we already do nothing
with it if we get it in BUSY state.)

- Have the server not send CloseComplete at all in the cases where
the client is not expecting it. Not sure how this would be
implemented.

- Other ideas?

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"That sort of implies that there are Emacs keystrokes which aren't obscure.
I've been using it daily for 2 years now and have yet to discover any key
sequence which makes any sense." (Paul Thomas)

Attachments:

v2-0001-PQsendQuery-Don-t-send-Close-in-pipeline-mode.patchtext/x-diff; charset=us-asciiDownload+74-8
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#7)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

So, git archaeology led me to this thread
/messages/by-id/202106072107.d4i55hdscxqj@alvherre.pgsql
which is why we added that message in the first place.

Um. Good thing you looked. I doubt we want to revert that change now.

Alternatives:
- Have the client not complain if it gets CloseComplete in idle state.
(After all, it's a pretty useless message, since we already do nothing
with it if we get it in BUSY state.)

ISTM the actual problem here is that we're reverting to IDLE state too
soon. I didn't try to trace down exactly where that's happening, but
I notice that in the non-pipeline case we don't go to IDLE till we've
seen 'Z' (Sync). Something in the pipeline logic must be jumping the
gun on that state transition.

regards, tom lane

#9Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Tom Lane (#8)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

At Wed, 15 Jun 2022 14:56:42 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

So, git archaeology led me to this thread
/messages/by-id/202106072107.d4i55hdscxqj@alvherre.pgsql
which is why we added that message in the first place.

Um. Good thing you looked. I doubt we want to revert that change now.

Alternatives:
- Have the client not complain if it gets CloseComplete in idle state.
(After all, it's a pretty useless message, since we already do nothing
with it if we get it in BUSY state.)

ISTM the actual problem here is that we're reverting to IDLE state too
soon. I didn't try to trace down exactly where that's happening, but

Yes. I once visited that fact but also I thought that in the
comparison with non-pipelined PQsendQuery, the three messages look
extra. Thus I concluded (at the time) that removing Close is enough
here.

I notice that in the non-pipeline case we don't go to IDLE till we've
seen 'Z' (Sync). Something in the pipeline logic must be jumping the
gun on that state transition.

PQgetResult() resets the state to IDLE when not in pipeline mode.

fe-exec.c:2171

if (conn->pipelineStatus != PQ_PIPELINE_OFF)
{
/*
* We're about to send the results of the current query. Set
* us idle now, and ...
*/
conn->asyncStatus = PGASYNC_IDLE;

And actually that code let the connection state enter to IDLE before
CloseComplete. In the test case I posted, the following happens.

PQsendQuery(conn, "SELECT 1;");
PQsendFlushRequest(conn);
PQgetResult(conn); // state enters IDLE, reads down to <CommandComplete>
PQgetResult(conn); // reads <CloseComplete comes>
PQpipelineSync(conn); // sync too late

Pipeline feature seems intending to allow PQgetResult called before
PQpipelineSync. And also seems allowing to call QPpipelineSync() after
PQgetResult().

I haven't come up with a valid *fix* of this flow..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#10Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#9)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

At Thu, 16 Jun 2022 10:34:22 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Wed, 15 Jun 2022 14:56:42 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

So, git archaeology led me to this thread
/messages/by-id/202106072107.d4i55hdscxqj@alvherre.pgsql
which is why we added that message in the first place.

Um. Good thing you looked. I doubt we want to revert that change now.

Alternatives:
- Have the client not complain if it gets CloseComplete in idle state.
(After all, it's a pretty useless message, since we already do nothing
with it if we get it in BUSY state.)

ISTM the actual problem here is that we're reverting to IDLE state too
soon. I didn't try to trace down exactly where that's happening, but

Yes. I once visited that fact but also I thought that in the
comparison with non-pipelined PQsendQuery, the three messages look
extra. Thus I concluded (at the time) that removing Close is enough
here.

I notice that in the non-pipeline case we don't go to IDLE till we've
seen 'Z' (Sync). Something in the pipeline logic must be jumping the
gun on that state transition.

- PQgetResult() resets the state to IDLE when not in pipeline mode.

D... the "not" should not be there.

+ PQgetResult() resets the state to IDLE while in pipeline mode.

fe-exec.c:2171

if (conn->pipelineStatus != PQ_PIPELINE_OFF)
{
/*
* We're about to send the results of the current query. Set
* us idle now, and ...
*/
conn->asyncStatus = PGASYNC_IDLE;

And actually that code let the connection state enter to IDLE before
CloseComplete. In the test case I posted, the following happens.

PQsendQuery(conn, "SELECT 1;");
PQsendFlushRequest(conn);
PQgetResult(conn); // state enters IDLE, reads down to <CommandComplete>
PQgetResult(conn); // reads <CloseComplete comes>
PQpipelineSync(conn); // sync too late

Pipeline feature seems intending to allow PQgetResult called before
PQpipelineSync. And also seems allowing to call QPpipelineSync() after
PQgetResult().

I haven't come up with a valid *fix* of this flow..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#11Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#9)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

At Thu, 16 Jun 2022 10:34:22 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

PQgetResult() resets the state to IDLE while in pipeline mode.

fe-exec.c:2171

if (conn->pipelineStatus != PQ_PIPELINE_OFF)
{
/*
* We're about to send the results of the current query. Set
* us idle now, and ...
*/
conn->asyncStatus = PGASYNC_IDLE;

And actually that code let the connection state enter to IDLE before
CloseComplete. In the test case I posted, the following happens.

PQsendQuery(conn, "SELECT 1;");
PQsendFlushRequest(conn);
PQgetResult(conn); // state enters IDLE, reads down to <CommandComplete>
PQgetResult(conn); // reads <CloseComplete comes>
PQpipelineSync(conn); // sync too late

Pipeline feature seems intending to allow PQgetResult called before
PQpipelineSync. And also seems allowing to call QPpipelineSync() after
PQgetResult().

I haven't come up with a valid *fix* of this flow..

The attached is a crude patch to separate the state for PIPELINE-IDLE
from PGASYNC_IDLE. I haven't found a better way..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

libpqpline-test.difftext/x-patch; charset=us-asciiDownload+72-1
libpqpline-separte-pipelineidle-state.difftext/x-patch; charset=us-asciiDownload+25-25
#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kyotaro Horiguchi (#11)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

On 2022-Jun-16, Kyotaro Horiguchi wrote:

The attached is a crude patch to separate the state for PIPELINE-IDLE
from PGASYNC_IDLE. I haven't found a better way..

Ah, yeah, this might be a way to fix this.

Something very similar to a PIPELINE_IDLE mode was present in Craig's
initial patch for pipeline mode. However, I fought very hard to remove
it, because it seemed to me that failing to handle it correctly
everywhere would lead to more bugs than not having it. (Indeed, there
were some.)

However, I see now that your patch would not only fix this bug, but also
let us remove the ugly "notionally not-idle" bit in fe-protocol3.c,
which makes me ecstatic. So let's push forward with this. However,
this means we'll have to go over all places that use asyncStatus to
ensure that they all handle the new value correctly.

I did found one bug in your patch: in the switch for asyncStatus in
PQsendQueryStart, you introduce a new error message. With the current
tests, that never fires, which is telling us that our coverage is not
complete. But with the right sequence of libpq calls, which the
attached adds (note that it's for REL_14_STABLE), that can be hit, and
it's easy to see that throwing an error there is a mistake. The right
action to take there is to let the action through.

Others to think about:

PQisBusy (I think no changes are needed),
PQfn (I think it should accept a call in PGASYNC_PIPELINE_IDLE mode;
fully untested in pipeline mode),
PQexitPipelineMode (I think it needs to return error; needs test case),
PQsendFlushRequest (I think it should let through; ditto).

I also attach a patch to make the test suite use Test::Differences, if
available. It makes the diffs of the traces much easier to read, when
they fail. (I wish for a simple way to set the context size, but that
would need a shim routine that I'm currently too lazy to write.)

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

Attachments:

v5-0001-Avoid-going-IDLE-in-pipeline-mode.patchtext/x-diff; charset=us-asciiDownload+166-32
v5-0002-Use-Test-Differences-if-available.patchtext/x-diff; charset=us-asciiDownload+15-2
#13Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Alvaro Herrera (#12)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

At Fri, 17 Jun 2022 20:31:50 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

On 2022-Jun-16, Kyotaro Horiguchi wrote:

The attached is a crude patch to separate the state for PIPELINE-IDLE
from PGASYNC_IDLE. I haven't found a better way..

Ah, yeah, this might be a way to fix this.

Something very similar to a PIPELINE_IDLE mode was present in Craig's
initial patch for pipeline mode. However, I fought very hard to remove
it, because it seemed to me that failing to handle it correctly
everywhere would lead to more bugs than not having it. (Indeed, there
were some.)

However, I see now that your patch would not only fix this bug, but also
let us remove the ugly "notionally not-idle" bit in fe-protocol3.c,
which makes me ecstatic. So let's push forward with this. However,

Yey.

this means we'll have to go over all places that use asyncStatus to
ensure that they all handle the new value correctly.

Sure.

I did found one bug in your patch: in the switch for asyncStatus in
PQsendQueryStart, you introduce a new error message. With the current
tests, that never fires, which is telling us that our coverage is not
complete. But with the right sequence of libpq calls, which the
attached adds (note that it's for REL_14_STABLE), that can be hit, and

# (ah, I wondered why it failed to apply..)

it's easy to see that throwing an error there is a mistake. The right
action to take there is to let the action through.

Yeah.. Actulallly I really did it carelessly.. Thanks!

Others to think about:

PQisBusy (I think no changes are needed),

Agreed.

PQfn (I think it should accept a call in PGASYNC_PIPELINE_IDLE mode;
fully untested in pipeline mode),

Does a PQ_PIPELINE_OFF path need that? Rather I thought that we need
Assert(!conn->asyncStatus != PGASYNC_PIPELINE_IDLE) there. But anyway
we might need a test for this path.

PQexitPipelineMode (I think it needs to return error; needs test case),

Agreed about test case. Currently the function doesn't handle
PGASYNC_IDLE so I thought that PGASYNC_PIPELINE_IDLE also don't need a
care. If the function treats PGASYNC_PIPELINE_IDLE state as error,
the regression test fails (but I haven't examine it furtuer.)

PQsendFlushRequest (I think it should let through; ditto).

Does that mean exit without pushing 'H' message?

I also attach a patch to make the test suite use Test::Differences, if
available. It makes the diffs of the traces much easier to read, when
they fail. (I wish for a simple way to set the context size, but that
would need a shim routine that I'm currently too lazy to write.)

Yeah, it was annoying that the script prints expected and result trace
separately. It looks pretty good with the patch. I don't think
there's much use of context size here.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#14Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#13)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

At Tue, 21 Jun 2022 11:42:59 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Fri, 17 Jun 2022 20:31:50 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

Others to think about:

PQisBusy (I think no changes are needed),

Agreed.

PQfn (I think it should accept a call in PGASYNC_PIPELINE_IDLE mode;
fully untested in pipeline mode),

Does a PQ_PIPELINE_OFF path need that? Rather I thought that we need
Assert(!conn->asyncStatus != PGASYNC_PIPELINE_IDLE) there. But anyway
we might need a test for this path.

In the attached, PQfn() is used while in pipeline mode and
PGASYNC_PIPELINE_IDLE. Both error out and effectivelly no-op.

PQexitPipelineMode (I think it needs to return error; needs test case),

Agreed about test case. Currently the function doesn't handle
PGASYNC_IDLE so I thought that PGASYNC_PIPELINE_IDLE also don't need a
care. If the function treats PGASYNC_PIPELINE_IDLE state as error,
the regression test fails (but I haven't examine it furtuer.)

PQexitPipelineMode() is called while PGASYNC_PIPELINE_IDLE.

PQsendFlushRequest (I think it should let through; ditto).

Does that mean exit without pushing 'H' message?

I didn't do anything on this in the sttached.

By the way, I noticed that "libpq_pipeline uniqviol" intermittently
fails for uncertain reasons.

result 574/575: pipeline aborted
...........................................................
done writing

libpq_pipeline:1531: got unexpected NULL

The "...........done writing" is printed too late in the error cases.

This causes the TAP test fail, but I haven't find what's happnening at
the time.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

libpq_pline_add_some_tests.difftext/x-patch; charset=us-asciiDownload+48-0
#15Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#14)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

At Tue, 21 Jun 2022 14:56:40 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

By the way, I noticed that "libpq_pipeline uniqviol" intermittently
fails for uncertain reasons.

result 574/575: pipeline aborted
...........................................................
done writing

libpq_pipeline:1531: got unexpected NULL

The "...........done writing" is printed too late in the error cases.

This causes the TAP test fail, but I haven't find what's happnening at
the time.

Just to make sure, I see this with the master branch

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#16Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#15)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

At Tue, 21 Jun 2022 14:59:07 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

At Tue, 21 Jun 2022 14:56:40 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

By the way, I noticed that "libpq_pipeline uniqviol" intermittently
fails for uncertain reasons.

result 574/575: pipeline aborted
...........................................................
done writing

libpq_pipeline:1531: got unexpected NULL

The "...........done writing" is printed too late in the error cases.

This causes the TAP test fail, but I haven't find what's happnening at
the time.

Just to make sure, I see this with the master branch

No. It *is* caused by the fix. Sorry for the mistake. The test module
linked to the wrong binary..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#17Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#14)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

At Tue, 21 Jun 2022 14:56:40 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

By the way, I noticed that "libpq_pipeline uniqviol" intermittently
fails for uncertain reasons.

result 574/575: pipeline aborted
...........................................................
done writing

libpq_pipeline:1531: got unexpected NULL

PQsendQueryPrepared() is called after the conection's state has moved
to PGASYNC_IDLE so PQgetResult returns NULL. But actually there are
results. So, if pqPipelineProcessorQueue() doesn't move the async
state to PGASYNC_IDLE when queue is emtpy, uniqviol can run till the
end. But that change breaks almost all of other test items.

Finally, I found that the change in pqPipelineProcessorQueue() as
attached fixes the uniqviol failure and doesn't break other tests.
However, I don't understand what I did by the change for now... X(
It seems to me something's wrong in the PQ_PIPELINE_ABORTED mode..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v6-0001-Avoid-going-IDLE-in-pipeline-mode.patchtext/x-patch; charset=us-asciiDownload+175-34
#18Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kyotaro Horiguchi (#17)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

So I wrote some more test scenarios for this, and as I wrote in some
other thread, I realized that there are more problems than just some
NOTICE trouble. For instance, if you send a query, then read the result
but not the terminating NULL then send another query, everything gets
confused; the next thing you'll read is the result for the second query,
without having read the NULL that terminates the results of the first
query. Any application that expects the usual flow of results will be
confused. Kyotaro's patch to add PIPELINE_IDLE fixes this bit too, as
far as I can tell.

However, another problem case, not fixed by PIPELINE_IDLE, occurs if you
exit pipeline mode after PQsendQuery() and then immediately use
PQexec(). The CloseComplete will be received at the wrong time, and a
notice is emitted nevertheless.

I spent a lot of time trying to understand how to fix this last bit, and
the solution I came up with is that PQsendQuery() must add a second
entry to the command queue after the PGQUERY_EXTENDED one, to match the
CloseComplete message being expected; with that entry in the queue,
PQgetResult will really only get to IDLE mode after the Close has been
seen, which is what we want. I named it PGQUERY_CLOSE.

Sadly, some hacks are needed to make this work fully:

1. the client is never expecting that PQgetResult() would return
anything for the CloseComplete, so something needs to consume the
CloseComplete silently (including the queue entry for it) when it is
received; I chose to do this directly in pqParseInput3. I tried to
make PQgetResult itself do it, but it became a pile of hacks until I
was no longer sure what was going on. Putting it in fe-protocol3.c
ends up a lot cleaner. However, we still need PQgetResult to invoke
parseInput() at the point where Close is expected.

2. if an error occurs while executing the query, the CloseComplete will
of course never arrive, so we need to erase it from the queue
silently if we're returning an error.

I toyed with the idea of having parseInput() produce a PGresult that
encodes the Close message, and have PQgetResult consume and discard
that, then read some further message to have something to return. But
it seemed inefficient and equally ugly and I'm not sure that flow
control is any simpler.

I think another possibility would be to make PQexitPipelineMode
responsible for /something/, but I'm not sure what that would be.
Checking the queue and seeing if the next message is CloseComplete, then
eating that message before exiting pipeline mode? That seems way too
magical. I didn't attempt this.

I attach a patch series that implements the proposed fix (again for
REL_14_STABLE) in steps, to make it easy to read. I intend to squash
the whole lot into a single commit before pushing. But while writing
this email it occurred to me that I need to add at least one more test,
to receive a WARNING while waiting for CloseComplete. AFAICT it should
work, but better make sure.

I produced pipeline_idle.trace file by running the test in the fully
fixed tree, then used it to verify that all tests fail in different ways
when run without the fixes. The first fix with PIPELINE_IDLE fixes some
of these failures, and the PGQUERY_CLOSE one fixes the remaining one.
Reading the trace file, it looks correct to me.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem." (Tom Lane)

Attachments:

v7-0001-Use-Test-Differences-if-available.patchtext/x-diff; charset=us-asciiDownload+15-2
v7-0002-Allow-tracefile-to-be-stdout.patchtext/x-diff; charset=us-asciiDownload+4-2
v7-0003-if-the-queue-becomes-empty-reset-the-tail-too.patchtext/x-diff; charset=us-asciiDownload+4-1
v7-0004-add-some-tests-maybe-wrong.patchtext/x-diff; charset=us-asciiDownload+235-1
v7-0005-add-PIPELINE_IDLE-state.patchtext/x-diff; charset=us-asciiDownload+44-31
v7-0006-Add-the-CLOSE-message-fix.patchtext/x-diff; charset=us-asciiDownload+56-3
#19Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Alvaro Herrera (#18)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

Thanks for the further testing scenario.

At Wed, 29 Jun 2022 14:09:17 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

So I wrote some more test scenarios for this, and as I wrote in some
other thread, I realized that there are more problems than just some
NOTICE trouble. For instance, if you send a query, then read the result
but not the terminating NULL then send another query, everything gets
confused; the next thing you'll read is the result for the second query,
without having read the NULL that terminates the results of the first
query. Any application that expects the usual flow of results will be
confused. Kyotaro's patch to add PIPELINE_IDLE fixes this bit too, as
far as I can tell.

However, another problem case, not fixed by PIPELINE_IDLE, occurs if you
exit pipeline mode after PQsendQuery() and then immediately use
PQexec(). The CloseComplete will be received at the wrong time, and a
notice is emitted nevertheless.

Mmm. My patch moves the point of failure of the scenario a bit but
still a little short. However, as my understanding, it seems like the
task of the PQpipelineSync()-PQgetResult() pair to consume the
CloseComplete. If Iinserted PQpipelineSync() just after PQsendQuery()
and called PQgetResult() for PGRES_PIPELINE_SYNC before
PQexitPipelineMode(), the out-of-sync CloseComplete is not seen in the
scenario. But if it is right, I'd like to complain about the
obscure-but-stiff protocol of pipleline mode..

I spent a lot of time trying to understand how to fix this last bit, and
the solution I came up with is that PQsendQuery() must add a second
entry to the command queue after the PGQUERY_EXTENDED one, to match the
CloseComplete message being expected; with that entry in the queue,
PQgetResult will really only get to IDLE mode after the Close has been
seen, which is what we want. I named it PGQUERY_CLOSE.

Sadly, some hacks are needed to make this work fully:

1. the client is never expecting that PQgetResult() would return
anything for the CloseComplete, so something needs to consume the
CloseComplete silently (including the queue entry for it) when it is
received; I chose to do this directly in pqParseInput3. I tried to
make PQgetResult itself do it, but it became a pile of hacks until I
was no longer sure what was going on. Putting it in fe-protocol3.c
ends up a lot cleaner. However, we still need PQgetResult to invoke
parseInput() at the point where Close is expected.

2. if an error occurs while executing the query, the CloseComplete will
of course never arrive, so we need to erase it from the queue
silently if we're returning an error.

I toyed with the idea of having parseInput() produce a PGresult that
encodes the Close message, and have PQgetResult consume and discard
that, then read some further message to have something to return. But
it seemed inefficient and equally ugly and I'm not sure that flow
control is any simpler.

I think another possibility would be to make PQexitPipelineMode
responsible for /something/, but I'm not sure what that would be.
Checking the queue and seeing if the next message is CloseComplete, then
eating that message before exiting pipeline mode? That seems way too
magical. I didn't attempt this.

I attach a patch series that implements the proposed fix (again for
REL_14_STABLE) in steps, to make it easy to read. I intend to squash
the whole lot into a single commit before pushing. But while writing
this email it occurred to me that I need to add at least one more test,
to receive a WARNING while waiting for CloseComplete. AFAICT it should
work, but better make sure.

I produced pipeline_idle.trace file by running the test in the fully

By the perl script doesn't produce the trace file since the list in
$cmptrace line doesn't contain pipleline_idle..

fixed tree, then used it to verify that all tests fail in different ways
when run without the fixes. The first fix with PIPELINE_IDLE fixes some
of these failures, and the PGQUERY_CLOSE one fixes the remaining one.
Reading the trace file, it looks correct to me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#20Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kyotaro Horiguchi (#19)
hackersbugs
Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

On 2022-Jul-04, Kyotaro Horiguchi wrote:

At Wed, 29 Jun 2022 14:09:17 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

However, another problem case, not fixed by PIPELINE_IDLE, occurs if you
exit pipeline mode after PQsendQuery() and then immediately use
PQexec(). The CloseComplete will be received at the wrong time, and a
notice is emitted nevertheless.

Mmm. My patch moves the point of failure of the scenario a bit but
still a little short. However, as my understanding, it seems like the
task of the PQpipelineSync()-PQgetResult() pair to consume the
CloseComplete. If Iinserted PQpipelineSync() just after PQsendQuery()
and called PQgetResult() for PGRES_PIPELINE_SYNC before
PQexitPipelineMode(), the out-of-sync CloseComplete is not seen in the
scenario. But if it is right, I'd like to complain about the
obscure-but-stiff protocol of pipleline mode..

Yeah, if you introduce PQpipelineSync then I think it'll work okay, but
my point here was to make it work without requiring that; that's why I
wrote the test to use PQsendFlushRequest instead.

BTW I patch for the problem with uniqviol also (not fixed by v7). I'll
send an updated patch in a little while.

I produced pipeline_idle.trace file by running the test in the fully

By the perl script doesn't produce the trace file since the list in
$cmptrace line doesn't contain pipleline_idle..

Ouch, of course, thanks for noticing.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

#21Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#20)
hackersbugs
#22Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#21)
hackersbugs
#23Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Alvaro Herrera (#20)
hackersbugs