NOTIFY in multi-statement PQexec() not sent outside of transaction

Started by John Muehlhausenalmost 6 years ago10 messagesbugs
Jump to latest

macOS 10.13.6 (17G11023)
PostgreSQL 10.5 (Debian 10.5-2.pgdg90+1) on x86_64-pc-linux-gnu, compiled
by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit

Recreate with the following script. Expected behavior with TEST_DSN set to
your database. To see the broken behavior also set TEST_BREAK=1
---

# python 3.6
# set TEST_DSN env var

import os
import sys
import select
import psycopg2 # 2.8.4
from multiprocessing import Process, Queue

def notifier(input,output):
with psycopg2.connect(os.environ['TEST_DSN']) as conn:
conn.autocommit=True
with conn.cursor() as cur:
input.get()
if 'TEST_BREAK' in os.environ:
# docs seem to indicate that an implied transaction
# will not exist if there is an explicit begin/commit?
cur.execute(("notify __test; "
"begin; select pg_advisory_lock(7777); "
"select pg_advisory_unlock(7777); commit"))
else:
# this version works but introduces a race condition
# into my test suite where I am not sure that I am "in"
# a PQexec() but might instead be between such calls
cur.execute("notify __test")
cur.execute(( "select pg_advisory_lock(7777); "
"select pg_advisory_unlock(7777)"))
print('notifier unlocked', file=sys.stderr)
output.put(())

with psycopg2.connect(os.environ['TEST_DSN']) as conn:
conn.autocommit=True
with conn.cursor() as cur:
cur.execute("listen __test; select pg_advisory_lock(7777)")

input, output = Queue(), Queue()
p = Process(target=notifier, args=(output,input))
p.start()
output.put(())

print('waiting for notification',file=sys.stderr)
select.select([conn],[],[])
conn.poll()
print('notification received',file=sys.stderr)

cur.execute("select pg_advisory_unlock(7777)")

input.get()
p.join()

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: John Muehlhausen (#1)
Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

On Monday, April 20, 2020, John Muehlhausen <jgm@jgm.org> wrote:

# docs seem to indicate that an implied transaction
# will not exist if there is an explicit begin/commit?
cur.execute(("notify __test; "
"begin; select pg_advisory_lock(7777); "
"select pg_advisory_unlock(7777); commit"))

A more comprehensive reading of the docs finds:

“ If the BEGIN follows some statements that were executed as an implicit
transaction block, those statements are not immediately committed; in
effect, they are retroactively included into the new regular transaction
block. “.

David J.

In reply to: David G. Johnston (#2)
Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

Wrapping the notify in a transaction does not fix it. Shouldn't the notify
be available as soon as the containing transaction is committed?

cur.execute(("begin; notify __test; commit; "
"begin; select pg_advisory_lock(7777); "
"select pg_advisory_unlock(7777); commit"))

On Mon, Apr 20, 2020 at 1:26 PM David G. Johnston <
david.g.johnston@gmail.com> wrote:

Show quoted text

On Monday, April 20, 2020, John Muehlhausen <jgm@jgm.org> wrote:

# docs seem to indicate that an implied transaction
# will not exist if there is an explicit begin/commit?
cur.execute(("notify __test; "
"begin; select pg_advisory_lock(7777); "
"select pg_advisory_unlock(7777); commit"))

A more comprehensive reading of the docs finds:

“ If the BEGIN follows some statements that were executed as an implicit
transaction block, those statements are not immediately committed; in
effect, they are retroactively included into the new regular transaction
block. “.

David J.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Muehlhausen (#3)
Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

John Muehlhausen <jgm@jgm.org> writes:

Wrapping the notify in a transaction does not fix it. Shouldn't the notify
be available as soon as the containing transaction is committed?

A notify message will be sent to the client when the backend is
(a) idle (*not* in the middle of processing a transaction string) and
(b) not in an open transaction.

You appear to be unhappy with point (a), but I think changing that
would break more clients than it fixes. It's intentional that
we don't send notifies mid-command, and this sure looks like a
variant of mid-command from here.

regards, tom lane

In reply to: Tom Lane (#4)
Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

Tom, thanks for the explanation!

My perspective as a libpq user is that multi-statement PQexec() should have
the same effects as multiple PQexec() calls other than for the former
dropping the results of all but the most recent statement. Rule (a)
appears to break this assumption, and at minimum this difference should be
documented?

As a practical matter, what I was trying to do is signal a test case that
Postgres is in the middle of processing a transaction string. Rule (a)
seems to mean that NOTIFY cannot be used for this. Is there some other
mechanism? I don't want to depend on fragile timer-based logic such as
"I've waited long enough so surely it is blocked in PQexec()"... that kind
of thing always gets my test cases into trouble. This particular test case
disrupts TCP during this state to ensure that reset logic works from this
state.

Why do you think that it would break more clients than it fixes? Surely it
is rare to end an explicit transaction containing a notify in a
multi-statement command? For all other cases the behavior would be
unchanged.

Thanks again,
John

On Mon, Apr 20, 2020 at 2:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

John Muehlhausen <jgm@jgm.org> writes:

Wrapping the notify in a transaction does not fix it. Shouldn't the

notify

be available as soon as the containing transaction is committed?

A notify message will be sent to the client when the backend is
(a) idle (*not* in the middle of processing a transaction string) and
(b) not in an open transaction.

You appear to be unhappy with point (a), but I think changing that
would break more clients than it fixes. It's intentional that
we don't send notifies mid-command, and this sure looks like a
variant of mid-command from here.

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Muehlhausen (#5)
Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

John Muehlhausen <jgm@jgm.org> writes:

My perspective as a libpq user is that multi-statement PQexec() should have
the same effects as multiple PQexec() calls other than for the former
dropping the results of all but the most recent statement.

Well, that's not even close to true, because of the rules about statements
within a multi-statement string getting merged into single transactions.

As a practical matter, what I was trying to do is signal a test case that
Postgres is in the middle of processing a transaction string.

I'm not really sure why you want to go about that in this particular way;
under ordinary circumstances the client app wouldn't see the notify until
after PQexec returned, anyway. (Yes, I realize that there are ways around
that, but still, you can't really do anything with the connection until
PQexec returns.)

Perhaps you could achieve a similar effect using single-row mode,
ie break the connection after collecting a few rows from a multi-row
query? That's not PQexec at all of course, but if you use PQsendQuery
then the server doesn't know the difference.

regards, tom lane

In reply to: Tom Lane (#6)
Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

My perspective as a libpq user is that multi-statement PQexec() should

have

the same effects as multiple PQexec() calls other than for the former
dropping the results of all but the most recent statement.

Well, that's not even close to true, because of the rules about statements
within a multi-statement string getting merged into single transactions.

My meaning was that with appropriate begin/commit augmentation, the user
should be able to achieve the same effects with multi-statement, and that
Rule (a) makes this unachievable, at least in the case of NOTIFY.

My goal is to write a test case where the service being tested (which is a
child process of the testing process) is known to be blocking in a PQexec
call, and during this exact state the testing process disrupts the TCP
connection between the service and postgres in various ways, to be sure
that the service recovers.

I'm not sure how to know that PQexec() has entered and not yet returned in
my service process other than to examine the effects of a partially
complete PQexec() itself. If I can't use NOTIFY then perhaps I can sniff
for the network traffic sent at the beginning of PQexec(), or perhaps there
is some extra libpq logging I could enable to know that PQexec() is in
process?

Still, even without considering my usage, I find it odd that this would not
be considered a bug, for the reason in my first sentence above. The
current behavior means that the user has to think about transactions "and
something else" when reasoning about the effects of statements. I would
vote to remove the "and something else" which would then remove the need to
augment the current documentation. As it stands, at minimum, the
documentation needs to warn the user that similar notify effects are
unachievable in the multi-statement realm?

-John

#8David G. Johnston
david.g.johnston@gmail.com
In reply to: John Muehlhausen (#5)
Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

Thanks again,
John

On Mon, Apr 20, 2020 at 2:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

John Muehlhausen <jgm@jgm.org> writes:

Wrapping the notify in a transaction does not fix it. Shouldn't the

notify

be available as soon as the containing transaction is committed?

A notify message will be sent to the client when the backend is
(a) idle (*not* in the middle of processing a transaction string) and
(b) not in an open transaction.

You appear to be unhappy with point (a), but I think changing that
would break more clients than it fixes. It's intentional that
we don't send notifies mid-command, and this sure looks like a
variant of mid-command from here.

On Mon, Apr 20, 2020 at 12:56 PM John Muehlhausen <jgm@jgm.org> wrote:

Rule (a) appears to break this assumption, and at minimum this difference

should be documented?

It is [1]https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-ASYNC though the comment itself is in a place where LISTEN is the topic
of interest, not NOTIFY. It seems like a similar note or paragraph needs
to be placed where people using LISTEN would be typically looking.

"At present, NotificationResponse can only be sent outside a transaction,
and thus it will not occur in the middle of a command-response series,
though it might occur just before ReadyForQuery."

[1]: https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-ASYNC
https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-ASYNC

Specifically, in 52.2.2.1 Multiple Statements In A Simple Query:

... "Conversely, if a BEGIN appears in a multi-statement Query message,
then it starts a regular transaction block that will only be terminated by
an explicit COMMIT or ROLLBACK, whether that appears in this Query message
or a later one. If the BEGIN follows some statements that were executed as
an implicit transaction block, those statements are not immediately
committed; in effect, they are retroactively included into the new regular
transaction block."

Add something like:

Finally, if multiple explicit transactions are present in a single Simple
Query command then statements that send events that take effect at commit
(i.e., LISTEN and NOTIFY) will not happen until just before the
ReadyToQuery event that ends the current Simple Query command.

(I'm now less certain this is a good change but I'll leave it out there for
discussion)

Then, in the NOTIFY documentation [2]https://www.postgresql.org/docs/12/sql-notify.html:

"Firstly, if a NOTIFY is executed inside a transaction, the notify events
are not delivered until and unless the transaction is committed."

It feels like "... the notify events are not sent until ..." would be
clearer. And then add:

Furthermore, if the transaction containing the NOTIFY is part of a
multi-transaction command the notify event will be sent post-command
completion.

Specifically:

NOTIFY interacts with SQL transactions in some important ways. Firstly, if
a NOTIFY is executed inside a transaction, the notify events are not
delivered until and unless the transaction is committed (Furthermore, if
the transaction containing the NOTIFY is part of a multi-transaction
command the notify event will be sent post-command completion.) This is
appropriate, since if the transaction is aborted, all the commands within
it have had no effect, including NOTIFY. But it can be disconcerting if one
is expecting the notification events to be delivered immediately. Secondly,

[2]: https://www.postgresql.org/docs/12/sql-notify.html

David J.

#9David G. Johnston
david.g.johnston@gmail.com
In reply to: John Muehlhausen (#7)
Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

On Mon, Apr 20, 2020 at 1:27 PM John Muehlhausen <jgm@jgm.org> wrote:

My perspective as a libpq user is that multi-statement PQexec() should

have

the same effects as multiple PQexec() calls other than for the former
dropping the results of all but the most recent statement.

Still, even without considering my usage, I find it odd that this would
not be considered a bug, for the reason in my first sentence above. The
current behavior means that the user has to think about transactions "and
something else" when reasoning about the effects of statements. I would
vote to remove the "and something else" which would then remove the need to
augment the current documentation. As it stands, at minimum, the
documentation needs to warn the user that similar notify effects are
unachievable in the multi-statement realm?

The deeper into the architecture the change the more resistant to change it
should be. This is pretty low-level and the benefits don't seem that
great. It is functional, and the usage pattern involved, multi-statement
commands, is one that is actively discouraged - so from my perspective
classifying this as a bug that the core team should fix is unappealing. If
someone wants to offer up a feature enhancement patch for critique then
we'll see.

David J.

In reply to: David G. Johnston (#9)
Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

Thanks for the thoughts David. I assume there is a regression test suite
for postgres? In addition to documentation, can a test case be added that
will ensure the present behavior remains unchanged until someone thinks
about fixing it? This will serve as another reminder about the issue?

On Mon, Apr 20, 2020 at 3:39 PM David G. Johnston <
david.g.johnston@gmail.com> wrote:

Show quoted text

On Mon, Apr 20, 2020 at 1:27 PM John Muehlhausen <jgm@jgm.org> wrote:

My perspective as a libpq user is that multi-statement PQexec() should

have

the same effects as multiple PQexec() calls other than for the former
dropping the results of all but the most recent statement.

Still, even without considering my usage, I find it odd that this would
not be considered a bug, for the reason in my first sentence above. The
current behavior means that the user has to think about transactions "and
something else" when reasoning about the effects of statements. I would
vote to remove the "and something else" which would then remove the need to
augment the current documentation. As it stands, at minimum, the
documentation needs to warn the user that similar notify effects are
unachievable in the multi-statement realm?

The deeper into the architecture the change the more resistant to change
it should be. This is pretty low-level and the benefits don't seem that
great. It is functional, and the usage pattern involved, multi-statement
commands, is one that is actively discouraged - so from my perspective
classifying this as a bug that the core team should fix is unappealing. If
someone wants to offer up a feature enhancement patch for critique then
we'll see.

David J.