BUG #15293: Stored Procedure Triggered by Logical Replication is Unable to use Notification Events
The following bug has been logged on the website:
Bug reference: 15293
Logged by: Michael Powers
Email address: michael.paul.powers@gmail.com
PostgreSQL version: 10.4
Operating system: Ubuntu 18.04 Desktop x64 - Linux 4.15.0-29-generic
Description:
When using logical replication a stored procedure executed on the replica is
unable to use NOTIFY to send messages to other listeners. The stored
procedure does execute as expected but the pg_notify() doesn't appear to
have any effect. If an insert is run on the replica side the trigger
executes the stored procedure as expected and the NOTIFY correctly notifies
listeners.
Steps to Reproduce:
Set up Master:
CREATE TABLE test (id SERIAL PRIMARY KEY, msg TEXT NOT NULL);
CREATE PUBLICATION testpub FOR TABLE test;
Set up Replica:
CREATE TABLE test (id SERIAL PRIMARY KEY, msg TEXT NOT NULL);
CREATE SUBSCRIPTION testsub CONNECTION 'host=192.168.0.136 user=test
password=test' PUBLICATION testpub;
CREATE OR REPLACE FUNCTION notify_channel() RETURNS trigger AS $$
BEGIN
RAISE LOG 'Notify Triggered';
PERFORM pg_notify('testchannel', 'Testing');
RETURN NEW;
END;
$$ LANGUAGE 'plpgsql';
DROP TRIGGER queue_insert ON TEST;
CREATE TRIGGER queue_insert AFTER INSERT ON test FOR EACH ROW EXECUTE
PROCEDURE notify_channel();
ALTER TABLE test ENABLE ALWAYS TRIGGER queue_insert;
LISTEN testchannel;
Run the following insert on the master:
INSERT INTO test (msg) VALUES ('test');
In postgresql-10-main.log I get the following:
2018-07-24 07:45:15.705 EDT [6701] LOG: 00000: Notify Triggered
2018-07-24 07:45:15.705 EDT [6701] CONTEXT: PL/pgSQL function
notify_channel() line 3 at RAISE
2018-07-24 07:45:15.705 EDT [6701] LOCATION: exec_stmt_raise,
pl_exec.c:3337
But no listeners receive the message. However if an insert is run directly
on the replica:
# INSERT INTO test VALUES (99999, 'test');
INSERT 0 1
Asynchronous notification "testchannel" with payload "Testing" received from
server process with PID 6701.
Asynchronous notification "testchannel" with payload "Testing" received from
server process with PID 6701.
Asynchronous notification "testchannel" with payload "Testing" received from
server process with PID 6701.
Asynchronous notification "testchannel" with payload "Testing" received from
server process with PID 6701.
Asynchronous notification "testchannel" with payload "Testing" received from
server process with PID 6701.
Asynchronous notification "testchannel" with payload "Testing" received from
server process with PID 9992.
Backed up notifications are received for previous NOTIFY's.
Hello
Thank you for report
I checked this bug and found reason: we do not notify backends about new events by call ProcessCompletedNotifies from logical worker.
New notify from regular backend does call ProcessCompletedNotifies: send signal to all listen backends and found new events for youself.
But i am not sure where is correct place for call ProcessCompletedNotifies in logical worker src/backend/replication/logical/worker.c and i can not provide patch.
regards, Sergei
Hi,
On 2018-07-24 16:19:45 +0300, Sergei Kornilov wrote:
I checked this bug and found reason: we do not notify backends about new events by call ProcessCompletedNotifies from logical worker.
New notify from regular backend does call ProcessCompletedNotifies: send signal to all listen backends and found new events for youself.
But i am not sure where is correct place for call ProcessCompletedNotifies in logical worker src/backend/replication/logical/worker.c and i can not provide patch.
Peter, Petr, this is the second report of this issue. Anything?
Greetings,
Andres Freund
If Sergei is correct, I would volunteer to work on the patch. I am
completely new to the codebase but this issue affects me. According to the
documentation for `ProcessCompletedNotifies()` it should be called just
before going idle... so perhaps in src/backend/replication/logical/worker.c
at the tail end of `apply_handle_commit`? Again.. just looking at the
codebase today so if this is beyond beginner level I will assist w/ testing
instead.
On Tue, Jul 24, 2018 at 11:58 AM, Andres Freund <andres@anarazel.de> wrote:
Show quoted text
Hi,
On 2018-07-24 16:19:45 +0300, Sergei Kornilov wrote:
I checked this bug and found reason: we do not notify backends about new
events by call ProcessCompletedNotifies from logical worker.
New notify from regular backend does call ProcessCompletedNotifies: send
signal to all listen backends and found new events for youself.
But i am not sure where is correct place for call
ProcessCompletedNotifies in logical worker src/backend/replication/logical/worker.c
and i can not provide patch.Peter, Petr, this is the second report of this issue. Anything?
Greetings,
Andres Freund
Hello
in fact, I've already tried to build fix. Adding ProcessCompletedNotifies to apply_handle_commit fixed this issue and i think this is right place. In src/backend/tcop/postgres.c we call ProcessCompletedNotifies similar way after commit. This change pass make check-world.
So i attach my two line patch.
regards, Sergei
Attachments:
bug15293.patchtext/x-diff; name=bug15293.patchDownload+2-0
Hi Tom, Peter,
On 2018-07-24 21:22:18 +0300, Sergei Kornilov wrote:
in fact, I've already tried to build fix. Adding ProcessCompletedNotifies to apply_handle_commit fixed this issue and i think this is right place. In src/backend/tcop/postgres.c we call ProcessCompletedNotifies similar way after commit. This change pass make check-world.
So i attach my two line patch.
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index 6ca6cdc..e54bd90 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -37,6 +37,7 @@#include "commands/tablecmds.h"
#include "commands/trigger.h"
+#include "commands/async.h"#include "executor/executor.h"
#include "executor/nodeModifyTable.h"
@@ -490,6 +491,7 @@ apply_handle_commit(StringInfo s)
replorigin_session_origin_timestamp = commit_data.committime;CommitTransactionCommand();
+ ProcessCompletedNotifies();
pgstat_report_stat(false);store_flush_position(commit_data.end_lsn);
That's probably reasonable for the back branches (although I'd put the
store_flush_position before).
But I wonder if we shouldn't actually move the signalling part of
ProcessCompletedNotifies() into CommitTransactionCommand() in v11. Given
that transactions can now commit without a ready command being sent, due
to the addition of procedures, that kind of seems necessary?
Greetings,
Andres Freund
Hi Everyone,
Thanks for the attention. I've tested Sergei's patch and it does appear to
resolve the issue for me.
Thanks!
Michael Powers
Import Notes
Resolved by subject fallback
Andres Freund <andres@anarazel.de> writes:
But I wonder if we shouldn't actually move the signalling part of
ProcessCompletedNotifies() into CommitTransactionCommand() in v11. Given
that transactions can now commit without a ready command being sent, due
to the addition of procedures, that kind of seems necessary?
Hrm. I have a nasty feeling that that code is dependent on being executed
at the outermost logic level. In particular, ProcessCompletedNotifies
calls CommitTransactionCommand itself, so your proposal will create
infinite recursion. There may be some other issues too.
Another question that needs consideration is whether an internal commit
should lead to immediate distribution of notifies to our own client.
I think it probably mustn't; from the standpoint of the client, its
originally-asked-for xact is still in progress, and it's not going to
expect any notifies until that ends. So the proposed change is just
wrong if you ask me.
I agree we need some serious rethinking here. Maybe the fix will end
up being just a few lines, but it might take significant restructuring
too.
regards, tom lane
Hi,
On 2018-07-24 17:43:30 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
But I wonder if we shouldn't actually move the signalling part of
ProcessCompletedNotifies() into CommitTransactionCommand() in v11. Given
that transactions can now commit without a ready command being sent, due
to the addition of procedures, that kind of seems necessary?Hrm. I have a nasty feeling that that code is dependent on being executed
at the outermost logic level. In particular, ProcessCompletedNotifies
calls CommitTransactionCommand itself, so your proposal will create
infinite recursion. There may be some other issues too.
Yea, I don't think we could do this without separating concerns in
ProcessCompletedNotifies().
Another question that needs consideration is whether an internal commit
should lead to immediate distribution of notifies to our own client.
I think it probably mustn't; from the standpoint of the client, its
originally-asked-for xact is still in progress, and it's not going to
expect any notifies until that ends.
Yea, I agree on that.
So the proposed change is just wrong if you ask me.
I was only proposing to move the signalling part of
ProcessCompletedNotifies() into CommitTransactionCommand(), not the part
that processes notifications for the currentbackend - so I don't think
we actually disagree?
I agree we need some serious rethinking here. Maybe the fix will end
up being just a few lines, but it might take significant restructuring
too.
Yea :(. I think we need to separate the SignalBackend() part into
transaction commit, but leave the remainder of
ProcessCompletedNotifies() to be done in outer loops like
PostgresMain(). I'm not quite sure if there's a good way to handle the
fact that currently the asyncQueueAdvanceTail() call depends on
SignalBackend()'s return value. We probably don't want to do that work
inside the CommitTransactionCommand() - i guess we could move to just
doing it independent of SignalBackend()?
One other thing, somewhat independent, I wonder is if it's actually
problematic that we don't do ProcessCompletedNotifies() in a bunch of
processes, because that means we'll not necessarily call
asyncQueueAdvanceTail(). Perhaps that means we actually *do* need to do
it around CommitTransactionCommand()?
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
One other thing, somewhat independent, I wonder is if it's actually
problematic that we don't do ProcessCompletedNotifies() in a bunch of
processes, because that means we'll not necessarily call
asyncQueueAdvanceTail(). Perhaps that means we actually *do* need to do
it around CommitTransactionCommand()?
As far as that goes, we should probably ensure that a process that hasn't
executed any LISTENs is ignored for purposes of whether to advance the
queue tail. I think it might be like that already.
regards, tom lane
On 2018-07-24 18:01:33 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
One other thing, somewhat independent, I wonder is if it's actually
problematic that we don't do ProcessCompletedNotifies() in a bunch of
processes, because that means we'll not necessarily call
asyncQueueAdvanceTail(). Perhaps that means we actually *do* need to do
it around CommitTransactionCommand()?As far as that goes, we should probably ensure that a process that hasn't
executed any LISTENs is ignored for purposes of whether to advance the
queue tail. I think it might be like that already.
It indeed is:
min = QUEUE_HEAD;
for (i = 1; i <= MaxBackends; i++)
{
if (QUEUE_BACKEND_PID(i) != InvalidPid)
min = QUEUE_POS_MIN(min, QUEUE_BACKEND_POS(i));
}
what I am wondering is what happens if there's a background worker (like
the replication worker, but it could be other things too) that queues
notifications, but no normal backends are actually listening. As far as
I can tell, in that case we'd continue to queue stuff into the slru, but
wouldn't actually clean things up until a normal session gets around to
it? Which might be a while, on e.g. a logical replica.
Greetings,
Andres Freund
I have reproduced this bug on PostgreSQL version 10.7 and 11.2 using the
steps described in Michael Powers' original report. The issue also still
seems to be present even with the patch provided by Sergei Kornilov.
Are there plans to address this issue any time soon or is there some way
I can assist in fixing it? It would be great to have notifications from
logical replication.
Regards,
Robert Welin
If you are trying to get around the issue for now, what my team did was
cron an insert statement on the database server. We have a queue table that
has some of these triggers setup so it was easy to write a no-op row to the
queue. This had the side effect of flushing the notification queue.
We haven't had any issues with this approach.
I can also volunteer to help test out patches.
Thanks,
-mdj
On Fri, Feb 22, 2019 at 11:37 AM Robert Welin <robert@vaion.com> wrote:
Show quoted text
I have reproduced this bug on PostgreSQL version 10.7 and 11.2 using the
steps described in Michael Powers' original report. The issue also still
seems to be present even with the patch provided by Sergei Kornilov.Are there plans to address this issue any time soon or is there some way
I can assist in fixing it? It would be great to have notifications from
logical replication.Regards,
Robert Welin
On 2019-Feb-22, Robert Welin wrote:
I have reproduced this bug on PostgreSQL version 10.7 and 11.2 using the
steps described in Michael Powers' original report. The issue also still
seems to be present even with the patch provided by Sergei Kornilov.Are there plans to address this issue any time soon or is there some way
I can assist in fixing it? It would be great to have notifications from
logical replication.
This issue seems largely forgotten about :-(
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Feb 22, 2019 at 11:37 AM Robert Welin <robert(at)vaion(dot)com>
wrote:
I have reproduced this bug on PostgreSQL version 10.7 and 11.2 using the
steps described in Michael Powers' original report. The issue also still
seems to be present even with the patch provided by Sergei Kornilov.Are there plans to address this issue any time soon or is there some way
I can assist in fixing it? It would be great to have notifications from
logical replication.Same, I've reproduced this on 11.4 and 10.9. Are there any plans to
address this?
If there's nothing in the short-term, can we get a caveat in the
documentation for notify/logical
replication explaining this shortcoming?
--
Alan Kleiman
alan.kleiman@ifood.com.br
On Fri, 2019-06-28 at 17:22 -0400, Alvaro Herrera wrote:
On 2019-Feb-22, Robert Welin wrote:
I have reproduced this bug on PostgreSQL version 10.7 and 11.2
using the
steps described in Michael Powers' original report. The issue also
still
seems to be present even with the patch provided by Sergei
Kornilov.Are there plans to address this issue any time soon or is there
some way
I can assist in fixing it? It would be great to have notifications
from
logical replication.This issue seems largely forgotten about :-(
Are there any news to this issue ?
I can reproduce this bug up to now with version 12.
--
Regards
Daniel Danzberger
embeDD GmbH, Alter Postplatz 2, CH-6370 Stans
Hi all! I still can reproduce it on 14beta1 version. I adapted a patch I found in this thread https://github.com/seregayoga/postgres/commit/338bc33f2cf77edde7c45bfdfb9f39a92ec57eb8 . It solved this bug for me (tested with simple Go program using https://github.com/lib/pq ). It would be nice to have this bug fixed. I’m not so familiar with postgres code base, but would glad to help with testing.
Monday, June 14, 2021 12:39 PM +03:00 from Daniel Danzberger < daniel@dd-wrt.com >:
On Fri, 2019-06-28 at 17:22 -0400, Alvaro Herrera wrote:On 2019-Feb-22, Robert Welin wrote:
I have reproduced this bug on PostgreSQL version 10.7 and 11.2
using the
steps described in Michael Powers' original report. The issue also
still
seems to be present even with the patch provided by Sergei
Kornilov.Are there plans to address this issue any time soon or is there
some way
I can assist in fixing it? It would be great to have notifications
from
logical replication.This issue seems largely forgotten about :-(
Are there any news to this issue ?
I can reproduce this bug up to now with version 12.--
RegardsDaniel Danzberger
embeDD GmbH, Alter Postplatz 2, CH-6370 Stans
Best regards, Sergey Fedchenko.
Hello hackers,
On Wed, Jul 14, 2021 at 11:30 AM Sergey Fedchenko <seregayoga@bk.ru> wrote:
Hi all! I still can reproduce it on 14beta1 version. I adapted a patch I found in this thread https://github.com/seregayoga/postgres/commit/338bc33f2cf77edde7c45bfdfb9f39a92ec57eb8 . It solved this bug for me (tested with simple Go program using https://github.com/lib/pq). It would be nice to have this bug fixed. I’m not so familiar with postgres code base, but would glad to help with testing.
In our company in one of our projects we ran into this bug too.
I attached the patch which fixes it in a different way. It calls
SignalBackends() in AtCommit_Notify(). It is possible to call SignalBackends()
outside of ProcessCompletedNotifies() after the commit
51004c7172b5c35afac050f4d5849839a06e8d3b, which removes necessity of checking
of SignalBackends()'s result.
Moving SignalBackends() to AtCommit_Notify() makes it possible to signal
backends by "non-normal" backends including logical replication workers. It
seems it is safe to call SignalBackends() in AtCommit_Notify() since
SignalBackends() doesn't raise any error except OOM.
Regarding Andres concern:
what I am wondering is what happens if there's a background worker (like
the replication worker, but it could be other things too) that queues
notifications, but no normal backends are actually listening. As far as
I can tell, in that case we'd continue to queue stuff into the slru, but
wouldn't actually clean things up until a normal session gets around to
it? Which might be a while, on e.g. a logical replica.
A backend which raises notifications calls asyncQueueAdvanceTail() sometimes,
which truncates pages up to new tail, which is equal to head if there are no
listening backends. But there will be a problem if there is a backend which
is listening but it doesn't process incoming notifications and doesn't update
its queue position. In that case asyncQueueAdvanceTail() is able to advance
tail only up to that backend's queue position.
--
Artur Zakirov
PostgreSQL Developer at Adjust
Attachments:
0001-signal-backends-on-commit.patchtext/x-patch; charset=US-ASCII; name=0001-signal-backends-on-commit.patchDownload+29-21
Should Arthur patch be included in PostgreSQL 14 - Beta 4?
Import Notes
Resolved by subject fallback
Artur Zakirov <artur.zakirov@adjust.com> writes:
I attached the patch which fixes it in a different way. It calls
SignalBackends() in AtCommit_Notify(). It is possible to call SignalBackends()
outside of ProcessCompletedNotifies() after the commit
51004c7172b5c35afac050f4d5849839a06e8d3b, which removes necessity of checking
of SignalBackends()'s result.
Hm. So that forecloses back-patching this to earlier than v13.
On the other hand, given that we've been ignoring the bug for awhile,
maybe a fix that only works in v13+ is good enough. (Or maybe by now
it'd be safe to back-patch the v13-era async.c changes? Don't really
want to, though.)
The larger problem with this patch is exactly what Andres said: if
a replication worker or other background process is sending notifies,
but no normal backend is listening, then nothing will ever call
asyncQueueAdvanceTail() so the message queue will bloat until it
overflows and things start failing. That's not OK. Perhaps it
could be fixed by moving the "if (backendTryAdvanceTail)" stanza
into AtCommit_Notify. That's not ideal, because really it's best
to not do that till after we've read our own notifies, but it might
be close enough. At that point ProcessCompletedNotifies's only
responsibility would be to call asyncQueueReadAllNotifications,
which would allow some simplifications.
There are some sort-of-cosmetic-but-important-for-future-proofing
issues too:
* I think that it's safe to move these actions to AtCommit_Notify,
given where that is called in the CommitTransaction sequence, but
there is nothing in xact.c suggesting that that call is in any way
ordering-critical (because today, it isn't). I think we need some
comments there to prevent somebody from breaking this in future.
Maybe about like
smgrDoPendingDeletes(true);
+ /*
+ * Send out notification signals to other backends (and do other
+ * post-commit NOTIFY cleanup). This must not happen until after
+ * our transaction is fully done from the viewpoint of other
+ * backends.
+ */
AtCommit_Notify();
+ /*
+ * Everything after this should be purely-internal-to-this-backend
+ * cleanup.
+ */
AtEOXact_GUC(true, 1);
* You need to spend more effort on the comments in async.c too. Some
of these changes are wrong plus there are places that should be changed
and weren't. Also, postgres.c's comment about ProcessCompletedNotifies
is invalidated by this patch.
* There is some verbiage about NOTIFY in bgworker.sgml, which looks
like it may be wrong now, and it certainly will be wrong after this
patch. We don't want to be encouraging bgworkers to call
ProcessCompletedNotifies. Maybe we should rename it, to force
the issue?
regards, tom lane