BUG #19488: Standby connection fails after dropping on login event trigger enabled always
The following bug has been logged on the website:
Bug reference: 19488
Logged by: Egor Chindyaskin
Email address: kyzevan23@mail.ru
PostgreSQL version: 18.4
Operating system: Ubuntu 26.04
Description:
Hello!
In a master + physical standby setup, connection to the standby fails after
creating a login event trigger on the master, enabling it as always, and
then dropping it without reconnecting to the master.
Also reproduces on master branch.
Steps to reproduce:
1. Run the following SQL script on the master:
CREATE OR REPLACE FUNCTION init_session()
RETURNS event_trigger SECURITY DEFINER
LANGUAGE plpgsql AS
$$
BEGIN
RAISE NOTICE 'init_session';
END;
$$;
CREATE EVENT TRIGGER init_session
ON login
EXECUTE FUNCTION init_session();
ALTER EVENT TRIGGER init_session ENABLE ALWAYS;
DROP EVENT TRIGGER init_session;
2. Try to connect to the standby:
psql -p5433
Result:
psql: error: connection to server on socket "/tmp/.s.PGSQL.5433" failed:
FATAL: cannot acquire lock mode AccessExclusiveLock on database objects
while recovery is in progress
HINT: Only RowExclusiveLock or less can be acquired on database objects
during recovery.
--
With best regards,
Egor Chindyaskin
Postgres Professional: https://postgrespro.com
Hi,
On Wed, 20 May 2026 at 02:34, PG Bug reporting form <noreply@postgresql.org>
wrote:
The following bug has been logged on the website:
Bug reference: 19488
Logged by: Egor Chindyaskin
Email address: kyzevan23@mail.ru
PostgreSQL version: 18.4
Operating system: Ubuntu 26.04
Description:Hello!
In a master + physical standby setup, connection to the standby fails after
creating a login event trigger on the master, enabling it as always, and
then dropping it without reconnecting to the master.
Also reproduces on master branch.
Steps to reproduce:1. Run the following SQL script on the master:
CREATE OR REPLACE FUNCTION init_session()
RETURNS event_trigger SECURITY DEFINER
LANGUAGE plpgsql AS
$$
BEGIN
RAISE NOTICE 'init_session';
END;
$$;CREATE EVENT TRIGGER init_session
ON login
EXECUTE FUNCTION init_session();ALTER EVENT TRIGGER init_session ENABLE ALWAYS;
DROP EVENT TRIGGER init_session;
2. Try to connect to the standby:
psql -p5433Result:
psql: error: connection to server on socket "/tmp/.s.PGSQL.5433" failed:
FATAL: cannot acquire lock mode AccessExclusiveLock on database objects
while recovery is in progress
HINT: Only RowExclusiveLock or less can be acquired on database objects
during recovery.--
With best regards,
Egor ChindyaskinPostgres Professional: https://postgrespro.com
Thanks for the report and the precise repro.
The cause is in EventTriggerOnLogin(). When a session connects to a
database whose pg_database.dathasloginevt flag is set but no login
event triggers are actually present, the function tries to clear the
flag via:
ConditionalLockSharedObject(DatabaseRelationId, MyDatabaseId, 0,
AccessExclusiveLock);
On a hot standby, LockAcquireExtended() refuses any lock stronger than
RowExclusiveLock on LOCKTAG_OBJECT/LOCKTAG_RELATION while
RecoveryInProgress() is true, which surfaces as a FATAL on the new
connection. The standby ends up in this state precisely after your
steps because the primary set dathasloginevt = true while the trigger
was active, then dropped the trigger but (intentionally) left the
flag set on disk for the next normal connection to clean up. That
state replicates to the standby, and any subsequent connection on the
standby then enters the cleanup path and crashes.
A standby should not try to clear that flag itself(?) the only correct
update path on a standby is WAL replay from the primary.
Attached patch adds a RecoveryInProgress() check to skip the cleanup
branch on standbys. I think this needs to be backpatched too.
Regards,
Ayush
Attachments:
v1-0001-Skip-pg_database.dathasloginevt-cleanup-on-standb.patchapplication/octet-stream; name=v1-0001-Skip-pg_database.dathasloginevt-cleanup-on-standb.patchDownload+10-2
On Wed, May 20, 2026 at 1:37 PM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:
Thanks for the report and the precise repro.
+1
Attached patch adds a RecoveryInProgress() check to skip the cleanup
branch on standbys.
Thanks for investigating this issue and for the patch!
The patch looks good to me.
I think this needs to be backpatched too.
Yes. Seems this should be backpatched to v17, where login event triggers
were introduced.
Regards,
--
Fujii Masao
On Wed, May 20, 2026 at 1:03 PM Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, May 20, 2026 at 1:37 PM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:Thanks for the report and the precise repro.
+1
Attached patch adds a RecoveryInProgress() check to skip the cleanup
branch on standbys.Thanks for investigating this issue and for the patch!
The patch looks good to me.I think this needs to be backpatched too.
Yes. Seems this should be backpatched to v17, where login event triggers
were introduced.
I've added a tap test reproducing the bug. I'm going to push and
backpatch this to v17 if no objections.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v2-0001-Skip-pg_database.dathasloginevt-cleanup-on-standb.patchapplication/octet-stream; name=v2-0001-Skip-pg_database.dathasloginevt-cleanup-on-standb.patchDownload+88-2
On Wed, May 20, 2026 at 8:31 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Wed, May 20, 2026 at 1:03 PM Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, May 20, 2026 at 1:37 PM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:Thanks for the report and the precise repro.
+1
Attached patch adds a RecoveryInProgress() check to skip the cleanup
branch on standbys.Thanks for investigating this issue and for the patch!
The patch looks good to me.I think this needs to be backpatched too.
Yes. Seems this should be backpatched to v17, where login event triggers
were introduced.I've added a tap test reproducing the bug. I'm going to push and
backpatch this to v17 if no objections.
+# Wait for the standby to replay the CREATE/DROP catalog state. At
+# this point the standby's pg_database.dathasloginevt is still true.
+$primary->wait_for_replay_catchup($standby);
+
+# A new connection to the standby exercises EventTriggerOnLogin()'s
+# cleanup branch. With the RecoveryInProgress() guard, that branch is
+# skipped on the standby and the connection succeeds. Without it the
+# session aborts with a FATAL about AccessExclusiveLock. Probing the
+# flag itself via safe_psql is what triggers the cleanup path.
+is( $standby->safe_psql(
+ 'postgres',
+ "SELECT dathasloginevt FROM pg_database WHERE datname = 'postgres'"),
+ 't',
+ 'standby accepts connection and reports dangling dathasloginevt');
The test looks unstable to me. wait_for_replay_catchup() may connect to
the primary to obtain the flush LSN, which could cause dathasloginevt to
become false before the subsequent safe_psql() call on the standby.
Regards,
--
Fujii Masao
Hi,
On Wed, 20 May 2026 at 17:35, Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, May 20, 2026 at 8:31 PM Alexander Korotkov <aekorotkov@gmail.com>
wrote:On Wed, May 20, 2026 at 1:03 PM Fujii Masao <masao.fujii@gmail.com>
wrote:
On Wed, May 20, 2026 at 1:37 PM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:Thanks for the report and the precise repro.
+1
Attached patch adds a RecoveryInProgress() check to skip the cleanup
branch on standbys.Thanks for investigating this issue and for the patch!
The patch looks good to me.I think this needs to be backpatched too.
Yes. Seems this should be backpatched to v17, where login event
triggers
were introduced.
I've added a tap test reproducing the bug. I'm going to push and
backpatch this to v17 if no objections.+# Wait for the standby to replay the CREATE/DROP catalog state. At +# this point the standby's pg_database.dathasloginevt is still true. +$primary->wait_for_replay_catchup($standby); + +# A new connection to the standby exercises EventTriggerOnLogin()'s +# cleanup branch. With the RecoveryInProgress() guard, that branch is +# skipped on the standby and the connection succeeds. Without it the +# session aborts with a FATAL about AccessExclusiveLock. Probing the +# flag itself via safe_psql is what triggers the cleanup path. +is( $standby->safe_psql( + 'postgres', + "SELECT dathasloginevt FROM pg_database WHERE datname = 'postgres'"), + 't', + 'standby accepts connection and reports dangling dathasloginevt');The test looks unstable to me. wait_for_replay_catchup() may connect to
the primary to obtain the flush LSN, which could cause dathasloginevt to
become false before the subsequent safe_psql() call on the standby.
I think Fuji-san is right, can we do something like this:
my $drop_lsn = $primary->safe_psql(
'postgres', q{
BEGIN;
DROP EVENT TRIGGER init_session;
DROP FUNCTION init_session();
COMMIT;
SELECT pg_current_wal_lsn();
});
$primary->wait_for_catchup($standby, 'replay', $drop_lsn);
Then the following standby connection should reliably exercise the case
where dathasloginevt is still true on the standby but no login event
trigger remains.
Regards,
Ayush
Hi,
On Wed, 20 May 2026 at 18:15, Ayush Tiwari <ayushtiwari.slg01@gmail.com>
wrote:
Hi,
On Wed, 20 May 2026 at 17:35, Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, May 20, 2026 at 8:31 PM Alexander Korotkov <aekorotkov@gmail.com>
wrote:On Wed, May 20, 2026 at 1:03 PM Fujii Masao <masao.fujii@gmail.com>
wrote:
On Wed, May 20, 2026 at 1:37 PM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:Thanks for the report and the precise repro.
+1
Attached patch adds a RecoveryInProgress() check to skip the cleanup
branch on standbys.Thanks for investigating this issue and for the patch!
The patch looks good to me.I think this needs to be backpatched too.
Yes. Seems this should be backpatched to v17, where login event
triggers
were introduced.
I've added a tap test reproducing the bug. I'm going to push and
backpatch this to v17 if no objections.+# Wait for the standby to replay the CREATE/DROP catalog state. At +# this point the standby's pg_database.dathasloginevt is still true. +$primary->wait_for_replay_catchup($standby); + +# A new connection to the standby exercises EventTriggerOnLogin()'s +# cleanup branch. With the RecoveryInProgress() guard, that branch is +# skipped on the standby and the connection succeeds. Without it the +# session aborts with a FATAL about AccessExclusiveLock. Probing the +# flag itself via safe_psql is what triggers the cleanup path. +is( $standby->safe_psql( + 'postgres', + "SELECT dathasloginevt FROM pg_database WHERE datname = 'postgres'"), + 't', + 'standby accepts connection and reports dangling dathasloginevt');The test looks unstable to me. wait_for_replay_catchup() may connect to
the primary to obtain the flush LSN, which could cause dathasloginevt to
become false before the subsequent safe_psql() call on the standby.
I had registered this in commitfest, could see CI bot failing
https://commitfest.postgresql.org/patch/6790/
my $drop_lsn = $primary->safe_psql(
'postgres', q{
BEGIN;
DROP EVENT TRIGGER init_session;
DROP FUNCTION init_session();
COMMIT;
SELECT pg_current_wal_lsn();
});$primary->wait_for_catchup($standby, 'replay', $drop_lsn);
Attaching v3 with this change on top of Alexander's changes.
Regards,
Ayush
Attachments:
v3-0001-Skip-pg_database.dathasloginevt-cleanup-on-standby.patchapplication/octet-stream; name=v3-0001-Skip-pg_database.dathasloginevt-cleanup-on-standby.patchDownload+89-2
On Thu, May 21, 2026 at 11:10 AM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:
On Wed, 20 May 2026 at 18:15, Ayush Tiwari <ayushtiwari.slg01@gmail.com> wrote:
Hi,
On Wed, 20 May 2026 at 17:35, Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, May 20, 2026 at 8:31 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Wed, May 20, 2026 at 1:03 PM Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, May 20, 2026 at 1:37 PM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:Thanks for the report and the precise repro.
+1
Attached patch adds a RecoveryInProgress() check to skip the cleanup
branch on standbys.Thanks for investigating this issue and for the patch!
The patch looks good to me.I think this needs to be backpatched too.
Yes. Seems this should be backpatched to v17, where login event triggers
were introduced.I've added a tap test reproducing the bug. I'm going to push and
backpatch this to v17 if no objections.+# Wait for the standby to replay the CREATE/DROP catalog state. At +# this point the standby's pg_database.dathasloginevt is still true. +$primary->wait_for_replay_catchup($standby); + +# A new connection to the standby exercises EventTriggerOnLogin()'s +# cleanup branch. With the RecoveryInProgress() guard, that branch is +# skipped on the standby and the connection succeeds. Without it the +# session aborts with a FATAL about AccessExclusiveLock. Probing the +# flag itself via safe_psql is what triggers the cleanup path. +is( $standby->safe_psql( + 'postgres', + "SELECT dathasloginevt FROM pg_database WHERE datname = 'postgres'"), + 't', + 'standby accepts connection and reports dangling dathasloginevt');The test looks unstable to me. wait_for_replay_catchup() may connect to
the primary to obtain the flush LSN, which could cause dathasloginevt to
become false before the subsequent safe_psql() call on the standby.I had registered this in commitfest, could see CI bot failing
https://commitfest.postgresql.org/patch/6790/my $drop_lsn = $primary->safe_psql(
'postgres', q{
BEGIN;
DROP EVENT TRIGGER init_session;
DROP FUNCTION init_session();
COMMIT;
SELECT pg_current_wal_lsn();
});$primary->wait_for_catchup($standby, 'replay', $drop_lsn);
Attaching v3 with this change on top of Alexander's changes.
I suggest another approach. Create a separate test database and apply
event trigger on it. wait_for_catchup() and others use 'postgres'
database and wouldn't touch our test database.
I also added check for successful clearance of the flag on both
primary and standby. One issue spotted there: in-place heap update
doesn't issue a WAL flush. But I think that's minor, WAL could be
flushed by any subsequent operation.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v4-0001-Skip-pg_database.dathasloginevt-cleanup-on-standb.patchapplication/octet-stream; name=v4-0001-Skip-pg_database.dathasloginevt-cleanup-on-standb.patchDownload+135-2
Hi,
On Thu, 21 May 2026 at 17:01, Alexander Korotkov <aekorotkov@gmail.com>
wrote:
I suggest another approach. Create a separate test database and apply
event trigger on it. wait_for_catchup() and others use 'postgres'
database and wouldn't touch our test database.
I also added check for successful clearance of the flag on both
primary and standby. One issue spotted there: in-place heap update
doesn't issue a WAL flush. But I think that's minor, WAL could be
flushed by any subsequent operation.
I agree the approach you are suggesting is better.
Patch looks good to me!
Regards,
Ayush
On Thu, May 21, 2026 at 2:48 PM Ayush Tiwari
<ayushtiwari.slg01@gmail.com> wrote:
On Thu, 21 May 2026 at 17:01, Alexander Korotkov <aekorotkov@gmail.com> wrote:
I suggest another approach. Create a separate test database and apply
event trigger on it. wait_for_catchup() and others use 'postgres'
database and wouldn't touch our test database.
I also added check for successful clearance of the flag on both
primary and standby. One issue spotted there: in-place heap update
doesn't issue a WAL flush. But I think that's minor, WAL could be
flushed by any subsequent operation.I agree the approach you are suggesting is better.
Patch looks good to me!
Thank you. I'm going to push and backpatch it if no objections.
------
Regards,
Alexander Korotkov
Supabase