Refactor recovery conflict signaling a little

Started by Heikki Linnakangas5 months ago16 messageshackers

heikki.linnakangas@enterprisedb.com

5 months ago

I had a look at recovery conflict signaling and a few things caught my
eye. No functional changes, but some cleanups and readability improvements:

Patch 0001: Remove useless errdetail_abort()
--------------------------------------------

The function is supposed to add DETAIL to errors when you are in an
aborted transaction, if the transaction was aborted by a recovery
conflict, like this:

ERROR: current transaction is aborted, commands ignored until end of
transaction block"
DETAIL: Abort reason: recovery conflict

But I don't see how to reach that. If a transaction is aborted by
recovery conflict, you get a different error like this:

ERROR: canceling statement due to conflict with recovery
DETAIL: User was holding a relation lock for too long.

The transaction abort clears the 'recoveryConflictPending' flag, so even
if that happens in a transaction block, you don't get that "DETAIL:
Abort reason: recovery conflict" in the subsequent errors.

errdetail_abort() was introduced in commit a8ce974cdd. I suppose it was
needed back then, but the signal handling has changed a lot since.
Looking at that commit now, though, I don't really understand how it was
reachable even back then. (Except with a race with an unrelated
transaction abort, see commit message)

Has anyone seen the "DETAIL: Abort reason: recovery conflict" in recent
years, or ever? If not, let's rip it out.

0002: Don't hint that you can reconnect when the database is dropped
--------------------------------------------------------------------

If you're connected to a database is being dropped, during recovery, you
get an error like this:

FATAL: terminating connection due to conflict with recovery
DETAIL: User was connected to a database that must be dropped.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.

The hint seems misleading. The database is being dropped, you most
likely can *not* reconnect to it. Let's remove it.

0003-0004: Separate RecoveryConflictReasons from procsignals
-------------------------------------------------------------

We're currently using different PROCSIG_* flags to indicate different
kinds of recovery conflicts. We're also abusing the same flags in
functions like LogRecoveryConflict, which isn't related to inter-process
signaling. It seems better to have a separate enum for the recovery
conflict reasons. With this patch, there's just a single
PROCSIG_RECOVERY_CONFLICT to wake up a process on a recovery conflict,
and the reason is communicated by setting a flag in a bitmask in PGPROC.

I was inspired to do this in preparation of my project to replaces
latches with "interrupts". By having just a single PROCSIG flag, we
reduce the need for "interrupt bits" with that project. But it seems
nicer on its own merits too.

0005: Refactor ProcessRecoveryConflictInterrupt for readability
---------------------------------------------------------------

The function had a switch-statement with fallthrough through all the
cases. It took me a while to understand how it works. Once I finally
understood it, I refactored it to not rely on the fallthrough. I hope
this makes it easier for others too.

- Heikki

Chao Li

li.evan.chao@gmail.com

5 months ago

In reply to: Heikki Linnakangas (#1)

Re: Refactor recovery conflict signaling a little

On Jan 23, 2026, at 07:20, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

I had a look at recovery conflict signaling and a few things caught my eye. No functional changes, but some cleanups and readability improvements:

Patch 0001: Remove useless errdetail_abort()
--------------------------------------------

The function is supposed to add DETAIL to errors when you are in an aborted transaction, if the transaction was aborted by a recovery conflict, like this:

ERROR: current transaction is aborted, commands ignored until end of transaction block"
DETAIL: Abort reason: recovery conflict

But I don't see how to reach that. If a transaction is aborted by recovery conflict, you get a different error like this:

ERROR: canceling statement due to conflict with recovery
DETAIL: User was holding a relation lock for too long.

The transaction abort clears the 'recoveryConflictPending' flag, so even if that happens in a transaction block, you don't get that "DETAIL: Abort reason: recovery conflict" in the subsequent errors.

errdetail_abort() was introduced in commit a8ce974cdd. I suppose it was needed back then, but the signal handling has changed a lot since. Looking at that commit now, though, I don't really understand how it was reachable even back then. (Except with a race with an unrelated transaction abort, see commit message)

Has anyone seen the "DETAIL: Abort reason: recovery conflict" in recent years, or ever? If not, let's rip it out.

I did a Google search and couldn't find any post reporting the message, which seems to prove that message can be removed.

0002: Don't hint that you can reconnect when the database is dropped
--------------------------------------------------------------------

If you're connected to a database is being dropped, during recovery, you get an error like this:

FATAL: terminating connection due to conflict with recovery
DETAIL: User was connected to a database that must be dropped.
HINT: In a moment you should be able to reconnect to the database and repeat your command.

The hint seems misleading. The database is being dropped, you most likely can *not* reconnect to it. Let's remove it.

I like this change. Not only removing the misleading error message, the code is also clearer now.

0003-0004: Separate RecoveryConflictReasons from procsignals
-------------------------------------------------------------

We're currently using different PROCSIG_* flags to indicate different kinds of recovery conflicts. We're also abusing the same flags in functions like LogRecoveryConflict, which isn't related to inter-process signaling. It seems better to have a separate enum for the recovery conflict reasons. With this patch, there's just a single PROCSIG_RECOVERY_CONFLICT to wake up a process on a recovery conflict, and the reason is communicated by setting a flag in a bitmask in PGPROC.

I was inspired to do this in preparation of my project to replaces latches with "interrupts". By having just a single PROCSIG flag, we reduce the need for "interrupt bits" with that project. But it seems nicer on its own merits too.

0005: Refactor ProcessRecoveryConflictInterrupt for readability
---------------------------------------------------------------

The function had a switch-statement with fallthrough through all the cases. It took me a while to understand how it works. Once I finally understood it, I refactored it to not rely on the fallthrough. I hope this makes it easier for others too.

- Heikki
<0001-Remove-useless-errdetail_abort.patch><0002-Don-t-hint-that-you-can-reconnect-when-the-database-.patch><0003-Use-ProcNumber-rather-than-pid-in-ReplicationSlot.patch><0004-Separate-RecoveryConflictReasons-from-procsignals.patch><0005-Refactor-ProcessRecoveryConflictInterrupt-for-readab.patch>

A few comments on 003-0005:

1 - 0003
```
ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
{
ReplicationSlot *s;
- int active_pid;
+ ProcNumber active_proc;
+ pid_t active_pid;
```

Active_pid is only used inside the "if (active_proc != MyProcNumber)” clause, so it can be only defined within the “if” clause.

2 - 0003
```
 				if (MyBackendType == B_STARTUP)
-					(void) SendProcSignal(active_pid,
-										  PROCSIG_RECOVERY_CONFLICT_LOGICALSLOT,
-										  INVALID_PROC_NUMBER);
+					SendProcSignal(active_pid,
+								   PROCSIG_RECOVERY_CONFLICT_LOGICALSLOT,
+								   active_proc);
```

Here active_proc!=INVALID_PROC_NUMBER, so this changes the original logic without an explanation. Is the change intentional?

3 - 0004
```
+	 * This is a bitmask of RecoveryConflictReasons
+	 */
+	pg_atomic_uint32 pendingRecoveryConflicts;
```

I just feel this comment is a little bit confusing because RecoveryConflictReasons is an enum. Maybe we can say something like: This is a bitmask; each bit corresponds to one RecoveryConflictReason enum value.

4 - 0004
```
-				(void) SendProcSignal(pid, sigmode, vxid.procNumber);
+				(void) SendProcSignal(pid, PROCSIG_RECOVERY_CONFLICT, vxid.procNumber);
```

Nit: Here (void) is retained for SendProcSignal, but in the place of commit 2 for 0003, (void) is deleted when calling SendProcSignal, is there any reason for retaining this one and deleting that one?

5 - 0004
```
+ * doesn't check for deadlock direcly, because we want to kill one of the
```

Typo: direcly -> directly

6 - 0005
```
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index a2fa98ee971..bbf2254ca67 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -179,11 +179,15 @@ static bool IsTransactionExitStmt(Node *parsetree);
 static bool IsTransactionExitStmtList(List *pstmts);
 static bool IsTransactionStmtList(List *pstmts);
 static void drop_unnamed_stmt(void);
+static void ProcessRecoveryConflictInterrupts(void);
+static void ProcessRecoveryConflictInterrupt(RecoveryConflictReason reason);
+static void report_recovery_conflict(RecoveryConflictReason reason);
 static void log_disconnections(int code, Datum arg);
 static void enable_statement_timeout(void);
 static void disable_statement_timeout(void);

+
/* ----------------------------------------------------------------
```

Nit: No need to add this empty line.

7 - 0005
```
+static void
+report_recovery_conflict(RecoveryConflictReason reason)
+{
+	bool		fatal;

+	if (RECOVERY_CONFLICT_DATABASE)
+	{
```

I believe this should be if (reason == RECOVERY_CONFLICT_DATABASE).

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

4 months ago

In reply to: Chao Li (#2)

Re: Refactor recovery conflict signaling a little

Thanks for the review!

On 23/01/2026 04:41, Chao Li wrote:

On Jan 23, 2026, at 07:20, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Patch 0001: Remove useless errdetail_abort()
--------------------------------------------

...

Has anyone seen the "DETAIL: Abort reason: recovery conflict" in recent years, or ever? If not, let's rip it out.

I did a Google search and couldn't find any post reporting the message, which seems to prove that message can be removed.

0002: Don't hint that you can reconnect when the database is dropped
--------------------------------------------------------------------

If you're connected to a database is being dropped, during recovery, you get an error like this:

FATAL: terminating connection due to conflict with recovery
DETAIL: User was connected to a database that must be dropped.
HINT: In a moment you should be able to reconnect to the database and repeat your command.

The hint seems misleading. The database is being dropped, you most likely can *not* reconnect to it. Let's remove it.

I like this change. Not only removing the misleading error message, the code is also clearer now.

Pushed patches 0001 and 0002.

1 - 0003
```
ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
{
ReplicationSlot *s;
- int active_pid;
+ ProcNumber active_proc;
+ pid_t active_pid;
```

Active_pid is only used inside the "if (active_proc != MyProcNumber)” clause, so it can be only defined within the “if” clause.

It needs to fetched while still holding the lock. We could arrange the
code to avoid fetching it when active_proc == MyProcNumber, but it seems
it would be less clear, and this isn't performance critical.

2 - 0003
```
if (MyBackendType == B_STARTUP)
-					(void) SendProcSignal(active_pid,
-										  PROCSIG_RECOVERY_CONFLICT_LOGICALSLOT,
-										  INVALID_PROC_NUMBER);
+					SendProcSignal(active_pid,
+								   PROCSIG_RECOVERY_CONFLICT_LOGICALSLOT,
+								   active_proc);
```
Here active_proc!=INVALID_PROC_NUMBER, so this changes the original logic without an explanation. Is the change intentional?

Yes, it's intentional. SendProcSignal() is more efficient when you pass
procNumber argument. We didn't pass it here before because we didn't
have it, but now we do.

3 - 0004
```
+	 * This is a bitmask of RecoveryConflictReasons
+	 */
+	pg_atomic_uint32 pendingRecoveryConflicts;
```
I just feel this comment is a little bit confusing because RecoveryConflictReasons is an enum. Maybe we can say something like: This is a bitmask; each bit corresponds to one RecoveryConflictReason enum value.

Ok, adopted that wording.

4 - 0004
```
-				(void) SendProcSignal(pid, sigmode, vxid.procNumber);
+				(void) SendProcSignal(pid, PROCSIG_RECOVERY_CONFLICT, vxid.procNumber);
```
Nit: Here (void) is retained for SendProcSignal, but in the place of commit 2 for 0003, (void) is deleted when calling SendProcSignal, is there any reason for retaining this one and deleting that one?

I added the (void) back to the SendProcSignal call in 0003. We're not
very consistent about that, there are SendProcSignal calls with and
without the (void) in the codebase. But there was no particular reason
to change it in this patch.

7 - 0005
```
+static void
+report_recovery_conflict(RecoveryConflictReason reason)
+{
+	bool		fatal;
+	if (RECOVERY_CONFLICT_DATABASE)
+	{
```
I believe this should be if (reason == RECOVERY_CONFLICT_DATABASE).

Oops yes, fixed!

Attached are new versions of the remaining patches, with the above and
the typo and whitespace fixes that you pointed out.

- Heikki

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

4 months ago

In reply to: Heikki Linnakangas (#3)

Re: Refactor recovery conflict signaling a little

On 03/02/2026 15:31, Heikki Linnakangas wrote:

Attached are new versions of the remaining patches, with the above and
the typo and whitespace fixes that you pointed out.

I hear no objectons, so committed. Thanks for the review!

- Heikki

Alexander Lakhin

exclusion@gmail.com

4 months ago

In reply to: Heikki Linnakangas (#4)

Re: Refactor recovery conflict signaling a little

Hello Heikki,

10.02.2026 16:32, Heikki Linnakangas wrote:

On 03/02/2026 15:31, Heikki Linnakangas wrote:

Attached are new versions of the remaining patches, with the above and the typo and whitespace fixes that you pointed
out.

I hear no objectons, so committed. Thanks for the review!

Could you please look at a new failure of 035_standby_logical_decoding.pl
produced at buildfarm [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2026-02-24%2004%3A28%3A36?:
[05:34:51.345](233.134s) # poll_query_until timed out executing this query:
# SELECT '0/0403F950' <= replay_lsn AND state = 'streaming'
# FROM pg_catalog.pg_stat_replication
# WHERE application_name IN ('standby', 'walreceiver')
# expecting this output:
# t
# last actual query output:
# f
# with stderr:

035_standby_logical_decoding_standby.log contains:
2026-02-24 05:30:58.300 CET [1512377][client backend][:0] LOG: disconnection: session time: 0:00:00.035 user=bf
database=testdb host=[local]
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] LOG: invalidating obsolete replication slot
"row_removal_inactiveslot"
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] DETAIL: The slot conflicted with xid horizon 748.
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] CONTEXT: WAL redo at 0/040214F0 for Heap2/PRUNE_ON_ACCESS:
snapshotConflictHorizon: 748, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 3, nunused: 1, dead: [33, 34, 35],
unused: [36]; blkref #0: rel 1663/16384/16418, blk 10
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] LOG: terminating process 1512360 to release replication slot
"row_removal_activeslot"
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] DETAIL: The slot conflicted with xid horizon 748.
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] CONTEXT: WAL redo at 0/040214F0 for Heap2/PRUNE_ON_ACCESS:
snapshotConflictHorizon: 748, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 3, nunused: 1, dead: [33, 34, 35],
unused: [36]; blkref #0: rel 1663/16384/16418, blk 10
2026-02-24 05:34:51.400 CET [1508227][walreceiver][:0] FATAL: could not receive data from WAL stream: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

The "terminating process ..." message doesn't appear when the test passes
successfully.

I've managed to reproduce this with multiple (20) test instances running
in a loop (it failed within 10 iterations for me); `git bisect` for this
anomaly pointed at 17f51ea81.

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2026-02-24%2004%3A28%3A36

Best regards,
Alexander

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

3 months ago

In reply to: Alexander Lakhin (#5)

Re: Refactor recovery conflict signaling a little

On 24/02/2026 10:00, Alexander Lakhin wrote:

Could you please look at a new failure of 035_standby_logical_decoding.pl
produced at buildfarm [1]?:
[05:34:51.345](233.134s) # poll_query_until timed out executing this query:
# SELECT '0/0403F950' <= replay_lsn AND state = 'streaming'
#          FROM pg_catalog.pg_stat_replication
#          WHERE application_name IN ('standby', 'walreceiver')
# expecting this output:
# t
# last actual query output:
# f
# with stderr:

035_standby_logical_decoding_standby.log contains:
2026-02-24 05:30:58.300 CET [1512377][client backend][:0] LOG:
disconnection: session time: 0:00:00.035 user=bf database=testdb
host=[local]
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] LOG: invalidating
obsolete replication slot "row_removal_inactiveslot"
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] DETAIL: The slot
conflicted with xid horizon 748.
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] CONTEXT: WAL
redo at 0/040214F0 for Heap2/PRUNE_ON_ACCESS: snapshotConflictHorizon:
748, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 3, nunused: 1,
dead: [33, 34, 35], unused: [36]; blkref #0: rel 1663/16384/16418, blk 10
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] LOG: terminating
process 1512360 to release replication slot "row_removal_activeslot"
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] DETAIL: The slot
conflicted with xid horizon 748.
2026-02-24 05:30:58.425 CET [1507982][startup][32/0:0] CONTEXT: WAL
redo at 0/040214F0 for Heap2/PRUNE_ON_ACCESS: snapshotConflictHorizon:
748, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 3, nunused: 1,
dead: [33, 34, 35], unused: [36]; blkref #0: rel 1663/16384/16418, blk 10
2026-02-24 05:34:51.400 CET [1508227][walreceiver][:0] FATAL: could not
receive data from WAL stream: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

The "terminating process ..." message doesn't appear when the test passes
successfully.

Hmm, right, looks like something wrong in signaling the recovery
conflict. I can't tell if the signal is being sent, or it's not
processed correctly. Looking at the code, I don't see anything wrong.

I've managed to reproduce this with multiple (20) test instances running
in a loop (it failed within 10 iterations for me); `git bisect` for this
anomaly pointed at 17f51ea81.

I've been trying to reproduce this locally, but so far not success,
after thousands of iterations.

If you can still reproduce this, can you try it with code changes from
the attached recovery-conflict-fail-extra-logging.patch, which adds some
extra logging, and send over the logs please?

The recovery-conflict-fail-repro-attempt.patch contains very hacky
changes to the test, to run the just the failing part 100 times in a
loop. That's just to show what I used to try to reproduce this, but no luck.

- Heikki

Alexander Lakhin

exclusion@gmail.com

3 months ago

In reply to: Heikki Linnakangas (#6)

Re: Refactor recovery conflict signaling a little

Hello Heikki,

03.03.2026 17:39, Heikki Linnakangas wrote:

On 24/02/2026 10:00, Alexander Lakhin wrote:

The "terminating process ..." message doesn't appear when the test passes
successfully.

Hmm, right, looks like something wrong in signaling the recovery conflict. I can't tell if the signal is being sent,
or it's not processed correctly. Looking at the code, I don't see anything wrong.

I've managed to reproduce this with multiple (20) test instances running
in a loop (it failed within 10 iterations for me); `git bisect` for this
anomaly pointed at 17f51ea81.

I've been trying to reproduce this locally, but so far not success, after thousands of iterations.

If you can still reproduce this, can you try it with code changes from the attached
recovery-conflict-fail-extra-logging.patch, which adds some extra logging, and send over the logs please?

The recovery-conflict-fail-repro-attempt.patch contains very hacky changes to the test, to run the just the failing
part 100 times in a loop. That's just to show what I used to try to reproduce this, but no luck.

Thank you for your attention to it!

I can easily reproduce the failure with the attached script when using SSD
(the test didn't fail for me on tmpfs and HDD, probably some tuning needed
):
ITERATION 1
...
19      t/035_standby_logical_decoding.pl .. ok
19      All tests successful.
19      Files=1, Tests=28, 20 wallclock secs ( 0.02 usr 0.01 sys + 0.35 cusr 0.80 csys = 1.18 CPU)
19      Result: PASS
2       # poll_query_until timed out executing this query:
2       # SELECT '0/0403F950' <= replay_lsn AND state = 'streaming'
2       #          FROM pg_catalog.pg_stat_replication
2       #          WHERE application_name IN ('standby', 'walreceiver')
2       # expecting this output:
2       # t
2       # last actual query output:
2       # f
2       # with stderr:
...

035_standby_logical_decoding_standby.log contains:
2026-03-03 20:34:14.198 EET startup[844699] LOG: invalidating obsolete replication slot "row_removal_inactiveslot"
2026-03-03 20:34:14.198 EET startup[844699] DETAIL: The slot conflicted with xid horizon 748.
2026-03-03 20:34:14.198 EET startup[844699] CONTEXT: WAL redo at 0/040214F0 for Heap2/PRUNE_ON_ACCESS:
snapshotConflictHorizon: 748, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 3, nunused: 1, dead: [33, 34, 35],
unused: [36]; blkref #0: rel 1663/16384/16418, blk 10
2026-03-03 20:34:14.198 EET startup[844699] LOG: terminating process 845647 to release replication slot
"row_removal_activeslot"
2026-03-03 20:34:14.198 EET startup[844699] DETAIL: The slot conflicted with xid horizon 748.
2026-03-03 20:34:14.198 EET startup[844699] CONTEXT: WAL redo at 0/040214F0 for Heap2/PRUNE_ON_ACCESS:
snapshotConflictHorizon: 748, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 3, nunused: 1, dead: [33, 34, 35],
unused: [36]; blkref #0: rel 1663/16384/16418, blk 10
2026-03-03 20:34:14.198 EET startup[844699] LOG: XXX: SendProcSignal sending SIGUSR1 to pid 845647
2026-03-03 20:34:14.198 EET startup[844699] CONTEXT: WAL redo at 0/040214F0 for Heap2/PRUNE_ON_ACCESS:
snapshotConflictHorizon: 748, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 3, nunused: 1, dead: [33, 34, 35],
unused: [36]; blkref #0: rel 1663/16384/16418, blk 10
2026-03-03 20:37:26.827 EET walreceiver[844817] FATAL: could not receive data from WAL stream: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

The full logs are attached too. I can add any other logging you want.

Best regards,
Alexander

Xuneng Zhou

xunengzhou@gmail.com

3 months ago

In reply to: Alexander Lakhin (#7)

Re: Refactor recovery conflict signaling a little

Hi Alexander,

On Wed, Mar 4, 2026 at 3:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:

Hello Heikki,

03.03.2026 17:39, Heikki Linnakangas wrote:

On 24/02/2026 10:00, Alexander Lakhin wrote:

The "terminating process ..." message doesn't appear when the test passes
successfully.

Hmm, right, looks like something wrong in signaling the recovery conflict. I can't tell if the signal is being sent,
or it's not processed correctly. Looking at the code, I don't see anything wrong.

I've managed to reproduce this with multiple (20) test instances running
in a loop (it failed within 10 iterations for me); `git bisect` for this
anomaly pointed at 17f51ea81.

I've been trying to reproduce this locally, but so far not success, after thousands of iterations.

If you can still reproduce this, can you try it with code changes from the attached
recovery-conflict-fail-extra-logging.patch, which adds some extra logging, and send over the logs please?

The recovery-conflict-fail-repro-attempt.patch contains very hacky changes to the test, to run the just the failing
part 100 times in a loop. That's just to show what I used to try to reproduce this, but no luck.

Thank you for your attention to it!

I can easily reproduce the failure with the attached script when using SSD
(the test didn't fail for me on tmpfs and HDD, probably some tuning needed
):
ITERATION 1
...
19 t/035_standby_logical_decoding.pl .. ok
19 All tests successful.
19 Files=1, Tests=28, 20 wallclock secs ( 0.02 usr 0.01 sys + 0.35 cusr 0.80 csys = 1.18 CPU)
19 Result: PASS
2 # poll_query_until timed out executing this query:
2 # SELECT '0/0403F950' <= replay_lsn AND state = 'streaming'
2 # FROM pg_catalog.pg_stat_replication
2 # WHERE application_name IN ('standby', 'walreceiver')
2 # expecting this output:
2 # t
2 # last actual query output:
2 # f
2 # with stderr:
...

035_standby_logical_decoding_standby.log contains:
2026-03-03 20:34:14.198 EET startup[844699] LOG: invalidating obsolete replication slot "row_removal_inactiveslot"
2026-03-03 20:34:14.198 EET startup[844699] DETAIL: The slot conflicted with xid horizon 748.
2026-03-03 20:34:14.198 EET startup[844699] CONTEXT: WAL redo at 0/040214F0 for Heap2/PRUNE_ON_ACCESS:
snapshotConflictHorizon: 748, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 3, nunused: 1, dead: [33, 34, 35],
unused: [36]; blkref #0: rel 1663/16384/16418, blk 10
2026-03-03 20:34:14.198 EET startup[844699] LOG: terminating process 845647 to release replication slot
"row_removal_activeslot"
2026-03-03 20:34:14.198 EET startup[844699] DETAIL: The slot conflicted with xid horizon 748.
2026-03-03 20:34:14.198 EET startup[844699] CONTEXT: WAL redo at 0/040214F0 for Heap2/PRUNE_ON_ACCESS:
snapshotConflictHorizon: 748, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 3, nunused: 1, dead: [33, 34, 35],
unused: [36]; blkref #0: rel 1663/16384/16418, blk 10
2026-03-03 20:34:14.198 EET startup[844699] LOG: XXX: SendProcSignal sending SIGUSR1 to pid 845647
2026-03-03 20:34:14.198 EET startup[844699] CONTEXT: WAL redo at 0/040214F0 for Heap2/PRUNE_ON_ACCESS:
snapshotConflictHorizon: 748, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 3, nunused: 1, dead: [33, 34, 35],
unused: [36]; blkref #0: rel 1663/16384/16418, blk 10
2026-03-03 20:37:26.827 EET walreceiver[844817] FATAL: could not receive data from WAL stream: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

The full logs are attached too. I can add any other logging you want.

Best regards,
Alexander

I was unable to reproduce the issue on an x86_64 Linux machine using
the provided script. All test runs completed successfully without any
failures.

--
Best,
Xuneng

Alexander Lakhin

exclusion@gmail.com

3 months ago

In reply to: Xuneng Zhou (#8)

Re: Refactor recovery conflict signaling a little

Hello Xuneng and Heikki,

04.03.2026 07:33, Xuneng Zhou wrote:

03.03.2026 17:39, Heikki Linnakangas wrote:

On 24/02/2026 10:00, Alexander Lakhin wrote:

The "terminating process ..." message doesn't appear when the test passes
successfully.

Hmm, right, looks like something wrong in signaling the recovery conflict. I can't tell if the signal is being sent,
or it's not processed correctly. Looking at the code, I don't see anything wrong.

I was unable to reproduce the issue on an x86_64 Linux machine using
the provided script. All test runs completed successfully without any
failures.

I've added debug logging (see attached) and saw the following:
!!!SignalRecoveryConflict[282363]
!!!ProcArrayEndTransaction| pendingRecoveryConflicts = 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
2026-03-07 12:21:24.544 EET walreceiver[282421] FATAL: could not receive data from WAL stream: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2026-03-07 12:21:24.645 EET postmaster[282355] LOG: received immediate shutdown request
2026-03-07 12:21:24.647 EET postmaster[282355] LOG: database system is shut down

While for a successful run, I see:
2026-03-07 12:18:17.075 EET startup[285260] DETAIL: The slot conflicted with xid horizon 677.
2026-03-07 12:18:17.075 EET startup[285260] CONTEXT: WAL redo at 0/04022130 for Heap2/PRUNE_ON_ACCESS:
snapshotConflictHorizon: 677, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 2, nunused: 0, dead: [35, 36]; blkref
#0: rel 1663/16384/16418, blk 10
!!!SignalRecoveryConflict[285260]
!!!ProcessInterrupts[286071]| MyProc->pendingRecoveryConflicts: 16
!!!ProcessRecoveryConflictInterrupts[286071]
!!!ProcessRecoveryConflictInterrupts[286071] pending: 16, reason: 4
2026-03-07 12:18:17.075 EET walsender[286071] 035_standby_logical_decoding.pl ERROR: canceling statement due to
conflict with recovery
2026-03-07 12:18:17.075 EET walsender[286071] 035_standby_logical_decoding.pl DETAIL: User was using a logical
replication slot that must be invalidated.

(Full logs for this failed run and a good run are attached.)

Best regards,
Alexander

#10

Xuneng Zhou

xunengzhou@gmail.com

3 months ago

In reply to: Alexander Lakhin (#9)

Re: Refactor recovery conflict signaling a little

Hi Alexander,

On Sat, Mar 7, 2026 at 7:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:

Hello Xuneng and Heikki,

04.03.2026 07:33, Xuneng Zhou wrote:

03.03.2026 17:39, Heikki Linnakangas wrote:

On 24/02/2026 10:00, Alexander Lakhin wrote:

The "terminating process ..." message doesn't appear when the test passes
successfully.

Hmm, right, looks like something wrong in signaling the recovery conflict. I can't tell if the signal is being sent,
or it's not processed correctly. Looking at the code, I don't see anything wrong.

I was unable to reproduce the issue on an x86_64 Linux machine using
the provided script. All test runs completed successfully without any
failures.

I've added debug logging (see attached) and saw the following:
!!!SignalRecoveryConflict[282363]
!!!ProcArrayEndTransaction| pendingRecoveryConflicts = 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
2026-03-07 12:21:24.544 EET walreceiver[282421] FATAL: could not receive data from WAL stream: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2026-03-07 12:21:24.645 EET postmaster[282355] LOG: received immediate shutdown request
2026-03-07 12:21:24.647 EET postmaster[282355] LOG: database system is shut down

While for a successful run, I see:
2026-03-07 12:18:17.075 EET startup[285260] DETAIL: The slot conflicted with xid horizon 677.
2026-03-07 12:18:17.075 EET startup[285260] CONTEXT: WAL redo at 0/04022130 for Heap2/PRUNE_ON_ACCESS: snapshotConflictHorizon: 677, isCatalogRel: T, nplans: 0, nredirected: 0, ndead: 2, nunused: 0, dead: [35, 36]; blkref #0: rel 1663/16384/16418, blk 10
!!!SignalRecoveryConflict[285260]
!!!ProcessInterrupts[286071]| MyProc->pendingRecoveryConflicts: 16
!!!ProcessRecoveryConflictInterrupts[286071]
!!!ProcessRecoveryConflictInterrupts[286071] pending: 16, reason: 4
2026-03-07 12:18:17.075 EET walsender[286071] 035_standby_logical_decoding.pl ERROR: canceling statement due to conflict with recovery
2026-03-07 12:18:17.075 EET walsender[286071] 035_standby_logical_decoding.pl DETAIL: User was using a logical replication slot that must be invalidated.

(Full logs for this failed run and a good run are attached.)

Thanks again for the logs. I think we might frame the next run around
three explicit hypotheses:

1) Self-clear: the target walsender clears its own
pendingRecoveryConflicts during xact cleanup before
ProcessInterrupts() consumes it.
2) Group-clear-on-behalf: another backend clears the target PGPROC via
group clear before consume.
3) Visibility gap: the weak pg_atomic_read_u32() in the interrupt path
observes 0 even though the conflict bit was set.

The bad vs good traces are consistent with a receiver-side handoff failure:
- bad run: signal path reached, target walsender later sees
pendingRecoveryConflicts = 0, never dispatches conflict;
- good run: target sees pending = 0x10, dispatches reason 4
(RECOVERY_CONFLICT_LOGICALSLOT), throws expected ERROR, releases slot.

I prepared an updated diagnostics patch to classify hypotheses (1) and
(2) directly:
- SignalRecoveryConflict: logs target pid/procno and old/new mask
- clear sites: logs self_pid, target_pid, target_procno, self_is_target
- ProcessInterrupts: logs handler counter + pending-mask observations

Two caveats:
- The weak/strong telemetry is supportive but not definitive for
hypothesis (3); it may miss the exact stale-read window.
- The patch preserves the original dispatch path, but added
diagnostics (especially barriered reads in clear paths) can perturb
timing, so repro rate may shift.

What to look for:
Hypothesis (1) self-clear:
ProcArrayEndTransaction... self_is_target=t ... clearing
pendingRecoveryConflicts=0x10 (or ProcArrayClearTransaction self case)
for target pid/procno before target ProcessInterrupts reports handler
fired but pending=0.

Hypothesis (2) group-clear-on-behalf:
ProcArrayEndTransactionInternal... self_is_target=f ...
target_procno=<walsender procno> ... clearing
pendingRecoveryConflicts=0x10 (or ProcArrayClearTransaction with
self_is_target=f).

Hypothesis (3) visibility gap (suggestive):
no matching clear-site log for target procno, but target still reports
handler fired with pending=0.

If this run is inconclusive, I suggest a direct behavioral A/B as next step:
- change only the ProcessInterrupts() recovery-conflict check from
pg_atomic_read_u32() to pg_atomic_read_membarrier_u32();
- if failures disappear, that strongly supports hypothesis (3).

--
Best,
Xuneng

#11

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

3 months ago

In reply to: Alexander Lakhin (#9)

Re: Refactor recovery conflict signaling a little

On 07/03/2026 13:00, Alexander Lakhin wrote:

Hello Xuneng and Heikki,

04.03.2026 07:33, Xuneng Zhou wrote:

03.03.2026 17:39, Heikki Linnakangas wrote:

On 24/02/2026 10:00, Alexander Lakhin wrote:

The "terminating process ..." message doesn't appear when the test passes
successfully.

Hmm, right, looks like something wrong in signaling the recovery conflict. I can't tell if the signal is being sent,
or it's not processed correctly. Looking at the code, I don't see anything wrong.

I was unable to reproduce the issue on an x86_64 Linux machine using
the provided script. All test runs completed successfully without any
failures.

I've added debug logging (see attached) and saw the following:
!!!SignalRecoveryConflict[282363]
!!!ProcArrayEndTransaction| pendingRecoveryConflicts = 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
2026-03-07 12:21:24.544 EET walreceiver[282421] FATAL: could not
receive data from WAL stream: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2026-03-07 12:21:24.645 EET postmaster[282355] LOG: received immediate
shutdown request
2026-03-07 12:21:24.647 EET postmaster[282355] LOG: database system is
shut down

A-ha! So MyProc->pendingRecoveryConflicts is being cleared by
ProcArrayEndTransaction(). If I add a small pg_usleep() to the top of
ProcArrayEndTransaction(), I can readily reproduce this.

Thanks for narrowing this down. The attached patch fixes it.

- Heikki

#12

Xuneng Zhou

xunengzhou@gmail.com

3 months ago

In reply to: Heikki Linnakangas (#11)

Re: Refactor recovery conflict signaling a little

Hi Heikki,

On Mon, Mar 9, 2026 at 7:17 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 07/03/2026 13:00, Alexander Lakhin wrote:

Hello Xuneng and Heikki,

04.03.2026 07:33, Xuneng Zhou wrote:

03.03.2026 17:39, Heikki Linnakangas wrote:

On 24/02/2026 10:00, Alexander Lakhin wrote:

The "terminating process ..." message doesn't appear when the test passes
successfully.

Hmm, right, looks like something wrong in signaling the recovery conflict. I can't tell if the signal is being sent,
or it's not processed correctly. Looking at the code, I don't see anything wrong.

I was unable to reproduce the issue on an x86_64 Linux machine using
the provided script. All test runs completed successfully without any
failures.

I've added debug logging (see attached) and saw the following:
!!!SignalRecoveryConflict[282363]
!!!ProcArrayEndTransaction| pendingRecoveryConflicts = 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
!!!ProcessInterrupts[283863]| MyProc->pendingRecoveryConflicts: 0
2026-03-07 12:21:24.544 EET walreceiver[282421] FATAL: could not
receive data from WAL stream: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2026-03-07 12:21:24.645 EET postmaster[282355] LOG: received immediate
shutdown request
2026-03-07 12:21:24.647 EET postmaster[282355] LOG: database system is
shut down

A-ha! So MyProc->pendingRecoveryConflicts is being cleared by
ProcArrayEndTransaction(). If I add a small pg_usleep() to the top of
ProcArrayEndTransaction(), I can readily reproduce this.

Thanks for narrowing this down. The attached patch fixes it.

- Heikki

Did you use Alexander’s reproducer script? I tried reproducing with a
1 ms pg_usleep() added to all three functions that clear
MyProc->pendingRecoveryConflicts, but I still couldn’t reproduce the
issue.

--
Best,
Xuneng

#13

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

3 months ago

In reply to: Xuneng Zhou (#12)

Re: Refactor recovery conflict signaling a little

On 09/03/2026 17:02, Xuneng Zhou wrote:

Did you use Alexander’s reproducer script? I tried reproducing with a
1 ms pg_usleep() added to all three functions that clear
MyProc->pendingRecoveryConflicts, but I still couldn’t reproduce the
issue.

I used the attached, to be precise. With that it fails every time for
me. I'm not sure if the "if (am_walsender)" check is necessary, I added
it just to make the test run faster.

- Heikki

#14

Xuneng Zhou

xunengzhou@gmail.com

3 months ago

In reply to: Heikki Linnakangas (#13)

Re: Refactor recovery conflict signaling a little

On Mon, Mar 9, 2026 at 11:28 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 09/03/2026 17:02, Xuneng Zhou wrote:

Did you use Alexander’s reproducer script? I tried reproducing with a
1 ms pg_usleep() added to all three functions that clear
MyProc->pendingRecoveryConflicts, but I still couldn’t reproduce the
issue.

I used the attached, to be precise. With that it fails every time for
me. I'm not sure if the "if (am_walsender)" check is necessary, I added
it just to make the test run faster.

- Heikki

I was able to reproduce the issue using a wider sleep window as you
suggested and can confirm that the flag is not cleared after applying
the patch. Below are two logs—one from a successful run and one from a
failed run. I'll look further into the patch later on.

failed run:
startup[1418915] LOG: DBG SignalRecoveryConflict target_pid=1419118
reason=4 old_mask=0x0 new_mask=0x10
walsender[1419118] LOG: DBG ProcArrayEndTransaction(no-xid) CLEARING
pendingRecoveryConflicts=0x10

successful run:
startup[1433218] LOG: DBG SignalRecoveryConflict target_pid=1433406
reason=4 old_mask=0x0 new_mask=0x10
walsender[1433406] LOG: DBG ProcessInterrupts handler fired 1
time(s), pending=0x10 -- processing
walsender[1433406] ERROR: canceling statement due to conflict with recovery

--
Best,
Xuneng

#15

Xuneng Zhou

xunengzhou@gmail.com

3 months ago

In reply to: Xuneng Zhou (#14)

Re: Refactor recovery conflict signaling a little

Hi,

On Tue, Mar 10, 2026 at 1:05 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:

On Mon, Mar 9, 2026 at 11:28 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 09/03/2026 17:02, Xuneng Zhou wrote:

Did you use Alexander’s reproducer script? I tried reproducing with a
1 ms pg_usleep() added to all three functions that clear
MyProc->pendingRecoveryConflicts, but I still couldn’t reproduce the
issue.

I used the attached, to be precise. With that it fails every time for
me. I'm not sure if the "if (am_walsender)" check is necessary, I added
it just to make the test run faster.

- Heikki

I was able to reproduce the issue using a wider sleep window as you
suggested and can confirm that the flag is not cleared after applying
the patch. Below are two logs—one from a successful run and one from a
failed run. I'll look further into the patch later on.

failed run:
startup[1418915] LOG: DBG SignalRecoveryConflict target_pid=1419118
reason=4 old_mask=0x0 new_mask=0x10
walsender[1419118] LOG: DBG ProcArrayEndTransaction(no-xid) CLEARING
pendingRecoveryConflicts=0x10

successful run:
startup[1433218] LOG: DBG SignalRecoveryConflict target_pid=1433406
reason=4 old_mask=0x0 new_mask=0x10
walsender[1433406] LOG: DBG ProcessInterrupts handler fired 1
time(s), pending=0x10 -- processing
walsender[1433406] ERROR: canceling statement due to conflict with recovery

--
Best,
Xuneng

I ran the script several times after applying the patch, and all tests
passed without deadlocking. LGTM.
One nit: should we separate the comment fix and the
InitAuxiliaryProcess hardening into separate patches?

--
Best,
Xuneng

#16

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

3 months ago

In reply to: Xuneng Zhou (#15)

Re: Refactor recovery conflict signaling a little

On 10/03/2026 05:55, Xuneng Zhou wrote:

On Tue, Mar 10, 2026 at 1:05 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:

On Mon, Mar 9, 2026 at 11:28 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 09/03/2026 17:02, Xuneng Zhou wrote:

Did you use Alexander’s reproducer script? I tried reproducing with a
1 ms pg_usleep() added to all three functions that clear
MyProc->pendingRecoveryConflicts, but I still couldn’t reproduce the
issue.

I used the attached, to be precise. With that it fails every time for
me. I'm not sure if the "if (am_walsender)" check is necessary, I added
it just to make the test run faster.

- Heikki

I was able to reproduce the issue using a wider sleep window as you
suggested and can confirm that the flag is not cleared after applying
the patch. Below are two logs—one from a successful run and one from a
failed run. I'll look further into the patch later on.

failed run:
startup[1418915] LOG: DBG SignalRecoveryConflict target_pid=1419118
reason=4 old_mask=0x0 new_mask=0x10
walsender[1419118] LOG: DBG ProcArrayEndTransaction(no-xid) CLEARING
pendingRecoveryConflicts=0x10

successful run:
startup[1433218] LOG: DBG SignalRecoveryConflict target_pid=1433406
reason=4 old_mask=0x0 new_mask=0x10
walsender[1433406] LOG: DBG ProcessInterrupts handler fired 1
time(s), pending=0x10 -- processing
walsender[1433406] ERROR: canceling statement due to conflict with recovery

I ran the script several times after applying the patch, and all tests
passed without deadlocking. LGTM.

Ok, thanks for confirming! Pushed.

One nit: should we separate the comment fix and the
InitAuxiliaryProcess hardening into separate patches?

I think it's appropriate to include them here; they're also follow-up
fixes for the same commit 17f51ea818.

- Heikki

Refactor recovery conflict signaling a little

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: