Startup process deadlock: WaitForProcSignalBarriers vs aux process

Started by Matthias van de Meent7 days ago5 messageshackers

boekewurm+postgres@gmail.com

7 days ago

Hi,

Over in the Hackers Discord, Melany pointed out [0]https://discord.com/channels/1258108670710124574/1346208113132568646/1496179622591598592 a random failure
of tests on the master branch, which seemed to have nothing to do with
the commit they failed on.

The logs [1]https://api.cirrus-ci.com/v1/artifact/task/6239099197063168/log/contrib/auto_explain/log/postmaster.log indicate that the startup process was waiting for another
process to process a signal barrier. While there isn't enough
information available to conclusively point the blame on any specific
component, I think I have a good understanding of what happened:

2026-04-21 15:10:50.065 UTC startup[19246] LOG: still waiting for backend with PID 19244 to accept ProcSignalBarrier

Here, the startup process is waiting for process with PID 19244 to
handle a signal barrier. It is not entirely clear which process it's
waiting on, but we can deduce this:

In the startup sequence, the postmaster creates these child processes,
in short order:
1. checkpointer
2. bgwriter
3. startup

It is therefore likely that the startup process' PID is just two
larger than that of the checkpointer; and therefore, it's likely the
startup process is waiting for the checkpointer process.

# Which code in the Startup process is waiting?

I think it's this: The startup process logged that it started with a
clean shutdown, so no recovery code should be executed. This excludes
most possible call sites of WaitForProcSignalBarriers, except this
one: The startup process calls StartupXLOG ->
UpdateLogicalDecodingStatusEndOfRecovery(), which then calls

if (IsUnderPostmaster)
WaitForProcSignalBarrier(
EmitProcSignalBarrier(
PROCSIGNAL_BARRIER_UPDATE_XLOG_LOGICAL_INFO
));

# Why doesn't the Checkpointer process acknowledge the ProcSignalBarrier?

If the PSB is emitted (and signaled to checkpointer) before the
checkpointer has registered its SIGUSR1 handler, then the checkpointer
won't receive the notice to check its procsignal slots, it won't
notice the updated procsignal flags, and it won't process the PSB; not
until it receives a new SIGUSR1.

Signals are sent to all processes that have their procsignal pss_pid
set, which is true for every process which has called ProcSignalInit,
which for the checkpointer (like other aux processes) happens in
AuxiliaryProcessMainCommon. However, checkpointer (also like other aux
processes) calls AuxiliaryProcessMainCommon before registering its
signal handlers, creating a small window in time where signals are
sent, but not handled.

# Is this new?

The issue of registering signal handlers only after opening the
process up to receiving signals has existed for a long time (unchanged
since at least 2022), only the ProcSignalBarrier in the startup
process is new: UpdateLogicalDecodingStatusEndOfRecovery was added
with Sawada-san's 67c20979.

# A solution?

I don't have one right now.
I was thinking in the direction of having a compile-time aux process
signal handlers array per process type, which is read by
AuxiliaryProcessMainCommon() to register the signal handlers ahead of
ProcSignalInit(), but I've not yet looked at the exact implications,
nor analyzed whether that's actually safe. It would move some
duplicative code patterns into compile-time structs, but that's not
necessarily a universal good.

Kind regards,

Matthias van de Meent

[0]: https://discord.com/channels/1258108670710124574/1346208113132568646/1496179622591598592
[1]: https://api.cirrus-ci.com/v1/artifact/task/6239099197063168/log/contrib/auto_explain/log/postmaster.log

Andres Freund

andres@anarazel.de

6 days ago

In reply to: Matthias van de Meent (#1)

Re: Startup process deadlock: WaitForProcSignalBarriers vs aux process

Hi,

On 2026-04-22 13:21:02 +0200, Matthias van de Meent wrote:

If the PSB is emitted (and signaled to checkpointer) before the
checkpointer has registered its SIGUSR1 handler, then the checkpointer
won't receive the notice to check its procsignal slots, it won't
notice the updated procsignal flags, and it won't process the PSB; not
until it receives a new SIGUSR1.

Signals are sent to all processes that have their procsignal pss_pid
set, which is true for every process which has called ProcSignalInit,
which for the checkpointer (like other aux processes) happens in
AuxiliaryProcessMainCommon. However, checkpointer (also like other aux
processes) calls AuxiliaryProcessMainCommon before registering its
signal handlers, creating a small window in time where signals are
sent, but not handled.

Hm. Have we confirmed this happens?

CheckpointerMain() is called with all signals masked, so it should be ok for
the signal handler to only be set up after AuxiliaryProcessMainCommon(), as
long as it happens before

/*
* Unblock signals (they were blocked when the postmaster forked us)
*/
sigprocmask(SIG_SETMASK, &UnBlockSig, NULL);

as the signal delivery should be held until after unblocking signals.

# A solution?

I don't have one right now.
I was thinking in the direction of having a compile-time aux process
signal handlers array per process type, which is read by
AuxiliaryProcessMainCommon() to register the signal handlers ahead of
ProcSignalInit(), but I've not yet looked at the exact implications,
nor analyzed whether that's actually safe. It would move some
duplicative code patterns into compile-time structs, but that's not
necessarily a universal good.

We really should move setup of most signal handlers into
AuxiliaryProcessMainCommon(). While there are some special cases (like
checkpointer not wanting to handle SIGTERM), that can be configured after
AuxiliaryProcessMainCommon(), as signals will still be blocked.

Greetings,

Andres Freund

Masahiko Sawada

sawada.mshk@gmail.com

4 days ago

In reply to: Andres Freund (#2)

Re: Startup process deadlock: WaitForProcSignalBarriers vs aux process

On Wed, Apr 22, 2026 at 12:05 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2026-04-22 13:21:02 +0200, Matthias van de Meent wrote:

If the PSB is emitted (and signaled to checkpointer) before the
checkpointer has registered its SIGUSR1 handler, then the checkpointer
won't receive the notice to check its procsignal slots, it won't
notice the updated procsignal flags, and it won't process the PSB; not
until it receives a new SIGUSR1.

Signals are sent to all processes that have their procsignal pss_pid
set, which is true for every process which has called ProcSignalInit,
which for the checkpointer (like other aux processes) happens in
AuxiliaryProcessMainCommon. However, checkpointer (also like other aux
processes) calls AuxiliaryProcessMainCommon before registering its
signal handlers, creating a small window in time where signals are
sent, but not handled.

Hm. Have we confirmed this happens?

CheckpointerMain() is called with all signals masked, so it should be ok for
the signal handler to only be set up after AuxiliaryProcessMainCommon(), as
long as it happens before

/*
* Unblock signals (they were blocked when the postmaster forked us)
*/
sigprocmask(SIG_SETMASK, &UnBlockSig, NULL);

as the signal delivery should be held until after unblocking signals.

Right. The postmaster blocks all signals before starting child process
as the following comment explains:

/*
* We start postmaster children with signals blocked. This allows them to
* install their own handlers before unblocking, to avoid races where they
* might run the postmaster's handler and miss an important control
* signal. With more analysis this could potentially be relaxed.
*/
sigprocmask(SIG_SETMASK, &BlockSig, &save_mask);

Investigating the issue, I found there is a race condition between the
procsignal initialization and emitting signal barrier that could be
the cause of this issue. Imagine the following scenario:

1. In ProcSignalInit(), the checkpointer initializes its
slot->pss_barrierGeneration with the global generation.
2. In EmitProcSignalBarrier(), the startup checks the checkpointer's
procsignal slot but it skips emitting the signal as slot->pss_pid is
still 0. It can happen even though the checkpointer holds a spinlock
on its slot during the initialization because the first pid check is
done without a spinlock acquisition.
3. The checkpointer sets its pid to slot->pss_pid and releases the spin lock.
4. In WaitForProcSignalBarrier(), the startup checks the
checkpointer's procsignal slot that has already initialized the
pss_barrierGeneration, and waits for it to be updated. However, the
checkpointer never updates its barrier generation as it doesn't get
the signal.

Another similar issue I found would be that child processes could miss
the PROCSIGNAL_BARRIER_UPDATE_XLOG_LOGICAL_INFO signal during the
initialization and end up in an inconsistent state because
InitializeProcessXLogLogicalInfo() is called (in BaseInit()) before
ProcSignalInit(). If the startup emits the signal to a process who is
between two steps, the process would not reflect the latest
XLogLogicalInfo state. I think we should move
InitializeProcessXLogLogicalInfo() after ProcSignalInit() like we do
so for InitLocalDataChecksumState().

I've attached the patch for fixing the latter problem as the fix is
straightforward.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Alexander Lakhin

exclusion@gmail.com

1 day ago

In reply to: Masahiko Sawada (#3)

Re: Startup process deadlock: WaitForProcSignalBarriers vs aux process

Hello Sawada-san,

24.04.2026 20:52, Masahiko Sawada wrote:

Right. The postmaster blocks all signals before starting child process
as the following comment explains:

/*
* We start postmaster children with signals blocked. This allows them to
* install their own handlers before unblocking, to avoid races where they
* might run the postmaster's handler and miss an important control
* signal. With more analysis this could potentially be relaxed.
*/
sigprocmask(SIG_SETMASK, &BlockSig, &save_mask);

Investigating the issue, I found there is a race condition between the
procsignal initialization and emitting signal barrier that could be
the cause of this issue. Imagine the following scenario:

1. In ProcSignalInit(), the checkpointer initializes its
slot->pss_barrierGeneration with the global generation.
2. In EmitProcSignalBarrier(), the startup checks the checkpointer's
procsignal slot but it skips emitting the signal as slot->pss_pid is
still 0. It can happen even though the checkpointer holds a spinlock
on its slot during the initialization because the first pid check is
done without a spinlock acquisition.
3. The checkpointer sets its pid to slot->pss_pid and releases the spin lock.
4. In WaitForProcSignalBarrier(), the startup checks the
checkpointer's procsignal slot that has already initialized the
pss_barrierGeneration, and waits for it to be updated. However, the
checkpointer never updates its barrier generation as it doesn't get
the signal.

Thank you for the investigation and explanation of the issue!

I've been puzzled by a buildfarm failure [1] with such symptoms for a while
and even reproduced it locally once, but couldn't gather more information
that time. But now that you have described the scenario, I can easily
reproduce the same test failure with:
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -206,6 +206,7 @@ ProcSignalInit(const uint8 *cancel_key, int cancel_key_len)
         if (cancel_key_len > 0)
                 memcpy(slot->pss_cancel_key, cancel_key, cancel_key_len);
         slot->pss_cancel_key_len = cancel_key_len;
+pg_usleep(10000);
         pg_atomic_write_u32(&slot->pss_pid, MyProcPid);

just running `meson test test_oat_hooks_*/regress` with the test multiplied x30:
26/30 test_oat_hooks_28 - postgresql:test_oat_hooks_28/regress         OK 1.28s   2 subtests passed
27/30 test_oat_hooks_30 - postgresql:test_oat_hooks_30/regress         OK 1.25s   2 subtests passed
28/30 test_oat_hooks_2 - postgresql:test_oat_hooks_2/regress           ERROR 62.49s   exit status 2

2026-04-27 17:34:44.290 UTC postmaster[1578102] LOG: starting PostgreSQL 19devel on x86_64-linux, compiled by
gcc-16.0.1, 64-bit
2026-04-27 17:34:44.290 UTC postmaster[1578102] LOG: listening on Unix socket "/tmp/pg_regress-QdhMPt/.s.PGSQL.40086"
2026-04-27 17:34:44.302 UTC startup[1578114] LOG: database system was shut down at 2026-04-27 17:34:44 UTC
2026-04-27 17:34:44.325 UTC dead-end client backend[1578133] [unknown] FATAL: the database system is starting up
...
2026-04-27 17:34:49.274 UTC dead-end client backend[1578643] [unknown] FATAL: the database system is starting up
2026-04-27 17:34:49.308 UTC startup[1578114] LOG: still waiting for backend with PID 1578110 to accept ProcSignalBarrier
2026-04-27 17:34:49.325 UTC dead-end client backend[1578645] [unknown] FATAL: the database system is starting up
...
2026-04-27 17:35:44.332 UTC dead-end client backend[1582376] [unknown] FATAL: the database system is starting up
2026-04-27 17:35:44.351 UTC startup[1578114] LOG: still waiting for backend with PID 1578110 to accept ProcSignalBarrier
2026-04-27 17:35:44.383 UTC dead-end client backend[1582379] [unknown] FATAL: the database system is starting up

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2026-03-10%2013%3A58%3A55

Best regards,
Alexander

Masahiko Sawada

sawada.mshk@gmail.com

about 10 hours ago

In reply to: Alexander Lakhin (#4)

Re: Startup process deadlock: WaitForProcSignalBarriers vs aux process

On Mon, Apr 27, 2026 at 11:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:

Hello Sawada-san,

24.04.2026 20:52, Masahiko Sawada wrote:

Right. The postmaster blocks all signals before starting child process
as the following comment explains:

/*
* We start postmaster children with signals blocked. This allows them to
* install their own handlers before unblocking, to avoid races where they
* might run the postmaster's handler and miss an important control
* signal. With more analysis this could potentially be relaxed.
*/
sigprocmask(SIG_SETMASK, &BlockSig, &save_mask);

Investigating the issue, I found there is a race condition between the
procsignal initialization and emitting signal barrier that could be
the cause of this issue. Imagine the following scenario:

1. In ProcSignalInit(), the checkpointer initializes its
slot->pss_barrierGeneration with the global generation.
2. In EmitProcSignalBarrier(), the startup checks the checkpointer's
procsignal slot but it skips emitting the signal as slot->pss_pid is
still 0. It can happen even though the checkpointer holds a spinlock
on its slot during the initialization because the first pid check is
done without a spinlock acquisition.
3. The checkpointer sets its pid to slot->pss_pid and releases the spin lock.
4. In WaitForProcSignalBarrier(), the startup checks the
checkpointer's procsignal slot that has already initialized the
pss_barrierGeneration, and waits for it to be updated. However, the
checkpointer never updates its barrier generation as it doesn't get
the signal.

Thank you for the investigation and explanation of the issue!
I've been puzzled by a buildfarm failure [1] with such symptoms for a while
and even reproduced it locally once, but couldn't gather more information
that time. But now that you have described the scenario, I can easily
reproduce the same test failure with:
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -206,6 +206,7 @@ ProcSignalInit(const uint8 *cancel_key, int cancel_key_len)
if (cancel_key_len > 0)
memcpy(slot->pss_cancel_key, cancel_key, cancel_key_len);
slot->pss_cancel_key_len = cancel_key_len;
+pg_usleep(10000);
pg_atomic_write_u32(&slot->pss_pid, MyProcPid);

Thank you for testing this.

I've attached a patch to address the issue. I haven't verified it
across all versions yet, but I suspect it exists in the stable
branches as well. Previously, the issue rarely occurred because
EmitProcSignalBarrier() was only used for smgr invalidation. However,
now that we use signal barriers for online wal_level changes and
checksum status updates, this race condition is likely to be
encountered more frequently.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Startup process deadlock: WaitForProcSignalBarriers vs aux process

Attachments:

Attachments: