stopgap fix for signal handling during restore_command
On Wed, Feb 22, 2023 at 09:48:10PM +1300, Thomas Munro wrote:
On Tue, Feb 21, 2023 at 5:50 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
I'm happy to create a new thread if needed, but I can't tell if there is
any interest in this stopgap/back-branch fix. Perhaps we should just jump
straight to the long-term fix that Thomas is looking into.Unfortunately the latch-friendly subprocess module proposal I was
talking about would be for 17. I may post a thread fairly soon with
design ideas + list of problems and decision points as I see them, and
hopefully some sketch code, but it won't be a proposal for [/me checks
calendar] next week's commitfest and probably wouldn't be appropriate
in a final commitfest anyway, and I also have some other existing
stuff to clear first. So please do continue with the stopgap ideas.
Okay, here is a new thread...
Since v8.4, the startup process will proc_exit() immediately within its
SIGTERM handler while the restore_command executes via system(). Some
recent changes added unsafe code to the section where this behavior is
enabled [0]/messages/by-id/20230201105514.rsjl4bnhb65giyvo@alap3.anarazel.de. The long-term fix likely includes moving away from system()
completely, but we may want to have a stopgap/back-branch fix while that is
under development.
I've attached a patch set for a proposed stopgap fix. 0001 simply moves
the extra code outside of the Pre/PostRestoreCommand() block so that only
system() is executed while the SIGTERM handler might proc_exit(). This
restores the behavior that was in place from v8.4 to v14, so I don't expect
it to be too controversial. 0002 adds code to startup's SIGTERM handler to
call _exit() instead of proc_exit() if we are in a forked process from
system(), etc. It also adds assertions to ensure proc_exit(), ProcKill(),
and AuxiliaryProcKill() are not called within such forked processes.
Thoughts?
[0]: /messages/by-id/20230201105514.rsjl4bnhb65giyvo@alap3.anarazel.de
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v6-0001-Move-extra-code-out-of-the-Pre-PostRestoreCommand.patchtext/x-diff; charset=us-asciiDownload
From aee0df5e44cfd631fb601e18ab50469a662ced19 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Thu, 23 Feb 2023 14:27:48 -0800
Subject: [PATCH v6 1/2] Move extra code out of the Pre/PostRestoreCommand()
block.
If SIGTERM is received within this block, the startup process will
immediately proc_exit() in the signal handler, so it is inadvisable
to include any more code than is required in this section. This
change moves the code recently added to this block (see 1b06d7b and
7fed801) to outside of the block. This ensures that only system()
is called while proc_exit() might be called in the SIGTERM handler,
which is how this code worked from v8.4 to v14.
---
src/backend/access/transam/xlogarchive.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index fcc87ff44f..41684418b6 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -159,20 +159,27 @@ RestoreArchivedFile(char *path, const char *xlogfname,
(errmsg_internal("executing restore command \"%s\"",
xlogRestoreCmd)));
+ fflush(NULL);
+ pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
+
/*
- * Check signals before restore command and reset afterwards.
+ * PreRestoreCommand() informs the SIGTERM handler for the startup process
+ * that it should proc_exit() right away. This is done for the duration of
+ * the system() call because there isn't a good way to break out while it
+ * is executing. Since we might call proc_exit() in a signal handler, it
+ * is best to put any additional logic before or after the
+ * PreRestoreCommand()/PostRestoreCommand() section.
*/
PreRestoreCommand();
/*
* Copy xlog from archival storage to XLOGDIR
*/
- fflush(NULL);
- pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
rc = system(xlogRestoreCmd);
- pgstat_report_wait_end();
PostRestoreCommand();
+
+ pgstat_report_wait_end();
pfree(xlogRestoreCmd);
if (rc == 0)
--
2.25.1
v6-0002-Don-t-proc_exit-in-startup-s-SIGTERM-handler-if-f.patchtext/x-diff; charset=us-asciiDownload
From 76e4f00c1c08513893780bc50709b5434dc3470f Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 14 Feb 2023 09:44:53 -0800
Subject: [PATCH v6 2/2] Don't proc_exit() in startup's SIGTERM handler if
forked by system().
Instead, emit a message to STDERR and _exit() in this case. This
change also adds assertions to proc_exit(), ProcKill(), and
AuxiliaryProcKill() to verify that these functions are not called
by a process forked by system(), etc.
---
src/backend/postmaster/startup.c | 20 +++++++++++++++++++-
src/backend/storage/ipc/ipc.c | 3 +++
src/backend/storage/lmgr/proc.c | 2 ++
3 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index efc2580536..de2b56c2fa 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -19,6 +19,8 @@
*/
#include "postgres.h"
+#include <unistd.h>
+
#include "access/xlog.h"
#include "access/xlogrecovery.h"
#include "access/xlogutils.h"
@@ -121,7 +123,23 @@ StartupProcShutdownHandler(SIGNAL_ARGS)
int save_errno = errno;
if (in_restore_command)
- proc_exit(1);
+ {
+ /*
+ * If we are in a child process (e.g., forked by system() in
+ * RestoreArchivedFile()), we don't want to call any exit callbacks.
+ * The parent will take care of that.
+ */
+ if (MyProcPid == (int) getpid())
+ proc_exit(1);
+ else
+ {
+ const char msg[] = "StartupProcShutdownHandler() called in child process";
+ int rc pg_attribute_unused();
+
+ rc = write(STDERR_FILENO, msg, sizeof(msg));
+ _exit(1);
+ }
+ }
else
shutdown_requested = true;
WakeupRecovery();
diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index 1904d21795..d5097dc008 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -103,6 +103,9 @@ static int on_proc_exit_index,
void
proc_exit(int code)
{
+ /* proc_exit() is not safe in forked processes from system(), etc. */
+ Assert(MyProcPid == (int) getpid());
+
/* Clean up everything that must be cleaned up */
proc_exit_prepare(code);
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 22b4278610..e3da0622d7 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -805,6 +805,7 @@ ProcKill(int code, Datum arg)
dlist_head *procgloballist;
Assert(MyProc != NULL);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. */
/* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;
Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. */
auxproc = &AuxiliaryProcs[proctype];
--
2.25.1
On Fri, Feb 24, 2023 at 12:15 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
Thoughts?
I think you should have a trailing \n when writing to stderr.
Here's that reproducer I speculated about (sorry I confused SIGQUIT
and SIGTERM in my earlier email, ENOCOFFEE). Seems to do the job, and
I tested on a Linux box for good measure. If you comment out the
kill(), "check PROVE_TESTS=t/002_archiving.pl" works fine
(demonstrating that that definition of system() works fine). With the
kill(), it reliably reaches 'TRAP: failed Assert("latch->owner_pid ==
MyProcPid")' without your patch, and with your patch it avoids it. (I
believe glibc's system() could reach it too with the right timing, but
I didn't try, my point being that the use of the OpenBSD system() here
is only because it's easier to grok and to wrangle.)
Attachments:
0001-XXX-inject-SIGTERM-into-system-at-an-inconvenient-mo.xatchapplication/octet-stream; name=0001-XXX-inject-SIGTERM-into-system-at-an-inconvenient-mo.xatchDownload
From 29c8b97ae41d6d285ab01e84b9848a74636bd81a Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Fri, 24 Feb 2023 13:08:31 +1300
Subject: [PATCH] XXX inject SIGTERM into system() at an inconvenient moment
---
src/backend/access/transam/xlogarchive.c | 68 +++++++++++++++++++++++-
1 file changed, 67 insertions(+), 1 deletion(-)
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 41684418b6..87e16f8a21 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -33,6 +33,72 @@
#include "storage/ipc.h"
#include "storage/lwlock.h"
+/*
+ * Copied from OpenBSD's lib/libc/stdlib/system.c because it's short and sweet
+ * and doesn't require any weird headers. The use of vfork() also screws the
+ * timing down a little (glibc's impl would use fork(), so child and parent
+ * would race to run proc_exec, while this version runs the handler in the
+ * child, and then the parent).
+ */
+
+#define _PATH_BSHELL "/bin/sh"
+
+static int
+doomed_system(const char *command)
+{
+ pid_t pid, cpid;
+ struct sigaction intsave, quitsave, sa;
+ sigset_t mask, omask;
+ int pstat;
+ char *argp[] = {"sh", "-c", NULL, NULL};
+
+ if (!command) /* just checking... */
+ return(1);
+
+ argp[2] = (char *)command;
+
+ sigemptyset(&mask);
+ sigaddset(&mask, SIGCHLD);
+ sigaddset(&mask, SIGINT);
+ sigaddset(&mask, SIGQUIT);
+ sigprocmask(SIG_BLOCK, &mask, &omask);
+ switch (cpid = fork()) {
+ case -1: /* error */
+ sigprocmask(SIG_SETMASK, &omask, NULL);
+ return(-1);
+ case 0: /* child */
+ sigprocmask(SIG_SETMASK, &omask, NULL);
+
+ kill(-getppid(), SIGTERM); // XXX a badly timed SIGTERM to group
+
+ execve(_PATH_BSHELL, argp, environ);
+ _exit(127);
+ }
+
+ sleep(2);
+
+ /* Ignore SIGINT and SIGQUIT while waiting for command to complete. */
+ memset(&sa, 0, sizeof(sa));
+ sigemptyset(&sa.sa_mask);
+ sa.sa_handler = SIG_IGN;
+ sigaction(SIGINT, &sa, &intsave);
+ sigaction(SIGQUIT, &sa, &quitsave);
+ sigemptyset(&mask);
+ sigaddset(&mask, SIGINT);
+ sigaddset(&mask, SIGQUIT);
+ sigprocmask(SIG_UNBLOCK, &mask, NULL);
+
+ do {
+ pid = waitpid(cpid, &pstat, 0);
+ } while (pid == -1 && errno == EINTR);
+
+ sigprocmask(SIG_SETMASK, &omask, NULL);
+ sigaction(SIGINT, &intsave, NULL);
+ sigaction(SIGQUIT, &quitsave, NULL);
+
+ return (pid == -1 ? -1 : pstat);
+}
+
/*
* Attempt to retrieve the specified file from off-line archival storage.
* If successful, fill "path" with its complete path (note that this will be
@@ -175,7 +241,7 @@ RestoreArchivedFile(char *path, const char *xlogfname,
/*
* Copy xlog from archival storage to XLOGDIR
*/
- rc = system(xlogRestoreCmd);
+ rc = doomed_system(xlogRestoreCmd);
PostRestoreCommand();
--
2.39.1
On Fri, Feb 24, 2023 at 1:25 PM Thomas Munro <thomas.munro@gmail.com> wrote:
ENOCOFFEE
Erm, I realised after sending that I'd accidentally sent a version
that uses fork() anyway, and now if I change it back to vfork() it
doesn't fail the way I wanted to demonstrate, at least on Linux. I
don't have time or desire to dig into how Linux vfork() really works
so I'll leave it at that... but the patch as posted does seem to be a
useful tool for understanding this failure... please just ignore the
confused comments about fork() vs vfork() therein.
On Fri, Feb 24, 2023 at 01:25:01PM +1300, Thomas Munro wrote:
I think you should have a trailing \n when writing to stderr.
Oops. I added that in v7.
Here's that reproducer I speculated about (sorry I confused SIGQUIT
and SIGTERM in my earlier email, ENOCOFFEE). Seems to do the job, and
I tested on a Linux box for good measure. If you comment out the
kill(), "check PROVE_TESTS=t/002_archiving.pl" works fine
(demonstrating that that definition of system() works fine). With the
kill(), it reliably reaches 'TRAP: failed Assert("latch->owner_pid ==
MyProcPid")' without your patch, and with your patch it avoids it. (I
believe glibc's system() could reach it too with the right timing, but
I didn't try, my point being that the use of the OpenBSD system() here
is only because it's easier to grok and to wrangle.)
Thanks for providing the reproducer! I am seeing the behavior that you
described on my Linux machine.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v7-0001-Move-extra-code-out-of-the-Pre-PostRestoreCommand.patchtext/x-diff; charset=us-asciiDownload
From 6bdafd9980854fc9d81f37d450b86e5f5d4a9d84 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Thu, 23 Feb 2023 14:27:48 -0800
Subject: [PATCH v7 1/2] Move extra code out of the Pre/PostRestoreCommand()
block.
If SIGTERM is received within this block, the startup process will
immediately proc_exit() in the signal handler, so it is inadvisable
to include any more code than is required in this section. This
change moves the code recently added to this block (see 1b06d7b and
7fed801) to outside of the block. This ensures that only system()
is called while proc_exit() might be called in the SIGTERM handler,
which is how this code worked from v8.4 to v14.
---
src/backend/access/transam/xlogarchive.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index fcc87ff44f..41684418b6 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -159,20 +159,27 @@ RestoreArchivedFile(char *path, const char *xlogfname,
(errmsg_internal("executing restore command \"%s\"",
xlogRestoreCmd)));
+ fflush(NULL);
+ pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
+
/*
- * Check signals before restore command and reset afterwards.
+ * PreRestoreCommand() informs the SIGTERM handler for the startup process
+ * that it should proc_exit() right away. This is done for the duration of
+ * the system() call because there isn't a good way to break out while it
+ * is executing. Since we might call proc_exit() in a signal handler, it
+ * is best to put any additional logic before or after the
+ * PreRestoreCommand()/PostRestoreCommand() section.
*/
PreRestoreCommand();
/*
* Copy xlog from archival storage to XLOGDIR
*/
- fflush(NULL);
- pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
rc = system(xlogRestoreCmd);
- pgstat_report_wait_end();
PostRestoreCommand();
+
+ pgstat_report_wait_end();
pfree(xlogRestoreCmd);
if (rc == 0)
--
2.25.1
v7-0002-Don-t-proc_exit-in-startup-s-SIGTERM-handler-if-f.patchtext/x-diff; charset=us-asciiDownload
From 52b0ad1a8b8a94db1f540b7cbf916bc9b1a785b6 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 14 Feb 2023 09:44:53 -0800
Subject: [PATCH v7 2/2] Don't proc_exit() in startup's SIGTERM handler if
forked by system().
Instead, emit a message to STDERR and _exit() in this case. This
change also adds assertions to proc_exit(), ProcKill(), and
AuxiliaryProcKill() to verify that these functions are not called
by a process forked by system(), etc.
---
src/backend/postmaster/startup.c | 20 +++++++++++++++++++-
src/backend/storage/ipc/ipc.c | 3 +++
src/backend/storage/lmgr/proc.c | 2 ++
3 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index efc2580536..bace915881 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -19,6 +19,8 @@
*/
#include "postgres.h"
+#include <unistd.h>
+
#include "access/xlog.h"
#include "access/xlogrecovery.h"
#include "access/xlogutils.h"
@@ -121,7 +123,23 @@ StartupProcShutdownHandler(SIGNAL_ARGS)
int save_errno = errno;
if (in_restore_command)
- proc_exit(1);
+ {
+ /*
+ * If we are in a child process (e.g., forked by system() in
+ * RestoreArchivedFile()), we don't want to call any exit callbacks.
+ * The parent will take care of that.
+ */
+ if (MyProcPid == (int) getpid())
+ proc_exit(1);
+ else
+ {
+ const char msg[] = "StartupProcShutdownHandler() called in child process\n";
+ int rc pg_attribute_unused();
+
+ rc = write(STDERR_FILENO, msg, sizeof(msg));
+ _exit(1);
+ }
+ }
else
shutdown_requested = true;
WakeupRecovery();
diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index 1904d21795..d5097dc008 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -103,6 +103,9 @@ static int on_proc_exit_index,
void
proc_exit(int code)
{
+ /* proc_exit() is not safe in forked processes from system(), etc. */
+ Assert(MyProcPid == (int) getpid());
+
/* Clean up everything that must be cleaned up */
proc_exit_prepare(code);
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 22b4278610..e3da0622d7 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -805,6 +805,7 @@ ProcKill(int code, Datum arg)
dlist_head *procgloballist;
Assert(MyProc != NULL);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. */
/* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;
Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. */
auxproc = &AuxiliaryProcs[proctype];
--
2.25.1
Hi,
On 2023-02-23 20:33:23 -0800, Nathan Bossart wrote:>
if (in_restore_command) - proc_exit(1); + { + /* + * If we are in a child process (e.g., forked by system() in + * RestoreArchivedFile()), we don't want to call any exit callbacks. + * The parent will take care of that. + */ + if (MyProcPid == (int) getpid()) + proc_exit(1); + else + { + const char msg[] = "StartupProcShutdownHandler() called in child process\n"; + int rc pg_attribute_unused(); + + rc = write(STDERR_FILENO, msg, sizeof(msg)); + _exit(1); + } + }
Why do we need that rc variable? Don't we normally get away with (void)
write(...)?
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 22b4278610..e3da0622d7 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -805,6 +805,7 @@ ProcKill(int code, Datum arg) dlist_head *procgloballist;Assert(MyProc != NULL);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. *//* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. */auxproc = &AuxiliaryProcs[proctype];
--
2.25.1
I think the much more interesting assertion here would be to check that
MyProc->pid equals the current pid.
Greetings,
Andres Freund
On Sat, Feb 25, 2023 at 11:07:42AM -0800, Andres Freund wrote:
On 2023-02-23 20:33:23 -0800, Nathan Bossart wrote:>
if (in_restore_command) - proc_exit(1); + { + /* + * If we are in a child process (e.g., forked by system() in + * RestoreArchivedFile()), we don't want to call any exit callbacks. + * The parent will take care of that. + */ + if (MyProcPid == (int) getpid()) + proc_exit(1); + else + { + const char msg[] = "StartupProcShutdownHandler() called in child process\n"; + int rc pg_attribute_unused(); + + rc = write(STDERR_FILENO, msg, sizeof(msg)); + _exit(1); + } + }Why do we need that rc variable? Don't we normally get away with (void)
write(...)?
My compiler complains about that. :/
../postgresql/src/backend/postmaster/startup.c: In function ‘StartupProcShutdownHandler’:
../postgresql/src/backend/postmaster/startup.c:139:11: error: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Werror=unused-result]
139 | (void) write(STDERR_FILENO, msg, sizeof(msg));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 22b4278610..e3da0622d7 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -805,6 +805,7 @@ ProcKill(int code, Datum arg) dlist_head *procgloballist;Assert(MyProc != NULL);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. *//* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. */auxproc = &AuxiliaryProcs[proctype];
--
2.25.1I think the much more interesting assertion here would be to check that
MyProc->pid equals the current pid.
I don't mind changing this, but why is this a more interesting assertion?
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Sat, Feb 25, 2023 at 11:28:25AM -0800, Nathan Bossart wrote:
On Sat, Feb 25, 2023 at 11:07:42AM -0800, Andres Freund wrote:
I think the much more interesting assertion here would be to check that
MyProc->pid equals the current pid.I don't mind changing this, but why is this a more interesting assertion?
Here is a new patch set with this change.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v8-0001-Move-extra-code-out-of-the-Pre-PostRestoreCommand.patchtext/x-diff; charset=us-asciiDownload
From 575b959d5e2980454c7ebe10794fff7c5aaf68ed Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Thu, 23 Feb 2023 14:27:48 -0800
Subject: [PATCH v8 1/2] Move extra code out of the Pre/PostRestoreCommand()
block.
If SIGTERM is received within this block, the startup process will
immediately proc_exit() in the signal handler, so it is inadvisable
to include any more code than is required in this section. This
change moves the code recently added to this block (see 1b06d7b and
7fed801) to outside of the block. This ensures that only system()
is called while proc_exit() might be called in the SIGTERM handler,
which is how this code worked from v8.4 to v14.
---
src/backend/access/transam/xlogarchive.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index fcc87ff44f..41684418b6 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -159,20 +159,27 @@ RestoreArchivedFile(char *path, const char *xlogfname,
(errmsg_internal("executing restore command \"%s\"",
xlogRestoreCmd)));
+ fflush(NULL);
+ pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
+
/*
- * Check signals before restore command and reset afterwards.
+ * PreRestoreCommand() informs the SIGTERM handler for the startup process
+ * that it should proc_exit() right away. This is done for the duration of
+ * the system() call because there isn't a good way to break out while it
+ * is executing. Since we might call proc_exit() in a signal handler, it
+ * is best to put any additional logic before or after the
+ * PreRestoreCommand()/PostRestoreCommand() section.
*/
PreRestoreCommand();
/*
* Copy xlog from archival storage to XLOGDIR
*/
- fflush(NULL);
- pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
rc = system(xlogRestoreCmd);
- pgstat_report_wait_end();
PostRestoreCommand();
+
+ pgstat_report_wait_end();
pfree(xlogRestoreCmd);
if (rc == 0)
--
2.25.1
v8-0002-Don-t-proc_exit-in-startup-s-SIGTERM-handler-if-f.patchtext/x-diff; charset=us-asciiDownload
From 42a9760cd34e9aafa6ffe30b0e28e2c893ae6a2c Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 14 Feb 2023 09:44:53 -0800
Subject: [PATCH v8 2/2] Don't proc_exit() in startup's SIGTERM handler if
forked by system().
Instead, emit a message to STDERR and _exit() in this case. This
change also adds assertions to proc_exit(), ProcKill(), and
AuxiliaryProcKill() to verify that these functions are not called
by a process forked by system(), etc.
---
src/backend/postmaster/startup.c | 20 +++++++++++++++++++-
src/backend/storage/ipc/ipc.c | 3 +++
src/backend/storage/lmgr/proc.c | 2 ++
3 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index efc2580536..bace915881 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -19,6 +19,8 @@
*/
#include "postgres.h"
+#include <unistd.h>
+
#include "access/xlog.h"
#include "access/xlogrecovery.h"
#include "access/xlogutils.h"
@@ -121,7 +123,23 @@ StartupProcShutdownHandler(SIGNAL_ARGS)
int save_errno = errno;
if (in_restore_command)
- proc_exit(1);
+ {
+ /*
+ * If we are in a child process (e.g., forked by system() in
+ * RestoreArchivedFile()), we don't want to call any exit callbacks.
+ * The parent will take care of that.
+ */
+ if (MyProcPid == (int) getpid())
+ proc_exit(1);
+ else
+ {
+ const char msg[] = "StartupProcShutdownHandler() called in child process\n";
+ int rc pg_attribute_unused();
+
+ rc = write(STDERR_FILENO, msg, sizeof(msg));
+ _exit(1);
+ }
+ }
else
shutdown_requested = true;
WakeupRecovery();
diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index 1904d21795..d5097dc008 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -103,6 +103,9 @@ static int on_proc_exit_index,
void
proc_exit(int code)
{
+ /* proc_exit() is not safe in forked processes from system(), etc. */
+ Assert(MyProcPid == (int) getpid());
+
/* Clean up everything that must be cleaned up */
proc_exit_prepare(code);
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 22b4278610..b9e2c3aafe 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -805,6 +805,7 @@ ProcKill(int code, Datum arg)
dlist_head *procgloballist;
Assert(MyProc != NULL);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */
/* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;
Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */
auxproc = &AuxiliaryProcs[proctype];
--
2.25.1
Hi,
On 2023-02-25 11:28:25 -0800, Nathan Bossart wrote:
On Sat, Feb 25, 2023 at 11:07:42AM -0800, Andres Freund wrote:
Why do we need that rc variable? Don't we normally get away with (void)
write(...)?My compiler complains about that. :/
../postgresql/src/backend/postmaster/startup.c: In function ‘StartupProcShutdownHandler’:
../postgresql/src/backend/postmaster/startup.c:139:11: error: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Werror=unused-result]
139 | (void) write(STDERR_FILENO, msg, sizeof(msg));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Ick. I guess we've already encountered this, because we've apparently removed
all the (void) write cases. Which I am certain we had at some point. We still
do it for a bunch of other functions though. Ah, yes: aa90e148ca7,
27314d32a88, 6c72a28e5ce etc.
I think I opined on this before, but we really ought to have a function to do
some minimal signal safe output. Implemented centrally, instead of being open
coded in a bunch of places.
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 22b4278610..e3da0622d7 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -805,6 +805,7 @@ ProcKill(int code, Datum arg) dlist_head *procgloballist;Assert(MyProc != NULL);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. *//* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. */auxproc = &AuxiliaryProcs[proctype];
--
2.25.1I think the much more interesting assertion here would be to check that
MyProc->pid equals the current pid.I don't mind changing this, but why is this a more interesting assertion?
Because we so far have little to no protection against some state corruption
leading to releasing PGPROC that's not ours. I didn't actually mean that we
shouldn't check that MyProcPid == (int) getpid(), just that the much more
interesting case to check is that MyProc->pid matches, because that protect
against multiple releases, releasing the wrong slot, etc.
Greetings,
Andres Freund
On Sat, Feb 25, 2023 at 11:52:53AM -0800, Andres Freund wrote:
I think I opined on this before, but we really ought to have a function to do
some minimal signal safe output. Implemented centrally, instead of being open
coded in a bunch of places.
While looking around for the right place to put this, I noticed that
there's a write_stderr() function in elog.c that we might be able to use.
I used that in v9. WDYT?
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v9-0001-Move-extra-code-out-of-the-Pre-PostRestoreCommand.patchtext/x-diff; charset=us-asciiDownload
From 7dcf8fe947ab7b6e0b37ddd42a08bbbc560d4a37 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Thu, 23 Feb 2023 14:27:48 -0800
Subject: [PATCH v9 1/2] Move extra code out of the Pre/PostRestoreCommand()
block.
If SIGTERM is received within this block, the startup process will
immediately proc_exit() in the signal handler, so it is inadvisable
to include any more code than is required in this section. This
change moves the code recently added to this block (see 1b06d7b and
7fed801) to outside of the block. This ensures that only system()
is called while proc_exit() might be called in the SIGTERM handler,
which is how this code worked from v8.4 to v14.
---
src/backend/access/transam/xlogarchive.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index fcc87ff44f..41684418b6 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -159,20 +159,27 @@ RestoreArchivedFile(char *path, const char *xlogfname,
(errmsg_internal("executing restore command \"%s\"",
xlogRestoreCmd)));
+ fflush(NULL);
+ pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
+
/*
- * Check signals before restore command and reset afterwards.
+ * PreRestoreCommand() informs the SIGTERM handler for the startup process
+ * that it should proc_exit() right away. This is done for the duration of
+ * the system() call because there isn't a good way to break out while it
+ * is executing. Since we might call proc_exit() in a signal handler, it
+ * is best to put any additional logic before or after the
+ * PreRestoreCommand()/PostRestoreCommand() section.
*/
PreRestoreCommand();
/*
* Copy xlog from archival storage to XLOGDIR
*/
- fflush(NULL);
- pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
rc = system(xlogRestoreCmd);
- pgstat_report_wait_end();
PostRestoreCommand();
+
+ pgstat_report_wait_end();
pfree(xlogRestoreCmd);
if (rc == 0)
--
2.25.1
v9-0002-Don-t-proc_exit-in-startup-s-SIGTERM-handler-if-f.patchtext/x-diff; charset=us-asciiDownload
From e41a13a77b1c6cf156387cee3ca49fb2f4b1253b Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 14 Feb 2023 09:44:53 -0800
Subject: [PATCH v9 2/2] Don't proc_exit() in startup's SIGTERM handler if
forked by system().
Instead, emit a message to STDERR and _exit() in this case. This
change also adds assertions to proc_exit(), ProcKill(), and
AuxiliaryProcKill() to verify that these functions are not called
by a process forked by system(), etc.
---
src/backend/postmaster/startup.c | 17 ++++++++++++++++-
src/backend/storage/ipc/ipc.c | 3 +++
src/backend/storage/lmgr/proc.c | 2 ++
3 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index efc2580536..261f992357 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -19,6 +19,8 @@
*/
#include "postgres.h"
+#include <unistd.h>
+
#include "access/xlog.h"
#include "access/xlogrecovery.h"
#include "access/xlogutils.h"
@@ -121,7 +123,20 @@ StartupProcShutdownHandler(SIGNAL_ARGS)
int save_errno = errno;
if (in_restore_command)
- proc_exit(1);
+ {
+ /*
+ * If we are in a child process (e.g., forked by system() in
+ * RestoreArchivedFile()), we don't want to call any exit callbacks.
+ * The parent will take care of that.
+ */
+ if (MyProcPid == (int) getpid())
+ proc_exit(1);
+ else
+ {
+ write_stderr("StartupProcShutdownHandler() called in child process\n");
+ _exit(1);
+ }
+ }
else
shutdown_requested = true;
WakeupRecovery();
diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index 1904d21795..d5097dc008 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -103,6 +103,9 @@ static int on_proc_exit_index,
void
proc_exit(int code)
{
+ /* proc_exit() is not safe in forked processes from system(), etc. */
+ Assert(MyProcPid == (int) getpid());
+
/* Clean up everything that must be cleaned up */
proc_exit_prepare(code);
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 22b4278610..b9e2c3aafe 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -805,6 +805,7 @@ ProcKill(int code, Datum arg)
dlist_head *procgloballist;
Assert(MyProc != NULL);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */
/* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;
Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */
auxproc = &AuxiliaryProcs[proctype];
--
2.25.1
Hi,
On 2023-02-25 14:06:29 -0800, Nathan Bossart wrote:
On Sat, Feb 25, 2023 at 11:52:53AM -0800, Andres Freund wrote:
I think I opined on this before, but we really ought to have a function to do
some minimal signal safe output. Implemented centrally, instead of being open
coded in a bunch of places.While looking around for the right place to put this, I noticed that
there's a write_stderr() function in elog.c that we might be able to use.
I used that in v9. WDYT?
write_stderr() isn't signal safe, from what I can tell.
On Sun, Feb 26, 2023 at 10:00:29AM -0800, Andres Freund wrote:
On 2023-02-25 14:06:29 -0800, Nathan Bossart wrote:
On Sat, Feb 25, 2023 at 11:52:53AM -0800, Andres Freund wrote:
I think I opined on this before, but we really ought to have a function to do
some minimal signal safe output. Implemented centrally, instead of being open
coded in a bunch of places.While looking around for the right place to put this, I noticed that
there's a write_stderr() function in elog.c that we might be able to use.
I used that in v9. WDYT?write_stderr() isn't signal safe, from what I can tell.
*facepalm* Sorry.
What precisely did you have in mind? AFAICT you are asking for a wrapper
around write().
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
On 2023-02-26 11:39:00 -0800, Nathan Bossart wrote:
On Sun, Feb 26, 2023 at 10:00:29AM -0800, Andres Freund wrote:
On 2023-02-25 14:06:29 -0800, Nathan Bossart wrote:
On Sat, Feb 25, 2023 at 11:52:53AM -0800, Andres Freund wrote:
I think I opined on this before, but we really ought to have a function to do
some minimal signal safe output. Implemented centrally, instead of being open
coded in a bunch of places.While looking around for the right place to put this, I noticed that
there's a write_stderr() function in elog.c that we might be able to use.
I used that in v9. WDYT?write_stderr() isn't signal safe, from what I can tell.
*facepalm* Sorry.
What precisely did you have in mind? AFAICT you are asking for a wrapper
around write().
Partially I just want something that can easily be searched for, that can have
comments attached to it documenting why what it is doing is safe.
It'd not be a huge amount of work to have a slow and restricted string
interpolation support, to make it easier to write messages. Converting floats
is probably too hard to do safely, and I'm not sure %m can safely be
supported. But basic things like %d would be pretty simple.
Basically a loop around the format string that directly writes to stderr using
write(), and only supports a signal safe subset of normal format strings.
Greetings,
Andres Freund
On Sun, Feb 26, 2023 at 12:12:27PM -0800, Andres Freund wrote:
On 2023-02-26 11:39:00 -0800, Nathan Bossart wrote:
What precisely did you have in mind? AFAICT you are asking for a wrapper
around write().Partially I just want something that can easily be searched for, that can have
comments attached to it documenting why what it is doing is safe.It'd not be a huge amount of work to have a slow and restricted string
interpolation support, to make it easier to write messages. Converting floats
is probably too hard to do safely, and I'm not sure %m can safely be
supported. But basic things like %d would be pretty simple.Basically a loop around the format string that directly writes to stderr using
write(), and only supports a signal safe subset of normal format strings.
Got it, thanks. I will try to put something together along these lines,
although I don't know if I'll pick up the interpolation support in this
thread.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Tue, Feb 28, 2023 at 08:36:03PM -0800, Nathan Bossart wrote:
On Sun, Feb 26, 2023 at 12:12:27PM -0800, Andres Freund wrote:
Partially I just want something that can easily be searched for, that can have
comments attached to it documenting why what it is doing is safe.It'd not be a huge amount of work to have a slow and restricted string
interpolation support, to make it easier to write messages. Converting floats
is probably too hard to do safely, and I'm not sure %m can safely be
supported. But basic things like %d would be pretty simple.Basically a loop around the format string that directly writes to stderr using
write(), and only supports a signal safe subset of normal format strings.Got it, thanks. I will try to put something together along these lines,
although I don't know if I'll pick up the interpolation support in this
thread.
Here is an attempt at adding a signal safe function for writing to STDERR.
I didn't add support for format strings, but looking ahead, I think one
challenge will be avoiding va_start() and friends. In any case, IMO format
string support probably deserves its own thread.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v10-0001-Move-extra-code-out-of-the-Pre-PostRestoreComman.patchtext/x-diff; charset=us-asciiDownload
From b454604da126dc0894449ef4a3f49227f42a4df6 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Thu, 23 Feb 2023 14:27:48 -0800
Subject: [PATCH v10 1/2] Move extra code out of the Pre/PostRestoreCommand()
block.
If SIGTERM is received within this block, the startup process will
immediately proc_exit() in the signal handler, so it is inadvisable
to include any more code than is required in this section. This
change moves the code recently added to this block (see 1b06d7b and
7fed801) to outside of the block. This ensures that only system()
is called while proc_exit() might be called in the SIGTERM handler,
which is how this code worked from v8.4 to v14.
---
src/backend/access/transam/xlogarchive.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index fcc87ff44f..41684418b6 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -159,20 +159,27 @@ RestoreArchivedFile(char *path, const char *xlogfname,
(errmsg_internal("executing restore command \"%s\"",
xlogRestoreCmd)));
+ fflush(NULL);
+ pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
+
/*
- * Check signals before restore command and reset afterwards.
+ * PreRestoreCommand() informs the SIGTERM handler for the startup process
+ * that it should proc_exit() right away. This is done for the duration of
+ * the system() call because there isn't a good way to break out while it
+ * is executing. Since we might call proc_exit() in a signal handler, it
+ * is best to put any additional logic before or after the
+ * PreRestoreCommand()/PostRestoreCommand() section.
*/
PreRestoreCommand();
/*
* Copy xlog from archival storage to XLOGDIR
*/
- fflush(NULL);
- pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
rc = system(xlogRestoreCmd);
- pgstat_report_wait_end();
PostRestoreCommand();
+
+ pgstat_report_wait_end();
pfree(xlogRestoreCmd);
if (rc == 0)
--
2.25.1
v10-0002-Don-t-proc_exit-in-startup-s-SIGTERM-handler-if-.patchtext/x-diff; charset=us-asciiDownload
From fb6957da01f11b75d1a1966f32b00e2e77c789a0 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 14 Feb 2023 09:44:53 -0800
Subject: [PATCH v10 2/2] Don't proc_exit() in startup's SIGTERM handler if
forked by system().
Instead, emit a message to STDERR and _exit() in this case. This
change also adds assertions to proc_exit(), ProcKill(), and
AuxiliaryProcKill() to verify that these functions are not called
by a process forked by system(), etc.
---
src/backend/postmaster/startup.c | 17 ++++++++++++++++-
src/backend/storage/ipc/ipc.c | 3 +++
src/backend/storage/lmgr/proc.c | 2 ++
src/backend/utils/error/elog.c | 28 ++++++++++++++++++++++++++++
src/include/utils/elog.h | 6 +-----
5 files changed, 50 insertions(+), 6 deletions(-)
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index efc2580536..0e7de26bc2 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -19,6 +19,8 @@
*/
#include "postgres.h"
+#include <unistd.h>
+
#include "access/xlog.h"
#include "access/xlogrecovery.h"
#include "access/xlogutils.h"
@@ -121,7 +123,20 @@ StartupProcShutdownHandler(SIGNAL_ARGS)
int save_errno = errno;
if (in_restore_command)
- proc_exit(1);
+ {
+ /*
+ * If we are in a child process (e.g., forked by system() in
+ * RestoreArchivedFile()), we don't want to call any exit callbacks.
+ * The parent will take care of that.
+ */
+ if (MyProcPid == (int) getpid())
+ proc_exit(1);
+ else
+ {
+ write_stderr_signal_safe("StartupProcShutdownHandler() called in child process\n");
+ _exit(1);
+ }
+ }
else
shutdown_requested = true;
WakeupRecovery();
diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index 1904d21795..d5097dc008 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -103,6 +103,9 @@ static int on_proc_exit_index,
void
proc_exit(int code)
{
+ /* proc_exit() is not safe in forked processes from system(), etc. */
+ Assert(MyProcPid == (int) getpid());
+
/* Clean up everything that must be cleaned up */
proc_exit_prepare(code);
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 22b4278610..b9e2c3aafe 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -805,6 +805,7 @@ ProcKill(int code, Datum arg)
dlist_head *procgloballist;
Assert(MyProc != NULL);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */
/* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;
Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */
auxproc = &AuxiliaryProcs[proctype];
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 5898100acb..f9925c8d8e 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -3730,6 +3730,34 @@ write_stderr(const char *fmt,...)
}
+/*
+ * Write a message to STDERR using only async-signal-safe functions. This can
+ * be used to safely emit a message from a signal handler.
+ *
+ * TODO: It is likely possible to safely do a limited amount of string
+ * interpolation (e.g., %s and %d), but that is not presently supported.
+ */
+void
+write_stderr_signal_safe(const char *fmt)
+{
+ int nwritten = 0;
+ int ntotal = strlen(fmt);
+
+ while (nwritten < ntotal)
+ {
+ int rc;
+
+ rc = write(STDERR_FILENO, fmt + nwritten, ntotal - nwritten);
+
+ /* Just give up on error. There isn't much else we can do. */
+ if (rc == -1)
+ return;
+
+ nwritten += rc;
+ }
+}
+
+
/*
* Adjust the level of a recovery-related message per trace_recovery_messages.
*
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 4a9562fdaa..20f1b54c6a 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -529,11 +529,7 @@ extern void write_pipe_chunks(char *data, int len, int dest);
extern void write_csvlog(ErrorData *edata);
extern void write_jsonlog(ErrorData *edata);
-/*
- * Write errors to stderr (or by equal means when stderr is
- * not available). Used before ereport/elog can be used
- * safely (memory context, GUC load etc)
- */
extern void write_stderr(const char *fmt,...) pg_attribute_printf(1, 2);
+extern void write_stderr_signal_safe(const char *fmt);
#endif /* ELOG_H */
--
2.25.1
Hi,
On 2023-03-01 14:47:51 -0800, Nathan Bossart wrote:
On Tue, Feb 28, 2023 at 08:36:03PM -0800, Nathan Bossart wrote:
On Sun, Feb 26, 2023 at 12:12:27PM -0800, Andres Freund wrote:
Partially I just want something that can easily be searched for, that can have
comments attached to it documenting why what it is doing is safe.It'd not be a huge amount of work to have a slow and restricted string
interpolation support, to make it easier to write messages. Converting floats
is probably too hard to do safely, and I'm not sure %m can safely be
supported. But basic things like %d would be pretty simple.Basically a loop around the format string that directly writes to stderr using
write(), and only supports a signal safe subset of normal format strings.Got it, thanks. I will try to put something together along these lines,
although I don't know if I'll pick up the interpolation support in this
thread.Here is an attempt at adding a signal safe function for writing to STDERR.
Cool.
I didn't add support for format strings, but looking ahead, I think one
challenge will be avoiding va_start() and friends. In any case, IMO format
string support probably deserves its own thread.
Makes sense to split that off.
FWIW, I think we could rely on va_start() et al to be signal safe. The
standardese isn't super clear about this, because they aren't functions, and
posix only talks about functions being async signal safe...
Greetings,
Andres Freund
On Wed, Mar 01, 2023 at 03:13:04PM -0800, Andres Freund wrote:
FWIW, I think we could rely on va_start() et al to be signal safe. The
standardese isn't super clear about this, because they aren't functions, and
posix only talks about functions being async signal safe...
Good to know. I couldn't tell whether that was a safe assumption from
briefly reading around.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Wed, Mar 01, 2023 at 03:13:04PM -0800, Andres Freund wrote:
On 2023-03-01 14:47:51 -0800, Nathan Bossart wrote:
Here is an attempt at adding a signal safe function for writing to STDERR.
Cool.
I'm gently bumping this thread to see if anyone had additional feedback on
the proposed patches [0]/messages/by-id/20230301224751.GA1823946@nathanxps13. The intent was to back-patch these as needed and
to pursue a long-term fix in v17. Are there any concerns with that?
[0]: /messages/by-id/20230301224751.GA1823946@nathanxps13
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On 22.04.23 01:07, Nathan Bossart wrote:
On Wed, Mar 01, 2023 at 03:13:04PM -0800, Andres Freund wrote:
On 2023-03-01 14:47:51 -0800, Nathan Bossart wrote:
Here is an attempt at adding a signal safe function for writing to STDERR.
Cool.
I'm gently bumping this thread to see if anyone had additional feedback on
the proposed patches [0]. The intent was to back-patch these as needed and
to pursue a long-term fix in v17. Are there any concerns with that?
Is this still being contemplated? What is the status of this?
On Sun, Oct 01, 2023 at 08:50:15PM +0200, Peter Eisentraut wrote:
Is this still being contemplated? What is the status of this?
I'll plan on committing this in the next couple of days.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On 2023-03-01 14:47:51 -0800, Nathan Bossart wrote:
Subject: [PATCH v10 1/2] Move extra code out of the Pre/PostRestoreCommand()
block.
LGTM
From fb6957da01f11b75d1a1966f32b00e2e77c789a0 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 14 Feb 2023 09:44:53 -0800
Subject: [PATCH v10 2/2] Don't proc_exit() in startup's SIGTERM handler if
forked by system().Instead, emit a message to STDERR and _exit() in this case. This
change also adds assertions to proc_exit(), ProcKill(), and
AuxiliaryProcKill() to verify that these functions are not called
by a process forked by system(), etc.
---
src/backend/postmaster/startup.c | 17 ++++++++++++++++-
src/backend/storage/ipc/ipc.c | 3 +++
src/backend/storage/lmgr/proc.c | 2 ++
src/backend/utils/error/elog.c | 28 ++++++++++++++++++++++++++++
src/include/utils/elog.h | 6 +-----
5 files changed, 50 insertions(+), 6 deletions(-)
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 22b4278610..b9e2c3aafe 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -805,6 +805,7 @@ ProcKill(int code, Datum arg) dlist_head *procgloballist;Assert(MyProc != NULL);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. *//* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */auxproc = &AuxiliaryProcs[proctype];
I'd make these elog(PANIC), I think. The paths are not performance critical
enough that a single branch hurts, so the overhead of the check is irrelevant,
and the consequences of calling ProcKill() twice for the same process are very
severe.
+/* + * Write a message to STDERR using only async-signal-safe functions. This can + * be used to safely emit a message from a signal handler. + * + * TODO: It is likely possible to safely do a limited amount of string + * interpolation (e.g., %s and %d), but that is not presently supported. + */ +void +write_stderr_signal_safe(const char *fmt)
As is, this isn't a format, so I'd probably just name it s or str :)
-/* - * Write errors to stderr (or by equal means when stderr is - * not available). Used before ereport/elog can be used - * safely (memory context, GUC load etc) - */ extern void write_stderr(const char *fmt,...) pg_attribute_printf(1, 2); +extern void write_stderr_signal_safe(const char *fmt);
Not sure why you removed the comment?
Greetings,
Andres Freund
On Tue, Oct 10, 2023 at 04:40:28PM -0700, Andres Freund wrote:
On 2023-03-01 14:47:51 -0800, Nathan Bossart wrote:
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 22b4278610..b9e2c3aafe 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -805,6 +805,7 @@ ProcKill(int code, Datum arg) dlist_head *procgloballist;Assert(MyProc != NULL);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. *//* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */auxproc = &AuxiliaryProcs[proctype];
I'd make these elog(PANIC), I think. The paths are not performance critical
enough that a single branch hurts, so the overhead of the check is irrelevant,
and the consequences of calling ProcKill() twice for the same process are very
severe.
Right. Should we write_stderr_signal_safe() and then abort() to keep these
paths async-signal-safe?
+/* + * Write a message to STDERR using only async-signal-safe functions. This can + * be used to safely emit a message from a signal handler. + * + * TODO: It is likely possible to safely do a limited amount of string + * interpolation (e.g., %s and %d), but that is not presently supported. + */ +void +write_stderr_signal_safe(const char *fmt)As is, this isn't a format, so I'd probably just name it s or str :)
Yup.
-/* - * Write errors to stderr (or by equal means when stderr is - * not available). Used before ereport/elog can be used - * safely (memory context, GUC load etc) - */ extern void write_stderr(const char *fmt,...) pg_attribute_printf(1, 2); +extern void write_stderr_signal_safe(const char *fmt);Not sure why you removed the comment?
I think it was because it's an exact copy of the comment above the function
in elog.c, and I didn't want to give the impression that it applied to the
signal-safe one, too. I added it back along with a new comment for
write_stderr_signal_safe().
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Attachments:
v11-0001-Move-extra-code-out-of-the-Pre-PostRestoreComman.patchtext/x-diff; charset=us-asciiDownload
From 0a5d6a420352ec6601bd9967321d82b8a7808d67 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Thu, 23 Feb 2023 14:27:48 -0800
Subject: [PATCH v11 1/2] Move extra code out of the Pre/PostRestoreCommand()
block.
If SIGTERM is received within this block, the startup process will
immediately proc_exit() in the signal handler, so it is inadvisable
to include any more code than is required in this section. This
change moves the code recently added to this block (see 1b06d7b and
7fed801) to outside of the block. This ensures that only system()
is called while proc_exit() might be called in the SIGTERM handler,
which is how this code worked from v8.4 to v14.
---
src/backend/access/transam/xlogarchive.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index f3fb92c8f9..b998cd651c 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -159,20 +159,27 @@ RestoreArchivedFile(char *path, const char *xlogfname,
(errmsg_internal("executing restore command \"%s\"",
xlogRestoreCmd)));
+ fflush(NULL);
+ pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
+
/*
- * Check signals before restore command and reset afterwards.
+ * PreRestoreCommand() informs the SIGTERM handler for the startup process
+ * that it should proc_exit() right away. This is done for the duration of
+ * the system() call because there isn't a good way to break out while it
+ * is executing. Since we might call proc_exit() in a signal handler, it
+ * is best to put any additional logic before or after the
+ * PreRestoreCommand()/PostRestoreCommand() section.
*/
PreRestoreCommand();
/*
* Copy xlog from archival storage to XLOGDIR
*/
- fflush(NULL);
- pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
rc = system(xlogRestoreCmd);
- pgstat_report_wait_end();
PostRestoreCommand();
+
+ pgstat_report_wait_end();
pfree(xlogRestoreCmd);
if (rc == 0)
--
2.25.1
v11-0002-Don-t-proc_exit-in-startup-s-SIGTERM-handler-if-.patchtext/x-diff; charset=us-asciiDownload
From ec9b038e8f4b1ecfece456e308b658ecdc8c8f9d Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 14 Feb 2023 09:44:53 -0800
Subject: [PATCH v11 2/2] Don't proc_exit() in startup's SIGTERM handler if
forked by system().
Instead, emit a message to STDERR and _exit() in this case. This
change also adds checks in proc_exit(), ProcKill(), and
AuxiliaryProcKill() to verify that these functions are not called
by a process forked by system(), etc.
---
src/backend/postmaster/startup.c | 17 ++++++++++++++++-
src/backend/storage/ipc/ipc.c | 4 ++++
src/backend/storage/lmgr/proc.c | 8 ++++++++
src/backend/utils/error/elog.c | 28 ++++++++++++++++++++++++++++
src/include/utils/elog.h | 6 ++++++
5 files changed, 62 insertions(+), 1 deletion(-)
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index efc2580536..0e7de26bc2 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -19,6 +19,8 @@
*/
#include "postgres.h"
+#include <unistd.h>
+
#include "access/xlog.h"
#include "access/xlogrecovery.h"
#include "access/xlogutils.h"
@@ -121,7 +123,20 @@ StartupProcShutdownHandler(SIGNAL_ARGS)
int save_errno = errno;
if (in_restore_command)
- proc_exit(1);
+ {
+ /*
+ * If we are in a child process (e.g., forked by system() in
+ * RestoreArchivedFile()), we don't want to call any exit callbacks.
+ * The parent will take care of that.
+ */
+ if (MyProcPid == (int) getpid())
+ proc_exit(1);
+ else
+ {
+ write_stderr_signal_safe("StartupProcShutdownHandler() called in child process\n");
+ _exit(1);
+ }
+ }
else
shutdown_requested = true;
WakeupRecovery();
diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index 1904d21795..6591b5d6a8 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -103,6 +103,10 @@ static int on_proc_exit_index,
void
proc_exit(int code)
{
+ /* not safe if forked by system(), etc. */
+ if (MyProcPid != (int) getpid())
+ elog(PANIC, "proc_exit() called in child process");
+
/* Clean up everything that must be cleaned up */
proc_exit_prepare(code);
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 5b663a2997..e9e445bb21 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -806,6 +806,10 @@ ProcKill(int code, Datum arg)
Assert(MyProc != NULL);
+ /* not safe if forked by system(), etc. */
+ if (MyProc->pid != (int) getpid())
+ elog(PANIC, "ProcKill() called in child process");
+
/* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -926,6 +930,10 @@ AuxiliaryProcKill(int code, Datum arg)
Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ /* not safe if forked by system(), etc. */
+ if (MyProc->pid != (int) getpid())
+ elog(PANIC, "AuxiliaryProcKill() called in child process");
+
auxproc = &AuxiliaryProcs[proctype];
Assert(MyProc == auxproc);
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 8e1f3e8521..eeb238331e 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -3733,6 +3733,34 @@ write_stderr(const char *fmt,...)
}
+/*
+ * Write a message to STDERR using only async-signal-safe functions. This can
+ * be used to safely emit a message from a signal handler.
+ *
+ * TODO: It is likely possible to safely do a limited amount of string
+ * interpolation (e.g., %s and %d), but that is not presently supported.
+ */
+void
+write_stderr_signal_safe(const char *str)
+{
+ int nwritten = 0;
+ int ntotal = strlen(str);
+
+ while (nwritten < ntotal)
+ {
+ int rc;
+
+ rc = write(STDERR_FILENO, str + nwritten, ntotal - nwritten);
+
+ /* Just give up on error. There isn't much else we can do. */
+ if (rc == -1)
+ return;
+
+ nwritten += rc;
+ }
+}
+
+
/*
* Adjust the level of a recovery-related message per trace_recovery_messages.
*
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 4a9562fdaa..0292e88b4f 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -536,4 +536,10 @@ extern void write_jsonlog(ErrorData *edata);
*/
extern void write_stderr(const char *fmt,...) pg_attribute_printf(1, 2);
+/*
+ * Write a message to STDERR using only async-signal-safe functions. This can
+ * be used to safely emit a message from a signal handler.
+ */
+extern void write_stderr_signal_safe(const char *fmt);
+
#endif /* ELOG_H */
--
2.25.1
On Tue, Oct 10, 2023 at 09:54:18PM -0500, Nathan Bossart wrote:
On Tue, Oct 10, 2023 at 04:40:28PM -0700, Andres Freund wrote:
I'd make these elog(PANIC), I think. The paths are not performance critical
enough that a single branch hurts, so the overhead of the check is irrelevant,
and the consequences of calling ProcKill() twice for the same process are very
severe.Right. Should we write_stderr_signal_safe() and then abort() to keep these
paths async-signal-safe?
Hm. I see that elog() is called elsewhere in proc_exit(), and it does not
appear to be async-signal-safe. Am I missing something?
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
On 2023-10-10 22:29:34 -0500, Nathan Bossart wrote:
On Tue, Oct 10, 2023 at 09:54:18PM -0500, Nathan Bossart wrote:
On Tue, Oct 10, 2023 at 04:40:28PM -0700, Andres Freund wrote:
I'd make these elog(PANIC), I think. The paths are not performance critical
enough that a single branch hurts, so the overhead of the check is irrelevant,
and the consequences of calling ProcKill() twice for the same process are very
severe.Right. Should we write_stderr_signal_safe() and then abort() to keep these
paths async-signal-safe?Hm. I see that elog() is called elsewhere in proc_exit(), and it does not
appear to be async-signal-safe. Am I missing something?
We shouldn't call proc_exit() in a signal handler. We perhaps have a few
remaining calls left, but we should (and I think in some cases are) working on
removing those.
Greetings,
Andres Freund
On Tue, Oct 10, 2023 at 08:39:29PM -0700, Andres Freund wrote:
We shouldn't call proc_exit() in a signal handler. We perhaps have a few
remaining calls left, but we should (and I think in some cases are) working on
removing those.
Hmm. I don't recall anything remaining, even after a quick check.
FWIW, I was under the impression that Thomas' work done in
0da096d78e1e4 has cleaned up the last bits of that.
--
Michael
On Wed, Oct 11, 2023 at 01:02:14PM +0900, Michael Paquier wrote:
On Tue, Oct 10, 2023 at 08:39:29PM -0700, Andres Freund wrote:
We shouldn't call proc_exit() in a signal handler. We perhaps have a few
remaining calls left, but we should (and I think in some cases are) working on
removing those.
Got it.
Hmm. I don't recall anything remaining, even after a quick check.
FWIW, I was under the impression that Thomas' work done in
0da096d78e1e4 has cleaned up the last bits of that.
StartupProcShutdownHandler() remains, at least. Of the other items in
Tom's list from 2020 [0]/messages/by-id/148145.1599703626@sss.pgh.pa.us, bgworker_die() and FloatExceptionHandler() are
also still unsafe. RecoveryConflictInterrupt() should be fixed by 0da096d,
and StandbyDeadLockHandler() and StandbyTimeoutHandler() should be fixed by
8900b5a and 8f1537d, respectively.
[0]: /messages/by-id/148145.1599703626@sss.pgh.pa.us
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Committed and back-patched.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Tue, Oct 17, 2023 at 10:46:47AM -0500, Nathan Bossart wrote:
Committed and back-patched.
... and it looks like some of the back-branches are failing for Windows.
I'm assuming this is because c290e79 was only back-patched to v15. My
first instinct is just to back-patch that one all the way to v11, but maybe
there's an alternative involving #ifdef WIN32. Are there any concerns with
back-patching c290e79?
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Nathan Bossart <nathandbossart@gmail.com> writes:
... and it looks like some of the back-branches are failing for Windows.
I'm assuming this is because c290e79 was only back-patched to v15. My
first instinct is just to back-patch that one all the way to v11, but maybe
there's an alternative involving #ifdef WIN32. Are there any concerns with
back-patching c290e79?
Sounds fine to me.
regards, tom lane
On Tue, Oct 17, 2023 at 12:47:29PM -0400, Tom Lane wrote:
Nathan Bossart <nathandbossart@gmail.com> writes:
... and it looks like some of the back-branches are failing for Windows.
I'm assuming this is because c290e79 was only back-patched to v15. My
first instinct is just to back-patch that one all the way to v11, but maybe
there's an alternative involving #ifdef WIN32. Are there any concerns with
back-patching c290e79?Sounds fine to me.
Thanks, done.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com