stopgap fix for signal handling during restore_command
On Wed, Feb 22, 2023 at 09:48:10PM +1300, Thomas Munro wrote:
On Tue, Feb 21, 2023 at 5:50 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
I'm happy to create a new thread if needed, but I can't tell if there is
any interest in this stopgap/back-branch fix. Perhaps we should just jump
straight to the long-term fix that Thomas is looking into.Unfortunately the latch-friendly subprocess module proposal I was
talking about would be for 17. I may post a thread fairly soon with
design ideas + list of problems and decision points as I see them, and
hopefully some sketch code, but it won't be a proposal for [/me checks
calendar] next week's commitfest and probably wouldn't be appropriate
in a final commitfest anyway, and I also have some other existing
stuff to clear first. So please do continue with the stopgap ideas.
Okay, here is a new thread...
Since v8.4, the startup process will proc_exit() immediately within its
SIGTERM handler while the restore_command executes via system(). Some
recent changes added unsafe code to the section where this behavior is
enabled [0]/messages/by-id/20230201105514.rsjl4bnhb65giyvo@alap3.anarazel.de. The long-term fix likely includes moving away from system()
completely, but we may want to have a stopgap/back-branch fix while that is
under development.
I've attached a patch set for a proposed stopgap fix. 0001 simply moves
the extra code outside of the Pre/PostRestoreCommand() block so that only
system() is executed while the SIGTERM handler might proc_exit(). This
restores the behavior that was in place from v8.4 to v14, so I don't expect
it to be too controversial. 0002 adds code to startup's SIGTERM handler to
call _exit() instead of proc_exit() if we are in a forked process from
system(), etc. It also adds assertions to ensure proc_exit(), ProcKill(),
and AuxiliaryProcKill() are not called within such forked processes.
Thoughts?
[0]: /messages/by-id/20230201105514.rsjl4bnhb65giyvo@alap3.anarazel.de
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Fri, Feb 24, 2023 at 12:15 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
Thoughts?
I think you should have a trailing \n when writing to stderr.
Here's that reproducer I speculated about (sorry I confused SIGQUIT
and SIGTERM in my earlier email, ENOCOFFEE). Seems to do the job, and
I tested on a Linux box for good measure. If you comment out the
kill(), "check PROVE_TESTS=t/002_archiving.pl" works fine
(demonstrating that that definition of system() works fine). With the
kill(), it reliably reaches 'TRAP: failed Assert("latch->owner_pid ==
MyProcPid")' without your patch, and with your patch it avoids it. (I
believe glibc's system() could reach it too with the right timing, but
I didn't try, my point being that the use of the OpenBSD system() here
is only because it's easier to grok and to wrangle.)
Attachments:
0001-XXX-inject-SIGTERM-into-system-at-an-inconvenient-mo.xatchapplication/octet-stream; name=0001-XXX-inject-SIGTERM-into-system-at-an-inconvenient-mo.xatchDownload+67-2
On Fri, Feb 24, 2023 at 1:25 PM Thomas Munro <thomas.munro@gmail.com> wrote:
ENOCOFFEE
Erm, I realised after sending that I'd accidentally sent a version
that uses fork() anyway, and now if I change it back to vfork() it
doesn't fail the way I wanted to demonstrate, at least on Linux. I
don't have time or desire to dig into how Linux vfork() really works
so I'll leave it at that... but the patch as posted does seem to be a
useful tool for understanding this failure... please just ignore the
confused comments about fork() vs vfork() therein.
On Fri, Feb 24, 2023 at 01:25:01PM +1300, Thomas Munro wrote:
I think you should have a trailing \n when writing to stderr.
Oops. I added that in v7.
Here's that reproducer I speculated about (sorry I confused SIGQUIT
and SIGTERM in my earlier email, ENOCOFFEE). Seems to do the job, and
I tested on a Linux box for good measure. If you comment out the
kill(), "check PROVE_TESTS=t/002_archiving.pl" works fine
(demonstrating that that definition of system() works fine). With the
kill(), it reliably reaches 'TRAP: failed Assert("latch->owner_pid ==
MyProcPid")' without your patch, and with your patch it avoids it. (I
believe glibc's system() could reach it too with the right timing, but
I didn't try, my point being that the use of the OpenBSD system() here
is only because it's easier to grok and to wrangle.)
Thanks for providing the reproducer! I am seeing the behavior that you
described on my Linux machine.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
On 2023-02-23 20:33:23 -0800, Nathan Bossart wrote:>
if (in_restore_command) - proc_exit(1); + { + /* + * If we are in a child process (e.g., forked by system() in + * RestoreArchivedFile()), we don't want to call any exit callbacks. + * The parent will take care of that. + */ + if (MyProcPid == (int) getpid()) + proc_exit(1); + else + { + const char msg[] = "StartupProcShutdownHandler() called in child process\n"; + int rc pg_attribute_unused(); + + rc = write(STDERR_FILENO, msg, sizeof(msg)); + _exit(1); + } + }
Why do we need that rc variable? Don't we normally get away with (void)
write(...)?
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 22b4278610..e3da0622d7 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -805,6 +805,7 @@ ProcKill(int code, Datum arg) dlist_head *procgloballist;Assert(MyProc != NULL);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. *//* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. */auxproc = &AuxiliaryProcs[proctype];
--
2.25.1
I think the much more interesting assertion here would be to check that
MyProc->pid equals the current pid.
Greetings,
Andres Freund
On Sat, Feb 25, 2023 at 11:07:42AM -0800, Andres Freund wrote:
On 2023-02-23 20:33:23 -0800, Nathan Bossart wrote:>
if (in_restore_command) - proc_exit(1); + { + /* + * If we are in a child process (e.g., forked by system() in + * RestoreArchivedFile()), we don't want to call any exit callbacks. + * The parent will take care of that. + */ + if (MyProcPid == (int) getpid()) + proc_exit(1); + else + { + const char msg[] = "StartupProcShutdownHandler() called in child process\n"; + int rc pg_attribute_unused(); + + rc = write(STDERR_FILENO, msg, sizeof(msg)); + _exit(1); + } + }Why do we need that rc variable? Don't we normally get away with (void)
write(...)?
My compiler complains about that. :/
../postgresql/src/backend/postmaster/startup.c: In function ‘StartupProcShutdownHandler’:
../postgresql/src/backend/postmaster/startup.c:139:11: error: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Werror=unused-result]
139 | (void) write(STDERR_FILENO, msg, sizeof(msg));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 22b4278610..e3da0622d7 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -805,6 +805,7 @@ ProcKill(int code, Datum arg) dlist_head *procgloballist;Assert(MyProc != NULL);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. *//* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. */auxproc = &AuxiliaryProcs[proctype];
--
2.25.1I think the much more interesting assertion here would be to check that
MyProc->pid equals the current pid.
I don't mind changing this, but why is this a more interesting assertion?
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Sat, Feb 25, 2023 at 11:28:25AM -0800, Nathan Bossart wrote:
On Sat, Feb 25, 2023 at 11:07:42AM -0800, Andres Freund wrote:
I think the much more interesting assertion here would be to check that
MyProc->pid equals the current pid.I don't mind changing this, but why is this a more interesting assertion?
Here is a new patch set with this change.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
On 2023-02-25 11:28:25 -0800, Nathan Bossart wrote:
On Sat, Feb 25, 2023 at 11:07:42AM -0800, Andres Freund wrote:
Why do we need that rc variable? Don't we normally get away with (void)
write(...)?My compiler complains about that. :/
../postgresql/src/backend/postmaster/startup.c: In function ‘StartupProcShutdownHandler’:
../postgresql/src/backend/postmaster/startup.c:139:11: error: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Werror=unused-result]
139 | (void) write(STDERR_FILENO, msg, sizeof(msg));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Ick. I guess we've already encountered this, because we've apparently removed
all the (void) write cases. Which I am certain we had at some point. We still
do it for a bunch of other functions though. Ah, yes: aa90e148ca7,
27314d32a88, 6c72a28e5ce etc.
I think I opined on this before, but we really ought to have a function to do
some minimal signal safe output. Implemented centrally, instead of being open
coded in a bunch of places.
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 22b4278610..e3da0622d7 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -805,6 +805,7 @@ ProcKill(int code, Datum arg) dlist_head *procgloballist;Assert(MyProc != NULL);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. *//* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProcPid == (int) getpid()); /* not safe if forked by system(), etc. */auxproc = &AuxiliaryProcs[proctype];
--
2.25.1I think the much more interesting assertion here would be to check that
MyProc->pid equals the current pid.I don't mind changing this, but why is this a more interesting assertion?
Because we so far have little to no protection against some state corruption
leading to releasing PGPROC that's not ours. I didn't actually mean that we
shouldn't check that MyProcPid == (int) getpid(), just that the much more
interesting case to check is that MyProc->pid matches, because that protect
against multiple releases, releasing the wrong slot, etc.
Greetings,
Andres Freund
On Sat, Feb 25, 2023 at 11:52:53AM -0800, Andres Freund wrote:
I think I opined on this before, but we really ought to have a function to do
some minimal signal safe output. Implemented centrally, instead of being open
coded in a bunch of places.
While looking around for the right place to put this, I noticed that
there's a write_stderr() function in elog.c that we might be able to use.
I used that in v9. WDYT?
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
On 2023-02-25 14:06:29 -0800, Nathan Bossart wrote:
On Sat, Feb 25, 2023 at 11:52:53AM -0800, Andres Freund wrote:
I think I opined on this before, but we really ought to have a function to do
some minimal signal safe output. Implemented centrally, instead of being open
coded in a bunch of places.While looking around for the right place to put this, I noticed that
there's a write_stderr() function in elog.c that we might be able to use.
I used that in v9. WDYT?
write_stderr() isn't signal safe, from what I can tell.
On Sun, Feb 26, 2023 at 10:00:29AM -0800, Andres Freund wrote:
On 2023-02-25 14:06:29 -0800, Nathan Bossart wrote:
On Sat, Feb 25, 2023 at 11:52:53AM -0800, Andres Freund wrote:
I think I opined on this before, but we really ought to have a function to do
some minimal signal safe output. Implemented centrally, instead of being open
coded in a bunch of places.While looking around for the right place to put this, I noticed that
there's a write_stderr() function in elog.c that we might be able to use.
I used that in v9. WDYT?write_stderr() isn't signal safe, from what I can tell.
*facepalm* Sorry.
What precisely did you have in mind? AFAICT you are asking for a wrapper
around write().
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
On 2023-02-26 11:39:00 -0800, Nathan Bossart wrote:
On Sun, Feb 26, 2023 at 10:00:29AM -0800, Andres Freund wrote:
On 2023-02-25 14:06:29 -0800, Nathan Bossart wrote:
On Sat, Feb 25, 2023 at 11:52:53AM -0800, Andres Freund wrote:
I think I opined on this before, but we really ought to have a function to do
some minimal signal safe output. Implemented centrally, instead of being open
coded in a bunch of places.While looking around for the right place to put this, I noticed that
there's a write_stderr() function in elog.c that we might be able to use.
I used that in v9. WDYT?write_stderr() isn't signal safe, from what I can tell.
*facepalm* Sorry.
What precisely did you have in mind? AFAICT you are asking for a wrapper
around write().
Partially I just want something that can easily be searched for, that can have
comments attached to it documenting why what it is doing is safe.
It'd not be a huge amount of work to have a slow and restricted string
interpolation support, to make it easier to write messages. Converting floats
is probably too hard to do safely, and I'm not sure %m can safely be
supported. But basic things like %d would be pretty simple.
Basically a loop around the format string that directly writes to stderr using
write(), and only supports a signal safe subset of normal format strings.
Greetings,
Andres Freund
On Sun, Feb 26, 2023 at 12:12:27PM -0800, Andres Freund wrote:
On 2023-02-26 11:39:00 -0800, Nathan Bossart wrote:
What precisely did you have in mind? AFAICT you are asking for a wrapper
around write().Partially I just want something that can easily be searched for, that can have
comments attached to it documenting why what it is doing is safe.It'd not be a huge amount of work to have a slow and restricted string
interpolation support, to make it easier to write messages. Converting floats
is probably too hard to do safely, and I'm not sure %m can safely be
supported. But basic things like %d would be pretty simple.Basically a loop around the format string that directly writes to stderr using
write(), and only supports a signal safe subset of normal format strings.
Got it, thanks. I will try to put something together along these lines,
although I don't know if I'll pick up the interpolation support in this
thread.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Tue, Feb 28, 2023 at 08:36:03PM -0800, Nathan Bossart wrote:
On Sun, Feb 26, 2023 at 12:12:27PM -0800, Andres Freund wrote:
Partially I just want something that can easily be searched for, that can have
comments attached to it documenting why what it is doing is safe.It'd not be a huge amount of work to have a slow and restricted string
interpolation support, to make it easier to write messages. Converting floats
is probably too hard to do safely, and I'm not sure %m can safely be
supported. But basic things like %d would be pretty simple.Basically a loop around the format string that directly writes to stderr using
write(), and only supports a signal safe subset of normal format strings.Got it, thanks. I will try to put something together along these lines,
although I don't know if I'll pick up the interpolation support in this
thread.
Here is an attempt at adding a signal safe function for writing to STDERR.
I didn't add support for format strings, but looking ahead, I think one
challenge will be avoiding va_start() and friends. In any case, IMO format
string support probably deserves its own thread.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Hi,
On 2023-03-01 14:47:51 -0800, Nathan Bossart wrote:
On Tue, Feb 28, 2023 at 08:36:03PM -0800, Nathan Bossart wrote:
On Sun, Feb 26, 2023 at 12:12:27PM -0800, Andres Freund wrote:
Partially I just want something that can easily be searched for, that can have
comments attached to it documenting why what it is doing is safe.It'd not be a huge amount of work to have a slow and restricted string
interpolation support, to make it easier to write messages. Converting floats
is probably too hard to do safely, and I'm not sure %m can safely be
supported. But basic things like %d would be pretty simple.Basically a loop around the format string that directly writes to stderr using
write(), and only supports a signal safe subset of normal format strings.Got it, thanks. I will try to put something together along these lines,
although I don't know if I'll pick up the interpolation support in this
thread.Here is an attempt at adding a signal safe function for writing to STDERR.
Cool.
I didn't add support for format strings, but looking ahead, I think one
challenge will be avoiding va_start() and friends. In any case, IMO format
string support probably deserves its own thread.
Makes sense to split that off.
FWIW, I think we could rely on va_start() et al to be signal safe. The
standardese isn't super clear about this, because they aren't functions, and
posix only talks about functions being async signal safe...
Greetings,
Andres Freund
On Wed, Mar 01, 2023 at 03:13:04PM -0800, Andres Freund wrote:
FWIW, I think we could rely on va_start() et al to be signal safe. The
standardese isn't super clear about this, because they aren't functions, and
posix only talks about functions being async signal safe...
Good to know. I couldn't tell whether that was a safe assumption from
briefly reading around.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Wed, Mar 01, 2023 at 03:13:04PM -0800, Andres Freund wrote:
On 2023-03-01 14:47:51 -0800, Nathan Bossart wrote:
Here is an attempt at adding a signal safe function for writing to STDERR.
Cool.
I'm gently bumping this thread to see if anyone had additional feedback on
the proposed patches [0]/messages/by-id/20230301224751.GA1823946@nathanxps13. The intent was to back-patch these as needed and
to pursue a long-term fix in v17. Are there any concerns with that?
[0]: /messages/by-id/20230301224751.GA1823946@nathanxps13
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On 22.04.23 01:07, Nathan Bossart wrote:
On Wed, Mar 01, 2023 at 03:13:04PM -0800, Andres Freund wrote:
On 2023-03-01 14:47:51 -0800, Nathan Bossart wrote:
Here is an attempt at adding a signal safe function for writing to STDERR.
Cool.
I'm gently bumping this thread to see if anyone had additional feedback on
the proposed patches [0]. The intent was to back-patch these as needed and
to pursue a long-term fix in v17. Are there any concerns with that?
Is this still being contemplated? What is the status of this?
On Sun, Oct 01, 2023 at 08:50:15PM +0200, Peter Eisentraut wrote:
Is this still being contemplated? What is the status of this?
I'll plan on committing this in the next couple of days.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On 2023-03-01 14:47:51 -0800, Nathan Bossart wrote:
Subject: [PATCH v10 1/2] Move extra code out of the Pre/PostRestoreCommand()
block.
LGTM
From fb6957da01f11b75d1a1966f32b00e2e77c789a0 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandbossart@gmail.com>
Date: Tue, 14 Feb 2023 09:44:53 -0800
Subject: [PATCH v10 2/2] Don't proc_exit() in startup's SIGTERM handler if
forked by system().Instead, emit a message to STDERR and _exit() in this case. This
change also adds assertions to proc_exit(), ProcKill(), and
AuxiliaryProcKill() to verify that these functions are not called
by a process forked by system(), etc.
---
src/backend/postmaster/startup.c | 17 ++++++++++++++++-
src/backend/storage/ipc/ipc.c | 3 +++
src/backend/storage/lmgr/proc.c | 2 ++
src/backend/utils/error/elog.c | 28 ++++++++++++++++++++++++++++
src/include/utils/elog.h | 6 +-----
5 files changed, 50 insertions(+), 6 deletions(-)
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 22b4278610..b9e2c3aafe 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -805,6 +805,7 @@ ProcKill(int code, Datum arg) dlist_head *procgloballist;Assert(MyProc != NULL);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. *//* Make sure we're out of the sync rep lists */
SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
PGPROC *proc;Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+ Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */auxproc = &AuxiliaryProcs[proctype];
I'd make these elog(PANIC), I think. The paths are not performance critical
enough that a single branch hurts, so the overhead of the check is irrelevant,
and the consequences of calling ProcKill() twice for the same process are very
severe.
+/* + * Write a message to STDERR using only async-signal-safe functions. This can + * be used to safely emit a message from a signal handler. + * + * TODO: It is likely possible to safely do a limited amount of string + * interpolation (e.g., %s and %d), but that is not presently supported. + */ +void +write_stderr_signal_safe(const char *fmt)
As is, this isn't a format, so I'd probably just name it s or str :)
-/* - * Write errors to stderr (or by equal means when stderr is - * not available). Used before ereport/elog can be used - * safely (memory context, GUC load etc) - */ extern void write_stderr(const char *fmt,...) pg_attribute_printf(1, 2); +extern void write_stderr_signal_safe(const char *fmt);
Not sure why you removed the comment?
Greetings,
Andres Freund